Statistics are hard, as Charles helpfully pointed
out to me a couple of weeks ago. But one has the idea that techier people
grasp the relevant concepts more than your standard arty journalist might. And
then one reads Paul Boutin
in Slate:
While the Web guys admit they could be off by half, Nielsen claims its television
ratings have a margin of error of 4
percent.
If you follow that link, you’ll find that it doesn’t quite say what Boutin
says it says. In fact, the words "margin of error" don’t even appear.
Rather, one finds this:
According to sampling theory and a very tasty laboratory test, 19 out of
20 times we take a well-stirred sample of soup containing 5,000 vegetable
pieces, we get between 48% and 52% carrots. There is no guarantee that the
percentage of carrots in a sample of this size will be between 48% and 52%
(one time in 20 it will be outside this range, but usually not far outside
this range). The same sampling errors apply to a representative sample of
television viewers.
Ignore the carrot language for the time being. What Nielsen is saying here
is that the company is 95% certain that its TV ratings are within 4 percentage
points of well, something. But that something isn’t the "true figure"
– the actual number of households watching a certain program. Nielsen
first assumes that its sample is perfectly representative (that’s what
they mean by "a well-stirred sample"); only then does it
calculate the margin of error. (This is true of all opinion polls, by the way,
including – and especially – political ones.)
In other words, there are two ways that Nielsen can be more than 4% out in
its TV ratings. On the one hand, it could simply be unlucky. Indeed, 5% of its
ratings are more than 4% out; it’s just that no one knows which 5%
they are. Alternatively, its methodology could be imperfect. Any problem with
the representativeness of the sample, or reporting bias, or technological glitches,
is not included in what Boutin calls the "margin of error". Which
means that if there was any way of actually measuring exactly how many households
were watching a given TV program on a given night, we’d find that more than
5% of Nielsen’s ratings would be more than 4% off base.
But there isn’t. So Nielsen ratings are accepted as the least bad option for
broadcasters and advertisers. On the web, of course, there are alternative ways
of measuring traffic to websites – looking at one’s own server logs being
the most obvious – and so it’s much easier to tell when Nielsen is wide
of the mark. "The more I dig into how Web ratings work, the more I realize
people in other media are in denial," says Boutin. Which might be true,
if people in other media really believed the Nielsen rankings. But in fact,
those people are simply making the only decision they can make: to take Nielsen’s
figures at face value, because there is no alternative.
Anybody counting anything is going to make a mistake. Take SAT scores, for
instance. There will, on occasion, be errors in the way that a certain person’s
test has been graded. The machine goes wonky, the wrong score is spat out, with
major or minor consequences. One hopes those errors are very infrequent. But
more to the point, there will often be occasions when someone with high scholastic
aptitude gets a low SAT score, or someone with low scholastic aptitude gets
a high SAT score. Everything from a hangover to a complicated love life to a
successful test-cramming service can affect SAT scores, which means they are
a far from perfect proxy for whatever it is they’re trying to measure. But it’s
useful for there to be something standard and quantifiable in the academic world,
and the SAT is one of the least-bad options.
The fact is that it’s not people in other media who are in denial, it’s Boutin’s
"web moguls". They think that because they have hard-and-fast numbers
for their own website, that there is or can be some kind of knowable truth about
how many pageviews and unique visitors they have. In reality, however, just
like anything else quantifiable, there are going to be measuring mistakes both
big and small. Everybody else has been resigned to this for decades. It’s only
on the web where people still dare to hope.