What BARB’s error reveals about the bizarre world of TV ratings
UPDATE: Jack Knight has written a fantastic comment on this post, giving a lot more background to BARB and sampled ratings in general. I highly recommend reading the comments after the post. If you’re interested in the history of audience ratings, I highly recommend reading ‘Rating The Audience: The Business of Media’.
BARB – the organisation that measures ratings for UK TV channels – has admitted that there were errors in its tracking system, and as a result some Channel 4 and ITV shows have ended up with false ratings. Broadcast Magazine’s article says that one of the programmes given false ratings was ITV’s X Factor – their highest rating show, and one of the biggest advertising targets in broadcast television:
“The entertainment show originally recorded an overnight audience of 8.96m (33.6%) on ITV1 and ITV1 HD in figures released last Monday, but this has now grown to 9.84m (36.91%) under the revised data. Meanwhile, C4 shows including 999: What’s Your Emergency and Grand Designs have also experienced audience uplifts. However, others have fallen, such as the 22 September episode of The Comedy World Cup, which dropped from 1.8m (7.9%) to 631k (2.76%) on the back of the gaffe.”
How can mistakes like this happen in a multi-billion pound industry reliant on accurate audience metrics? Looking for an answer to that question opens up lots more questions about why such a huge and influential industry relies on relatively crude measuring techniques that haven’t changed much in decades.
TV ratings are measured using mechanical devices that record the presence of viewers in the room when the TV is on, usually by the viewers pressing a button to register that they’ve entered the room. So it really registers presence, rather than attention – the viewer could be reading a newspaper, doing the ironing or using their iphone, but for the sake of the ratings they count as an avid viewer.
Ratings technologies have been refined over time, but the basic concept hasn’t changed since it was invented by Arthur C Nielsen to measure radio audiences in the 1930s. BARB is the UK version of TV ratings, using a panel of 5,100 homes to represent the UK TV viewing public. So each percentage point in the examples above stand for a measurement sample of just 51 homes. The amount of people in these homes is around 11,300, so each percentage point stands for a maximum of 113 people pressing their buttons when they walk into the living room. It’s often a lot less, as the percentages above are share of the total viewing audience (BARB calls this the ‘universe’) at that time – many BARB panellists might be out of their homes, or might not have the TV on at that time.
If we take the numbers of viewers in the sample above, we can work out the size of the TV viewing universe watching when these errors occured. For example, the 8.9m audience originally reported for X Factor was 33.6% of total viewers that night, so one percentage of that audience is 8.9m/33.6% – 264,880 viewers. This means that the BARB’s estimate for the total UK TV viewing audience on a Saturday night is around 26.4m people, which is 39% of the UK population of 62m people. So we could transfer this to roughly work out that the number of BARB Panellists registering themselves as viewers that night is 39% of 11,300 – 4,407 people.
Still with me? Lets now take the share of X Factor’s reported viewing to work out how many BARB panellists registered themselves as watching that programme. The original share reported was 33.6% of total viewers. We know the total BARB panellists watching TV was 4,407, so the number watching X Factor according to the original report was 33.6% of 4,407 – 1,481 people. So BARB measures 1,481 people watching a TV programme, and extrapolates that number to report an audience rating of 8.96m viewers. No matter how scientific and representational the survey, is remarkable to think that multi-billion pound creative decisions are made on such a small sample size.
Now lets look at the error size. BARB under-represented X Factor’s ratings by 3.31 percentage points, which was a difference of 880,000 viewers in the reported ratings. Again, if we take the total panellists viewing X Factor that night as 1,481 people, 3.31% is 49 people.
An error in measuring 49 people pressing a button when they walk into a room means that one of the UK’s largest media businesses under-represented the performance of their most important programme by 880,000 viewers. Is it just me, or is that completely insane?