A study was published on 4th March 2014 with a press release claiming “Meat and cheese may be as bad for you as smoking“.
I’ve gone through the study in detail here.
Just to remind ourselves of the headline numbers from the press release and where the numbers come from in the main paper:
1) The “four times more likely to die of cancer” comes from Table 1, Model 1, for 50-65 year olds. The Hazard Ratio (HR) is given as 4.33 for the high protein intake group when the low protein intake group is referenced as 1.00.
2) The “protein-lovers were 74 percent more likely to die of any cause” also comes from Table 1, Model 1, for 50-65 year olds. The Hazard Ratio is given as 1.74 for the high protein intake group when the low protein intake group is referenced as 1.00.
I mentioned in the narrative that I had emailed Dr Longo asking for the numbers behind the hazard ratios. I have not heard back from Dr Longo – not even an auto response. A journalist I know has also failed to get any numbers from the research team. Their view was “we don’t send raw and incomplete data to journalist (sic) since it is usually misinterpreted.”
No – raw data gives the facts. Manipulated data can be deliberately presented so that it is misinterpreted – or rather, interpreted as the ‘researchers’ want you to interpret it. This kind of misinterpretation for example: “eating a diet rich in animal proteins during middle age makes you four times more likely to die of cancer than someone with a low-protein diet — a mortality risk factor comparable to smoking.”
However – well done to “Sam” who posted a comment next to the article.
Sam asked: “Please disclose the raw death rates for cancer in the 3 groups following the stratification into -65 and +65 (that division is the central pillar of your human study, hence it is bad science and poor reviewing not to have asked for them). Without those, your modelling rests on nothing and is thus open to severe criticism. Your table 1 in suppl material shows the raw death rates (9-10%) and protein seems to have no effect whatsoever on them (as has been pointed out by others).”
Morgan Levine replied: “The frequencies by protein group are as follows:
Age 50-65 All-cause Mortality
Age 50-65 Cancer Mortality
Age 66+ All-cause Mortality
Age 66+ Cancer Mortality
So now we’re in business!
Page 10 of the supplemental PDF tells us:
Ages 50-65: Low Protein (N=219), Moderate Protein (N=2,277), High Protein (N=543)
Ages 66+: Low Protein (N=218), Moderate Protein (N=2,521), High Protein (N=603)
So we know how many people were in each of the two age groups and three protein intake groups. This gives us row 1 in the table below…
|1||Number of people (from supplemental)|
|2||All-cause mortality (from Levine)|
|3||Cancer Mortality (from Levine)|
|4||All-cause deaths (Row 1 x Row 2)|
|5||Cancer deaths (Row 1 x Row 3)|
|6||All-cause RR (uses Row 2)|
|7||All-cause HR from Table 1|
|8||Cancer RR (uses Row 3)|
|9||Cancer HR from Table 1|
|10||Average person yrs follow-up (from supplemental)|
|11||Cancer death rates per year (Row 3/Row 10)|
|12||Person years (Row 1 x Row 10)|
Rows 2 and 3 have been provided by Morgan Levine, in the reply to Sam’s query.
Rows 4 and 5 calculate actual numbers of deaths using the number of people and death rates. As my “p.s.” in the original post hypothesised, there are substantially more deaths in the over 65s (3.34 times as many).
Row 6 takes low protein intake for all-cause mortality as the reference point of 1.00 and then works out the ratio of moderate and high protein intake relative to 1.00. Row 7 repeats the Hazard Ratios taken from Table 1, Model 1, of the main paper, as a reminder of the source of the headlines.
We are not comparing like with like, as Levine has given us top level death rates for each protein intake group and Model 1 has adjusted for loads of things (age, race, sex, education, waist circumference, smoking, diabetes, cancer, previous heart attack, diet changed in past year, tried to lose weight in past year and the kitchen sink – I made up that last one).
It is interesting that, for total mortality, the unadjusted data would only elicit a headline of “45% more likely”, not “74% more likely” – notwithstanding that this is still association, not causation and still relative not absolute risk and still only featuring the 50-65 year old group. The headline from Table 1, Model 1 for over 65s could have been “low protein consumers are 39% more likely to die” (using high protein as the reference).
Row 8 takes low protein intake as the reference point of 1.00 and then works out the ratio of moderate and high protein intake relative to 1.00. Row 9 repeats the Hazard Ratios taken from Table 1, Model 1, of the main paper. Here our relative risk numbers are much closer to the Model 1 HRs – there is an exact match (3.06) for the moderate protein intake group for 50-65 year olds.
Here we find the real headline. What the researchers didn’t want us to find out. The “four times more likely to die” global headline grabber was based on a reference group of six deaths. Yes six deaths. And not just six deaths – but six deaths over an 18 year study. And the ‘researchers’ tried to claim that animal protein is as bad as smoking based on this?
I warned in the original post about the dangers of basing relative risks on small group sizes. Don’t forget that the researchers didn’t divide the 6,381 people into three even groups: they put 75% of participants into a moderate intake group (which they created) and just 6-7% into a low intake group (which they created). Had just 4 more cancer deaths occurred in the low protein intake group and 4 fewer in the high protein intake group, the relative risk would have halved.
Absolute risk per year
Row 10: The supplemental PDF has a useful table (S1), which gives us person years by protein intake group (not by age group). We can use this to work out an average number for person years of follow-up by protein intake group (total person years divided by the number of people in that group).
Row 11 is row 3 divided by row 10 – this gives the death rate per year of study.
Row 12 calculates back the person years (Row 1 x Row 10) as a sense check. The person years in this row add up to 83,315. Total person years in table S1 in the supplementary PDF is given as 83,308. We’re almost bang on – not bad – given how little help we’re getting.
So cancer deaths rates per year are as follows:
|NEW||Cancer deaths per 1,000 person years|
We now take cancer deaths (row 5) and the person years (row 12) and we need a way of comparing like with like so we make all the person years “1,000” and work back to see how many cancer deaths there would be per 1,000 person years for each protein group and age group.
In the 50-65 year old group there were 2.18 deaths per 1,000 person years in the low protein group, 5.95 deaths per 1,000 person years in the moderate protein group and 7.84 deaths per 1,000 person years in the high protein group. The over 65s had 15.2 deaths per 1,000 person years in the low protein group, 9.75 deaths per 1,000 person years in the moderate protein group and 6.31 deaths per 1,000 person years in the high protein group.
Please don’t forget association not causation and no plausible mechanism blah blah. But do you think that, using a base of six deaths, 7.84 deaths per 1,000 person years vs 2.18 deaths per 1,000 person years would have held the front page?!
p.s. Only Longo declares his conflict of interest in L-Nutra. Many thanks to the people who commented on my original post to point out that three other authors are also part of the L-Nutra team: Priya Balasubramanian; Sebastian Brandhorst and Luigi Fontana.
p.s.2 I was contacted by an actuary, Dermot, who had put the deaths and person years from above into a mortality spreadsheet that he has developed for his work. From this, Dermot calculated the confidence intervals for the 2 age groups and 3 protein intake groups – as shown in the table below.
Below the table is a chart, which Dermot has produced from the table. The pale blue area on this chart shows how big or small the actual mortality rate could be, with 95% probability (i.e. there is a 5% chance it could be even higher or lower). As Dermot concluded: “It shows that the shaded area for the 3 rates for age 50-65 overlaps considerably, almost allowing a straight line to be drawn at about 0.5. This greatly reduces the statistical reliability of the raw rates.”
|Age group||Person years||Deaths||Observed probability of death (p)||Confidence interval|
|Low : p1||High : p2|