But what about when systematic reviews disagree? When the “he said, she said” of conflicting studies goes meta, it can be even more confusing. New layers of disagreement get piled onto the layers from the original research. Yikes! This post is going to be tough-going…
A group of us defined this discordance among reviews as: the review authors disagree about whether or not there is an effect, or the direction of effect differs between reviews. A difference in direction of effect can mean one review gives a “thumbs up” and another a “thumbs down.”
Some people are surprised that this happens. But it’s inevitable. Sometimes you need to read several systematic reviews to get your head around a body of evidence. Different groups of people approach even the same question in different but equally legitimate ways. And there are lots of different judgment calls people can make along the way. Those decisions can change the results the systematic review will get.
When and how they searched for studies – and what type and subject – means that it’s not at all unusual for groups of reviewers to be looking at different sets of studies for much the same question.
It’s a little like watching a game of football where there are several teams on the field at once. Some of the players are on all the teams, but some are playing for only one or two. Each team has goal posts in slightly different places – and each team isn’t necessarily playing by the same rules. And there’s no umpire.
Here’s an example of how you can end up with controversy and people taking different positions even when there’s a systematic review. The area of some disagreement in this subset of reviews is about psychological intervention after trauma to prevent post-traumatic stress disorder (PTSD) or other problems:
The conclusions range from saying debriefing has a large benefit to saying there is no evidence of benefit and it seems to cause some PTSD. Most of the others, but not all, fall somewhere in between, leaning to “we can’t really be sure”. Most are based only on randomized trials, but one has none, and one has a mixture of study types.
The authors are sometimes big independent national or international agencies. A couple of others include authors of the studies they are reviewing. The definition of trauma isn’t the same – they may or may not include childbirth, for example. The interventions aren’t the same.
The quality of evidence is very low. And the biggest discordance – whether or not there is evidence of harm – hinges mostly on how much weight you put on one trial.
The people in the debriefing group were at quite a lot higher risk of PTSD in the first place. Data for more than 20% of the people randomized is missing – and that biases the results too (it’s called attrition bias). You can’t be sure those people didn’t return because they were depressed, for example. If so, that could change the results.