Systematic ReviewWikiTop journalMagnesiumHigh evidence score

The evidence underpinning sports performance products: a systematic assessment

Authors: Carl Heneghan, Jeremy Howick, Braden O’Neill, Peter J. Gill, Daniel Lasserson, Deborah Cohen, Ruth Davis, Alison Ward, Adam Smith, Greg Jones, Matthew Thompson
Journal: BMJ Open
Year: 2012
DOI: 10.1136/bmjopen-2012-001702
Citations: 45

TL;DR

Of 431 performance-enhancing claims made by 104 different sports products across magazines and websites, 53% of websites provided zero references, and of the 74 studies that could be critically appraised, 84% were at high risk of bias — meaning the vast majority of sports product claims are backed by poor or non-existent evidence.

What they tested

This was not a test of a single product. Instead, the researchers tested the *evidence base* behind sports performance products. They examined:

**Interventions:** Sports drinks, oral supplements (powders, pills, gels), footwear, clothing (compression gear), and devices (wristbands, bracelets)

**Comparators:** None directly — they compared the claims made by manufacturers against the quality of scientific evidence cited to support those claims

**Outcome measures:** Whether each product claimed to improve strength, speed, endurance, recovery, or reduce muscle fatigue; whether references were provided; and the methodological quality of those references (using the Centre for Evidence-Based Medicine levels of evidence and Cochrane risk of bias tools)

The researchers did not measure whether any product actually worked. They measured whether the *evidence* provided by manufacturers was trustworthy.

Who was studied

This was a systematic review of advertising claims and their supporting references, not a study of human participants. However, the researchers examined the *participants* in the 74 studies they were able to critically appraise:

Participants in the underlying studies were categorised as: "regular people" who do not exercise or compete seriously in sport; amateur athletes (including "regular people" who exercise seriously); and sports professionals

The researchers did not report exact participant numbers across all studies, but the studies they appraised ranged from small (often fewer than 20 participants) to moderate-sized trials

The setting was global: the researchers examined UK and US magazines (top 100 general magazines and top 10 sport/fitness magazines) and the associated websites of products advertised within them

How they measured it

The researchers used a systematic, multi-step process:

**Magazine screening:** One reviewer examined every page of 92 magazines (March 2012 issues) to identify adverts. A second team of reviewers confirmed which adverts were sports-related. A third round identified which adverts made specific performance-enhancing claims.

**Website analysis:** For each product making a claim, researchers viewed all associated web pages (1,035 total) and extracted every performance claim (431 total) and every reference cited (146 total).

**Evidence quality assessment:** They used the Centre for Evidence-Based Medicine (CEBM) Levels of Evidence, where Level 1 = systematic reviews of randomised trials, Level 2 = individual randomised trials, Level 3 = non-randomised studies, Level 4 = case series, and Level 5 = mechanistic reasoning (expert opinion, animal studies).

**Risk of bias assessment:** They used the Cochrane Collaboration's tool, assessing studies as high, unclear, or low risk of bias based on: clear hypothesis, control group, power calculation, randomisation, allocation concealment, intention-to-treat analysis, and blinding of investigators, outcome assessors, and participants.

**Manufacturer contact:** Researchers emailed all 42 manufacturers asking them to confirm whether the list of claims and references was complete, whether additional data existed, and whether unpublished research was available. Only 16 responded.

Methodology

**Study design:** This was a systematic review — a study that collects, evaluates, and synthesises all available evidence on a topic using pre-specified, reproducible methods. It is not a meta-analysis (which pools data statistically), because the studies were too heterogeneous and mostly too poor to combine.

**Sampling strategy:** The researchers selected the top 100 general magazines and top 10 sport/fitness magazines in the UK and USA by circulation (Magazine Audit Bureau of Circulations). This was a purposive sample designed to capture products marketed to the general public, not niche bodybuilding magazines. They examined only the March 2012 issues — a single month snapshot.

**Data extraction:** Five reviewers extracted data using a standardised template. All data were checked by a second reviewer. One reviewer acted as custodian and checked all entries for consistency. This reduces transcription errors.

**Quality assessment:** Two reviewers independently assessed each study's risk of bias. Discrepancies were resolved by consultation with additional reviewers. This is standard best practice for systematic reviews.

**What this design can prove:**

It can definitively show that the *evidence base* for sports performance products is poor

It can quantify how many claims go unsupported, how many studies are flawed, and what specific methodological weaknesses are most common

It can identify patterns (e.g., which product categories are worst offenders, whether athlete endorsements substitute for evidence)

**What this design cannot prove:**

It cannot prove that any specific product does or does not work — it only shows that the evidence provided by manufacturers is insufficient to know

It cannot rule out that some products might be effective — it only shows that the studies cited are too weak to trust

It cannot estimate effect sizes for any product — the review did not pool data

It cannot assess unpublished negative studies that manufacturers chose not to cite

**Major methodological strengths:**

Systematic, reproducible methods

Independent duplicate data extraction and quality assessment

Attempted to contact manufacturers for additional evidence

Used validated quality assessment tools (Cochrane risk of bias, CEBM levels)

**Major methodological weaknesses:**

Single month snapshot (March 2012) — may miss seasonal products or campaigns

Only UK and US magazines — may not generalise to other countries

Manufacturers were given limited time to respond — the authors acknowledge that more time might have yielded more references

The sample may be biased toward the worst end of the spectrum (as the authors acknowledge), since better products might not need magazine advertising

No assessment of whether the claims themselves were plausible or biologically mechanistic — only whether evidence was provided and whether that evidence was good

Key findings

**Claims and references:**

92 magazines examined, containing 1,807 adverts

615 (34%) advertised sports products

235 products were included after excluding non-product-specific adverts

54 (23%) of these made 113 enhanced performance or recovery claims in magazines

Of these 54 magazine-advertised products, only 3 (2.7%) provided references (all for one product: ACCELERADE)

22 (42%) products were endorsed by athletes

6 (12%) made direct comparisons with other products

Only 3 provided disclaimers (all US-based, citing FDA: "these statements have not been evaluated by the FDA")

**Website analysis:**

1,035 web pages viewed for 104 different products

431 performance-enhancing claims identified (median 7 per product, range 0–65)

146 references cited across all websites

52.8% of websites that made claims provided zero references

One site (Powerade GB) provided 46 references — approximately one-third of all references found

**Manufacturer responses:**

42 companies contacted, 16 responded

2 refused to share research (Panache, New Balance)

1 said a video of the product in use was "sufficient" (Nike)

1 pointed to a researcher's work but not their own product (Merrell)

1 directed back to their website

9 provided additional material: 2 published studies, 1 in-press study, 2 unpublished studies, and 4 bibliographies

**Quality of evidence:**

None of the 146 references were systematic reviews (Level 1 evidence)

72 of 146 references (49.3%) could not be critically appraised at all (reasons included: not retrievable, not relevant to the claim, or not a study)

Of the 74 studies that were critically appraised:

- 84% (62/74) were judged at high risk of bias

- 58.1% (43/74) used randomisation

- Only 6.8% (5/74) clearly reported allocation concealment

- Only 27.0% (20/74) clearly reported blinding of investigators, outcome assessors, or participants

- Only 2.7% (2/74) were judged to be of high quality and at low risk of bias

**Participant types in appraised studies:**

Most studies used amateur athletes or "regular people who exercise seriously"

Few studies used elite/professional athletes

Adverse events were rarely reported

Effect magnitude

This review does not report effect sizes because it did not pool results from individual studies. However, the *practical* effect size is stark:

If you pick any sports product at random from a magazine or website, there is roughly a 50% chance the manufacturer provides zero evidence at all

If they do provide evidence, there is an 84% chance that the study they cite is at high risk of bias — meaning the results are likely unreliable

Only about 1 in 37 products (2.7%) had evidence that was both available and of reasonable quality

In plain English: the chance that a randomly selected sports product claim is backed by trustworthy science is approximately 1–3%. The other 97–99% of claims are either unsupported or supported by studies that are too flawed to believe.

Limitations

**Acknowledged by authors:**

The sample of products may be at the "worst end of the spectrum" — better products might not need magazine advertising

Manufacturers were given limited time to respond; more time might have yielded additional references

Only UK and US magazines were sampled; results may not generalise globally

Only March 2012 issues were examined; seasonal variation in advertising is possible

**Additional limitations a critical reader should note:**

The review is from 2012 — the evidence base may have changed (though subsequent investigations suggest it has not improved substantially)

The researchers did not assess whether the *claims themselves* were biologically plausible — a product could make a modest, plausible claim supported by weak evidence, and still be counted as "poor evidence"

The review did not assess harms or side effects systematically — only whether adverse events were reported in the cited studies

The researchers did not evaluate the products directly — a product could work even if its manufacturer provides poor evidence (though you'd have no way to know)

The definition of "performance-enhancing claim" was broad — some claims might be trivial (e.g., "this shoe provides cushioning") while others are substantial (e.g., "this drink will make you run 10% faster")

The review did not assess whether the claims matched the evidence — a manufacturer might cite a study on ingredient X to support a claim about product Y, which is a logical error the review didn't fully capture

Industry funding of the underlying studies was not systematically assessed, though it is known to correlate with positive results

Practical takeaways

For someone running their own n=1 experiment:

**What to test:**

Pick ONE specific product claim that is measurable and plausible. Do not test vague claims like "improves recovery" — test something concrete like "reduces perceived muscle soreness 24 hours after a workout" or "improves 5 km run time by 30 seconds"

Good candidates for self-testing: sports drinks vs. water, caffeine vs. placebo, compression garments vs. normal clothing, specific supplements (creatine, beta-alanine) that have SOME evidence outside of manufacturer-funded studies

Avoid products that make multiple simultaneous claims — you won't know what caused any effect

**Minimum meaningful duration:**

For acute effects (e.g., energy drinks, pre-workout supplements): test at least 3–5 separate sessions with the product and 3–5 without, alternating in a randomised order

For chronic effects (e.g., recovery supplements, compression gear worn during training): minimum 2–4 weeks per condition, ideally 4–6 weeks

For strength or endurance gains: minimum 6–8 weeks per condition, as physiological adaptations take time

The longer you test, the more reliable your results — but balance this against the risk of losing motivation

**What to measure (specific metrics):**

Primary outcome: Choose ONE objective measure (e.g., time to complete a fixed distance, number of reps at a fixed weight, heart rate at a fixed intensity, rating of perceived exertion on a 1–10 scale)

Secondary outcomes: Subjective measures (e.g., "how sore do you feel 24 hours later?" on a 1–10 scale), sleep quality, mood, motivation

Objective measures are better than subjective ones — they are less susceptible to placebo effects

Measure at the same time of day, under the same conditions (same temperature, same time since last meal, same warm-up)

**Key confounds to control for:**

**Order effects:** If you always test the product first and placebo second, any improvement could be due to practice or fatigue. Randomise the order — flip a coin each session

**Expectation:** If you know you're taking the product, your belief can create a real effect (placebo). Use blinding if possible — have someone else prepare your drinks/pills so you don't know which is which

**Diet and sleep:** These affect performance massively. Keep them as consistent as possible across all test sessions. Log what you ate and how long you slept each night

**Training load:** Don't change your training programme during the experiment. If you're testing a recovery product, keep your workouts identical in intensity and duration

**Hydration status:** Especially important for sports drink tests. Standardise water intake before each session

**Time of day:** Performance varies by circadian rhythm. Test at the same time each day

**Menstrual cycle (if applicable):** Can affect strength, endurance, and perceived exertion. Either test across a full cycle (4+ weeks per condition) or schedule tests in the same phase

**What a positive result would look like:**

For an acute effect: You consistently perform better (e.g., run 15–30 seconds faster on a 5 km, or lift 1–2 more reps) on at least 4 out of 5 product sessions compared to placebo sessions

For a chronic effect: Your performance improves steadily over 4–6 weeks with the product, and plateaus or improves more slowly without it

The effect should be larger than your normal day-to-day variation. If your 5 km time varies by ±45 seconds normally, a 20-second improvement might just be noise. Track your baseline variability first

A "positive" result in a well-designed n=1 trial is more trustworthy than most of the studies in this review — because you control for your own biology, you test the actual product (not a related ingredient), and you can replicate the finding

**Bottom line:** Before spending money on any sports product, assume the advertised claims are unsupported. If you want to know if something works for *you*, run your own blinded, randomised, repeated test. The evidence from manufacturers is not worth trusting.

Read full paper →More Magnesium research