Meta-analysisWikiFocusCognitive PerformanceHigh evidence score

Working memory training in typically developing children: A meta-analysis of the available evidence.

Read full paper →
Authors
Sala G, Gobet F
Journal
Dev Psychol
Year
2017
Citations
216

TL;DR

Working memory training in typically developing children produces moderate improvements on tasks very similar to the training itself (near-transfer, effect size g = 0.46), but produces negligible to zero improvement on real-world outcomes like intelligence, maths, reading, or attention (far-transfer, effect size g = 0.12). For someone running a self-experiment, this means you should expect to get better at the training game itself, but don't count on it boosting your general cognitive abilities or academic performance.

What they tested

This is a meta-analysis — a statistical synthesis of 26 separate experimental studies. The researchers tested whether computerised working memory (WM) training programmes (e.g., Cogmed, n-back tasks, span tasks) improve:

**Near-transfer outcomes:** Tasks closely related to what was trained, such as visuospatial working memory, verbal working memory, and short-term memory (both verbal and visuospatial).

**Far-transfer outcomes:** Tasks outside the domain of working memory, including:

- Fluid intelligence (e.g., Raven's Progressive Matrices)

- Attention/inhibition (e.g., Stroop task, go/no-go task)

- Academic skills: mathematics (arithmetic, problem-solving) and literacy (reading comprehension, spelling)

- Other cognitive skills (e.g., processing speed)

The comparator was either an active control group (children who did a different computer task, like a low-difficulty version of the training or a non-WM game) or a passive control group (children who received no intervention or continued with normal classroom activities). All studies compared trained children against untrained controls at post-test.

Who was studied

**Total sample:** 2,401 typically developing children across 26 studies.

**Age range:** 3 to 16 years old.

**Population:** Typically developing children only — studies of children with ADHD, learning disabilities, or other clinical diagnoses were excluded.

**Setting:** Mostly school-based interventions in Europe, North America, and Australia. Some studies were conducted in laboratory settings.

**Key exclusion:** Children with diagnosed working memory deficits, ADHD, autism, or other neurodevelopmental conditions were not included. This meta-analysis specifically focuses on children without any known cognitive impairment.

How they measured it

The meta-analysis extracted effect sizes from each study using standardised mean differences (Hedges' g). The specific instruments varied across studies, but included:

**Near-transfer measures:**

- Automated Operation Span Task (OSPAN)

- Backwards digit span (verbal WM)

- Corsi block-tapping task (visuospatial WM)

- Forward digit span (verbal STM)

- Dot matrix task (visuospatial STM)

**Far-transfer measures:**

- Fluid intelligence: Raven's Progressive Matrices, Cattell Culture Fair Test, Kaufman Brief Intelligence Test (KBIT)

- Attention/inhibition: Stroop task (colour-word interference), go/no-go task (response inhibition), Flanker task

- Mathematics: Woodcock-Johnson Applied Problems, KeyMath, curriculum-based maths tests

- Literacy: Woodcock-Johnson Reading Comprehension, word reading tests, spelling tests

**Follow-up measures:** Some studies re-tested children 3–12 months after training ended.

Methodology

**Study design:** Meta-analysis of 26 experimental studies (randomised controlled trials and quasi-experimental designs).

**Randomisation:** 18 of the 26 studies used random assignment to groups; 8 studies did not (they used convenience assignment, e.g., whole classrooms assigned to condition).

**Blinding:** The meta-analysis coded whether studies used blinding of assessors (i.e., the person administering the post-test didn't know which group the child was in). Only a minority of studies reported blinding. No studies blinded participants (children obviously knew they were doing a training programme).

**Control groups:** 15 studies used an active control group (children did a different computerised task, often a low-difficulty version of the training or a non-WM game). 11 studies used a passive control group (no intervention, or business-as-usual classroom activities).

**Duration of training:** Ranged from 2 to 12 weeks. Most programmes involved 15–25 sessions of 15–45 minutes each, typically 3–5 times per week.

**Statistical approach:** The authors used random-effects meta-analysis (which assumes true effects vary across studies) and calculated Hedges' g effect sizes. They tested for publication bias using funnel plots and Egger's regression test. They also ran moderator analyses to see if study quality (randomisation, active vs passive control) affected results.

**What this design can and cannot prove:**

**Can prove:** The overall average effect of WM training across many studies, with statistical power to detect small effects. The moderator analysis can show whether study design features (like randomisation) influence results.

**Cannot prove:** Causality at the individual level. Meta-analysis aggregates group averages — it cannot tell you whether *your* child (or you) will benefit. It also cannot rule out that some specific training programmes or specific populations might work, because the analysis averages across all programmes and all typically developing children.

**Major methodological weakness flagged by the authors:** The quality of the included studies varied substantially. Studies without random assignment and studies using passive control groups showed larger (likely inflated) effects. When only high-quality studies (randomised + active control) were analysed, far-transfer effects essentially disappeared (g = 0.04, not statistically significant).

Key findings

**Primary outcome — Near-transfer (WM-related tasks):**

Overall effect: g = 0.46, 95% CI [0.30, 0.62], p < 0.001

This is a moderate-to-large effect. Children who did WM training scored about half a standard deviation higher on WM tests than controls.

However, this effect was significantly smaller in studies with active control groups (g = 0.28) compared to passive control groups (g = 0.68), suggesting that some of the "improvement" is due to placebo effects or general computer familiarity.

**Primary outcome — Far-transfer (non-WM tasks):**

Overall effect: g = 0.12, 95% CI [0.04, 0.20], p = 0.004

This is a very small effect. For context, an effect of 0.12 means the average trained child scored only about 1.2 percentile points higher than the average control child.

When only studies with random assignment AND active control groups were analysed: g = 0.04, 95% CI [-0.06, 0.14], p = 0.43 — not statistically significant.

**Secondary outcomes — Specific far-transfer domains:**

Fluid intelligence: g = 0.07, 95% CI [-0.04, 0.18], p = 0.21 — not significant

Attention/inhibition: g = 0.10, 95% CI [-0.04, 0.24], p = 0.16 — not significant

Mathematics: g = 0.08, 95% CI [-0.06, 0.22], p = 0.26 — not significant

Literacy: g = 0.07, 95% CI [-0.08, 0.22], p = 0.36 — not significant

Other cognitive skills: g = 0.19, 95% CI [0.02, 0.36], p = 0.03 — small but significant, though this category was a mixed bag of different tests

**Moderator analyses:**

Type of control group: Studies with passive controls showed significantly larger effects than those with active controls (p = 0.02 for near-transfer; p = 0.04 for far-transfer).

Randomisation: Studies without random assignment showed larger effects than randomised studies (p = 0.03 for near-transfer; p = 0.06 for far-transfer, trend level).

Age: No significant moderation — effects were similar across the 3–16 age range.

Training duration: No significant moderation — longer training did not produce larger effects.

**Follow-up effects (3–12 months post-training):**

Near-transfer: g = 0.27, 95% CI [0.08, 0.46], p = 0.005 — still significant but reduced

Far-transfer: g = 0.07, 95% CI [-0.06, 0.20], p = 0.29 — not significant

**Publication bias:**

Funnel plot asymmetry was detected for near-transfer effects (Egger's test p = 0.03), suggesting that small studies with null or negative results may be missing from the literature.

No significant publication bias for far-transfer effects.

Effect magnitude

Translate the numbers into plain English:

**Near-transfer:** If you trained working memory for 4–8 weeks, you'd likely improve your score on a working memory test by about half a standard deviation. That's roughly the difference between a child at the 50th percentile and a child at the 69th percentile. But this improvement is mostly on tasks that look very similar to the training — it's like getting better at the game itself.

**Far-transfer:** The average effect of g = 0.12 means that if you took 100 children who did WM training and 100 who didn't, the trained children would score only about 1–2 points higher on an IQ test (assuming a standard deviation of 15). That's negligible — less than the typical test-retest variability. For maths and reading, the effect was essentially zero.

**Practical meaning:** To get a meaningful boost in intelligence or school performance, you'd need an effect size of at least 0.3–0.5. The observed far-transfer effects are 3–4 times smaller than that. In plain terms: WM training does not make typically developing children smarter or better at school.

Limitations

**What the authors acknowledge:**

The number of studies (26) is modest, especially for subgroup analyses (e.g., only 6–8 studies per far-transfer domain).

Most studies had small sample sizes (median n = 60), meaning individual studies were underpowered to detect small far-transfer effects.

Few studies reported follow-up data (only 8 studies), limiting conclusions about long-term effects.

The meta-analysis could not control for the specific type of WM training (e.g., Cogmed vs n-back vs span tasks) because too few studies used each type.

Publication bias was present for near-transfer effects, meaning the true near-transfer effect may be even smaller than reported.

**What a critical reader would note:**

**Lack of blinding:** Almost no studies blinded participants or trainers. Children knew they were in a "brain training" programme, which creates strong placebo effects. Active control groups partially address this, but even active controls may not be perfectly matched for expectations.

**Commercial funding:** Several studies (especially those using Cogmed) were funded or co-authored by the company that sells the training programme. The meta-analysis did not formally test for industry funding as a moderator.

**Outcome switching:** Some studies measured many outcomes and only reported the significant ones. The meta-analysis relied on published data, which may overrepresent positive findings.

**Training adherence:** Few studies reported how many sessions children actually completed. Poor adherence could dilute real effects, but the authors note that even studies with high adherence showed no far-transfer.

**Age range is broad:** Combining 3-year-olds with 16-year-olds may mask developmental differences. However, the moderator analysis found no age effect, suggesting the null result is consistent across childhood.

**Only typically developing children:** These results do not apply to children with ADHD, working memory deficits, or learning disabilities, where WM training might have different effects.

Practical takeaways

For someone running their own n=1 experiment (or for a parent considering WM training for their typically developing child):

### What to test (specific intervention and dose)

**Intervention:** A computerised WM training programme (e.g., dual n-back, Cogmed, or a custom span task). The most common dose in the literature was 20–25 sessions of 20–30 minutes each, done 4–5 times per week for 4–6 weeks.

**What NOT to expect:** Do not expect improvements in IQ, maths, reading, or attention. The evidence says these are unlikely.

**What you CAN expect:** You will likely improve your score on the training task itself and on similar WM tests (e.g., digit span, Corsi blocks). This is real but narrow.

### Minimum meaningful duration

**4 weeks** is the minimum to see near-transfer effects. Longer training (up to 12 weeks) did not produce larger far-transfer effects in the meta-analysis, so don't assume "more is better."

For a self-experiment, run the training for 4–6 weeks, then test. If you want to test far-transfer, you'll need a much longer intervention (the literature suggests even 12 weeks doesn't work, but your n=1 might differ).

### What to measure (specific metrics)

**Primary outcome (near-transfer):** Backwards digit span (verbal WM) and Corsi block-tapping task (visuospatial WM). Measure at baseline and post-training.

**Secondary outcomes (far-transfer):**

- Fluid intelligence: Raven's Progressive Matrices (online version available) or a matrix reasoning subtest

- Attention: Stroop task (online versions exist) or a simple go/no-go reaction time test

- Academic: A standardised maths test (e.g., timed arithmetic) and a reading comprehension test

**Training metric:** Track your score on the training task itself (e.g., n-back level achieved, reaction time, accuracy). This is your manipulation check — if you didn't improve on the training, you didn't actually train WM.

### Key confounds to control for

**Placebo effect:** The biggest confound. If you believe WM training will make you smarter, you may try harder on post-tests. Use an active control condition if possible (e.g., do a different computer game for the same amount of time, then switch).

**Practice effects:** Taking the same test twice will improve your score even without training. Use alternate versions of tests at pre and post, or include a control period where you test twice without training.

**Motivation and fatigue:** Training is boring. If you do 20 sessions, your motivation will wane. Track your effort and mood daily.

**Sleep and nutrition:** These affect cognitive performance. Keep them constant across pre- and post-testing.

**Age and development:** If you're testing a child, natural cognitive development over 4–6 weeks could produce small improvements. A control group (or control period) is essential.

### What a positive result would look like

**Near-transfer positive:** Your backwards digit span increases by 1–2 digits (roughly 0.5 SD). Your Corsi span increases by 1–2 blocks. This is realistic and expected.

**Far-transfer positive:** Your Raven's matrices score increases by 2–3 points (out of 60) or your maths test improves by 5–10%. Given the meta-analytic evidence, this is unlikely. If you see it, check for:

- Practice effects (did you use the same test twice?)

- Placebo effects (did you expect improvement?)

- Regression to the mean (was your pre-test score unusually low?)

**What would be convincing:** A far-transfer improvement that is larger than the test-retest reliability of the measure (e.g., > 0.3 SD), that replicates across multiple tests (e.g., both maths and reading improve), and that is accompanied by a clear dose-response relationship (more training sessions = more improvement). Even then, be sceptical — the meta-analysis of 2,401 children found essentially zero far-transfer.

**Bottom line for your self-experiment:** Train WM if you enjoy the challenge or want to improve your WM test scores. Do not train WM expecting to boost your IQ, grades, or attention. The evidence from 26 studies on 2,401 children says that doesn't work.

Test it on yourself

Run a structured focus experiment

The research gives you a prior. Your own data tells you what actually works for you.

Working memory training in typically developing children: A meta-analysis of the available evidence. | Steady Practice | SteadyPractice