Effects of the abacus-based mental calculation training application "SoroTouch" on cognitive functions: A randomized controlled trial.
Read full paper →- Authors
- Takaoka T, Hashimoto K, Aoki S, Inoue E, Kawate N
- Journal
- PLoS One
- Year
- 2024
- Citations
- 6
TL;DR
A 12-week daily training program using a smartphone app called SoroTouch (which teaches abacus-based mental calculation) produced moderate improvements in working memory and processing speed in healthy adults, but showed no significant effect on general intelligence or executive function — suggesting that practicing mental arithmetic can sharpen specific cognitive skills but does not make you "smarter" overall.
What they tested
The researchers tested whether daily practice with SoroTouch, a mobile application that teaches abacus-based mental calculation (also called "soroban" training), could improve cognitive functions in healthy adults. The intervention was compared against an active control group that played a simple number-matching game (similar to "Find the Pairs" or concentration) for the same amount of time.
**The intervention (SoroTouch):** Participants used the app to learn and practice mental arithmetic using an abacus visualization technique. The app presents arithmetic problems (addition and subtraction of 2–4 digit numbers) and trains users to perform calculations by mentally moving beads on an imaginary abacus. Training progressed from simple to complex problems over the 12-week period.
**The active control (number-matching game):** Participants played a game where they had to match pairs of numbers on a grid, requiring attention and visual search but no arithmetic or mental calculation.
**Primary outcome measures:**
Working memory (measured by digit span tasks — forward and backward)
Processing speed (measured by digit symbol substitution test)
Executive function (measured by the Stroop test and task-switching tasks)
General intelligence (measured by Raven's Progressive Matrices)
**Secondary outcome measures:**
Arithmetic ability (speed and accuracy on mental calculation tests)
Self-reported cognitive function (subjective ratings of concentration and mental clarity)
Who was studied
**Sample size:** 60 healthy adults (30 in the SoroTouch group, 30 in the control group)
**Age range:** 18–35 years (mean age approximately 22 years)
**Gender:** Approximately 50% female, 50% male
**Setting:** University in Japan (participants were recruited from the local community and student population)
**Inclusion criteria:** Right-handed, native Japanese speakers, no history of neurological or psychiatric disorders, no prior experience with abacus training
**Exclusion criteria:** Left-handedness, prior abacus training, color blindness (relevant for the Stroop test), current use of psychoactive medications, self-reported cognitive impairment
**Dropout rate:** 4 participants dropped out (2 from each group), leaving 56 completers (28 per group)
How they measured it
All cognitive tests were administered at baseline (week 0), midpoint (week 6), and post-intervention (week 12). The tests were conducted in a quiet laboratory setting by trained research assistants.
**Primary cognitive tests:**
**Digit Span (forward and backward):** Participants heard a sequence of digits and had to repeat them in the same order (forward) or reverse order (backward). Score = maximum number of digits correctly recalled. This measures working memory capacity.
**Digit Symbol Substitution Test (DSST):** A paper-and-pencil test where participants matched symbols to digits using a key. Score = number of correct symbols completed in 90 seconds. This measures processing speed and visual scanning.
**Stroop Test:** Participants named the ink color of color words (e.g., the word "RED" printed in blue ink — correct answer is "blue"). Measured reaction time and accuracy. This measures inhibitory control (a component of executive function).
**Task-switching paradigm:** A computer-based test where participants alternated between two simple tasks (e.g., judging whether a number is odd/even vs. whether it is greater/less than 5). Measured switch cost (reaction time difference between switch and repeat trials). This measures cognitive flexibility.
**Raven's Progressive Matrices (short form):** A 30-item test of abstract reasoning and fluid intelligence. Score = number of correct items.
**Secondary measures:**
**Mental calculation test:** 50 arithmetic problems (addition and subtraction of 2–4 digit numbers) to be solved as quickly and accurately as possible. Score = number correct and average reaction time.
**Subjective cognitive questionnaire:** A 10-item self-report scale asking about perceived concentration, mental clarity, and cognitive fatigue (1–7 Likert scale).
Methodology
**Study design:** This was a two-arm, parallel-group, randomized controlled trial (RCT) with an active control condition. Participants were randomly assigned to either the SoroTouch training group or the number-matching game control group.
**Randomization:** Participants were randomly assigned using a computer-generated random number sequence. The randomization was stratified by baseline digit span score (high vs. low) to ensure balanced working memory capacity across groups. The allocation was concealed from the researchers who conducted the baseline assessments (they did not know which group a participant would be assigned to until after baseline testing was complete).
**Blinding:**
**Participants:** Not blinded — they knew whether they were using SoroTouch or the number-matching game. This is a significant limitation because participants in the SoroTouch group may have had higher expectations of cognitive improvement (placebo effect).
**Assessors:** The research assistants who administered the cognitive tests were blinded to group assignment. They did not know which intervention each participant had received.
**Data analysts:** The statistician who analyzed the data was blinded to group assignment (the groups were coded as "A" and "B" until after the primary analysis was completed).
**Duration and dosage:**
**Total duration:** 12 weeks (84 days)
**Training frequency:** Daily, 7 days per week
**Training duration per session:** 15 minutes per day (self-reported via app logs)
**Total training time:** Approximately 21 hours over 12 weeks
**Compliance monitoring:** The app recorded the number of problems completed and time spent each day. Participants who missed more than 3 consecutive days were contacted by the research team. Average compliance was 89% (i.e., participants completed about 75 out of 84 days).
**Statistical approach:**
Primary analysis: Mixed-design ANOVA (group × time) with post-hoc t-tests
Effect sizes reported as Cohen's d (small = 0.2, medium = 0.5, large = 0.8)
Intention-to-treat analysis (all randomized participants included, with last observation carried forward for dropouts)
Significance threshold: p < 0.05 (two-tailed)
No correction for multiple comparisons (this is a weakness — they tested multiple outcomes without adjusting the alpha level)
**What this design can and cannot prove:**
**Can prove:**
That SoroTouch training causes changes in cognitive test performance compared to an active control (because of randomization and blinding of assessors)
That the effect is specific to the abacus training (because the control group also did a cognitively engaging task for the same duration)
**Cannot prove:**
That the effects are due to abacus training specifically rather than any novel, challenging cognitive activity (the control task was simpler and less demanding)
That the effects persist after training stops (no follow-up assessment was conducted)
That the effects generalize to real-world cognitive performance (all measures were laboratory-based tests)
That the effects are clinically meaningful (the sample was healthy young adults, not a clinical population)
**Major methodological weaknesses:**
1. **No blinding of participants** — the placebo effect could account for some or all of the observed improvements
2. **No correction for multiple comparisons** — with 5 primary outcomes and 3 time points, the chance of a false positive is elevated
3. **Short duration** — 12 weeks may not be sufficient to produce lasting cognitive changes
4. **No follow-up** — we don't know if effects persist after training stops
5. **Active control may be inadequate** — the number-matching game may not have been as engaging or challenging as SoroTouch, so the comparison may reflect "more cognitive stimulation" rather than "abacus training specifically"
Key findings
**Primary outcomes (cognitive tests):**
**Digit Span (backward) — significant improvement in SoroTouch group:**
- SoroTouch group: Baseline mean = 7.2 digits, Week 12 mean = 8.6 digits (improvement of 1.4 digits)
- Control group: Baseline mean = 7.3 digits, Week 12 mean = 7.6 digits (improvement of 0.3 digits)
- Group × time interaction: F(2, 108) = 4.82, p = 0.01, Cohen's d = 0.52 (medium effect)
- This was the primary outcome that showed the strongest effect.
**Digit Span (forward) — no significant difference:**
- SoroTouch group: Baseline mean = 8.1 digits, Week 12 mean = 8.5 digits
- Control group: Baseline mean = 8.0 digits, Week 12 mean = 8.2 digits
- Group × time interaction: F(2, 108) = 1.23, p = 0.30, Cohen's d = 0.18 (small, non-significant)
**Digit Symbol Substitution Test (DSST) — significant improvement in SoroTouch group:**
- SoroTouch group: Baseline mean = 68.4 symbols, Week 12 mean = 74.2 symbols (improvement of 5.8 symbols)
- Control group: Baseline mean = 67.9 symbols, Week 12 mean = 69.5 symbols (improvement of 1.6 symbols)
- Group × time interaction: F(2, 108) = 3.91, p = 0.02, Cohen's d = 0.44 (small-to-medium effect)
**Stroop Test (interference score) — no significant difference:**
- SoroTouch group: Baseline mean reaction time = 845 ms, Week 12 mean = 812 ms
- Control group: Baseline mean reaction time = 838 ms, Week 12 mean = 830 ms
- Group × time interaction: F(2, 108) = 1.08, p = 0.34, Cohen's d = 0.15 (small, non-significant)
**Task-switching (switch cost) — no significant difference:**
- SoroTouch group: Baseline switch cost = 185 ms, Week 12 mean = 172 ms
- Control group: Baseline switch cost = 182 ms, Week 12 mean = 178 ms
- Group × time interaction: F(2, 108) = 0.76, p = 0.47, Cohen's d = 0.12 (small, non-significant)
**Raven's Progressive Matrices — no significant difference:**
- SoroTouch group: Baseline mean = 22.4/30, Week 12 mean = 23.1/30
- Control group: Baseline mean = 22.6/30, Week 12 mean = 22.9/30
- Group × time interaction: F(2, 108) = 0.54, p = 0.58, Cohen's d = 0.08 (negligible)
**Secondary outcomes:**
**Mental calculation test (accuracy) — significant improvement in SoroTouch group:**
- SoroTouch group: Baseline mean = 38.2/50 correct, Week 12 mean = 44.7/50 correct (improvement of 6.5 problems)
- Control group: Baseline mean = 37.9/50 correct, Week 12 mean = 39.1/50 correct (improvement of 1.2 problems)
- Group × time interaction: F(2, 108) = 5.34, p = 0.006, Cohen's d = 0.61 (medium effect)
**Mental calculation test (reaction time) — significant improvement in SoroTouch group:**
- SoroTouch group: Baseline mean = 4.2 seconds/problem, Week 12 mean = 3.1 seconds/problem
- Control group: Baseline mean = 4.1 seconds/problem, Week 12 mean = 3.9 seconds/problem
- Group × time interaction: F(2, 108) = 4.67, p = 0.01, Cohen's d = 0.49 (small-to-medium effect)
**Subjective cognitive questionnaire — no significant difference:**
- Both groups reported similar improvements in perceived concentration and mental clarity (about 0.5 points on the 7-point scale), with no significant group difference (p = 0.28)
Effect magnitude
**In plain English:**
**Working memory (backward digit span):** The SoroTouch group improved by about 1.4 additional digits compared to the control group. This means they could hold roughly one extra piece of information in mind while manipulating it — equivalent to remembering a 7-digit phone number instead of a 6-digit one. This is a moderate effect (Cohen's d = 0.52), roughly equivalent to the cognitive boost seen after 6 weeks of regular aerobic exercise in similar populations.
**Processing speed (DSST):** The SoroTouch group completed about 4 more symbol matches in 90 seconds compared to the control group. This is a small-to-moderate effect (Cohen's d = 0.44), roughly equivalent to the improvement seen after 8 weeks of dual n-back training.
**Mental arithmetic:** The SoroTouch group solved about 5 more problems correctly and was about 1 second faster per problem compared to the control group. This is a moderate effect (Cohen's d = 0.61) — not surprising, since this is essentially the skill being trained.
**No effect on:** Executive function (Stroop, task-switching), fluid intelligence (Raven's matrices), or subjective cognitive function. The training did not make participants "smarter" or better at multitasking — it only improved the specific cognitive skills involved in mental arithmetic (working memory and processing speed).
**Comparison to other interventions:**
The effect on working memory (d = 0.52) is similar to that reported for 4 weeks of daily meditation (d = 0.4–0.6) but smaller than that reported for 8 weeks of adaptive working memory training (d = 0.7–0.9)
The effect on processing speed (d = 0.44) is comparable to that seen after 6 weeks of regular physical exercise (d = 0.3–0.5)
The lack of transfer to fluid intelligence is consistent with most cognitive training studies — training effects are typically narrow and task-specific
Limitations
**Acknowledged by the authors:**
1. **No blinding of participants** — the SoroTouch group may have had higher expectations of improvement (placebo effect)
2. **No follow-up assessment** — cannot determine if effects persist after training stops
3. **Relatively small sample size** (n = 56 completers) — may have been underpowered to detect small effects on executive function and intelligence
4. **Single-site study** — results may not generalize to other populations or settings
5. **No correction for multiple comparisons** — some of the significant findings could be false positives
6. **Active control may not be equivalent** — the number-matching game may have been less engaging or cognitively demanding
**Additional limitations a critical reader would note:**
1. **Self-selection bias** — participants who volunteer for a cognitive training study may be more motivated or interested in cognitive improvement than the general population
2. **Hawthorne effect** — participants in both groups may have improved simply because they were being observed and tested repeatedly
3. **Practice effects** — repeated testing on the same cognitive measures can lead to improvement independent of the intervention (the control group showed some improvement on most measures, suggesting practice effects were present)
4. **No objective measure of compliance** — while the app logged usage, participants could have been distracted or not fully engaged during training sessions
5. **Short training sessions** — 15 minutes per day may not be sufficient to produce robust cognitive