Meta-analysisWikiRunningRun FuelingCyclingDeliberate PracticeHydrationHigh evidence score

Defining Training and Performance Caliber: A Participant Classification Framework

Read full paper →
Authors
Alannah K.A. McKay, Trent Stellingwerff, Ella S. Smith, David T. Martin, Iñigo Mujika, Vicky L. Goosey-Tolfrey, Jeremy Sheppard, Louise M. Burke
Journal
International Journal of Sports Physiology and Performance
Year
2021
Citations
2,249

TL;DR

This paper proposes a standardized 6-tier framework (Tier 0: Sedentary through Tier 5: World Class) to classify research participants by training volume and performance metrics, solving the problem that terms like "elite" and "trained" are used inconsistently across thousands of studies — making it nearly impossible to compare results or know if a finding applies to you.

What they tested

This is not an experiment but a **consensus framework** — a proposed classification system developed by experts in sport science and physiology. The authors tested nothing in the traditional sense. Instead, they:

Reviewed how previous studies have misused terms like "elite," "trained," "well-trained," and "recreationally active"

Identified that 91 publications examined across multiple sports used "elite" to describe everyone from national-level competitors to Olympic medalists — a range so wide it renders the term meaningless

Developed a 6-tier system based on two core metrics: **training volume** (hours per week, sessions per week, years of experience) and **performance caliber** (competition level, results, rankings)

Estimated what percentage of the general population falls into each tier using global and continent-specific statistics

Discussed how the framework can be adapted for different sports, ages (junior, masters), and Paralympic athletes

The outcome is a **classification tool** — not a finding about what works or doesn't work.

Who was studied

No participants were studied directly. This is a **commentary/review paper** that synthesizes existing literature and expert opinion. The authors reference population statistics from:

World Health Organization physical activity guidelines (adults aged 18–64)

National sport participation reports from Australia, Canada, the UK, and the US

Previous classification systems for cyclists (Jeukendrup et al., 2000; Decroix et al., 2016; Mujika, 2017)

The Swann et al. (2015) framework for classifying elite athletes in sports psychology

The "sample" is the entire published literature on participant classification in exercise science — approximately 91 papers specifically examined for misuse of the term "elite."

How they measured it

No instruments or scales were used. The framework itself is the measurement tool. Classification is based on:

**Training volume:** Minutes per week of moderate/vigorous activity, number of training sessions per week, years of sport-specific training

**Performance metrics:** Highest competition level achieved (local, regional, national, international, Olympic/World Championship), results/rankings at that level

**Population statistics:** WHO physical activity compliance rates, national sport participation data, elite athlete density estimates

The authors provide worked examples for specific sports (cycling, running, swimming, team sports) showing how to classify a given athlete.

Methodology

**Study design:** This is an **expert consensus commentary** — not a systematic review, meta-analysis, or experiment. The authors (all experienced sport scientists who have worked with Olympic and professional athletes) developed the framework through iterative discussion and refinement, then published it as an "Invited Commentary" in the International Journal of Sports Physiology and Performance.

**How the framework was built:**

1. The authors identified the problem: inconsistent terminology across the literature

2. They reviewed existing classification attempts (Jeukendrup for cyclists, Swann for sports psychology)

3. They defined 6 tiers with specific criteria for each

4. They estimated population percentages for each tier using published statistics

5. They discussed edge cases: junior athletes, masters athletes, Paralympic athletes, athletes transferring between sports, athletes at different career stages

6. They provided sport-specific examples to demonstrate practical application

**What this design can prove:**

That a standardized classification system is needed (the evidence of inconsistent terminology is clear)

That such a system can be constructed using existing population data

That the framework can accommodate different sports, ages, and ability levels

**What this design cannot prove:**

Whether using this framework actually improves research quality or reproducibility (that would require a prospective study comparing studies that use the framework vs. those that don't)

Whether the population estimates are accurate for all countries or sports (they are based on limited data from a few Western nations)

Whether the tier boundaries are optimal (they are expert judgments, not empirically derived cutoffs)

**Major methodological weaknesses:**

No systematic literature search methodology is described (this is not a systematic review)

No formal consensus process (e.g., Delphi method) was used — the framework reflects the opinions of 8 authors, all from similar backgrounds

Population estimates are rough and based on limited data — the authors acknowledge this in the supplementary materials

The framework has not been validated against actual athlete outcomes or research quality metrics

No inter-rater reliability testing was done (would two different researchers classify the same athlete into the same tier?)

Key findings

Since this is a framework paper, the "findings" are the classification system itself. Here are the specific tiers and their criteria:

**Tier 0: Sedentary**

Criteria: Fails to meet WHO physical activity guidelines (<150 min moderate activity OR <75 min vigorous activity per week)

Estimated population: ~46–52% of adults (based on WHO compliance data)

Appropriate for: Control groups in mechanistic studies, pharmacokinetic studies, proof-of-concept investigations

NOT appropriate for: Performance research (poor reliability of performance measures)

**Tier 1: Recreationally Active**

Criteria: Meets WHO physical activity guidelines; may participate in various activities/sports without specific commitment to competition

Estimated population: ~48–54% of adults (includes those who meet guidelines)

Appropriate for: Control groups, mechanistic studies, dose-response investigations

NOT appropriate for: Performance research

**Tier 2: Trained/Developmental**

Criteria: Sport-specific training ~3 times per week; intention to compete at local level; no performance standard required

Estimated population: ~12–19% of population (based on national sport participation rates)

Appropriate for: Large trials requiring statistical power, cross-sectional studies, survey designs

Performance research: Possible with familiarization trials and sensitive laboratory tests

**Tier 3: Highly Trained/National Level**

Criteria: National-level representative; structured/periodized training; developing proficiency for highest level; includes Provincial/State or Academy Programs (e.g., NCAA Division II/III)

Estimated population: ~0.01–0.03% of population

Appropriate for: Studies requiring high-performance athletes, intervention studies with adequate statistical power

**Tier 4: Elite/International Level**

Criteria: International-level competitor; top 10–15% of national rankings; full-time training commitment; includes NCAA Division I, professional athletes in lower-tier leagues

Estimated population: <0.001% of population (exact figure not provided)

Appropriate for: Studies at the extremes of human performance, genetic studies, intervention studies with small but meaningful effects

**Tier 5: World Class**

Criteria: Medal contender at Olympic Games, World Championships, or equivalent; top 5–10 in world rankings; multiple years at this level

Estimated population: Extremely small — likely <0.0001% of population

Appropriate for: Case studies, small-n designs, studies requiring the absolute highest level of performance

**Key insight from the paper:** The authors demonstrate that many published studies claiming to study "elite" athletes actually studied Tier 2 or Tier 3 athletes. They provide examples showing that interventions effective in Tier 2 athletes (e.g., carbohydrate periodization, beetroot juice supplementation) often fail to replicate in Tier 4 or Tier 5 athletes — suggesting that training status modifies intervention responses.

Effect magnitude

This is not applicable in the traditional sense — there are no effect sizes, confidence intervals, or p-values. However, the practical "effect" of this framework, if widely adopted, would be:

**Reduction in misclassification:** Currently, ~91% of studies using the term "elite" may be misclassifying their participants (based on the Swann et al. 2015 review cited by the authors)

**Improved comparability:** Studies using the same tier system could be directly compared, potentially revealing that many "conflicting" findings in the literature are actually due to studying different tiers of athletes

**More appropriate generalization:** Researchers and readers would know exactly what population a finding applies to — no more assuming that what works for a "trained" cyclist applies to a "world-class" one

The authors cite specific examples where tier differences matter:

Carbohydrate periodization: Effective in "trained" (Tier 2) athletes but not in "elite" (Tier 4) athletes

Beetroot juice supplementation: Performance-enhancing in lower-caliber athletes but not in elite athletes

These are not quantified effect sizes but qualitative observations from the literature.

Limitations

**Acknowledged by the authors:**

Population estimates are rough and based on limited data from a few countries (Australia, Canada, UK, US)

The framework may need sport-specific adaptations (e.g., what counts as "national level" differs between sports and countries)

Junior and masters athletes may not fit neatly into the age-based assumptions of the framework

Paralympic athletes require additional classification considerations beyond training and performance

The framework does not account for athletes who are between tiers (e.g., injured, returning from break, transitioning sports)

Depth of sport participation (e.g., a Tier 3 athlete in a highly competitive sport vs. a Tier 3 athlete in a niche sport) is not fully addressed

**Additional limitations a critical reader should note:**

**No empirical validation:** The framework has not been tested to see if it actually improves research quality, reproducibility, or interpretability

**Expert bias:** All 8 authors work in elite sport environments — the framework may overemphasize the needs of elite sport research at the expense of general population research

**No inter-rater reliability data:** Two researchers could classify the same athlete differently, especially at tier boundaries

**Training volume focus:** The framework emphasizes training volume over training quality, periodization, or recovery — two athletes with the same hours/week could have vastly different adaptations

**Performance metric challenges:** In some sports (e.g., team sports, action sports), objective performance metrics are hard to define — the framework relies on competition level as a proxy

**Cultural and national differences:** What counts as "national level" in a small country with low sport participation may be equivalent to "regional level" in a larger, more competitive nation

**No consideration of non-sport factors:** Genetics, age at peak performance, injury history, and psychological factors are not included but clearly affect performance caliber

**Publication venue bias:** Published as an "Invited Commentary" rather than a peer-reviewed original research paper — the framework has not undergone the same scrutiny as a systematic review or meta-analysis

Practical takeaways

For someone running their own n=1 experiment, this framework is valuable for **understanding where you fit** and **interpreting whether published research applies to you**.

### What to test (specific intervention and dose)

You don't "test" this framework — you **use it** to classify yourself and others. Here's how:

1. **Classify yourself honestly** using the tier criteria:

- Tier 0: Do you meet WHO activity guidelines? (<150 min moderate OR <75 min vigorous per week)

- Tier 1: Do you exercise regularly but without specific sport commitment?

- Tier 2: Do you train 3+ times/week in a specific sport and compete locally?

- Tier 3: Do you compete at state/provincial or national level?

- Tier 4: Do you compete internationally?

- Tier 5: Are you a medal contender at World Championships or Olympics?

2. **Use your tier to filter research:** Before trying an intervention from a study, check what tier the participants were. If the study says "trained cyclists" but doesn't specify further, be skeptical — they could be Tier 2 or Tier 4.

3. **Track your own progression:** Use the framework to set tier-based goals. Moving from Tier 1 to Tier 2 requires committing to sport-specific training 3x/week and entering local competitions. Moving from Tier 2 to Tier 3 requires reaching national-level performance standards.

### Minimum meaningful duration

This framework is not time-bound — you can classify yourself at any point. However, for tracking changes:

**Reclassification:** Allow 3–6 months of consistent training at a new level before reclassifying (to ensure the change is real, not a temporary peak)

**Performance metrics:** Use competition results from the most recent season (12 months) for classification

**Training volume:** Average over 4–8 weeks to account for periodization and variation

### What to measure (specific metrics)

To classify yourself using this framework, collect:

**Training volume:** Hours per week of sport-specific training, number of sessions per week, years of consistent training in your sport

**Performance level:** Highest competition level achieved (local, regional, national, international), best result/ranking at that level

**Physical activity baseline:** Minutes per week of moderate and vigorous activity (for Tier 0 vs. Tier 1 classification)

**Sport-specific metrics:** For endurance sports: VO2max, power output at lactate threshold, race times. For strength sports: 1RM relative to bodyweight. For skill sports: competition ranking, coach assessment.

### Key confounds to control for

When using this framework to interpret research or design your own experiments:

**Don't assume "elite" means elite:** Always check the actual tier classification. A study claiming "elite cyclists" might have Tier 2 or Tier 3 athletes.

**Training history matters:** Two athletes in the same tier may have different training histories (e.g., 5 years vs. 15 years) — this affects response to interventions.

**Sport specificity:** A Tier 3 runner and a Tier 3 weightlifter have vastly different physiological profiles — the framework is sport-specific.

**Age effects:** Junior athletes (under 18) and masters athletes (over 40) may not fit neatly into the tier system — their performance relative to age-group competition may be higher than their absolute performance suggests.

**Gender differences:** The framework is gender-neutral, but population statistics and competition depth differ between men's and women's sports — a Tier 4 female athlete may face different competition depth than a Tier 4 male athlete.

**Career stage:** An athlete returning from injury or at the end of their career may have lower current performance than their tier suggests — use recent (12-month) data.

### What a positive result would look like

A "positive result" from using this framework would be:

**You can clearly classify yourself** and any research participants you recruit

**You can identify why a published intervention worked or didn't work for you** — because you now know whether you're in the same tier as the study participants

**You can set tier-appropriate goals** — e.g., "I'm currently Tier 2. To reach Tier 3, I need to compete at national level and increase my training volume to 5–6 sessions per week"

**You notice patterns** — e.g., "Beetroot juice works for me (Tier 2) but doesn't work for my friend who's Tier 4" — this is consistent with the literature the authors cite

**You can communicate your level precisely** — instead of saying "I'm pretty fit," you can say "I'm a Tier 2 recreational cyclist who trains 4x/week and races locally"

**For your own n=1 experiments:** Before starting any intervention, classify yourself using this framework. Then, when you read a study, check the tier of the participants. If you're in the same tier, the results are more likely to apply to you. If you're in a different tier, be cautious — the intervention may work differently or not at all. This single practice — matching your tier to the study tier — could dramatically improve how well you can translate published research into personal results.

Test it on yourself

Run a structured running experiment

The research gives you a prior. Your own data tells you what actually works for you.

Defining Training and Performance Caliber: A Participant Classification Framework | Steady Practice | SteadyPractice