← All topics

Experiment Design

Power analysis, adaptive designs, randomization, interference, and SUTVA.

Research synthesis7 min read

The Confounding Problem: Why Your Experiments Can Fool You

You logged 60 days of data. The intervention looks like it worked. But something else changed at the same time — something you didn't track. Here's how confounding sneaks into personal experiments and what to do about it.

The Experiment That Looked Convincing

A user ran a 45-day magnesium glycinate experiment. Took 400mg each evening on intervention days. Measured sleep score every morning from their Oura ring. The result: sleep scores averaged 73 on magnesium nights versus 68 on placebo nights. A 5-point difference. 82% posterior probability that magnesium helped. Looked convincing.

Then they noticed something. They'd started a new job three weeks into the experiment. And they'd been more likely to take their magnesium on days when work had gone well — less anxious, winding down earlier, not lying awake rehearsing tomorrow's tasks. The magnesium and the stress level had moved together. Not because of each other, but because the same underlying thing — a good versus difficult workday — caused both.

The magnesium may have done nothing. The stress level may have done everything. Or magnesium may have helped a little and the stress confounded the rest. With observational data entangled this way, there is no clean answer.

This is the confounding problem in personal experiments. It's not exotic. It happens constantly.

What Confounding Actually Is

A confounder is a variable that influences both which condition you end up in and what your outcome is.

The clearest way to see this: suppose you're testing whether your morning run improves your afternoon focus. You tend to run on days when you sleep well. You also tend to have better focus on days when you sleep well. Now you find that focus is higher on run days. Is it the running or the sleep? You can't tell from the data, because sleep quality is tangled with both the exposure (whether you ran) and the outcome (focus score).

Sleep quality here is a confounder. It's not the intervention. It's not the outcome. But it's driving a correlation between them that doesn't reflect a real causal relationship.

The same dynamic appears across almost every common personal experiment:

  • Caffeine cutoff experiments: stress and workload affect both when you stop drinking coffee and how well you sleep
  • Exercise timing experiments: motivation levels affect both when you work out and how you perform afterward
  • Diet interventions: social context affects both what you eat and your mood
  • Supplement experiments: how consistently you take a supplement is often correlated with how "on" you're feeling that day

If your condition assignment is not random — if you're choosing which condition to follow based on how you feel, or external conditions are pushing you toward one condition more than the other — confounders are almost certainly at work.

The Difference Between Confounding and Noise

Noise is random variation that makes your estimates imprecise. More data reduces noise. With enough trials, a real effect emerges through the noise.

Confounding is systematic bias. More data doesn't fix it — it just makes the biased estimate more precise. You get a very confident wrong answer.

This is why confounding is more dangerous than noise. A noisy experiment will produce uncertain results, which looks appropriately uncertain. A confounded experiment will produce confident results that point in the wrong direction. The statistical analysis has no way to know that the effect it's measuring isn't what you think it is.

The practical implication: running your experiment longer doesn't help if the confound is systematic. You need to address the source of the confounding, not collect more data around it.

The Classic Fix: Randomization

The reason randomized controlled trials are considered the gold standard is not that randomization is magical. It's that randomization breaks the link between confounders and condition assignment.

If you flip a coin each morning to decide whether today is a magnesium day or a placebo day, your stress level that day can't systematically push you toward one condition. On average, stressful days are equally distributed across the two conditions. The confounder becomes uncorrelated with the intervention. It adds noise but not bias.

This is why the randomization step in your experiment design matters more than most people realize. Convenience-based condition assignment — "I'll do the hard protocol when I feel ready for it" — is a recipe for confounded results. The coin flip feels unnecessary when you're logging every day. It is not.

When You Can't Randomize

Some interventions are hard to randomize. You can't flip a coin each morning and decide whether to practice intermittent fasting today. The protocol has multi-day structure. You have to commit to it for a run of consecutive days, then switch.

In these cases — crossover designs with block structure — the confounders tend to be trends over time: seasonal changes, work project cycles, relationship stress, training load building up. The fix is to measure those potential confounders and account for them.

The key variables to track alongside any experiment:

  • Stress level (1–10, morning rating)
  • Sleep quality the night before
  • Alcohol the previous evening (binary or units)
  • Exercise (yes/no, or intensity)
  • Social plans (high-contact vs. low-contact day)
  • Any work or life stressor that's variable across the experiment period

You don't need to track all of these. You need to track the ones that are likely to co-vary with your condition assignment. If you run more on weekends and you're testing a protocol that also happens to change on weekends, week-type is a confounder you should log.

Reading the Warning Signs

A few patterns in your data should raise confound suspicion:

Your condition assignment is predictable from your context. If you can predict, before looking at outcomes, that intervention days cluster on weekends / low-stress periods / times when you're sleeping well — your condition isn't effectively randomized. Whatever drives that clustering is a likely confounder.

Your outcome metric is highly correlated with a tracked confounder. If your focus score tracks your stress level at r = 0.6, and your stress level also tracks whether you did your focus protocol, the protocol's apparent effect is partially or fully explained by stress.

The effect varies dramatically depending on who was in a given condition. If your "caffeine-free days" happen to be your low-workload days and vice versa, the caffeine effect estimate is really a workload effect in disguise.

The effect disappears when you control for a confounder. If including stress level as a covariate in your analysis reduces the apparent intervention effect from 1.2 to 0.3 points, stress was doing most of the work.

What to Do With a Potentially Confounded Result

If you've already run an experiment and you're suspicious that confounding is distorting the results:

Step 1: Check your data for the confound. Look at whether the confounder value is systematically different across your two conditions. Plot stress-by-condition, sleep-by-condition, whatever seems likely. If the distributions are similar, confounding from that variable is probably minimal. If they're clearly different, you have a problem.

Step 2: Adjust statistically if you have enough data. With 40+ trials per condition and a tracked confounder, you can partial out the confounder's effect using regression adjustment. This gives you an estimate of the intervention effect holding the confounder constant. It's not as clean as having randomized in the first place, but it's far better than ignoring the problem.

Step 3: Re-run with better controls. For important questions, a re-run with strict randomization and explicit confounder tracking will give you a cleaner answer than any amount of post-hoc adjustment on the original data.

The goal is not a perfect experiment. The goal is an experiment whose results you can interpret honestly. Knowing where your confounders are is half the battle — an identified confounder is a problem you can manage. An unidentified one is a silent distortion in your conclusions.


Steady Practice now automatically flags confounders that are correlated with your condition assignment (r ≥ 0.3). You'll see these as warnings in your experiment analysis — a starting point for deciding whether a re-run or adjustment is warranted. Start a new experiment or check your existing results.

Evidence briefs

Reviewed claims

Claim-level summaries connect a practical takeaway to the papers that actually support it.

High confidencePublished

A/A tests for randomization integrity verification positive Detection of systematic bias in treatment assignment

A/A tests can detect failures in randomization (e.g., due to cookie churn or implementation bugs) by checking for statistically significant differences on pre-experiment covariates, with false positive rate controlled at the chosen significance level.

Population: Any online controlled experiment · Comparator: No verification or reliance on randomization alone

Primary evidence

Online controlled experiments at large scale

A/A tests can detect failures in randomization (e.g., due to cookie churn or implementation bugs) by checking for statistically significant differences on pre-experiment covariates, with false positive rate controlled at the chosen significance level.

High confidencePublished

Overall Evaluation Criterion (OEC) positive Trustworthiness and actionability of experiment results

Using a composite OEC that is sensitive, reliable, actionable, resistant to gaming, and aligned with long-term goals improves the trustworthiness and actionability of experiment decisions compared to relying on individual metrics alone.

Population: Online controlled experiments (A/B tests) in technology companies · Comparator: Single metrics (e.g., click-through rate, conversion rate)

Primary evidence

Trustworthy Online Controlled Experiments

Using a composite OEC that is sensitive, reliable, actionable, resistant to gaming, and aligned with long-term goals improves the trustworthiness and actionability of experiment decisions compared to relying on individual metrics alone.

High confidencePublished

A/A tests and Sample Ratio Mismatch (SRM) checks positive Detection of systematic biases and false positives

Running A/A tests and checking for SRM helps detect violations of experimental assumptions (e.g., randomization failures, instrumentation errors) that would otherwise lead to untrustworthy conclusions.

Population: Online controlled experiments · Comparator: No diagnostic checks

Primary evidence

Trustworthy Online Controlled Experiments

Running A/A tests and checking for SRM helps detect violations of experimental assumptions (e.g., randomization failures, instrumentation errors) that would otherwise lead to untrustworthy conclusions.

High confidencePublished

Sequential testing (always-valid p-values, group sequential designs) positive Validity of statistical inference under repeated looks

Sequential testing methods allow valid inference even when experimenters peek at results repeatedly, avoiding inflated false positive rates that occur with traditional fixed-horizon tests when stopping early based on interim results.

Population: Online controlled experiments with continuous monitoring · Comparator: Fixed-horizon testing with peeking

Primary evidence

Trustworthy Online Controlled Experiments

Sequential testing methods allow valid inference even when experimenters peek at results repeatedly, avoiding inflated false positive rates that occur with traditional fixed-horizon tests when stopping early based on interim results.

High confidencePublished

Always valid p-values and confidence intervals via mixture sequential probability ratio test (mSPRT) positive Type I error rate control under data-dependent stopping

Always valid methods maintain Type I error at nominal level (e.g., 5%) under any stopping rule, whereas classical methods inflate Type I error to 25% or higher with moderate sample sizes.

Population: A/B tests with continuous monitoring and optional stopping · Comparator: Classical fixed-sample p-values and confidence intervals

Primary evidence

Always Valid Inference: Bringing Sequential Analysis to A/B Testing

Always valid methods maintain Type I error at nominal level (e.g., 5%) under any stopping rule, whereas classical methods inflate Type I error to 25% or higher with moderate sample sizes.

High confidencePublished

Mixture sequential probability ratio test (mSPRT) positive Power and expected sample size

The mSPRT achieves power close to the fixed-horizon test while allowing continuous monitoring, with expected sample size under the alternative being comparable to the fixed-horizon sample size.

Population: A/B tests with i.i.d. observations from single-parameter exponential family · Comparator: Fixed-horizon hypothesis test

Primary evidence

Always Valid Inference: Bringing Sequential Analysis to A/B Testing

The mSPRT achieves power close to the fixed-horizon test while allowing continuous monitoring, with expected sample size under the alternative being comparable to the fixed-horizon sample size.

Evidence base

Min quality:

50 papers

BookWikiCanonicalHigh evidence score

The Design of Experiments

Ronald A. Fisher · Oliver and Boyd · 1935

The foundational text on randomized experiments, experimental control, and the statistical logic of designed experiments.

Read the breakdown →
StudyWikiCanonicalHigh confidence

On Causal Inference in the Presence of Interference

Eric J. Tchetgen Tchetgen, Tyler J. VanderWeele · Statistical Methods in Medical Research · 2012 · 500 citations

A compact overview of causal inference when one unit's treatment can affect another unit's outcome, including finite-sample inference, IPW estimators, and estimands.

Read the breakdown →
StudyPreprintWikiCanonicalModerate

Always Valid Inference: Bringing Sequential Analysis to A/B Testing

Ramesh Johari, Leo Pekelis, David J. Walsh · 2015 · 101 citations

A/B tests are typically analyzed via frequentist p-values and confidence intervals; but these inferences are wholly unreliable if users endogenously choose samples sizes by *continuously monitoring* their tests. We define *always valid* p-values and confidence intervals that let users try to take advantage of data as fast as it becomes available, providing valid statistical inference whenever they make their decision. Always valid inference can be interpreted as a natural interface for a sequential hypothesis test, which empowers users to implement a modified test tailored to them. In particular, we show in an appropriate sense that the measures we develop tradeoff sample size and power efficiently, despite a lack of prior knowledge of the user's relative preference between these two goals. We also use always valid p-values to obtain multiple hypothesis testing control in the sequential context. Our methodology has been implemented in a large scale commercial A/B testing platform to analyze hundreds of thousands of experiments to date.

Read the breakdown →
StudyPreprintWikiCanonicalModerate

A Tutorial on Thompson Sampling

Daniel Russo, Benjamin Van Roy, Abbas Kazerouni +2 more · 2017 · 1,175 citations

Thompson sampling is an algorithm for online decision problems where actions are taken sequentially in a manner that must balance between exploiting what is known to maximize immediate performance and investing to accumulate new information that may improve future performance. The algorithm addresses a broad range of problems in a computationally efficient manner and is therefore enjoying wide use. This tutorial covers the algorithm and its application, illustrating concepts through a range of examples, including Bernoulli bandit problems, shortest path problems, product recommendation, assortment, active learning with neural networks, and reinforcement learning in Markov decision processes. Most of these problems involve complex information structures, where information revealed by taking an action informs beliefs about other actions. We will also discuss when and why Thompson sampling is or is not effective and relations to alternative algorithms.

Read the breakdown →
Systematic ReviewHigh evidence score

Rayyan—a web and mobile app for systematic reviews

Mourad Ouzzani, Hossam M. Hammady, Zbys Fedorowicz +1 more · Systematic Reviews · 2016 · 24,121 citations

BACKGROUND: Synthesis of multiple randomized controlled trials (RCTs) in a systematic review can summarize the effects of individual outcomes and provide numerical answers about the effectiveness of interventions. Filtering of searches is time consuming, and no single method fulfills the principal requirements of speed with accuracy. Automation of systematic reviews is driven by a necessity to expedite the availability of current best evidence for policy and clinical decision-making. We developed Rayyan ( http://rayyan.qcri.org ), a free web and mobile app, that helps expedite the initial screening of abstracts and titles using a process of semi-automation while incorporating a high level of usability. For the beta testing phase, we used two published Cochrane reviews in which included studies had been selected manually. Their searches, with 1030 records and 273 records, were uploaded to Rayyan. Different features of Rayyan were tested using these two reviews. We also conducted a survey of Rayyan's users and collected feedback through a built-in feature. RESULTS: Pilot testing of Rayyan focused on usability, accuracy against manual methods, and the added value of the prediction feature. The "taster" review (273 records) allowed a quick overview of Rayyan for early comments on usability. The second review (1030 records) required several iterations to identify the previously identified 11 trials. The "suggestions" and "hints," based on the "prediction model," appeared as testing progressed beyond five included studies. Post rollout user experiences and a reflexive response by the developers enabled real-time modifications and improvements. The survey respondents reported 40% average time savings when using Rayyan compared to others tools, with 34% of the respondents reporting more than 50% time savings. In addition, around 75% of the respondents mentioned that screening and labeling studies as well as collaborating on reviews to be the two most important features of Rayyan. As of November 2016, Rayyan users exceed 2000 from over 60 countries conducting hundreds of reviews totaling more than 1.6M citations. Feedback from users, obtained mostly through the app web site and a recent survey, has highlighted the ease in exploration of searches, the time saved, and simplicity in sharing and comparing include-exclude decisions. The strongest features of the app, identified and reported in user feedback, were its ability to help in screening and collaboration as well as the time savings it affords to users. CONCLUSIONS: Rayyan is responsive and intuitive in use with significant potential to lighten the load of reviewers.

StudyTop journalModerate

Inference and analysis of cell-cell communication using CellChat

Suoqin Jin, Christian F. Guerrero‐Juarez, Lihua Zhang +6 more · Nature Communications · 2021 · 8,112 citations

Understanding global communications among cells requires accurate representation of cell-cell signaling links and effective systems-level analyses of those links. We construct a database of interactions among ligands, receptors and their cofactors that accurately represent known heteromeric molecular complexes. We then develop CellChat, a tool that is able to quantitatively infer and analyze intercellular communication networks from single-cell RNA-sequencing (scRNA-seq) data. CellChat predicts major signaling inputs and outputs for cells and how those cells and signals coordinate for functions using network analysis and pattern recognition approaches. Through manifold learning and quantitative contrasts, CellChat classifies signaling pathways and delineates conserved and context-specific pathways across different datasets. Applying CellChat to mouse and human skin datasets shows its ability to extract complex signaling patterns. Our versatile and easy-to-use toolkit CellChat and a web-based Explorer ( http://www.cellchat.org/ ) will help discover novel intercellular communications and build cell-cell communication atlases in diverse tissues.

StudyModerate

PsychoPy2: Experiments in behavior made easy

Jonathan W. Peirce, Jeremy Gray, Sol Simpson +5 more · Behavior Research Methods · 2019 · 5,318 citations

PsychoPy is an application for the creation of experiments in behavioral science (psychology, neuroscience, linguistics, etc.) with precise spatial control and timing of stimuli. It now provides a choice of interface; users can write scripts in Python if they choose, while those who prefer to construct experiments graphically can use the new Builder interface. Here we describe the features that have been added over the last 10 years of its development. The most notable addition has been that Builder interface, allowing users to create studies with minimal or no programming, while also allowing the insertion of Python code for maximal flexibility. We also present some of the other new features, including further stimulus options, asynchronous time-stamped hardware polling, and better support for open science and reproducibility. Tens of thousands of users now launch PsychoPy every month, and more than 90 people have contributed to the code. We discuss the current state of the project, as well as plans for the future.

ObservationalHigh evidence score

GSVA: gene set variation analysis for microarray and RNA-Seq data

Sonja Hänzelmann, Robert Castelo, Justin Guinney · BMC Bioinformatics · 2013 · 16,300 citations

BACKGROUND: Gene set enrichment (GSE) analysis is a popular framework for condensing information from gene expression profiles into a pathway or signature summary. The strengths of this approach over single gene analysis include noise and dimension reduction, as well as greater biological interpretability. As molecular profiling experiments move beyond simple case-control studies, robust and flexible GSE methodologies are needed that can model pathway activity within highly heterogeneous data sets. RESULTS: To address this challenge, we introduce Gene Set Variation Analysis (GSVA), a GSE method that estimates variation of pathway activity over a sample population in an unsupervised manner. We demonstrate the robustness of GSVA in a comparison with current state of the art sample-wise enrichment methods. Further, we provide examples of its utility in differential pathway activity and survival analysis. Lastly, we show how GSVA works analogously with data from both microarray and RNA-seq experiments. CONCLUSIONS: GSVA provides increased power to detect subtle pathway activity changes over a sample population in comparison to corresponding methods. While GSE methods are generally regarded as end points of a bioinformatic analysis, GSVA constitutes a starting point to build pathway-centric models of biology. Moreover, GSVA contributes to the current need of GSE methods for RNA-seq data. GSVA is an open source software package for R which forms part of the Bioconductor project and can be downloaded at http://www.bioconductor.org.

StudyModerate

Reporting animal research: Explanation and elaboration for the ARRIVE guidelines 2.0

Nathalie Percie du Sert, Amrita Ahluwalia, Sabina Alam +25 more · PLoS Biology · 2020 · 2,747 citations

Improving the reproducibility of biomedical research is a major challenge. Transparent and accurate reporting is vital to this process; it allows readers to assess the reliability of the findings and repeat or build upon the work of other researchers. The ARRIVE guidelines (Animal Research: Reporting In Vivo Experiments) were developed in 2010 to help authors and journals identify the minimum information necessary to report in publications describing in vivo experiments. Despite widespread endorsement by the scientific community, the impact of ARRIVE on the transparency of reporting in animal research publications has been limited. We have revised the ARRIVE guidelines to update them and facilitate their use in practice. The revised guidelines are published alongside this paper. This explanation and elaboration document was developed as part of the revision. It provides further information about each of the 21 items in ARRIVE 2.0, including the rationale and supporting evidence for their inclusion in the guidelines, elaboration of details to report, and examples of good reporting from the published literature. This document also covers advice and best practice in the design and conduct of animal studies to support researchers in improving standards from the start of the experimental design process through to publication.

StudyModerate

The MR-Base platform supports systematic causal inference across the human phenome

Gibran Hemani, Jie Zheng, Benjamin Elsworth +17 more · eLife · 2018 · 8,325 citations

Results from genome-wide association studies (GWAS) can be used to infer causal relationships between phenotypes, using a strategy known as 2-sample Mendelian randomization (2SMR) and bypassing the need for individual-level data. However, 2SMR methods are evolving rapidly and GWAS results are often insufficiently curated, undermining efficient implementation of the approach. We therefore developed MR-Base (<ext-link ext-link-type="uri" xlink:href="http://www.mrbase.org">http://www.mrbase.org</ext-link>): a platform that integrates a curated database of complete GWAS results (no restrictions according to statistical significance) with an application programming interface, web app and R packages that automate 2SMR. The software includes several sensitivity analyses for assessing the impact of horizontal pleiotropy and other violations of assumptions. The database currently comprises 11 billion single nucleotide polymorphism-trait associations from 1673 GWAS and is updated on a regular basis. Integrating data with software ensures more rigorous application of hypothesis-driven analyses and allows millions of potential causal relationships to be efficiently evaluated in phenome-wide association studies.

StudyTop journalModerate

Online University Teaching During and After the Covid-19 Crisis: Refocusing Teacher Presence and Learning Activity

Chrysi Rapanta, Luca Botturi, Peter Goodyear +2 more · Postdigital Science and Education · 2020 · 2,034 citations

The Covid-19 pandemic has raised significant challenges for the higher education community worldwide. A particular challenge has been the urgent and unexpected request for previously face-to-face university courses to be taught online. Online teaching and learning imply a certain pedagogical content knowledge (PCK), mainly related to designing and organising for better learning experiences and creating distinctive learning environments, with the help of digital technologies. With this article, we provide some expert insights into this online-learning-related PCK, with the goal of helping non-expert university teachers (i.e. those who have little experience with online learning) to navigate in these challenging times. Our findings point at the design of learning activities with certain characteristics, the combination of three types of presence (social, cognitive and facilitatory) and the need for adapting assessment to the new learning requirements. We end with a reflection on how responding to a crisis (as best we can) may precipitate enhanced teaching and learning practices in the postdigital era.

Meta-analysisHigh evidence score

Clinical Trial Designs

Brijesh Nair · Indian Dermatology Online Journal · 2019 · 141 citations

“To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination - he may be able to say what the experiment dies of.” R.A. Fisher The delivery of an intervention whether drug, a dietary change, a lifestyle change, or a psychological therapy session counts as an intervention and hence must be dealt as a clinical trial [Figure 1]. Clinical trial design is an important aspect of interventional trials that serves to optimize, ergonomise and economize the clinical trial conduct. The purpose of the clinical trial is assessment of efficacy, safety, or risk benefit ratio. Goal may be superiority, non-inferiority, or equivalence. A well-conducted study with a good design based on a robust hypothesis evolved from clinical practice goes a long way in facilitating the implementation of the best tenets of evidence-based practice. A robust well-powered trial adds to the meta-analyzable evidence base and contributes huge quanta to our knowledge of dermatological practice. This article sets out to describe the various trial designs and modifications and attempts to delineate the pros and cons of each design and attempts to provide illustrative samples for the same where possible.Figure 1: Basic framework of clinical trialsUncontrolled Trials This design incorporates no control arm. This design is usually utilized to determine pharmacokinetic properties of a new drug (Phase 1 trials). Uncontrolled trials are known to produce greater mean effect estimates than a controlled trial, thereby inflating the expectations from the intervention. There is a threat of inherent bias and results are considered less valid than RCT. Another issue is use of this design in spontaneously resolving maladies that might again overstate the effect [Figure 2].Figure 2: Single arm trial schematicIllustrative example In immunotherapy in warts, it is imperative to avoid an uncontrolled study. Warts can be self-resolving and hence the efficacy of immunotherapy as opposed to the self-resolution compromises the validity of the results. Control Arm Options in Controlled Trials Controlled trials allow discrimination of the patient outcome from an outcome caused by other factors (such as natural history or observer or patient expectation). Choosing a right control at the right dose and right frequency is pivotal to trial success. The controls which can be used are: Placebo concurrent control – Placebo is a form of inert substance, or an intervention designed to simulate medical therapy, without specificity for the condition being treated. The placebo must share the same appearance, frequency, and formulation as the active drug. Placebo control helps to discriminate outcomes due to intervention (new product) from outcomes due to other factors. This design is used to demonstrate superiority or equivalence. This design must be adopted only when no effective treatment exits, and it will be deemed unethical to use a placebo control if an effective standard of care exits. Placebo must only be used if no permanent harm (death or irreversible morbidity) accrues by delaying available active treatment for the duration of the trial and is preferable for a minimal risk, short-term study “No treatment” concurrent control – No intervention will be administered in control arm in this design. Study end points must be objective in this design. The downsides are potential for observer bias and difficulty in blinding in this design Active treatment concurrent control – This design involves comparison of a new drug to a standard drug or compare combination of new and standard therapies vis a vis standard therapy alone. A therapeutic modality that should preferably be the current standard of care against which the active drug to be studied is compared with. This design can be used to demonstrate equivalence, non-inferiority, and superiority. This design is most ethical whenever approved drugs are available for the disease under study. The Declaration of Helsinki mandates the use of standard treatment as controls Dose-comparison concurrent control – Different doses or regimens of same treatment are used as active arm and control arm in this design. The purpose is to establish a relationship between dose and efficacy/safety of the intervention. This design may include active and placebo groups also in addition to the different dose groups. This design may be inefficient if the therapeutic range of the drug is not known Historical control (external and non-concurrent) – Source of controls are external to the present study and were treated at an earlier time (earlier therapeutic gold standard) or in a different setting. The advantage of historical controls is in studying rare conditions where sample size generation is difficult. The downside is that no randomization or blinding is possible in this design. A disadvantage is that the co-interventions evolve in due course of time thereby reducing the comparability of the present intervention versus historic control. Another deficiency of this design is the difference between baseline characteristics of subjects in trial arm versus historical arm. For example, toxic epidermal necrolysis, where clinical outcomes in cyclosporine treated patients can be compared with historical controls treated in the same center with IVIg in the past. Variants of Placebo Controlled Trial Designs Add-on design – This design denotes a placebo-controlled comparison on top of a standard treatment given to all patients. If the improvement that is achievable in addition to that obtained from the standard treatment is small, the size of such trial may need to be very large Early escape design – The early escape design using a placebo control allows a patient to be withdrawn from the study as soon as a predefined negative efficacy criterion has been attained. This reduces the time on placebo or in treatment failure. This design analyses failure rate, so minimizes exposure to ineffective treatment. The time for withdrawal is then used as the primary outcome variable. The patient could then be switched over to another therapy, including the test treatment if appropriate. The attendant limitations are sacrifice of study power with increased “escape” cases and evaluation of only short-term efficacy. If the drug has a slow and deliberate effect on long-term use then that might be missed in this design Unbalanced assignment of patients to placebo and test treatment. By this design it is implied that a smaller number of patients could be assigned to the placebo group compared to the test treatment group (e.g., 2/3 case arm, 1/3 placebo arm) Double-dummy design [Figure 3] – This design is of great utility if the comparator interventions are of different natureIllustrative example – Comparison of oral acitretin versus injection purified protein derivative (PPD) in extensive verruca vulgaris. So, blinding of patients is not feasible in this scenario. But this issue can be circumvented by administering acitretin orally with a dummy injection like normal saline to one study group and injection PPD along with placebo capsule identical in size and appearance to the acitretin capsule to the comparator armPlacebo run-in design – Placebo run-in period is a period before a clinical trial is commenced, when placebo is administered for all study subjects. The clinical data from this stage of a trial are only occasionally of value but can serve a valuable role in screening out ineligible or non-compliant participants, in ensuring that participants are in a stable condition, and also helps in providing baseline observations. After the run-in phase, randomisation is done, patients are randomized into study arms where different active interventions are added to the placebo in each study arm [Figure 4]. Figure 3: Double dummy trial designFigure 4: Run in designRandomized Clinical Trials (RCT) In randomized controlled trials, trial participants are randomly assigned to either treatment or control arms. The process of randomly assigning a trial participant to treatment or control arms is called “randomization”. Different tools can be used to randomize (closed envelopes, computer generated sequences, random numbers). There are two components to randomization: the generation of a random sequence and the implementation of that random sequence, ideally in a way that keeps participants unaware of the sequence (allocation concealment). Randomization removes potential for systematic error or bias. The biggest upside of an RCT is the balancing of both the known and unknown confounding factors which leads to wrong conclusions. Randomization Schemes in Randomized Controlled Trials to Eliminate Confounding Factors Stratified randomization – This refers to the situation in which strata are constructed based on values of prognostic variables and a randomization scheme is implemented separately within each stratum. The objective of stratified randomization is to ensure balance of the treatment groups with respect to the various combinations of the prognostic variables. This method can be used to achieve balance among groups in terms of subjects' baseline characteristics (covariates). Specific covariates must be identified by the researcher who understands the potential influence each covariate has on the dependent variable. To avoid strata with very less patients, the number of strata should be kept minimum. After all the subjects have been identified and assigned into strata, simple randomization is performed within each stratum to assign subjects to either case or control groups Block randomization – Blocking is the arranging of experimental units in groups (blocks) that are similar to one another. Typically, a blocking factor is a source of variability that is not of primary interest to the experimenter. An example of a blocking factor might be the sex of a patient; by blocking on sex, this source of variability is controlled for, thus leading to greater accuracy. The block randomization method is designed to randomize subjects into groups that result in equal sample sizes. This method is used to ensure a balance in sample size across groups over time. Blocks are small and balanced with predetermined group assignments, which keeps the numbers of subjects in each group similar at all times Randomization by body halves or paired organs (Split Body trials) – This is a scenario most often used in dermatology and ophthalmic practice where one intervention is administered to one half of the body and the comparator intervention is assigned to other half of the body. This can be implemented only if experimental treatment acts locally. Randomization is used to select which side of the body receives which drug. The upside is the elimination of confounding factors between trial arms, as the baseline characteristics of both arms are the same. The downside is difficulty in blinding the investigator, statistical analysis, and influence of therapy administered in one half of the body influencing disease on the other side as the halves of the human body is a continuum and not entirely independent entities (carryover of the experimental treatment to control half). Allocation between paired organs/split skin can obscure systemic adverse events. Paired data statistical analytic tests need to be done in this scenario Cluster randomization – Study patients and treating interventionists do not exist in isolation. Sometimes interventions need to be applied at ward level, village level, hospital level, or group practice level. Hence intervention is administered to clusters by randomization to prevent contamination. Each cluster forms a unit of the trial and either active or comparator intervention is administered for each cluster Allocation by randomized consent (Zelen trials) – Eligible patients are allocated to one of the two trial arms prior to informed consent. This is utilized when informed consent process acts as an impediment to study subject accrual. However, this design raises serious ethical uncertainties and must only be used in severely flagging trials in terms of insufficient sample size of great public health importance and is not recommended in routine clinical trial design Minimization – Stratification based on multiple co-variates (age, sex, gender, baseline severity of disease, personal habits, co-morbidities, treatment naivety, etc.) leads to excessive number of strata with smaller number of patients at times in each strata. Hence, an alternate strategy to control for prognostic variables to avoid such small strata is minimization. After identification of these variables, they are dichotomized at some break point in case of continuous variables or based on presence or absence of a categorical variable. Then each dichotomized half is given a value of 0 or 1 (e.g., male = 0, female = 1; age <50 years = 0, age >50 years = 1). Thus, in a female of age 55, the total will be 1 + 1 = 2. A male of age 65 will be allocated 0 + 1 = 1 point, a female of age 45 will have score of 1 + 0 = 1 point, etc. For example, patient number 1 with score 2 is randomized to control arm. Patient no. 2 has 1 point and to minimize the difference in total scores between the study arms, he is allocated to case arm. So now the control arm total score is 2 and case arm score is 1. Patient 3 is a female with score 1 and will be allocated to case arm and thus the cumulative score in both groups will be balanced at 2 points. Once the running scores in both arms are tied, the next recruited subject is again randomly allocated and the whole cycle repeats. Thus, minimization is a viable alternative to randomization for known prognostic factors, but does not factor in the unknown prognostic confounding variables. Hence, it can be considered a platinum standard to the gold standard of random allocation. RCT Designs a. Parallel group trial design Parallel arm design is the most commonly used study design. In this design, subjects are randomized to one or more study arms and each study arm will be allocated a different intervention. After randomization each participant will stay in their assigned treatment arm for the duration of the study [Figure 5]. Parallel group design can be applied to and allows running in a number of and groups can be in The randomized patients in groups should not the other group by co-interventions or Parallel arm in a arm trial design of study subject into study after consent Randomization Allocation to either test drug arm or control arm example – A trial of and in where clinical as efficacy can be as a randomized controlled as active arm trial design. over design Another advantage is of a smaller sample size [Figure The ethical limitations of a placebo control are by a over design in which each patient receives both interventions but in a different The in which patient receives interventions is Each serves as control results in balancing the covariates in treatment and control arms. Another advantage is of a smaller sample In this design, some participants with drug A and then to drug in one trial arm, subjects in other trial arm with drug and then to drug A period must be given before to the of the administered intervention. are then compared within the same subject of A effect of The are two The disease must be and and characteristics must not for the duration of the two study and the out period and the effect of each drug must not be and usually a over design. The duration of for the patient is than for a design, and is a risk that a number of patients do not the study and out leading to study points over trials are in over trial 1: to be in over of designs design – A A in one arm versus A in other arm. The and multiple designs are of with the of where and of a to a can only be by such trial of 1 design – of 1 trials or or randomized controlled trial are used to all interventions in a A patient trial of treatment a number of The of treatment is randomly assigned within each treatment period the primary objective of such a trial is to determine the treatment for the patient and this design is in The advantage of this design is such that it can be a can be for the subject being The utility also in that in different subjects. from 1 subjects can be to effect by or 2 This is a design for the study of two or more interventions in various combinations in one study and helps in the study of from combination of use versus of and different dose of one can be studied as in Figure This design can two or more with one trial and more for the with sample sizes. In a 2 2 design with patients are randomized into to treatment A treatment both A and or of placebo are using of all patients who treatment A 1 and with not treated with A 2 and and all patients who treatment 2 and with not treated with 1 and The sample size reduces by as compared to out drug A and drug comparison with placebo in 2 different However, a is that is no between A and If exits, then of power is possible in case of analyses of the different If an is then that has to be into the sample size in addition to sample Hence, it is not for rare where between A and are The limitations of this trial design are of trial, difficulty in of both drugs study subject to two and statistical designs are used when it is deemed unethical to a and the placebo only arm is Randomized withdrawal design randomized withdrawal In this design, after an period which all subjects are assigned to the are from the trial and the are randomized to intervention or placebo in the of the only are and Study is using only data from the withdrawal and outcome is usually of The statistical power for a given sample The randomized withdrawal design to the duration of a treatment in patients who to the treatment. The advantage is in the time on placebo only are randomized to placebo thereby an ethical This to trial subjects and hence This design can if treatment to be or can be This design allows subjects who have therapy withdrawn arm to effective therapy after they have the study (e.g., of thereby both ethical and patient placebo assignment by the time that patients are on placebo [Figure withdrawal of are on a and a group of patients at the are from the trial due to of efficacy. are on the drug or are assigned to placebo and of at 1 after randomization can be compared between two arms. If is no difference in outcomes between the arms, an can be administered is and then thereby reducing of The of this design are data due to that can be by or using time to over from to can external validity of results. The treatment effect is only are for (e.g., or with slow can be used only for stable duration of is Study Designs a. randomization the the This is only for with outcomes and are most when the effect size being is The and the designs to the group with the best of by the of patients being randomized to that The of being randomized to one group or another is to the results obtained with patients. The of each patient after treatment an role in the of of the study In the design, more study subjects are randomized to the effective intervention. In the study subjects are from the ineffective intervention arm. The advantage is exposure of subjects to an effective intervention and increased of and this can also result in group which can statistical The can be based on the which can be in select trial 2: Factors to in design 3: designs used in clinical trials – This design reduces patient to ineffective intervention arms. available is in and therapies with a time to This design can and result in bias as trial can also be by in patient or treatment characteristics over time that are with the treatment from inherent of and of this design has subjects randomized to interventions and The best therapy from 1 is compared with placebo in a randomized or design in The comparison is between all subjects the intervention versus all the subjects placebo in both is best for multiple intervention comparison in sample size However, is a that wrong of the most therapy in will the trial results design – This design allows and end point of efficacy, or is In to trials, the number of participants for a trial is unknown at The trial at the which thereby the number of subjects to an or treatment or one that is can be performed after each patient or after a or number of patients This design is only effective when study is to be and treatment outcomes soon after so that outcomes can be before the next patient or group of patients is to be include the of multiple and of and number of design There is also interest in a of continuous trial process by different trial from to called design. the stage of trials and stage of In the of subjects are randomized into the treatment arms of therapy of A and or control. An is then performed to determine which active arm should be In the stage of the treatment groups with the effective active arm and control arm will be This design has two and In the the subjects will their treatment arm from to phase, and the data in both will be For the the data in two are design In a participants are often ineligible for along with cases in due to and patients are in number as in case of rare to a study than the study could be as a In an the of the study is a and the study is this sample size is and incorporates the subjects In to external can be as they do not patients and do not time or of Trial Designs is no study design for each trial an is in 2 and 4: for of study in statistical and power to allow for more study designs and and should best advantage of these There is a of evidence generated by alternative trial designs by and is imperative to that same may be alternative designs and that is no trial design for The time and of study subjects are to such as objective of the trial, number of patients of trial, and the variability is could be important in the of the most trial design. The are of the that no trial design is and no design to all In this all the must the to study the most design among a of and they must in trial design and and of interest There are no of

StudyModerate

Gorilla in our midst: An online behavioral experiment builder

Alexander Leslie Anwyl-Irvine, Jessica Massonnié, Adam Flitton +2 more · Behavior Research Methods · 2019 · 1,905 citations

Behavioral researchers are increasingly conducting their studies online, to gain access to large and diverse samples that would be difficult to get in a laboratory environment. However, there are technical access barriers to building experiments online, and web browsers can present problems for consistent timing-an important issue with reaction-time-sensitive measures. For example, to ensure accuracy and test-retest reliability in presentation and response recording, experimenters need a working knowledge of programming languages such as JavaScript. We review some of the previous and current tools for online behavioral research, as well as how well they address the issues of usability and timing. We then present the Gorilla Experiment Builder (gorilla.sc), a fully tooled experiment authoring and deployment platform, designed to resolve many timing issues and make reliable online experimentation open and accessible to a wider range of technical abilities. To demonstrate the platform's aptitude for accessible, reliable, and scalable research, we administered a task with a range of participant groups (primary school children and adults), settings (without supervision, at home, and under supervision, in both schools and public engagement events), equipment (participant's own computer, computer supplied by the researcher), and connection types (personal internet connection, mobile phone 3G/4G). We used a simplified flanker task taken from the attentional network task (Rueda, Posner, & Rothbart, 2004). We replicated the "conflict network" effect in all these populations, demonstrating the platform's capability to run reaction-time-sensitive experiments. Unresolved limitations of running experiments online are then discussed, along with potential solutions and some future features of the platform.

StudyTop journalModerate

Massively parallel digital transcriptional profiling of single cells

Grace Zheng, Jessica M. Terry, Phillip Belgrader +30 more · Nature Communications · 2017 · 7,718 citations

Characterizing the transcriptome of individual cells is fundamental to understanding complex biological systems. We describe a droplet-based system that enables 3' mRNA counting of tens of thousands of single cells per sample. Cell encapsulation, of up to 8 samples at a time, takes place in ∼6 min, with ∼50% cell capture efficiency. To demonstrate the system's technical performance, we collected transcriptome data from ∼250k single cells across 29 samples. We validated the sensitivity of the system and its ability to detect rare populations using cell lines and synthetic RNAs. We profiled 68k peripheral blood mononuclear cells to demonstrate the system's ability to characterize large immune populations. Finally, we used sequence variation in the transcriptome data to determine host and donor chimerism at single-cell resolution from bone marrow mononuclear cells isolated from transplant patients.

StudyTop journalWikiModerate

A national experiment reveals where a growth mindset improves achievement

David S. Yeager, Paul Hanselman, Gregory M. Walton +22 more · Nature · 2019 · 1,424 citations

A single, 50-minute online session teaching that intelligence can be developed raised lower-achieving students' grades by 0.11 grade points (roughly a 3% improvement) and increased advanced math enrolment by 5 percentage points, but only in schools where peer norms already supported challenge-seeking.

Read the breakdown →
StudyModerate

Enrichr: a comprehensive gene set enrichment analysis web server 2016 update

Maxim V. Kuleshov, Matthew R. Jones, Andrew D. Rouillard +11 more · Nucleic Acids Research · 2016 · 11,578 citations

Enrichment analysis is a popular method for analyzing gene sets generated by genome-wide experiments. Here we present a significant update to one of the tools in this domain called Enrichr. Enrichr currently contains a large collection of diverse gene set libraries available for analysis and download. In total, Enrichr currently contains 180 184 annotated gene sets from 102 gene set libraries. New features have been added to Enrichr including the ability to submit fuzzy sets, upload BED files, improved application programming interface and visualization of the results as clustergrams. Overall, Enrichr is a comprehensive resource for curated gene sets and a search engine that accumulates biological knowledge for further biological discoveries. Enrichr is freely available at: http://amp.pharm.mssm.edu/Enrichr.

StudyModerate

ImageJ2: ImageJ for the next generation of scientific image data

Curtis Rueden, Johannes Schindelin, Mark Hiner +4 more · BMC Bioinformatics · 2017 · 6,205 citations

BACKGROUND: ImageJ is an image analysis program extensively used in the biological sciences and beyond. Due to its ease of use, recordable macro language, and extensible plug-in architecture, ImageJ enjoys contributions from non-programmers, amateur programmers, and professional developers alike. Enabling such a diversity of contributors has resulted in a large community that spans the biological and physical sciences. However, a rapidly growing user base, diverging plugin suites, and technical limitations have revealed a clear need for a concerted software engineering effort to support emerging imaging paradigms, to ensure the software's ability to handle the requirements of modern science. RESULTS: We rewrote the entire ImageJ codebase, engineering a redesigned plugin mechanism intended to facilitate extensibility at every level, with the goal of creating a more powerful tool that continues to serve the existing community while addressing a wider range of scientific requirements. This next-generation ImageJ, called "ImageJ2" in places where the distinction matters, provides a host of new functionality. It separates concerns, fully decoupling the data model from the user interface. It emphasizes integration with external applications to maximize interoperability. Its robust new plugin framework allows everything from image formats, to scripting languages, to visualization to be extended by the community. The redesigned data model supports arbitrarily large, N-dimensional datasets, which are increasingly common in modern image acquisition. Despite the scope of these changes, backwards compatibility is maintained such that this new functionality can be seamlessly integrated with the classic ImageJ interface, allowing users and developers to migrate to these new methods at their own pace. CONCLUSIONS: Scientific imaging benefits from open-source programs that advance new method development and deployment to a diverse audience. ImageJ has continuously evolved with this idea in mind; however, new and emerging scientific requirements have posed corresponding challenges for ImageJ's development. The described improvements provide a framework engineered for flexibility, intended to support these requirements as well as accommodate future needs. Future efforts will focus on implementing new algorithms in this framework and expanding collaborations with other popular scientific software suites.

StudyModerate

The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update

Enis Afgan, Dannon Baker, Bérénice Batut +17 more · Nucleic Acids Research · 2018 · 3,908 citations

Galaxy (homepage: https://galaxyproject.org, main public server: https://usegalaxy.org) is a web-based scientific analysis platform used by tens of thousands of scientists across the world to analyze large biomedical datasets such as those found in genomics, proteomics, metabolomics and imaging. Started in 2005, Galaxy continues to focus on three key challenges of data-driven biomedical science: making analyses accessible to all researchers, ensuring analyses are completely reproducible, and making it simple to communicate analyses so that they can be reused and extended. During the last two years, the Galaxy team and the open-source community around Galaxy have made substantial improvements to Galaxy's core framework, user interface, tools, and training materials. Framework and user interface improvements now enable Galaxy to be used for analyzing tens of thousands of datasets, and >5500 tools are now available from the Galaxy ToolShed. The Galaxy community has led an effort to create numerous high-quality tutorials focused on common types of genomic analyses. The Galaxy developer and user communities continue to grow and be integral to Galaxy's development. The number of Galaxy public servers, developers contributing to the Galaxy framework and its tools, and users of the main Galaxy server have all increased substantially.

StudyModerate

deepTools2: a next generation web server for deep-sequencing data analysis

Fidel Ramírez, Devon Ryan, Björn Grüning +6 more · Nucleic Acids Research · 2016 · 8,928 citations

We present an update to our Galaxy-based web server for processing and visualizing deeply sequenced data. Its core tool set, deepTools, allows users to perform complete bioinformatic workflows ranging from quality controls and normalizations of aligned reads to integrative analyses, including clustering and visualization approaches. Since we first described our deepTools Galaxy server in 2014, we have implemented new solutions for many requests from the community and our users. Here, we introduce significant enhancements and new tools to further improve data visualization and interpretation. deepTools continue to be open to all users and freely available as a web service at deeptools.ie-freiburg.mpg.de The new deepTools2 suite can be easily deployed within any Galaxy framework via the toolshed repository, and we also provide source code for command line usage under Linux and Mac OS X. A public and documented API for access to deepTools functionality is also available.

StudyModerate

The Ensembl Variant Effect Predictor

William McLaren, Laurent Gil, Sarah Hunt +5 more · Genome biology · 2016 · 8,659 citations

The Ensembl Variant Effect Predictor is a powerful toolset for the analysis, annotation, and prioritization of genomic variants in coding and non-coding regions. It provides access to an extensive collection of genomic annotation, with a variety of interfaces to suit different requirements, and simple options for configuring and extending analysis. It is open source, free to use, and supports full reproducibility of results. The Ensembl Variant Effect Predictor can simplify and accelerate variant interpretation in a wide range of study designs.

StudyModerate

The STRING database in 2017: quality-controlled protein–protein association networks, made broadly accessible

Damian Szklarczyk, John H. Morris, Helen Cook +9 more · Nucleic Acids Research · 2016 · 7,429 citations

A system-wide understanding of cellular function requires knowledge of all functional interactions between the expressed proteins. The STRING database aims to collect and integrate this information, by consolidating known and predicted protein-protein association data for a large number of organisms. The associations in STRING include direct (physical) interactions, as well as indirect (functional) interactions, as long as both are specific and biologically meaningful. Apart from collecting and reassessing available experimental data on protein-protein interactions, and importing known pathways and protein complexes from curated databases, interaction predictions are derived from the following sources: (i) systematic co-expression analysis, (ii) detection of shared selective signals across genomes, (iii) automated text-mining of the scientific literature and (iv) computational transfer of interaction knowledge between organisms based on gene orthology. In the latest version 10.5 of STRING, the biggest changes are concerned with data dissemination: the web frontend has been completely redesigned to reduce dependency on outdated browser technologies, and the database can now also be queried from inside the popular Cytoscape software framework. Further improvements include automated background analysis of user inputs for functional enrichments, and streamlined download options. The STRING resource is available online, at http://string-db.org/.

StudyModerate

TrackMate: An open and extensible platform for single-particle tracking

Jean-Yves Tinévez, Nick Perry, Johannes Schindelin +6 more · Methods · 2016 · 3,713 citations

We present TrackMate, an open source Fiji plugin for the automated, semi-automated, and manual tracking of single-particles. It offers a versatile and modular solution that works out of the box for end users, through a simple and intuitive user interface. It is also easily scriptable and adaptable, operating equally well on 1D over time, 2D over time, 3D over time, or other single and multi-channel image variants. TrackMate provides several visualization and analysis tools that aid in assessing the relevance of results. The utility of TrackMate is further enhanced through its ability to be readily customized to meet specific tracking problems. TrackMate is an extensible platform where developers can easily write their own detection, particle linking, visualization or analysis algorithms within the TrackMate environment. This evolving framework provides researchers with the opportunity to quickly develop and optimize new algorithms based on existing TrackMate modules without the need of having to write de novo user interfaces, including visualization, analysis and exporting tools. The current capabilities of TrackMate are presented in the context of three different biological problems. First, we perform Caenorhabditis-elegans lineage analysis to assess how light-induced damage during imaging impairs its early development. Our TrackMate-based lineage analysis indicates the lack of a cell-specific light-sensitive mechanism. Second, we investigate the recruitment of NEMO (NF-κB essential modulator) clusters in fibroblasts after stimulation by the cytokine IL-1 and show that photodamage can generate artifacts in the shape of TrackMate characterized movements that confuse motility analysis. Finally, we validate the use of TrackMate for quantitative lifetime analysis of clathrin-mediated endocytosis in plant cells.

StudyModerate

limma powers differential expression analyses for RNA-sequencing and microarray studies

Matthew E. Ritchie, Belinda Phipson, Di Wu +4 more · Nucleic Acids Research · 2015 · 42,631 citations

limma is an R/Bioconductor software package that provides an integrated solution for analysing data from gene expression experiments. It contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. Over the past decade, limma has been a popular choice for gene discovery through differential expression analyses of microarray and high-throughput PCR data. The package contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions. First, the package can now perform both differential expression and differential splicing analyses of RNA sequencing (RNA-seq) data. All the downstream analysis tools previously restricted to microarray data are now available for RNA-seq as well. These capabilities allow users to analyse both RNA-seq and microarray data with very similar pipelines. Second, the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences. This article reviews the philosophy and design of the limma package, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

StudyModerate

SCANPY: large-scale single-cell gene expression data analysis

F. Alexander Wolf, Philipp Angerer, Fabian J. Theis · Genome biology · 2018 · 9,151 citations

Scanpy is a scalable toolkit for analyzing single-cell gene expression data. It includes methods for preprocessing, visualization, clustering, pseudotime and trajectory inference, differential expression testing, and simulation of gene regulatory networks. Its Python-based implementation efficiently deals with data sets of more than one million cells ( https://github.com/theislab/Scanpy ). Along with Scanpy, we present AnnData, a generic class for handling annotated data matrices ( https://github.com/theislab/anndata ).

StudyTop journalModerate

Integrative analysis of 111 reference human epigenomes

Anshul Kundaje, Wouter Meuleman, Jason Ernst +92 more · Nature · 2015 · 7,081 citations

The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but epigenomic studies lack a similar reference. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection so far of human epigenomes for primary cells and tissues. Here we describe the integrative analysis of 111 reference human epigenomes generated as part of the programme, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease- and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation and human disease.

StudyModerate

A survey on Image Data Augmentation for Deep Learning

Connor Shorten, Taghi M. Khoshgoftaar · Journal Of Big Data · 2019 · 12,157 citations

Deep convolutional neural networks have performed remarkably well on many Computer Vision tasks. However, these networks are heavily reliant on big data to avoid overfitting. Overfitting refers to the phenomenon when a network learns a function with very high variance such as to perfectly model the training data. Unfortunately, many application domains do not have access to big data, such as medical image analysis. This survey focuses on Data Augmentation, a data-space solution to the problem of limited data. Data Augmentation encompasses a suite of techniques that enhance the size and quality of training datasets such that better Deep Learning models can be built using them. The image augmentation algorithms discussed in this survey include geometric transformations, color space augmentations, kernel filters, mixing images, random erasing, feature space augmentation, adversarial training, generative adversarial networks, neural style transfer, and meta-learning. The application of augmentation methods based on GANs are heavily covered in this survey. In addition to augmentation techniques, this paper will briefly discuss other characteristics of Data Augmentation such as test-time augmentation, resolution impact, final dataset size, and curriculum learning. This survey will present existing methods for Data Augmentation, promising developments, and meta-level decisions for implementing Data Augmentation. Readers will understand how Data Augmentation can improve the performance of their models and expand limited datasets to take advantage of the capabilities of big data.

RCTHigh evidence score

A roadmap to using randomization in clinical trials

Vance W. Berger, Louis Bour, Kerstine Carter +12 more · BMC Medical Research Methodology · 2021 · 126 citations

BACKGROUND: Randomization is the foundation of any clinical trial involving treatment comparison. It helps mitigate selection bias, promotes similarity of treatment groups with respect to important known and unknown confounders, and contributes to the validity of statistical tests. Various restricted randomization procedures with different probabilistic structures and different statistical properties are available. The goal of this paper is to present a systematic roadmap for the choice and application of a restricted randomization procedure in a clinical trial. METHODS: We survey available restricted randomization procedures for sequential allocation of subjects in a randomized, comparative, parallel group clinical trial with equal (1:1) allocation. We explore statistical properties of these procedures, including balance/randomness tradeoff, type I error rate and power. We perform head-to-head comparisons of different procedures through simulation under various experimental scenarios, including cases when common model assumptions are violated. We also provide some real-life clinical trial examples to illustrate the thinking process for selecting a randomization procedure for implementation in practice. RESULTS: Restricted randomization procedures targeting 1:1 allocation vary in the degree of balance/randomness they induce, and more importantly, they vary in terms of validity and efficiency of statistical inference when common model assumptions are violated (e.g. when outcomes are affected by a linear time trend; measurement error distribution is misspecified; or selection bias is introduced in the experiment). Some procedures are more robust than others. Covariate-adjusted analysis may be essential to ensure validity of the results. Special considerations are required when selecting a randomization procedure for a clinical trial with very small sample size. CONCLUSIONS: The choice of randomization design, data analytic technique (parametric or nonparametric), and analysis strategy (randomization-based or population model-based) are all very important considerations. Randomization-based tests are robust and valid alternatives to likelihood-based tests and should be considered more frequently by clinical investigators.

StudyModerate

scikit-image: image processing in Python

Stéfan van der Walt, Johannes L. Schönberger, Juan Nunez-Iglesias +5 more · PeerJ · 2014 · 6,862 citations

scikit-image is an image processing library that implements algorithms and utilities for use in research, education and industry applications. It is released under the liberal Modified BSD open source license, provides a well-documented API in the Python programming language, and is developed by an active, international team of collaborators. In this paper we highlight the advantages of open source to achieve the goals of the scikit-image library, and we showcase several real-world image processing applications that use scikit-image. More information can be found on the project homepage, http://scikit-image.org.

StudyModerate

Commentary: The Materials Project: A materials genome approach to accelerating materials innovation

Anubhav Jain, Shyue Ping Ong, Geoffroy Hautier +8 more · APL Materials · 2013 · 12,469 citations

Accelerating the discovery of advanced materials is essential for human welfare and sustainable, clean energy. In this paper, we introduce the Materials Project (www.materialsproject.org), a core program of the Materials Genome Initiative that uses high-throughput computing to uncover the properties of all known inorganic materials. This open dataset can be accessed through multiple channels for both interactive exploration and data mining. The Materials Project also seeks to create open-source platforms for developing robust, sophisticated materials analyses. Future efforts will enable users to perform ‘‘rapid-prototyping’’ of new materials in silico, and provide researchers with new avenues for cost-effective, data-driven materials design.

StudyModerate

Bioconductor: open software development for computational biology and bioinformatics

Robert Gentleman, Vincent J. Carey, Douglas M. Bates +22 more · Genome biology · 2004 · 12,503 citations

The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into interdisciplinary scientific research, and promoting the achievement of remote reproducibility of research results. We describe details of our aims and methods, identify current challenges, compare Bioconductor to other open bioinformatics projects, and provide working examples.

StudyModerate

Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool

Edward Y. Chen, Christopher M. Tan, Yan Kou +5 more · BMC Bioinformatics · 2013 · 8,333 citations

BACKGROUND: System-wide profiling of genes and proteins in mammalian cells produce lists of differentially expressed genes/proteins that need to be further analyzed for their collective functions in order to extract new knowledge. Once unbiased lists of genes or proteins are generated from such experiments, these lists are used as input for computing enrichment with existing lists created from prior knowledge organized into gene-set libraries. While many enrichment analysis tools and gene-set libraries databases have been developed, there is still room for improvement. RESULTS: Here, we present Enrichr, an integrative web-based and mobile software application that includes new gene-set libraries, an alternative approach to rank enriched terms, and various interactive visualization approaches to display enrichment results using the JavaScript library, Data Driven Documents (D3). The software can also be embedded into any tool that performs gene list analysis. We applied Enrichr to analyze nine cancer cell lines by comparing their enrichment signatures to the enrichment signatures of matched normal tissues. We observed a common pattern of up regulation of the polycomb group PRC2 and enrichment for the histone mark H3K27me3 in many cancer cell lines, as well as alterations in Toll-like receptor and interlukin signaling in K562 cells when compared with normal myeloid CD33+ cells. Such analyses provide global visualization of critical differences between normal tissues and cancer cell lines but can be applied to many other scenarios. CONCLUSIONS: Enrichr is an easy to use intuitive enrichment analysis web-based tool providing various types of visualization summaries of collective functions of gene lists. Enrichr is open source and freely available online at: http://amp.pharm.mssm.edu/Enrichr.

StudyModerate

Characterising and justifying sample size sufficiency in interview-based studies: systematic analysis of qualitative health research over a 15-year period

Konstantina Vasileiou, Julie Barnett, Susan J. Thorpe +1 more · BMC Medical Research Methodology · 2018 · 3,054 citations

BACKGROUND: Choosing a suitable sample size in qualitative research is an area of conceptual debate and practical uncertainty. That sample size principles, guidelines and tools have been developed to enable researchers to set, and justify the acceptability of, their sample size is an indication that the issue constitutes an important marker of the quality of qualitative research. Nevertheless, research shows that sample size sufficiency reporting is often poor, if not absent, across a range of disciplinary fields. METHODS: A systematic analysis of single-interview-per-participant designs within three health-related journals from the disciplines of psychology, sociology and medicine, over a 15-year period, was conducted to examine whether and how sample sizes were justified and how sample size was characterised and discussed by authors. Data pertinent to sample size were extracted and analysed using qualitative and quantitative analytic techniques. RESULTS: Our findings demonstrate that provision of sample size justifications in qualitative health research is limited; is not contingent on the number of interviews; and relates to the journal of publication. Defence of sample size was most frequently supported across all three journals with reference to the principle of saturation and to pragmatic considerations. Qualitative sample sizes were predominantly - and often without justification - characterised as insufficient (i.e., 'small') and discussed in the context of study limitations. Sample size insufficiency was seen to threaten the validity and generalizability of studies' results, with the latter being frequently conceived in nomothetic terms. CONCLUSIONS: We recommend, firstly, that qualitative health researchers be more transparent about evaluations of their sample size sufficiency, situating these within broader and more encompassing assessments of data adequacy. Secondly, we invite researchers critically to consider how saturation parameters found in prior methodological studies and sample size community norms might best inform, and apply to, their own project and encourage that data adequacy is best appraised with reference to features that are intrinsic to the study at hand. Finally, those reviewing papers have a vital role in supporting and encouraging transparent study-specific reporting.

StudyModerate

The CMS experiment at the CERN LHC

S. Chatrchyan, G. Hmayakyan, V. Khachatryan +97 more · Journal of Instrumentation · 2008 · 5,435 citations

The Compact Muon Solenoid (CMS) detector is described. The detector operates at the Large Hadron Collider (LHC) at CERN. It was conceived to study proton-proton (and lead-lead) collisions at a centre-of-mass energy of 14 TeV (5.5 TeV nucleon-nucleon) and at luminosities up to 1034 cm−2 s−1 (1027 cm−2 s−1). At the core of the CMS detector sits a high-magnetic-field and large-bore superconducting solenoid surrounding an all-silicon pixel and strip tracker, a lead-tungstate scintillating-crystals electromagnetic calorimeter, and a brass-scintillator sampling hadron calorimeter. The iron yoke of the flux-return is instrumented with four stations of muon detectors covering most of the 4π solid angle. Forward sampling calorimeters extend the pseudorapidity coverage to high values (|η| ≤ 5) assuring very good hermeticity. The overall dimensions of the CMS detector are a length of 21.6 m, a diameter of 14.6 m and a total weight of 12500 t.

StudyTop journalModerate

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Jörg Henseler, Christian M. Ringle, Marko Sarstedt · Journal of the Academy of Marketing Science · 2014 · 32,833 citations

Discriminant validity assessment has become a generally accepted prerequisite for analyzing relationships between latent variables. For variance-based structural equation modeling, such as partial least squares, the Fornell-Larcker criterion and the examination of cross-loadings are the dominant approaches for evaluating discriminant validity. By means of a simulation study, we show that these approaches do not reliably detect the lack of discriminant validity in common research situations. We therefore propose an alternative approach, based on the multitrait-multimethod matrix, to assess discriminant validity: the heterotrait-monotrait ratio of correlations. We demonstrate its superior performance by means of a Monte Carlo simulation study, in which we compare the new approach to the Fornell-Larcker criterion and the assessment of (partial) cross-loadings. Finally, we provide guidelines on how to handle discriminant validity issues in variance-based structural equation modeling.

StudyModerate

Identification of acquired antimicrobial resistance genes

Ea Zankari, Henrik Hasman, Salvatore Cosentino +5 more · Journal of Antimicrobial Chemotherapy · 2012 · 5,125 citations

OBJECTIVES: Identification of antimicrobial resistance genes is important for understanding the underlying mechanisms and the epidemiology of antimicrobial resistance. As the costs of whole-genome sequencing (WGS) continue to decline, it becomes increasingly available in routine diagnostic laboratories and is anticipated to substitute traditional methods for resistance gene identification. Thus, the current challenge is to extract the relevant information from the large amount of generated data. METHODS: We developed a web-based method, ResFinder that uses BLAST for identification of acquired antimicrobial resistance genes in whole-genome data. As input, the method can use both pre-assembled, complete or partial genomes, and short sequence reads from four different sequencing platforms. The method was evaluated on 1862 GenBank files containing 1411 different resistance genes, as well as on 23 de-novo-sequenced isolates. RESULTS: When testing the 1862 GenBank files, the method identified the resistance genes with an ID = 100% (100% identity) to the genes in ResFinder. Agreement between in silico predictions and phenotypic testing was found when the method was further tested on 23 isolates of five different bacterial species, with available phenotypes. Furthermore, ResFinder was evaluated on WGS chromosomes and plasmids of 30 isolates. Seven of these isolates were annotated to have antimicrobial resistance, and in all cases, annotations were compatible with the ResFinder results. CONCLUSIONS: A web server providing a convenient way of identifying acquired antimicrobial resistance genes in completely sequenced isolates was created. ResFinder can be accessed at www.genomicepidemiology.org. ResFinder will continuously be updated as new resistance genes are identified.

StudyModerate

Strengthening the Reporting of Observational Studies in Epidemiology (STROBE): Explanation and elaboration

Jan P. Vandenbroucke, Erik von Elm, Douglas G. Altman +6 more · International Journal of Surgery · 2014 · 2,860 citations

Much medical research is observational. The reporting of observational studies is often of insufficient quality. Poor reporting hampers the assessment of the strengths and weaknesses of a study and the generalisability of its results. Taking into account empirical evidence and theoretical considerations, a group of methodologists, researchers, and editors developed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) recommendations to improve the quality of reporting of observational studies. The STROBE Statement consists of a checklist of 22 items, which relate to the title, abstract, introduction, methods, results and discussion sections of articles. Eighteen items are common to cohort studies, case-control studies and cross-sectional studies and four are specific to each of the three study designs. The STROBE Statement provides guidance to authors about how to improve the reporting of observational studies and facilitates critical appraisal and interpretation of studies by reviewers, journal editors and readers. This explanatory and elaboration document is intended to enhance the use, understanding, and dissemination of the STROBE Statement. The meaning and rationale for each checklist item are presented. For each item, one or several published examples and, where possible, references to relevant empirical studies and methodological literature are provided. Examples of useful flow diagrams are also included. The STROBE Statement, this document, and the associated Web site (http://www.strobe-statement.org/) should be helpful resources to improve reporting of observational research. © 2014 The Authors.

StudyModerate

XGBoost

Tianqi Chen, Carlos Guestrin · 2016 · 47,429 citations

Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

ObservationalModerate

Applied Regression Analysis and Other Multivariable Methods

Esteban Walker, David G. Kleinbaum, Lawrence L. Kupper +1 more · Technometrics · 1989 · 8,350 citations

1. CONCEPTS AND EXAMPLES OF RESEARCH. Concepts. Examples. Concluding Remarks. References. 2. CLASSIFICATION OF VARIABLES AND THE CHOICE OF ANALYSIS. Classification of Variables. Overlapping of Classification Schemes. Choice of Analysis. References. 3. BASIC STATISTICS: A REVIEW. Preview. Descriptive Statistics. Random Variables and Distributions. Sampling Distributions of t, ?O2, and F. Statistical Inference: Estimation. Statistical Inference: Hypothesis Testing. Error Rate, Power, and Sample Size. Problems. References. 4. INTRODUCTION TO REGRESSION ANALYSIS. Preview. Association versus Causality. Statistical versus Deterministic Models. Concluding Remarks. References. 5. STRAIGHT-LINE REGRESSION ANALYSIS. Preview. Regression with a Single Independent Variable. Mathematical Properties of a Straight Line. Statistical Assumptions for a Straight-line Model. Determining the Best-fitting Straight Line. Measure of the Quality of the Straight-line Fit and Estimate ?a2. Inferences About the Slope and Intercept. Interpretations of Tests for Slope and Intercept. Inferences About the Regression Line ?YY|X = ?O0 + ?O1X . Prediction of a New Value of Y at X0. Problems. References. 6. THE CORRELATION COEFFICIENT AND STRAIGHT-LINE REGRESSION ANALYSIS. Definition of r. r as a Measure of Association. The Bivariate Normal Distribution. r and the Strength of the Straight-line Relationship. What r Does Not Measure. Tests of Hypotheses and Confidence Intervals for the Correlation Coefficient. Testing for the Equality of Two Correlations. Problems. References. 7. THE ANALYSIS-OF-VARIANCE TABLE. Preview. The ANOVA Table for Straight-line Regression. Problems. 8. MULTIPLE REGRESSION ANALYSIS: GENERAL CONSIDERATIONS. Preview. Multiple Regression Models. Graphical Look at the Problem. Assumptions of Multiple Regression. Determining the Best Estimate of the Multiple Regression Equation. The ANOVA Table for Multiple Regression. Numerical Examples. Problems. References. 9. TESTING HYPOTHESES IN MULTIPLE REGRESSION. Preview. Test for Significant Overall Regression. Partial F Test. Multiple Partial F Test. Strategies for Using Partial F Tests. Tests Involving the Intercept. Problems. References. 10. CORRELATIONS: MULTIPLE, PARTIAL, AND MULTIPLE PARTIAL. Preview. Correlation Matrix. Multiple Correlation Coefficient. Relationship of RY|X1, X2, !KXk to the Multivariate Normal Distribution. Partial Correlation Coefficient. Alternative Representation of the Regression Model. Multiple Partial Correlation. Concluding Remarks. Problems. References. 11. CONFOUNDING AND INTERACTION IN REGRESSION. Preview. Overview. Interaction in Regression. Confounding in Regression. Summary and Conclusions. Problems. References. 12. DUMMY VARIABLES IN REGRESSION. Preview. Definitions. Rule for Defining Dummy Variables. Comparing Two Straight-line Regression Equations: An Example. Questions for Comparing Two Straight Lines. Methods of Comparing Two Straight Lines. Method I: Using Separate Regression Fits to Compare Two Straight Lines. Method II: Using a Single Regression Equation to Compare Two Straight Lines. Comparison of Methods I and II. Testing Strategies and Interpretation: Comparing Two Straight Lines. Other Dummy Variable Models. Comparing Four Regression Equations. Comparing Several Regression Equations Involving Two Nominal Variables. Problems. References. 13. ANALYSIS OF COVARIANCE AND OTHER METHODS FOR ADJUSTING CONTINUOUS DATA. Preview. Adjustment Problem. Analysis of Covariance. Assumption of Parallelism: A Potential Drawback. Analysis of Covariance: Several Groups and Several Covariates. Comments and Cautions. Summary Problems. Reference. 14. REGRESSION DIAGNOSTICS. Preview. Approaches to Diagnosing Problems in Data. Residual Analysis: Detecting Outliers and Violations of Model Assumptions. Strategies of Analysis. Collinearity. Scaling Problems. Diagnostics Example. An Important Caution. Problems. References. 15. POLYNOMIAL REGRESSION. Preview. Polynomial Models. Least-squares Procedure for Fitting a Parabola. ANOVA Table for Second-order Polynomial Regression. Inferences Associated with Second-order Polynomial Regression. Example Requiring a Second-order Model. Fitting and Testing Higher-order Model. Lack-of-fit Tests. Orthogonal Polynomials. Strategies for Choosing a Polynomial Model. Problems. 16. SELECTING THE BEST REGRESSION EQUATION. Preview. Steps in Selecting the Best Regression Equation. Step 1: Specifying the Maximum Model. Step 2: Specifying a Criterion for Selecting a Model. Step 3: Specifying a Strategy for Selecting Variables. Step 4: Conducting the Analysis. Step 5: Evaluating Reliability with Split Samples. Example Analysis of Actual Data. Issues in Selecting the Most Valid Model. Problems. References. 17. ONE-WAY ANALYSIS OF VARIANCE. Preview. One-way ANOVA: The Problem, Assumptions, and Data Configuration. for One-way Fixed-effects ANOVA. Regression Model for Fixed-effects One-way ANOVA Fixed-effects Model for One-way ANOVA. Random-effects Model for One-way ANOVA. -comparison Procedures for Fixed-effects One-way ANOVA. a Multiple-comparison Technique. Orthogonal Contrasts and Partitioning an ANOVA Sum of Squares. Problems. References. 18. RANDOMIZED BLOCKS: SPECIAL CASE OF TWO-WAY ANOVA. Preview. Equivalent Analysis of a Matched-pairs Experiment. Principle of Blocking. Analysis of a Randomized-blocks Experiment. ANOVA Table for a Randomized-blocks Experiment. Models for a Randomized-blocks Experiment. Fixed-effects ANOVA Model for a Randomized-blocks Experiment. Problems. References. 19. TWO-WAY ANOVA WITH EQUAL CELL NUMBERS. Preview. Using a Table of Cell Means. General Methodology. F Tests for Two-way ANOVA. Regression Model for Fixed-effects Two-way ANOVA. Interactions in Two-way ANOVA. Random- and Mixed-effects Two-way ANOVA Models. Problems. References. 20. TWO-WAY ANOVA WITH UNEQUAL CELL NUMBERS. Preview. Problem with Unequal Cell Numbers: Nonorthogonality. Regression Approach for Unequal Cell Sample Sizes. Higher-way ANOVA. Problems. References. 21. THE METHOD OF MAXIMUM LIKELIHOOD. Preview. The Principle of Maximum Likelihood. Statistical Inference Using Maximum Likelihood. Summary. Problems. 22. LOGISTIC REGRESSION ANALYSIS. Preview. The Logistic Model. Estimating the Odds Ratio Using Logistic Regression. A Numerical Example of Logistic Regression. Theoretical Considerations. An Example of Conditional ML Estimation Involving Pair-matched Data with Unmatched Covariates. Summary. Problems. References. 23. POLYTOMOUS AND ORDINAL LOGISTIC REGRESSION. Preview. Why Not Use Binary Regression? An Example of Polytomous Logistic Regression: One Predictor, Three Outcome Categories. An Example: Extending the Polytomous Logistic Model to Several Predictors. Ordinal Logistic Regression: Overview. A Simple Hypothetical Example: Three Ordinal Categories and One Dichotomous Exposure Variable. Ordinal Logistic Regression Example Using Real Data with Four Ordinal Categories and Three Predictor Variables. Summary. Problems. References. 24. POISSON REGRESSION ANALYSIS. Preview. The Poisson Distribution. Example of Poisson Regression. Poisson Regression: General Considerations. Measures of Goodness of Fit. Continuation of Skin Cancer Data Example. A Second Illustration of Poisson Regression Analysis. Summary. Problems. References. 25. ANALYSIS OF CORRELATED DATA PART 1: THE GENERAL LINEAR MIXED MODEL. Preview. Examples. General Linear Mixed Model Approach. Example: Study of Effects of an Air Polluion Episode on FEV1 Levels. Summary!XAnalysis of Correlated Data: Part 1. Problems. References. 26. ANALYSIS OF CORRELATED DATA PART 2: RANDOM EFFECTS AND OTHER ISSUES. Preview. Random Effects Revisited. Results for Random Effects Models Applied to Air Pollution Study Data. Second Example!XAnalysis of Posture Measurement Data. Recommendations about Choice of Correlation Structure. Analysis of Data for Discrete Outcomes. Problems. References. 27. SAMPLE SIZE PLANNING FOR LINEAR AND LOGISTIC REGRESSION AND ANALYSIS OF VARIANCE. Preview. Review: Sample Size Calculations for Comparisons of Means and Proportions. Sample Size Planning for Linear Regression. Sample Size Planning for Logistic Regression. Power and Sample Size Determination for Linear Models: A General Approach. Sample Size Determination for Matched Case-control Studies with a Dichotomous Outcome. Practical Considerations and Cautions. Problems. References. Appendix A. Appendix B. Appendix C. Solutions to Exercises. Index.