Introduction to Statistics: Frequentist - PowerPoint PPT Presentation

1 / 43
About This Presentation

Introduction to Statistics: Frequentist


Introduction to Statistics ... Take random samples from the population and calculate a statistic Describes the chance fluctuations ... A Study Guide to Epidemiology ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 44
Provided by: eceUtAcI


Transcript and Presenter's Notes

Title: Introduction to Statistics: Frequentist

Introduction to Statistics Frequentist
Bayesian Approaches (for Non-Statisticians)
  • Ryung Suh, MD
  • Becker Associates Consulting, Inc.
  • Internal Staff Training
  • June 8, 2004

  • To provide a basic understanding of the terms and
    concepts that underlie statistical analyses of
    clinical trials data
  • To introduce Bayesian approaches and their
    application to FDA submissions

Table of Contents
  • Sources of Statistical Data
  • Frequentist Approaches
  • Bayesian Approaches
  • Insights from the Experts (from the Bayesian
    Approaches meeting, May 20-21, 2004)
  • Take-Aways and Strategic Insights
  • Corporate Resources

Sources of Data
  • Retrospective Studies Design, Bias, Matching,
    Relative Risk, Odds Ratio
  • Prospective Studies Design, Loss to Follow-up,
    Analysis, Relative Risk, Nonconcurrent
    Prospective Studies, Incidence, Prevalence
  • Randomized Controlled Trials Design,
    Elimination of Bias, Placebo Effect, Analysis
  • Survival Analysis Person-Time, Life-Tables,
    Proportional Hazard Models

Classical Frequentist
  • Hypothesis Testing In order to draw a valid
    statistical inference that an independent
    variable has a statistically significant effect
    (not the same as clinically significant effect),
    it is important to rule out chance or random
    variability as an explanation for the effects
    seen in a sampling distribution.

Statistical Inference
  • Two inferential techniques
  • Hypothesis Testing
  • Confidence Intervals
  • Inference is the process of making statements
    (hypotheses) with a degree of statistical
    certainty about population parameters based on a
    sampling distribution

Hypothesis Testing Terms
  • Null Hypothesis Ho initially held to be true
    unless proven otherwise
  • e.g. there is NO difference between treatment and
  • e.g. µ 11, or µ2 µ1 0
  • Akin to the accused is innocent
  • Alternative Hypothesis Ha is the claim we
    usually want to prove
  • e.g. there is a difference between treatment and
  • e.g. µ ? 11, or µ2 µ1 ? 0
  • Akin to the accused is guilty
  • We assume innocence until proven guilty beyond a
    reasonable doubt the same applies with Ho

Hypothesis Testing Decisions
  • Decision Options
  • Reject Ho (and assert Ha to be true)
  • Fail to Reject Ho (due to insufficient evidence)
  • Errors in Decisions

Level of Significance
  • Alpha a P(Type I Error) P(Reject Ho Ho is
  • Beta ß P(Type II Error) P(Fail to Reject Ho
    Ho is false)
  • Power 1 ß
  • We want both a and ß to be small
  • but increasing one decreases the other

This example is a simplification to aid
understanding the exact ß tends to be
generally unknown, although it is frequently
due to sample sizes that are too small.
Alternative Hypothesis
Null Hypothesis
Sampling Distribution
  • Population Distribution usu. a normal
    distribution with a mean of µ and a variance of
    s2 (but tough to measure the entire population)
  • Sampling Distribution a distribution of means
    from random samples drawn from the population a
    random variable (?) normally distributed with a
    mean (µ?) and variance of (s2/n),
  • Take random samples from the population and
    calculate a statistic
  • Describes the chance fluctuations of the
    statistic and the variability of sample averages
    around the population mean, for a given sample
    size (n).
  • Sample mean (µ?) serves as a point estimate for
    the population mean (µ)
  • Central Limit Theorem as n ? 8, sampling
    distribution approaches normal distribution (and
    the estimate becomes more precise)
  • http//

Determining the P(?µ)
  • Key Question Does the sample mean reflect the
    population mean, given the effects of
  • If population standard deviation (s) is known, we
    can standardize (mean0 s.d.1) and compare
  • Z (? - µ?) / (s / vn)
  • If s is unknown, we can estimate s from the same
    set of sample data and compare with a normal
  • T (? - µ?) / (s / vn)
  • a continuous distribution symmetric about zero
  • an infinite number of t-distributions indexed by
    degrees of freedom
  • as degrees of freedom (n-1) increase,
    t-distributions approach standard normal

Normal versus t-distribution
T-distributions are flatter and have more area
in the tails compared to Normal
distributions T-distributions approximate the
Normal as degrees of freedom (n-1) increase
Hypothesis Testing More Terms
  • Test Statistic the computed statistic used to
    make the decisions in hypothesis testing relates
    to a probability distribution (e.g. Z, t, ?2)
  • Critical Region contains the values of the test
    statistic such that Ho is rejected
  • Critical Value the endpoint(s) of the critical
  • One-tailed versus two-tailed tests depends on
  • P-Value the smallest value of a such that Ho
    will be rejected (a probability associated with
    the calculated value of the test statistic)

Steps in Hypothesis TestingThe
Classical/Frequentist Approach
  • Define parameter and specify Ho and Ha
  • Specify n (sample size), a (significance level),
    the test statistic, and the critical value(s) and
    critical regions
  • Take a sample and compute the value of the test
    statistic compare to the relevant probability
  • Reject or fail to reject Ho and draw statistical
  • Remember P-value is not the probability of
    the null hypothesis being true (the null
    hypothesis is either true or not, with P-value
    defining the level of significance for which
    randomness is considered).

Confidence Intervals
  • CI for (1-a)100 ? t (n-1, a/2)(s/vn)
  • Provides CI for population mean (µ) at the chosen
    level of confidence (e.g. 90, 95, 99)
  • Provides interval estimate of the population mean
    (vs. the point estimate that the sample mean
  • Depends on the amount of variability in the data
  • Depends on the level of certainty we require
  • Increasing (1-a) will increase the CI width
  • Increasing sample size (n) will decrease the CI

Issues for Frequentists (and others)
  • Multiplicity the chance of a Type I error when
    multiple hypotheses are tested is larger than the
    chance of a Type I error in each hypothesis test
  • Multiple Endpoints Frequentists worry about the
    dimensions of the sample space (the Bayesian
    looks at the dimensions of the parameter
    space)both tend to be skeptical of believing
    what he thinks he sees in high-dimensional
    problems (Permutt)
  • Multiple Looks Trials are expensive, so
    sequential methods are attractive but stopping
    rules tend to be fixed in frequentist approaches
  • Multiple Studies Frequentist meta-analysis (to
    look at combined evidence from several studies)
    cannot rely simply on a fixed p-value (i.e.
    0.05) it must look at the entirely of the
    evidence and the strength of each piece
  • Garbage In, Garbage Out

Bayesian Statistics
  • Thomas Bayes (1702-1761) English theologian and
    mathematician Essay towards solving a problem
    in the doctrine of chances (1763)
  • Bayesian methods iterative processes that make
    better decisions based on learning from
  • combines a prior probability distribution for the
    states of nature with new sample information
  • the combined data gives a revised probability
    distribution about the states of nature, which is
    then used as a prior probability distribution
    with new (future) sample information
  • and so on and so on
  • Key feature using an empirically derived
    probability distribution for a population
  • May use objective data or subjective opinions in
    specifying a prior distribution
  • Criticized for lack of objectivity in specifying
    prior probability distribution

A Bayesian example
  • From http//
  • 15 blue taxis 85 black taxis only 100 taxis in
    the entire town
  • Witness claims seeing a blue taxi in hit-and-run
  • Witness is given a random ordered test
  • successfully identifies 4/5 taxis correctly (80)
  • If witness claims blue, how likely is she to
    have the color correct?
  • Blue taxis 80 is 12 blue 3 black
  • Black taxis 80 is 68 black 17 blue
  • In given sample space, 12/29 claims of blue are
    actually blue taxis (41)
  • A claim of black would be 68/71 (in the given
    sample space) 96
  • Bayesians take into account the rate of false
    positives for black taxis as well as for blue
    taxis (note that black taxis are in greater
    supply here)
  • Bayesian stats useful for calculating relatively
    small risks (e.g. rare disorders)
  • Bayesian stats useful in non-random distributions

Perspectives on Probability
  • Frequentist probability the relative
    frequency of an event, given the experiment is
    repeated an infinite number of times
  • Bayesian probability degree of belief or
    the likelihood of an event happening given what
    is known about the population

Bayesian Hypothesis Testing
  • Non-Bayesians navigate the optimal tradeoff
    between the probabilities of a false alarm
    (Type I error) and a miss (Type II error)
  • One can compare the likelihood ratio of these two
    probabilities to a nonnegative threshold value
    (or the log likelihood ratio to an arbitrary real
    threshold value)
  • Increasing the threshold makes the test less
    sensitive (higher chance of a miss)
    decreasing the threshold makes the test more
    sensitive (but with a higher chance of a false
  • More data improves the limits of this ratio (the
    limit relation is often give as Steins lemma,
    which approaches the Kullback-Leibler distance)
  • Bayesians instead of optimizing a probability
    tradeoff, a miss event or false alarm event
    is assigned costs additionally, we have prior
  • Decision function is based on the Bayes Risk, or
    expected costs
  • Threshold value is a function of costs and priors

Bayesian Parameter Estimation
  • Non-Bayesians the probability of an event is
    estimated as the empirical frequency of the event
    in a data sample
  • Bayesians include empirical prior
    information as the data sample goes to
    infinity, the effects of the past trial wash out
  • If there is no empirical prior information, it
    is possible to create a prior distribution based
    on reasonable beliefs
  • We calculate the posterior distribution from the
    sample data and the prior distribution using
    Bayes Theorem
  • P(AB) P(BA) P(A) / P(B)
  • This becomes the new prior distribution (known as
    a conjugate prior) this process allows efficient
    sequential updating of the posterior
    distributions as the study proceeds
  • The output of the Bayesian analysis is the
    entire posterior distribution (not just a single
    point estimate) it summarizes ALL our
    information to date
  • As we get more data, the posterior distribution
    will become more sharply peaked about a single

Bayesian Sequential Analysis
  • Given no fixed number of observations, and the
    observations come in sequence (until we decide to
  • Non-Bayesians the sequential probability ratio
    test is comparable to the log likelihood ratio
    and is used to decide on outcome 1, outcome 2, or
    to keep collecting observations (assigning
    threshold values to the log ratio functions)
  • Bayesians use the sequential Bayes risk by
    assigning a cost (of false alarms and misses)
    proportional to the number of observations prior
    to stopping the goal is to minimize expected
    cost using a strategy of optimal stopping

Steve Goodman (Hopkins)
  • Medical Inference is inductive
  • Deductive (disease ? signs/symptoms) traditional
    statistical methods
  • Inductive (signs/symptoms ? disease)Bayesian
    approaches more appropriate
  • Bayes Theorem
  • prior odds x Bayes factor posterior odds
  • Pretest odds x likelihood factor posttest odds
  • P-Value P(X being more extreme than observed
    result, assuming null hypothesis to be true)
  • Does not represent the probability of observed
    data being true
  • Does not represent the probability of observed
    data being by chance
  • Does not represent the probability of the truth
    of the null hypothesis
  • If P(datahypothesis) p, then likelihood of
    (hypothesisdata) cp, where c is an arbitrary
  • P(H0data) / P(Hadata) g / (1-g)
    P(dataH0) / P (dataHa)

Steve Goodman (Hopkins)
  • P-Value
  • Noncomparative
  • Observed hypothetical data
  • Implicit Ha
  • Evidence can only be negative
  • Sensitive to stopping rules
  • No formal interpretation
  • Bayes Factor
  • Comparative
  • Only observed data
  • Pre-defined explicit Ha
  • Positive or negative evidence
  • Insensitive to stopping rules
  • Formal interpretation

P-Value asks you to look at the data only ? then
make inferences later Bayesian methods ask you
to ask the question first ? and look at existing
that is evidence for the
Tom Louis (Hopkins)
  • Bayesian Inference
  • Specify the multi-level structure of prior
    probability distributions
  • Compute the joint posterior distribution for all
  • Compute the posterior distribution of quantities
    by integrating known conditions
  • Use the joint distribution to make inferences
  • Bayesian Advantages
  • Precision increases with more available
  • Repeated sampling gives information on the prior
  • More flexible when looking at partially related
    gaussian distributions
  • Allows inclusion and structuring of historical
    data (allows a compromise between ignoring
    historical data (no weight) and data-pooling
    (full weight)
  • Captures relevant uncertainties
  • Structures complicated inferences
  • Adds flexibility in designs
  • Documents assumptions

Don Berry (M.D. Anderson)
  • Approaches to drug/device development
  • Fully Bayes ? likelihood principle (for company
  • Bayesian tools for expanding the frequentist
    envelope (for designing and analyzing
    registration studies)
  • Bayesian advantages
  • Sequential learning is useful in study design
  • Predictive distributions (frequentists cannot
    emulate this)
  • Borrowing strength from historical data,
    concomitant trials, or from across patient and
    disease groups
  • Early data allows Adaptive Randomization
  • Ethical advantage stop clearly harmful or
    ineffective drugs/devices early in the trial
  • Find nuggets quickly and with higher
  • Learn quickly, treat patients in trial more
    effectively, save resources
  • May save resources (base development on early
  • May test multiple experimental drugs (e.g. cancer
    drug cocktails)
  • Seamless transitions through clinical trial
    phases (e.g. do not stop accrual)
  • Increase statistical power with much smaller
    sample populations
  • Relates response and survival rates as well
  • Early decisions on treatmentand on ending a

Bob Temple (CDER)
  • FDA is nervous and inexperienced with regard
    to Bayesian analysis (perhaps with exception in
  • Strategy should show both frequentist and
    Bayesian results (and show the difference)
  • Pitfalls Bayesian approaches can sometimes be
    longer and more expensive for the company
  • Bottomline Bayesian approaches are still new
    and need to be better understood by investigators
    and regulators

Larry Kessler (CDRH)
  • Bayesians at CDRH Greg Campbell, Don Malec,
    Gene Pennello, Telba Irony
  • White Paper (1997) http//
  • Applications to devices
  • Devices tend to have a great deal of prior
    information (mechanism of action is physical and
    local, as opposed to pharmacokinetic and
  • Devices usually evolve in small steps
  • Studies gain strength by using quantitative
    prior information
  • Prediction models available for surrogate
  • Sensitivity analysis available for missing data
  • Adaptive trial designs often useful for decision
    theoretics, non-inferiority trials, and
    post-market surveillance
  • Helps determine sample size and interim-look
  • Risks and Challenges
  • Often a trade-off between clinical burden and
    computational burden
  • Can be more expensive (e.g. if the prior
    information is NOT predictive or useless)
  • Beware of the regression to the mean effect
  • Hierarchical structure is not good if too little
    (single prior study) or too much prior info

Larry Kessler (CDRH)
  • Considerations
  • Restrict to quantitative prior information
  • Need legal permission because companies tend to
    own prior studies and data
  • Published literature and SSEs often lack
    patient-level data
  • FDA/companies need to reach agreement on the
    validity of any prior info
  • Need new decision rules for the clinical study
  • Frequentist statistically significant result
    for primary endpoint effectiveness
  • Bayesian posterior probability exceeding some
    predetermined value (or some interval within
    which it behaves consistently)
  • Bayesian trials must be prospectively designed
    (no switching mid-stream)
  • Control group cannot be used as a source of prior
    info for the new device
  • Need new formats for Labeling and for the Summary
    of Safety and Effectiveness
  • Simulations are important (show that Type I
    error is well-controlled)
  • FDA review team plays role in choice of decision
    rules for success and for the exchangeability of
    prior studies in a hierarchical model
  • Recommendations
  • Prospectively planned, with legally available and
    valid prior information
  • Good communications with the FDA, with a good
    statistician, and proper electronic Data

Ralph DAgostino (Boston Univ)(Advisory
Committee Member)
  • Randomized Controlled Trials need to keep
  • Challenge is that Bayesian methods can sometimes
    seem complex
  • Promise is that Bayesian methods can be made more
  • Should NOT use Bayesian methods to salvage
    studies that have failed frequentist approaches
  • Sometimes Bayesians are too optimistic about
    their ability to see validity across studies with
    different populations, different endpoints, and
    different analytical methods

Bob ONeill (CDER)
  • Too many people misinterpret the p-value
  • We rely on statistical significance with little
    regard for effect size or magnitude
  • The FDA needs to develop more format and content
    guides about reporting Bayesian statistics
  • Dealing with missing data is essentially a
    Bayesian exercise (i.e. model-building)
  • Bayesian statistics cut both ways (may require
    more time, expenses, and data to reach required

Stacy Lindborg (Global Statistics) and Greg
Campbell (CDRH)
  • SL Need validated computer software for
    Bayesian statistics and need a great deal of
    education to help regulators and clinicians
    understand the meaning of predictive posterior
    probabilities and to trust in Bayesian
  • SL Great promise with regard to
  • Looking at data more comprehensively
  • Conducting trials more ethically
  • GC Bayesian designs need to be done
  • CANNOT switch to Bayesian analysis to
    rescue/salvage studies that are not going well
  • GC Bayesian methods have the potential to
    shorten study duration, cut costs (by reducing
    number of patients), and enhance product
  • GC Between 1999-2003, there have been 14
    original PMAs Supplements in which Bayesian
    estimation was the primary analysis many more
    are in the works

Don Rubin (Harvard) and Jay Siegal (Centecor)
  • DR Bayesian thinking is our natural way to look
    at the world
  • DR Frequentist approaches need to work with
    Bayesian thinking (they are still just rules)
  • DR Validation is needed to ensure that both the
    model and the analysis are appropriate
  • JS Bayesian approaches (which relies on
    Predictive Value) and Frequentist approaches
    (which relies on Specificity) will converge to
    the extent that prior probabilites are similar
  • e.g. in adult use drugs/devices now applied to
    pediatric use
  • e.g the same class of drug being applied to
    similar therapeutic uses
  • JS Concerns about movement toward Bayesian
  • Shifts incentives toward non-innovative (more
    valid priors for existing therapies)
  • Priors constantly change during a trial (need
    predictable, prospective standards)
  • Legal concerns about using competitors data

Susan Ellenberg (OBE, CBER) and Norris Alderson
  • SE If Bayesian approaches are really a better
    mousetrap, it will spread and people will beg
    to demand it
  • NA Bayesian is NOT a religion
  • NA Incorporating a priori knowledge is useful,
    but we need frequentist checks at times (reality
  • NA Clear guidelines on methods, formats,
    content, analysis, etc. are need FDA regulators
    will need to work with statisticians, clinicians,
    and industry to accomplish this
  • NA Bayesian approaches still must deal with the
    common sources of bias found in frequentist

Statistical Terms and Concepts
  • Sources of Data
  • Statistical Inference
  • Frequentist Hypothesis Testing
  • Null and Alternative Hypotheses
  • Test Statistics and Sampling Distribution
  • Type I and Type II Errors Power
  • P-Value and Significance Level (a)
  • Confidence Intervals
  • Bayesian Statistics
  • Prior probability distribution
  • Posterior (or Joint) probability distribution
  • Bayes Factor (or Likelihood Ratio)
  • Adaptive Randomization

Strategic FDA Insights
  • FDA (especially CDRH) favorable to Bayesian
  • Not effective in rescuing/salvaging troubled
    studies must do prospectively
  • May lead to quicker, less expensive approvals
    (but may be longer, more expensive as well)
  • Useful in predictive models, sensitivity analysis
    for missing data, adaptive trial designs, and for
    looking at data more comprehensively (and perhaps
  • Need to use valid quantitative prior information
    (work with owners of data and with the FDA)
  • New decision rules, content, format, method,
    analysis, and reporting guidelines are needed (as
    well as new labeling and SSE)
  • A good statistician with both Bayesian and
    Frequentist credentials is perhaps our best
    advocate many Bayesians already have good
    relationships with the FDA

Final Thoughts
  • Clinical versus Statistical Significance
  • Why p-values of 0.05?
  • Importance of the research question
  • Bayesian is not a religion, although some
    Bayesians seem to see it that way
  • The promise of new statistical approaches
  • Our need to understand (at least at a basic
    level) the statistical work we do for our clients

Corporate Resources
  • Carlos Alzola, MS
  • Aldo Crossa, MS
  • Campbell Tuskey, MSPH
  • Reine Lea Speed, MPH
  • Ryung Suh, MD
  • Expert Associates Simon, dAgostino, Rubin,
    HCRI, Hopkins
  • Firm Library and Statistical Literature

  • Bayesian Approaches, U.S. Food and Drug
    Administration. Meeting at Masur Auditorium,
    National Institutes of Health, May 20-21, 2004.
  • Morton, Richard F, J. Richard Hebel, and Robert
    J. McCarter. A Study Guide to Epidemiology and
    Biostatistics. 3rd ed. 1990.
  • Permutt, Thomas. Three Nonproblems in the
    Frequentist Approach to Clinical Trials, U.S.
    Food and Drug Administration.
  • Stockburger, David W. Introductory Statistics
    Concepts, Models, and Applications.
  • Thornburg, Harvey. Introduction to Bayesian
    Statistics, CCRMA. Stanford University, Spring
  • Sampling Distribution Demonstration.
Write a Comment
User Comments (0)