── R CMD build ─────────────────────────────────────────────────────────────────
* checking for file ‘/private/var/folders/36/1t9xtfl51px173qfsbg8tlcc0000gn/T/RtmpQWrj7e/remotes91b72c122d24/Genentech-phase1b-617e6f9/DESCRIPTION’ ... OK
* preparing ‘phase1b’:
* checking DESCRIPTION meta-information ... OK
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
* building ‘phase1b_1.0.0.tar.gz’
Warning in utils::tar(filepath, pkgname, compression = compression, compression_level = 9L, :
storing paths of more than 100 bytes is not portable:
‘phase1b/tests/testthat/_snaps/plotBetaDiff/plot-of-distibution-of-difference-of-two-arms-with-beta-mixture.svg’
Warning in utils::tar(filepath, pkgname, compression = compression, compression_level = 9L, :
storing paths of more than 100 bytes is not portable:
‘phase1b/tests/testthat/_snaps/plotOc/plot-of-simulation-result-for-single-arm-posterior-predictive-probability.svg’
Warning in utils::tar(filepath, pkgname, compression = compression, compression_level = 9L, :
storing paths of more than 100 bytes is not portable:
‘phase1b/tests/testthat/_snaps/plotOc/plot-of-simulation-result-for-single-arm-posterior-probability.svg’
Warning in utils::tar(filepath, pkgname, compression = compression, compression_level = 9L, :
storing paths of more than 100 bytes is not portable:
‘phase1b/tests/testthat/_snaps/plotOc/plot-of-simulation-result-with-relativedelta-for-posterior-predictive-probability.svg’
Warning in utils::tar(filepath, pkgname, compression = compression, compression_level = 9L, :
storing paths of more than 100 bytes is not portable:
‘phase1b/tests/testthat/_snaps/plotOc/plot-of-simulation-result-without-relativedelta-for-posterior-predictive-probability.svg’
Practical Bayesian Statistics
Audrey Yeo
Thursday, March 27 08:00 DAGStat 2025
Berlin, Germany
Practical Bayesian Statistics?
A gentle refresher on probability theory, Bayesian Framework and Intuition for effective application in Biostatistics
Build a simple Bayesian Model
Test out different Priors
Infer from different Posteriors
Discussing ideas for your own work
Agenda
What’s covered
Proper Priors
Discrete endpoints
Credible Interval
Posterior Probability
Predictive Posterior Probability
if time permits, discussion on example applications
What’s not covered
Improper Priors
Continuous endpoints
Complex simulation
Comparison with hypothesis testing using Bayes Factors
Intro myself
Intro myself
Career
joined R & D at Roche in mid 2022 :
Biostats Lead in Early Oncology Trials
Study Statistician for phase I-II, phase III trials
Led Roche/Genentech Dose Escalation on new SCE
Led the development of R package phase1b
Instructor for Julia Course Basel (Data Science, Quarto, ML modules)
Co-organiser and Presenter many Methodology seminars
Upon visual inspection, increase sample sizes leads to
Better precision
Estimates and Credible Interval are more precise
These are benefits of the Bayesian paradigm
Using priors to improve precision
Priors that incorporate higher \(\alpha\) and \(\beta\) parameters influence the posterior, given data stays the same (16 / 23 responders in likelihood)
Quiz
Increased sample size in prior information can improve precision of credible intervals
Different Perspective : neither are wrong
Example
Frequentist
Bayesian
\(\theta\)
Known and fixed according to some distribution
A random variable we can infer from some distribution
Model or Distribution
Build a model, e.g. Likelihood using data
P(\(\theta\) | data, prior )
Method
OLS, Maximum Likelihood
prior multiplied by likelihood
Inference
via hypothesis testing
P(\(\theta\) > \(\theta_{c}\) | data, prior )
What is the probability of a fair coin toss ?
Since we toss this n times, it would converge to 0.5
it’s 0 or 1
A Bayesian model looks like the Bayes Thereom or Bayes Rule
For Event A and B :
\[ 0 < P(A) < 1\]
\[ P(B) > 0\]\[ {P( B | A)} = { {P(A|B)P(B)} \over {P(A) } } \]Law of Total probability
\[P(A) = \sum_{i=1}^{n} P(A \cap B_i)\]
Using the definition of conditional probability, we can rewrite this as:
A single arm novel therapeutic with an assumed control response rate is at most 60%
Example
Interim
Final
Responders
16
23
n
23
40
Response rate
69.57 %
57.5 %
Posterior probability*
ask phase1b
ask phase1b
Predictive posterior probability*
ask phase1b
-
Decision to develop molecule further : Go/Stop/Grey Zone
ask phase1b
ask phase1b
Prior and Posterior of Beta Distribution for \(\pi\)
Conjugate Prior is \(f(\pi)\), where \(\pi \sim {Beta(\alpha, \beta)}\), same family of distribution of Posterior
We know the mean response rate (RR) is : \[\pi = \ \frac {\alpha}{\alpha + \beta}\]
Likelihood is \(f(x|\pi)\), where \(x \sim {Binomial(x, n)}\)
The updated Posterior \(f( \pi | x )\) is again a \(Beta\) distribution (same family as prior) : \[ \pi| \ x \sim Beta(\alpha + x, \ \beta + n - x)\] where \(x\) is the number of responders of current trial
Posterior Probability : \(P (\pi > 60 \% | \alpha + x, \beta + n - x )\)
Predictive Posterior Probability : \(P (success \ or \ failure \ at \ final)\)
try it yourself :
Parameters:
Historical trial showed result of 1 of 3 responders
we then set alpha = 1, beta = 2
expected mean for prior distribution = 1 / 1 + 2
Current experiment has 16 / 23 responders
expected mean for posterior distribution = 1 + 16 / 17 + 2 + 23 - 16
control =0.6confidence_seventy =0.7result <- phase1b::predprob(x =16, n =23, Nmax =40, p = control, thetaT = confidence_seventy,parE =c(0.6, 0.4))result$result
[1] 0.8211011
confidence_ninety =0.9result_high_thetaT <- phase1b::predprob(x =16, n =23, Nmax =40, p = control, thetaT = confidence_ninety,parE =c(0.6, 0.4))result_high_thetaT$result
[1] 0.5655589
Predictive Posterior Probability (Lee & Liu, 2017)
\[ \sum_{i = 0}^{m} P( Y = i \ | \ x ) . I\ (Prob ( P > p_{0} \ | x, Y = i) > \theta_{T}) \]
\[ = \sum_{i = 0}^{m} \{ P( Y = i \ | \ x ). I\ ( B_{i} > \theta_{T} \}
= \sum_{i = 0}^{m} P( Y = i \ | \ x ) . I_{i} \]
Predictive Posterior Probability (PPP)
\[ PPP = \sum_{i = 0}^{m} \{ P( Y = i \ | \ x ). I\ ( B_{i} > \theta_{T} \}
= P(RR \ at final \ > 60 \%) > 70 \% \]
# The original Lee and Liu (Table 1) example:# Nmax = 40, x = 16, n = 23, beta(0.6,0.4) prior distribution,# thetaT = 0.7. The control response rate is 60%:results <- phase1b::predprob(x =16, # current number of respondersn =23, # sample size at interimNmax =40, # max sample sizep =0.6, # control response rate thetaT =0.7, # confidence parE =c(0.6, 0.4) # prior alpha and beta)
At n = 34, we have some useful results (we did not have to evaluate 40!)
1/3 of results are known at interim, most of these results are in the Gray Zone
~ 1/2 of results are also at the Gray Zone at final
Expanded features
…. and wiggle room!
SOC uncertainty
single-arm
two-arm
simulation
plotting
boundaries
postprob
✔️
postprobDist
✔️
✔️
predprob
✔️
predprobDist
✔️
✔️
ocPostprob
✔️
✔️
ocPostprobDist
✔️
✔️
✔️
ocPredprob
✔️
✔️
ocPredprobDist
✔️
✔️
✔️
ocRctPostprobDist
✔️
✔️
✔️
✔️
ocRctPredprobDist
✔️
✔️
✔️
✔️
plotBeta
✔️
✔️
plotDecision
✔️
plotOc
✔️
plotBounds
✔️
boundsPostprob
✔️
boundsPredprob
✔️
Some references
Held L & Sabanés Bové D (2020) Likelihood and Bayesian Inference : Applications in Medicine and Biology, 2nd Edition.
LeSaffre E & Lawson A (2012) Bayesian Biostatistics, First Edition,
Thall P F, Simon R (1994), Practical Guidelines for Phase IIB Clinical Trials, Biometrics, 50, 337-349.
Lee J J, Liu D D (2008), A Predictive probability design for phase II cancer clinical trials, 5(2), 93-106, Clinical Trials.
Yeo, A T, Sabanés Bové D, Elze M, Pourmohamad T, Zhu J, Lymp J, Teterina A (2024). Phase1b : Calculations for decisions on Phase 1b clinical trials. R package version 1.0.0, https://genentech.github.io/phase1b
Zeileis, Fisher, Hornik, Ihaka, McWhite, Murrell, Stauffer, Wilke (2020) colorspace: A Toolbox for Manipulating and Assessing Colors and Palettes. Journal of Statistical Software.
Thanks DAGStat 2025
Audrey Yeo [email protected]
M Nursing (Sydney)
M Sci Biostats (Zürich)