Multivariate Normal Probability that Margin is Maximum

James Totterdell ยท 2019-05-08

In response adaptive randomisation we are usually interested in weighting allocation ratios proportional to the probability that one treatment is the best out of a collection of treatments. For example, suppose we had \(P\) treatments, then for each treatment \(p\) we are interested in \[ \mathbb P(\beta_p = \text{max}\{\beta_{1:P}\}),\quad \text{for }p=1,...P. \] Equivalently, \[ \mathbb P(\beta_p - \beta_{1:P} > 0),\quad \text{for }p=1,...,P. \]

In practice, these probabiliies would be calculated based on MCMC draws from the posterior, however, for the purposes of assessing trial characteristics, we may resort to normal approximations for running thousands of trial simulations. In this case, the above probabilities may still be most efficiently calculated using Monte Carlo draws from the multivariate normal, however, we can also calculate them numerically.

Suppose \(\beta\sim N(\mu,\Sigma)\), then we are interested in the linear transformation \[ Z_p = A_p\beta_{1:P} \] where \[ A_p = \begin{pmatrix} -1 & 0 & 0 & \cdots & 1 & \cdots & 0 \\ 0 & -1 & 0 & \cdots & 1 & \cdots & 0 \\ 0 & 0 & -1 & \cdots & 1 & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots &\vdots &\vdots &\vdots \\ 0 & 0 & 0 & \cdots & 1 &\cdots & -1 \end{pmatrix}, \] that is, the \(p\)th column is ones and column cell \(i,i\) is -1 with the remaining cells zero.

Given that multivariate normal is invariant under affine transformations with linearly independent rows, we have \[ A\beta \sim N(A\mu, A\Sigma A^\top) \] and we just calculate the upper tail probability using the CDF, that is \(\Phi(0, \infty; A\mu, A\Sigma A^\top)\) where \[ \Phi(a, b;\mu,\Sigma) = \frac{1}{\sqrt{|\Sigma|(2\pi)^P}}\int_{a_1}^{b_1}\cdots\int_{a_P}^{b_P} e^{-\frac{1}{2}(x-\mu)\Sigma^{-1}(x-\mu)}dx_P...dx_1. \]

Example

For example,

library(brms)
library(varapproxr)
library(mvtnorm)

beta <- sort(rnorm(5, 0, 0.1))
n <- rep(20, 5)
y <- rbinom(5, n, plogis(beta))
x <- factor(1:5)
D <- data.frame(n = 20, y = y, x = x)
X <- model.matrix( ~ 0 + x)[,]
A <- matrix(c(1, 1, 1, 1,
             -1, 0, 0, 0,
              0,-1, 0, 0,
              0, 0, -1, 0,
              0, 0, 0, -1), 4, 5)
mod_mcmc <- brm(y | trials(n) ~ 0 + x, data = D, family = binomial, refresh = 0, iter = 1e4,
                prior = prior(normal(0, 1), class = b))
mod_vb <- vb_logistic_n(X, y, n, mu0 = rep(0, 5), Sigma0 = diag(1, 5),
                        mu_init = rep(0, 5), Sigma_init = diag(1, 5))

rbind(
  prop.table(table(factor(apply(as.matrix(mod_mcmc), 1, which.max), levels = 1:5))),
  c(pmvnorm(0, Inf, drop(A[,c(1,2,3,4,5)] %*% mod_vb$mu), 
            sigma = A[,c(1,2,3,4,5)] %*% mod_vb$Sigma %*% t(A[,c(1,2,3,4,5)]))[1],
    pmvnorm(0, Inf, drop(A[,c(2,1,3,4,5)] %*% mod_vb$mu), sigma = A[,c(2,1,3,4,5)] %*% mod_vb$Sigma %*% t(A[,c(2,1,3,4,5)]))[1],
    pmvnorm(0, Inf, drop(A[,c(2,3,1,4,5)] %*% mod_vb$mu), sigma = A[,c(2,3,1,4,5)] %*% mod_vb$Sigma %*% t(A[,c(2,3,1,4,5)]))[1],
    pmvnorm(0, Inf, drop(A[,c(2,3,4,1,5)] %*% mod_vb$mu), sigma = A[,c(2,3,4,1,5)] %*% mod_vb$Sigma %*% t(A[,c(2,3,4,1,5)]))[1],
    pmvnorm(0, Inf, drop(A[,c(2,3,4,5,1)] %*% mod_vb$mu), sigma = A[,c(2,3,4,5,1)] %*% mod_vb$Sigma %*% t(A[,c(2,3,4,5,1)]))[1])
)
##               1         2         3         4         5
## [1,] 0.04655000 0.1979000 0.1982500 0.3549500 0.2023500
## [2,] 0.05173443 0.1995843 0.1996116 0.3493869 0.1996634