Post
Kenneth Hung

Binomial ranking with SARS data

With the coronavirus spreading to many countries, Rebecca asked me a curious question: how does the US perform during SARS compared to other regions in terms of survival rate? While we can compute the survival rate of all infected regions and rank them accordingly, we are ignoring the sampling variability. For example, South Africa that has one case but also one death, does not necessarily perform worse than Indonesia where there were two cases but both patients survived.

Setup

Of course there are many other factors, but we can consider an idealized model where the number of patients, in each country is predetermined, but the number of deaths, is random and comes from a binomial draw:

where represent the chance of a patient dying in region . We would then want to provide simultaneous confidence intervals for the rank of region , , as defined in Al Mohamad, Goeman, van Zwet (2018):

There is an obvious Bayesian way to achieve this. By setting up a reasonable prior, we can perform posterior draws of and search for confidence intervals of ranks that covers of the posterior draws. Naturally this can be extended to an empirical Bayes way as well, as suggested in one of the comments in this Cross Validated thread. We want to focus on strict frequentist methods here.

Data

I have never done data scraping, so I am glad that this led me to learn rvest. We read in the table from the SARS page on Wikipedia. It looks like this:

region cases deaths fatality
China (mainland) 5327 349 0.06551530
Hong Kong 1755 299 0.17037037
Taiwan 346 37 0.10693642
Canada 251 43 0.17131474
Singapore 238 33 0.13865546
Vietnam 63 5 0.07936508
United States 27 0 0.00000000
Philippines 14 2 0.14285714

Method 1: Simultaneous confidence intervals

One way is to construct simultaneous confidence intervals for each of the region, and “project” to figure out the ranks. We construct UMAU simultaneous confidence intervals:

Simultaneous UMAU intervals for SARS fatality

These UMAU intervals are random by nature. Furthermore, the simultaneous coverage is achieved by Bonferroni correction here. A more fine-tuned analysis can be obtain by strategically distribute the type I error over the 29 regions, especially when some of these intervals are likely uninformative.

We can now compute the ranks for all parameters falling into the simultaneous confidence region:

region cases deaths fatality ci.lo ci.hi rank.lo rank.hi
China (mainland) 5327 349 0.06551530 0.0553903386 0.07671217 1 26
Hong Kong 1755 299 0.17037037 0.1432909366 0.19993250 2 29
Taiwan 346 37 0.10693642 0.0619888081 0.16667006 1 29
Canada 251 43 0.17131474 0.1042017481 0.25591583 2 29
Singapore 238 33 0.13865546 0.0795036998 0.21678338 2 29
Vietnam 63 5 0.07936508 0.0099142628 0.22994923 1 29
United States 27 0 0.00000000 0.0000000000 0.18652527 1 29
Philippines 14 2 0.14285714 0.0001578934 0.58405200 1 29

Method 2: Simultaneous pairwise tests

Binomial distributions is an exponential family, so contrasts like are amenable to UMPU tests. We can perform pairwise tests, with Bonferroni correction, and draw conclusions about the ranks.

region cases deaths fatality rank.lo rank.hi
China (mainland) 5327 349 0.06551530 1 27
Hong Kong 1755 299 0.17037037 2 29
Taiwan 346 37 0.10693642 1 29
Canada 251 43 0.17131474 2 29
Singapore 238 33 0.13865546 1 29
Vietnam 63 5 0.07936508 2 29
United States 27 0 0.00000000 1 29
Philippines 14 2 0.14285714 1 29

Here we are bounded to correct for all tests, instead of taking advantage of Tukey’s HSD, which incurs a much smaller penalty for multiple testing. Asymptotically we can always think of the likelihood as if it came from a Gaussian distribution and use Tukey’s HSD, but we most definitely are not in any reasonable asymptotic regime here.

Thoughts

Are there better, strictly frequentist methods for computing these rank confidence intervals? This alone seems hard, but there is a natural, even harder generalization of this problem: suppose we have a joint distribution of exponential families where the base measure does not need to be the same

is there a powerful method for ranking ?