How Reliable Are These Numbers?

Risk numbers disagree. The WHO and the Institute for Health Metrics and Evaluation (IHME) report malaria deaths as 550,000 and 760,000 respectively — a 38% gap from the same underlying deaths. Our World in Data’s Deadliest Animals chart is visually compelling, but converting annual death counts to per-encounter micromorts is non-trivial. This vignette documents how we handle that uncertainty.

1. Why Risk Numbers Disagree

Three factors drive disagreement between sources:

  1. Numerator uncertainty: Death attribution varies by coding system (ICD-10 codes, verbal autopsy, hospital records)
  2. Denominator uncertainty: How many people were exposed? A “deaths per year” figure means nothing without knowing the exposure population
  3. Temporal and geographic aggregation: A global annual average hides enormous regional and seasonal variation

Our inclusion criteria: traceable numerator + defined denominator + reproducible calculation. We reject risks where we cannot identify both the death count and the population at risk.

2. The Confidence System

Every entry in atomic_risks() carries a confidence tier:

Confidence tiers with examples from the micromort dataset
Tier Criteria Example Source type
high Peer-reviewed, large-N studies with defined denominators Medical radiation (NRC dosimetry) Regulatory agency
medium Reputable sources, reasonable denominators, some extrapolation Wikipedia micromort list, CDC injury data Secondary compilation
low Limited sources, regional uncertainty, or extrapolated denominators Snake bite in rural Africa (WHO estimate) Expert estimate
estimated Derived by calculation from a model (e.g., LNT for radiation) Annual cosmic radiation from LNT model Model-derived

Validation status (new)

Within each confidence tier, we now track how thoroughly the estimate has been cross-checked:

Validation status levels
Status Definition Source count Example
single_source One citation, no cross-check 1 Most legacy entries from Wikipedia/micromorts.rip
corroborated 2+ sources agree within 2x 2+ Flight risks (Boeing + NCRP + medical literature)
cross_validated 3+ sources, range documented, outliers explained 3+ (Future: entries with systematic literature review)
Current validation status across all entries
confidence corroborated single_source
high 29 9
low 3 0
medium 12 76
estimated 0 2

3. Geographic and Health Profile Conditioning

Geography is the biggest source of variation in risk data — the same snake bite ranges from 0.5 mm (US, with antivenom) to 18.5 mm (rural sub-Saharan Africa) �� a 37x difference. Health profile conditioning shows similar magnitude: a bee sting is 0.03 mm for the general population but 31 mm for someone with a known allergy (1,000x).

For the full analysis of how geography and demographics reshuffle risk rankings, including disease mortality by country (IHME GBD data) and age-conditioned confounders (bed falls, anaesthesia), see the Confounding Variables vignette.

The common_risks() function supports profile-based filtering:

# Default: returns high-income, all-ages estimates
common_risks()

# Geographic and health profile conditioning
common_risks(profile = list(country = "NG"))
common_risks(profile = list(health_profile = "allergic"))

4. Cross-Validation Methods

We use five methods to assess data reliability:

Source triangulation

Compare the same risk across independent sources. For wildlife risks, we cross-reference:

  • OWID (Our World in Data) annual death counts (numerator)
  • CDC (Centers for Disease Control and Prevention) injury surveillance (US denominator)
  • WHO (World Health Organization) fact sheets (global denominator)
  • ISAF (International Shark Attack File) species-specific data

Denominator audit

The most common failure mode. Does the source report both a numerator (deaths) and a denominator (exposures)?

Animal Numerator available? Denominator available? Included?
Shark Yes (ISAF) Yes (~100M swims/yr) Yes
Dog Yes (CDC, WHO) Yes (4.5M bites US) Yes
Mosquito Yes (WHO: 600k+) No per-encounter rate No
Crocodile Yes (CrocBITE) No exposure estimate No

Temporal stability

Has the number changed significantly across editions of the source? Stable estimates across 5+ years increase confidence.

Geographic consistency

Do US, UK, and global estimates agree within an order of magnitude? Large discrepancies suggest unmeasured confounders (see Confounding Variables).

Order-of-magnitude test

Is the number physically plausible? A micromort value that implies more deaths than the population can support is a red flag.

5. Worked Example: Animal Risks from OWID

Our World in Data reports annual deaths by animal. Converting to per-encounter micromorts requires:

$$\text{micromorts} = \frac{\text{deaths per year}}{\text{encounters per year}} \times 10^6$$

Converting Our World in Data (OWID) annual counts to per-encounter micromorts
Animal Annual deaths (approx) Encounters/yr (approx) Micromorts Source for denominator In dataset?
Shark ~6 (US) ~100M ocean swims 0.06 ISAF Yes
Dog (US) ~30 ~4.5M bites 6.7 CDC Yes
Bee/wasp (US) ~62 ~2M stings 0.03 CDC Yes
Snake (US) ~5 ~10,000 bites 0.5 CDC Yes
Snake (Africa) ~100,000 ~5.4M bites 18.5 WHO/Lancet Yes
Mosquito ~600,000+ Unknown per-bite No
Crocodile ~1,000 Unknown No
Elephant ~500 Unknown No

Mosquito, crocodile, and elephant fail our inclusion criteria: there is no defensible per-encounter denominator. Mosquito bites are ubiquitous in endemic regions, making a per-bite risk meaningless. We cite OWID for context but do not include these as micromort entries.

6. Estimate Ranges

For wildlife entries, we document plausible ranges reflecting source disagreement:

Estimate ranges for wildlife entries
activity micromorts estimate_range source_count validation_status
Shark encounter (ocean swim) 0.06 0.03-0.10 2 corroborated
Dog bite (US) 6.70 5-10 2 corroborated
Dog bite (rabies-endemic) 160.00 100-250 2 corroborated
Bee/wasp sting (general) 0.03 0.02-0.05 2 corroborated
Bee/wasp sting (allergic) 31.00 20-50 2 corroborated
Snake bite (US, with antivenom) 0.50 0.3-1.0 2 corroborated
Snake bite (rural sub-Saharan Africa) 18.50 10-30 2 corroborated

The range reflects uncertainty in both the numerator (death counts vary by year and reporting) and denominator (exposure estimates are often rough). The point estimate is our best central value; the range brackets the plausible minimum and maximum.

7. What You Can Contribute

If you find a better source for an existing entry, or want to propose a new risk: open an issue at github.com/johngavin/micromort with:

  1. Numerator: Death count and source citation
  2. Denominator: Exposure count and source citation
  3. Geography/condition: Does the estimate apply globally, or to a specific population?
  4. Time period: When was the data collected?

Entries start at validation_status = "single_source" and get upgraded as more sources confirm them.

References

Reproducibility

Show code
sessionInfo()
R version 4.6.1 (2026-06-24)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 26.04 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.32.so;  LAPACK version 3.12.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: Etc/UTC
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] dplyr_1.2.1     micromort_0.2.0

loaded via a namespace (and not attached):
 [1] base64url_1.4     jsonlite_2.0.0    compiler_4.6.1    tidyselect_1.2.1 
 [5] Rcpp_1.1.1-1.1    callr_3.8.0       yaml_2.3.12       fastmap_1.2.0    
 [9] R6_2.6.1          generics_0.1.4    igraph_2.3.3      knitr_1.51       
[13] backports_1.5.1   targets_1.12.0    tibble_3.3.1      units_1.0-1      
[17] maketools_1.3.2   rprojroot_2.1.1   pillar_1.11.1     rlang_1.2.0      
[21] xfun_0.59         sys_3.4.3         otel_0.2.0        cli_3.6.6        
[25] magrittr_2.0.5    ps_1.9.3          digest_0.6.39     processx_3.9.0   
[29] secretbase_1.3.0  lifecycle_1.0.5   prettyunits_1.2.0 vctrs_0.7.3      
[33] evaluate_1.0.5    glue_1.8.1        data.table_1.18.4 codetools_0.2-20 
[37] buildtools_1.0.0  rmarkdown_2.31    tools_4.6.1       pkgconfig_2.0.3  
[41] htmltools_0.5.9  

micromort 0.1.0 | Git 94d93d2 | R 4.5.2 | Built 2026-04-18 12:20:56