class: center, middle, inverse, title-slide # Modeling Malaria: Methods and Applications ## SNRE Exit Seminar
(Slides available at: bit.ly/bentoh-exit) ### Kok Ben Toh ### 2020/09/21 --- class: inverse, center, middle # Introduction to Malaria --- # Introduction to Malaria .pull-left1[ - Caused by protozoan parasite *Plasmodium* - 5 species that causes human malaria - Mosquito-borne disease ] .pull-right1[ <img src="../assets/img/malaria-life-cycle.jpg" width="1065" style="display: block; margin: auto;" /> .source[*[Johns Hopkins Bloomberg School of Public Health](http://ocw.jhsph.edu)*] ] --- # Stats and Geography - 230 million cases worldwide in 2018; 400k death, 2/3 under 5 years old - Distribution of disease - Tropics: Sub-Saharan Africa, South and Southeast Asia, Latin America - Correlated to environmental and socio-economic factors <img src="../assets/img/world-malaria-map.png" width="95%" style="display: block; margin: auto;" /> .source[*(Weiss et al. 2019)*] --- # Modeling Malaria .pull-left[ - Integral part of understanding malaria - Mathematical, e.g. - Compartmental deterministic model to understand disease transmissions - Statistics - Risk factor analysis - Mapping malaria risks - Clinical diagnosis ] .pull-right[ <img src="../assets/img/malaria-compartment.jpg" width="1600" style="display: block; margin: auto;" /> .source[*(Mandal et al. 2011)*] ] --- # Overview - Improving National Level Spatial Mapping of Malaria through Alternative Spatial and Spatio-temporal Models - Interlude - Guiding placement of health facilities using malaria criteria and interactive tool - Decision Support Tool to Predict Causes of Childhood Febrile Illness Using a Bayesian Model Approach --- class: inverse, center, middle # Improving National Level Spatial Mapping of Malaria through Alternative Spatial and Spatio-temporal Models --- # Mapping malaria - Malaria risk map guides efficient resource allocation for intervention program <img src="../assets/img/malaria-risk-bf-intro.png" width="2125" /> --- # Choice of Model - Most common approach: Gaussian Process Model - Points close to the observed data point should share similar prevalence <img src="../assets/img/gp-correl.gif" width="80%" style="display: block; margin: auto;" /> - Time consuming as sampling point large - Commonly approximated using Stochastic Partial Differential Equation (SPDE) + Integrated Nested Laplace Approximation (INLA) --- # Alternative model? .pull-left[ <img src="../assets/img/spde-gamgbm-length.png" width="50%" style="display: block; margin: auto;" /> ] .pull-right[ - SPDE-INLA is complicated - Can we use faster and simpler alternatives instead? ] --- # Alternative model 1 - Spatial smoothing using Generalized Additive Model (GAM) - An underlying smoothed "landscape" unexplained by the covariates <img src="../assets/img/gam-smoothed.png" width="70%" style="display: block; margin: auto;" /> --- # Alternative model 2: Machine learning - Decision-trees ensemble methods - E.g. Random forests, gradient boosting tree/method (GBM) <img src="../assets/img/gbm-schema3.png" width="50%" style="display: block; margin: auto;" /> --- # Borrow strength from the past? - More and more countries now have multiple years of national malaria survey data. - e.g. Burkina Faso 2010 vs 2014 - Spatial vs Spatiotemporal setting <img src="../assets/img/temporal-choice.png" width="90%" style="display: block; margin: auto;" /> --- # Objectives - To determine if GAM and state-of-the-art machine learning method (e.g. GBM), under both spatial and spatio-temporal setting, can be good alternatives to the more complicated SPDE method - To determine if inclusion of past data is beneficial in modeling the current spatial distribution of malaria prevalence at the national scale --- # Data - Demographic Health Survey data from five countries - Spatial covariates: free and publicly available remote sensing or GIS products .pull-left[ <img src="../assets/img/map-chp1-5cty.png" width="1365" style="display: block; margin: auto;" /> ] .pull-right[ <img src="../assets/img/chp1-covars-layer.png" width="1365" style="display: block; margin: auto;" /> ] --- # Model comparison - 5 countries Ă 4 models Ă 2 settings - Spatial setting <img src="../assets/img/chp1-schema-sp.png" width="2028" style="display: block; margin: auto;" /> - Spatiotemporal setting <img src="../assets/img/chp1-schema-st.png" width="2029" style="display: block; margin: auto;" /> --- # Results - GAM and SPDE đ <img src="../assets/img/BF_LLMAE.png" width="60%" style="display: block; margin: auto;" /> --- # Results - GAM and SPDE đ <img src="../assets/img/UG_LLMAE.png" width="60%" style="display: block; margin: auto;" /> --- # Results - Very small difference among top models (look at the axis) <img src="../assets/img/NG_LLMAE.png" width="60%" style="display: block; margin: auto;" /> --- # Results - GBM đČ <img src="../assets/img/ML_LLMAE.png" width="60%" style="display: block; margin: auto;" /> --- # Results - GBM đČ GAM đš <img src="../assets/img/MW_LLMAE.png" width="60%" style="display: block; margin: auto;" /> --- # Spatial vs Spatiotemporal setting <img src="../assets/img/Diff-MAE-pres.png" width="60%" style="display: block; margin: auto;" /> --- # Discussions - No single best model, performance varies from setting to setting and country to country - SPDE is consistent, but doesn't gain much from incorporating past data - Can deteriorate when past and present spatial dependency are very different - GAM is good, fast and simple-to-use alternative, especially with more data - Dismal performance in irregularly shaped countries - High perimeter to root area ratio: Malawi 8.7, Uganda 5.7 - GBM unpredictable but generally fits well with more data available - đ đŹ Fit multiple model or at least benchmark with GAM --- class: inverse, center, middle # Interlude --- # Malaria treatment - The most famous antimalarial drug is... Chloroquine -- - But currently, the state-of-the-art drug is the *Artemisinin (Qinghaosu, éèżçŽ )* and derivatives. - First isolated in 1972 by Tu Youyou (ć± ćŠćŠ) and members of *Project 523* - Awarded 2015 Nobel Prize in Medicine <img src="../assets/img/Tu-Youyou-meme.jpg" width="45%" style="display: block; margin: auto;" /> .source[*Internet, creator unknown*] --- # Artemisinin - Isolated from *Artemisia annua* - Traditional Chinese Medicine - Prescription for malaria in 4th century AD <img src="../assets/img/zhou-hou-fang-malaria.jpeg" width="50%" style="display: block; margin: auto;" /> .source[*[chinadaily.com.cn](http://china.chinadaily.com.cn/a/201901/27/WS5c4d2c9ba31010568bdc6a44.html)*] --- class: inverse, center, middle # Guiding placement of health facilities using malaria criteria and interactive tool --- # Access to healthcare and malaria transmission <img src="../assets/img/chp2-remote-schema1.png" width="60%" style="display: block; margin: auto;" /> --- # Access to healthcare and malaria transmission <img src="../assets/img/chp2-remote-schema2.png" width="60%" style="display: block; margin: auto;" /> --- # Distance / Travel time to health facilities - Early diagnosis and treatment reduce death and transmission - Many factors contribute to access to healthcare - Distance or travel time to health facilities is important predictor to malaria prevalence *(e.g. Schoeps et al. 2011, Kizito et al. 2012)* --- # Bunkpurugu-Yunyoo District, Ghana .pull-left1[ - 1200 km `\(^2\)` , 150K populations - 2 urban centers, 8 health facilities - Multiple years of malaria surveys in 2010 - 2014 - Important predictors *(Amratia et al. 2019; Millar et al. 2018)* - Distance to health facilities (HF) - Distance to urban centers ] .pull-right1[ <img src="../assets/img/byd-simplemap.png" width="1889" style="display: block; margin: auto;" /> ] --- # Expansion of health services? - Community-based Health Planning and Services (CHPS) - Run by trained community health officers (CHOs) - Assigned to community health post, services at doorstep - Treatment for malaria, acute respiratory diseases and other childhood illness - Expand CHPS to all communities --- # Prevalence vs Incidence - Prevalence (0 to 1): Likelihood of a person in the area having malaria at a point in time - Incidence: number of cases during a given period in the population of interest - Available Data: Prevalence under 5 years old - Decision criteria: Incidence (of all ages) - Conversion from prevalence under five to incidence of all ages - Prevalence â Incidence not always linear, especially for older children and adults --- # Objectives - Determine the optimal locations for new health facilities based on district-wide malaria criteria: 1. Overall malaria prevalence of children under 5, 2. Overall malaria incidence of all ages, and 3. Average travel time to nearest health facilities --- # Methods - Three years of high transmission season data (2010 to 2012) - ~ 5000 children under five - 71 to 80 villages per year - GAM with 5 predictors - Travel time to HF, distance to urban center, elevation, NDVI, log population density - Use Genetic Algorithm to find optimal locations given number of health facilities and criteria --- # Results - See interactive visualizer and simulator created. - http://bit.ly/ben-hf-app ### Pitch... <img src="../assets/img/valle-conbio-grab.PNG" width="1888" style="display: block; margin: auto;" /> <!-- # Optimization --> <!-- - Procedures: --> <!-- - Add new facilit(ies) onto the map --> <!-- - Recalculate travel time to nearest health facilities --> <!-- - Use model to predict new sets of covariates --> <!-- - Calculate district-wide metrics --> <!-- - Specify number of health facilities and criteria --> <!-- - Start in some random initial locations --> <!-- - Use genetic algorithm to find locations that minimize metric --> <!-- - Find one up to five optimal locations for each criteria --> <!-- --- --> --- class: inverse, center, middle # Decision Support Tool to Predict Causes of Childhood Febrile Illness Using a Bayesian Model Approach --- # Childhood Acute Febrile Illness (AFI) .pull-left2[ - Malaria used to be the main cause of AFI in Africa - With decline in malaria transmission, other etiologies such as bacterial infection can be increasingly prominent - Difficulty in diagnosis, lack of test kit and prioritization of malaria can drive up misdiagnosis - Predicting infection status based on symptoms, demographic variables and hematological variables ] .pull-right2[ <img src="../assets/img/fever-clipart.png" width="351" style="display: block; margin: auto;" /> .source[*[kissclipart.com](https://www.kissclipart.com/cartoon-thermometer-clipart-thermometer-fever-clip-p9mkzm/)*] ] --- # Statistical decision support tool - Interactive tools that inform health practitioners using predicted probability (of infection) - http://bit.ly/ben-afi-app -- Ideal statistical decision support tool: - Accommodate missing data - User experience: Fast, small and flexible - Accurate - Informative and Interpretable --- # Objectives - Create model to simultaneously - Impute missing variables - Make predictions - Modified Bayesian Model Averaging (BMA) --- # Data - 2016/2017 - 1500 children 1 to 15 years old <img src="../assets/img/afi-study-design.png" width="60%" style="display: block; margin: auto;" /> --- # Data - Model diagnostic test outcomes using the recorded predictors .pull-left[ Diagnostic Tests - Malaria - Rapid Diagnostic Test (RDT) - Microscopy - Taqman Array Card (TAC; Subsample) - Blood and Urine Culture if deemed necessary ] .pull-right[ Recorded Predictors - Age - Time of visit (season) - Body temperature - 30 clinical symptoms (Binary) - 10 Hematological parameters đš Missing values in - 31/43 predictors (Most < 5%) - 17% individuals ] --- # Model training <img src="../assets/img/mvp-bma-training-schema.png" width="100%" style="display: block; margin: auto;" /> --- # Model prediction <img src="../assets/img/mvp-bma-prediction-schema.png" width="100%" style="display: block; margin: auto;" /> --- # Performance vs other methods <img src="../assets/img/afi-msev3.png" width="100%" style="display: block; margin: auto;" /> --- # Key findings - The modified BMA is as accurate as random forest without losing its interpretability - Better than that of stepwise logistic regressions - Provides predictor importance for each diagnostic test - Can be incorporated into a relatively responsive interactive decision support tool --- # Key findings - Poor predictive performance with Malaria TAC - Low sample size - Submicroscopic infection is asymptomatic â Symptoms are not useful predictors - Poor predictive performance with Blood culture - Low sample size - Multiple types of pathogens â diverse signs and symptoms --- # References - Schoeps, Anja, et al. "The Effect of Distance to Health-Care Facilities on Childhood Mortality in Rural Burkina Faso", American Journal of Epidemiology 173, 5 (2011), pp. 492--498. - Kizito, James, et al. "Improving access to health care for malaria in Africa: a review of literature on what attracts patients", Malaria Journal 11, 1 (2012), pp. 55. - Weiss, D.J. et al. 2019. Mapping the global prevalence, incidence, and mortality of Plasmodium falciparum, 2000â17: a spatial and temporal modelling study. Lancet 394: 322â331. - Mandal, S. et al. Mathematical models of malaria - a review. Malar J 10, 202 (2011). - Millar, J. et al. 2018. Detecting local risk factors for residual malaria in northern Ghana using Bayesian model averaging. Malar. J. 17: 343. - Amratia, P. et al. 2019. Characterizing local-scale heterogeneity of malaria risk: A case study in Bunkpurugu-Yunyoo district in northern Ghana. Malar. J. 18: 1â14. --- # Acknowledgement - Dr Denis Valle - Committee members: Dr Nikolay Bliznyuk, Dr Rhoel Dinglasan, Dr Tom Hladish - Collaborators: - Dr Gordon Awandare, Dr Justin Stoler, Dr Nicholas Amoako and WACCBIP; - Dr Benjamin Abuaku, Dr Paul Psychas, Noguchi Medical Research Institute and President's Malaria Initiative - Labmates - Family and friends --- class: inverse, center, middle # Thank you very much! Feedback, comments and questions? Email: kokbent [at] ufl.edu