Information

Related Research Units

Research Overview

The Manrai Lab is a team of machine learning scientists, clinicians, and biomedical data scientists working to improve medical decision making by developing computational approaches that incorporate rich and deep representations of clinical state and an individual's identity into care. Active projects include:

  1. Improving genetic variant classification and quantifying risk ("penetrance") in clinical genomics, with a focus on inherited heart disease (e.g. Manrai et al. NEJM 2016)
  2. Measuring "normal" variation for blood laboratory biomarkers across populations with a focus on creatinine and kidney disease (e.g. Manrai et al. JAMA 2018)/li>
  3. Developing semi-supervised learning approaches with applications including medical imaging and text (e.g. Melas-Kyriazi & Manrai 2020)/li>
  4. Modeling reproducibility in integrative biomedical studies using meta-science ("science of science") approaches (e.g. Manrai et al. AJE 2019)/li>

The group's research has been published in the New England Journal of Medicine and JAMA, presented at the National Academy of Sciences, and featured in the New York Times, Wall Street Journal, and NPR.

Research Background

Arjun (Raj) Manrai is an Assistant Professor at Harvard Medical School and Faculty Member in the Computational Health Informatics Program (CHIP) at Boston Children’s Hospital. Manrai received an A.B. in Physics with Highest Honors from Harvard and earned his Ph.D. in Bioinformatics and Integrative Genomics from the Harvard-MIT Division of Health Sciences and Technology.

 

Publications

  1. Tackling algorithmic bias and promoting transparency in health datasets: the STANDING Together consensus recommendations. Lancet Digit Health. 2025 Jan; 7(1):e64-e88. View Abstract
  2. Large Language Models and the Degradation of the Medical Record. N Engl J Med. 2024 Oct 31; 391(17):1561-1564. View Abstract
  3. Medical Artificial Intelligence and Human Values. Reply. N Engl J Med. 2024 Sep 26; 391(12):1167-1168. View Abstract
  4. Projected Changes in Statin and Antihypertensive Therapy Eligibility With the AHA PREVENT Cardiovascular Risk Equations. JAMA. 2024 09 24; 332(12):989-1000. View Abstract
  5. Medical Artificial Intelligence and Human Values. N Engl J Med. 2024 May 30; 390(20):1895-1904. View Abstract
  6. Discordance between a deep learning model and clinical-grade variant pathogenicity classification in a rare disease cohort. medRxiv. 2024 May 23. View Abstract
  7. Implications of Race Adjustment in Lung-Function Equations. N Engl J Med. 2024 Jun 13; 390(22):2083-2097. View Abstract
  8. Heterogeneity in elevated glucose and A1C as predictors of the prediabetes to diabetes transition: Framingham Heart Study, Multi-Ethnic Study on Atherosclerosis, Jackson Heart Study, and Atherosclerosis Risk In Communities. medRxiv. 2024 Apr 08. View Abstract
  9. To do no harm - and the most good - with AI in health care. Nat Med. 2024 Mar; 30(3):623-627. View Abstract
  10. Assessing the genetic contribution of cumulative behavioral factors associated with longitudinal type 2 diabetes risk highlights adiposity and the brain-metabolic axis. medRxiv. 2024 Jan 31. View Abstract
  11. Decoding the exposome: data science methodologies and implications in exposome-wide association studies (ExWASs). Exposome. 2024; 4(1):osae001. View Abstract
  12. Prediction and stratification of longitudinal risk for chronic obstructive pulmonary disease across smoking behaviors. Nat Commun. 2023 Dec 14; 14(1):8297. View Abstract
  13. Artificial Intelligence vs Clinician Performance in Estimating Probabilities of Diagnoses Before and After Testing. JAMA Netw Open. 2023 12 01; 6(12):e2347075. View Abstract
  14. Publisher Correction: Scientific discovery in the age of artificial intelligence. Nature. 2023 Sep; 621(7978):E33. View Abstract
  15. Scientific discovery in the age of artificial intelligence. Nature. 2023 Aug; 620(7972):47-60. View Abstract
  16. Prediction and stratification of longitudinal risk for chronic obstructive pulmonary disease across smoking behaviors. medRxiv. 2023 Apr 05. View Abstract
  17. Artificial Intelligence in Medicine. N Engl J Med. 2023 Mar 30; 388(13):1220-1221. View Abstract
  18. National Projections for Clinical Implications of Race-Free Creatinine-Based GFR Estimating Equations. J Am Soc Nephrol. 2023 02 01; 34(2):309-321. View Abstract
  19. Positive Predictive Value of the Thumb-Palm Test for General Population Screening of Ascending Aortic Aneurysm. Am J Cardiol. 2021 12 15; 161:116-117. View Abstract
  20. Leveraging vibration of effects analysis for robust discovery in observational biomedical data science. PLoS Biol. 2021 09; 19(9):e3001398. View Abstract
  21. Data Mining Approaches to Reference Interval Studies. Clin Chem. 2021 09 01; 67(9):1175-1181. View Abstract
  22. Foundational Considerations for Artificial Intelligence Using Ophthalmic Images. Ophthalmology. 2022 02; 129(2):e14-e32. View Abstract
  23. Race-Free Equations for eGFR: Comparing Effects on CKD Classification. J Am Soc Nephrol. 2021 08; 32(8):1868-1870. View Abstract
  24. Physicians, Probabilities, and Populations-Estimating the Likelihood of Disease for Common Clinical Scenarios. JAMA Intern Med. 2021 06 01; 181(6):756-757. View Abstract
  25. Removing Race From Kidney Function Estimates-Reply. JAMA. 2021 05 18; 325(19):2018-2019. View Abstract
  26. Association of 152 Biomarker Reference Intervals with All-Cause Mortality in Participants of a General United States Survey from 1999 to 2010. Clin Chem. 2021 03 01; 67(3):500-507. View Abstract
  27. Harmonizing the Collection of Clinical Data on Genetic Testing Requisition Forms to Enhance Variant Interpretation in Hypertrophic Cardiomyopathy (HCM): A Study from the ClinGen Cardiomyopathy Variant Curation Expert Panel. J Mol Diagn. 2021 05; 23(5):589-598. View Abstract
  28. Comparisons of Polyexposure, Polygenic, and Clinical Risk Scores in Risk Prediction of Type 2 Diabetes. Diabetes Care. 2021 04; 44(4):935-943. View Abstract
  29. Clinical Implications of Removing Race From Estimates of Kidney Function. JAMA. 2021 Jan 12; 325(2):184-186. View Abstract
  30. In Search of a Better Equation - Performance and Equity in Estimates of Kidney Function. N Engl J Med. 2021 Feb 04; 384(5):396-399. View Abstract
  31. What about the environment? Leveraging multi-omic datasets to characterize the environment's role in human health. Pac Symp Biocomput. 2021; 26:309-315. View Abstract
  32. What about the environment? Leveraging multi-omic datasets to characterize the environment's role in human health. Pac Symp Biocomput. 2021; 26:309-315. View Abstract
  33. Scalability and cost-effectiveness analysis of whole genome-wide association studies on Google Cloud Platform and Amazon Web Services. J Am Med Inform Assoc. 2020 09 01; 27(9):1425-1430. View Abstract
  34. Prediction of chronological and biological age from laboratory data. Aging (Albany NY). 2020 05 05; 12(9):7626-7638. View Abstract
  35. Challenges to the Reproducibility of Machine Learning Models in Health Care. JAMA. 2020 01 28; 323(4):305-306. View Abstract
  36. Signals Among Signals: Prioritizing Nongenetic Associations in Massive Data Sets. Am J Epidemiol. 2019 05 01; 188(5):846-850. View Abstract
  37. Author Correction: Repurposing large health insurance claims data to estimate genetic and environmental contributions in 560 phenotypes. Nat Genet. 2019 04; 51(4):764-765. View Abstract
  38. Potential Excessive Testing at Scale: Biomarkers, Genomics, and Machine Learning. JAMA. 2019 Feb 26; 321(8):739-740. View Abstract
  39. Repurposing large health insurance claims data to estimate genetic and environmental contributions in 560 phenotypes. Nat Genet. 2019 02; 51(2):327-334. View Abstract
  40. Using Big Data to Determine Reference Values for Laboratory Tests-Reply. JAMA. 2018 10 09; 320(14):1496. View Abstract
  41. In the Era of Precision Medicine and Big Data, Who Is Normal? JAMA. 2018 May 15; 319(19):1981-1982. View Abstract
  42. Biomedical informatics and machine learning for clinical genomics. Hum Mol Genet. 2018 05 01; 27(R1):R29-R34. View Abstract
  43. Adaptation and validation of the ACMG/AMP variant classification framework for MYH7-associated inherited cardiomyopathies: recommendations by ClinGen's Inherited Cardiomyopathy Expert Panel. Genet Med. 2018 03; 20(3):351-359. View Abstract
  44. Association of Sex With Recurrence of Autism Spectrum Disorder Among Siblings. JAMA Pediatr. 2017 11 01; 171(11):1107-1112. View Abstract
  45. Fostering reproducibility in industry-academia research. Science. 2017 08 25; 357(6353):759-761. View Abstract
  46. Systematic correlation of environmental exposure and physiological and self-reported behaviour factors with leukocyte telomere length. Int J Epidemiol. 2017 02 01; 46(1):44-56. View Abstract
  47. METHODS TO ENSURE THE REPRODUCIBILITY OF BIOMEDICAL RESEARCH. Pac Symp Biocomput. 2017; 22:117-119. View Abstract
  48. Informatics and Data Analytics to Support Exposome-Based Discovery for Public Health. Annu Rev Public Health. 2017 Mar 20; 38:279-294. View Abstract
  49. Genetic Misdiagnoses and the Potential for Health Disparities. N Engl J Med. 2016 Aug 18; 375(7):655-65. View Abstract
  50. Clinical Genomics: From Pathogenicity Claims to Quantitative Risk Estimates. JAMA. 2016 Mar 22-29; 315(12):1233-4. View Abstract
  51. REPRODUCIBLE AND SHAREABLE QUANTIFICATIONS OF PATHOGENICITY. Pac Symp Biocomput. 2016; 21:231-42. View Abstract
  52. METHODS TO ENHANCE THE REPRODUCIBILITY OF PRECISION MEDICINE. Pac Symp Biocomput. 2016; 21:180-182. View Abstract
  53. Development of exposome correlation globes to map out environment-wide associations. Pac Symp Biocomput. 2015; 231-42. View Abstract
  54. Medicine's uncomfortable relationship with math: calculating positive predictive value. JAMA Intern Med. 2014 Jun; 174(6):991-3. View Abstract
  55. Enriched protein screening of human bone marrow mesenchymal stromal cell secretions reveals MFAP5 and PENK as novel IL-10 modulators. Mol Ther. 2014 May; 22(5):999-1007. View Abstract
  56. Urinary-cell mRNA and acute kidney-transplant rejection. N Engl J Med. 2013 11 07; 369(19):1859. View Abstract
  57. CEAS: cis-regulatory element annotation system. Bioinformatics. 2009 Oct 01; 25(19):2605-6. View Abstract
  58. Androgen receptor regulates a distinct transcription program in androgen-independent prostate cancer. Cell. 2009 Jul 23; 138(2):245-56. View Abstract
  59. The geometry of multisite phosphorylation. Biophys J. 2008 Dec 15; 95(12):5533-43. View Abstract
  60. Model-based analysis of two-color arrays (MA2C). Genome Biol. 2007; 8(8):R178. View Abstract

Contact Arjun (Raj) Manrai