Sepsis and Machine Learning

Categories: Machine Learning

Sepsis is a medical condition caused by an infection that leads to an immune response so vigorous it attacks the body itself, often spreading to the bloodstream, leading quickly – if unchecked – to death. Sepsis affects as many as 18 million people worldwide each year.[1] It causes 200,000 deaths annually in the United States, and is the number one killer of people who are hospitalized.[2] While many other diseases dominate the headlines of our health conversations, sepsis is “more common than heart attack, and claims more lives than any single cancer.”[3]

Sepsis is also expensive. It is the most costly condition treated in U.S. hospitals.[4] Add to this a high prevalence, co-morbidity with readmission and other critical conditions, fast onset, and diagnostic difficulty: sepsis tends to be an area of continuous focus for a health system. The United States Centers for Medicare & Medicaid Services (CMS), America’s largest funder of healthcare expenses, aggressively tracks hospital readmissions. Many do not realize that sepsis has a higher readmission rate than any of the clinical conditions that the CMS’ effort currently targets.[5]

In my travels I speak to over two dozen different health systems and academic medical centers a year. The outcome they are most likely to use machine learning to predict is readmissions, followed closely by sepsis. One could argue the order should probably be reversed. Regardless, the focus on those two outcomes dwarfs the prevalence of other prediction efforts–such as where to put a rapid response team or predicting a specific adverse event. For sepsis, sometimes a health system’s prediction is focused on a specific patient cohort or time period; sometimes it is limited to drawing insights from reporting, retrospective analytics or statistics; and sometimes it is true machine learning in a batch or a continuous mode.

How ought a health system predict sepsis most effectively? Sepsis has such a high bar in terms of its rapid progression and the need for fast diagnostics, it would be hard to expect nurses to be able to continuously monitor the combination of systems of every patient. Unfortunately, this makes sepsis a sort of ‘perfect’ case designed for using computers to continuously monitor and alert to sepsis.

One success story of sepsis prediction at scale is Cerner’s HealtheIntent platform. HealtheIntent includes a real-time surveillance capability for sepsis.[6] The success story reveals, “The St. John Sepsis Surveillance Agent, developed by Cerner Corporation in 2010, draws from the best published evidence and uses cloud computing with big data analytics to screen and activate on high-risk patients early in their infectious process, while increasing precision in estimating mortality risk to enable medical decision.”[7] HealtheIntent relies on 2,000  nodes of Cloudera for Apache Hadoop, Apache Kafka and other open source technologies.[8] Cerner reports that with Cloudera, “we can now accurately determine the probability that a patient has a bloodstream infection.”[9]

Research into methods to best predict sepsis have occurred for decades. A search in Google Scholar for research since 2013 with the keyword term “predict sepsis” brings back over 800 results. Journals that include these articles include Nature, New England Journal of Medicine, Lancet Respiratory Medicine, Critical Care, Annals of Intensive Care, and BMJ. Research in this timeframe showed that many factors have predictive power, including specific qualities of circulating granulocytes[10], trajectories of past co-morbidities before their current hospital stay[11], C-Reactive protein[12], procalcitonin[13], and—something easier to monitor—heart-rate to systolic rate[14], among others.

Multiple organizations are predicting sepsis with Cloudera today. There are almost as many ways to predict sepsis as there are teams doing it. Some are very large health systems, and some are smaller and regional. Some have data scientists, some do not. Some health systems prefer a do-it-yourself approach and others import talent and software. Of Cloudera’s over 2,800 partners, a number provide sepsis prediction value added, including SAS, H20, DataRobot, ProKarma, Intel, and Docbox.

Our path toward the most real-time, continuous, actionable, and accurate sepsis prediction for every patient globally has already started. Someday, families will evaluate where they admit their loved ones based on whether that health system offers continuous and predictive monitoring for sepsis, so the risk of mortality is lower, and everyone—especially the patient—can sleep easier at night.


[10] Guérin, E., Orabona, M., Raquil, M. A., Giraudeau, B., Bellier, R., Gibot, S., … & Vignon, P. (2014). Circulating immature granulocytes with T-cell killing functions predict sepsis deterioration. Critical care medicine, 42(9), 2007-2018.
[11] Beck, M. K., Jensen, A. B., Nielsen, A. B., Perner, A., Moseley, P. L., & Brunak, S. (2016). Diagnosis trajectories of prior multi-morbidity predict sepsis mortality. Scientific reports, 6.
[12] John, J., Chisthi, M. M., & Kuttanchettiyar, K. G. (2017). C-reactive protein: an early predictor of sepsis in patients with thermal burns. International Surgery Journal, 4(2), 628-632.
[13] Schuetz, P., Birkhahn, R., Sherwin, R., Jones, A. E., Singer, A., Kline, J. A., … & Gaieski, D. F. (2017). Serial Procalcitonin Predicts Mortality in Severe Sepsis Patients: Results From the Multicenter Procalcitonin MOnitoring SEpsis (MOSES) Study. Critical care medicine, 45(5), 781.
[14] Danner, O. K., Hendren, S., Santiago, E., Nye, B., & Abraham, P. (2017). Physiologically-based, predictive analytics using the heart-rate-to-systolic-ratio significantly improves the timeliness and accuracy of sepsis prediction compared to SIRS. The American Journal of Surgery.


Leave a Reply