This course explores using machine learning to address high healthcare costs and improve patient outcomes. The instructor shares personal stories highlighting the need for earlier diagnoses and better disease management. While AI is part of the solution, systemic changes are also crucial. The course will cover AI/ML basics, examples of healthcare transformation, and the unique challenges of applying ML to healthcare data (e.g., data scarcity, safety, fairness, causality). The availability of electronic medical records and advancements in ML algorithms offer new opportunities. The course will utilize datasets like MIMIC and potentially Truven MarketScan, focusing on practical application and addressing ethical considerations. The Question: How Can Machine Learning Transform Healthcare? The Problem with Healthcare The Need for Change Personal Stories and Motivations Data Availability and Examples The Changing Landscape of Healthcare Data The Time is Right for AI in Healthcare Breakthroughs in Machine Learning Standardizing Health Data What's Unique About Machine Learning in Healthcare? Examples of AI Transforming Healthcare Challenges that prevented the translation of AI algorithms into clinical care in the past: Poor Fit into Clinical Workflows : The models, although effective, did not integrate well into existing clinical workflows. Data Scarcity : It was difficult to obtain sufficient training data due to the manual efforts involved in collection. Lack of Generalizability : Algorithms developed and validated at one institution often did not perform well when applied to other institutions. Data Standardization : Decades of work has been focused on standardizing health data, indicating that lack of standardization was a significant issue. Manual Efforts : The need for manual data collection made it challenging to gather enough data for training effective models. just talked about data, but data alone is not nearly enough. The other major change is that there's been decades worth of work on standardizing health data. So for example, when I mentioned to you that when you go to a doctor's office, and they send a bill that bill the social with the diagnosis and that diagnosis is coded in a system called ICD-9, or ICD-10, which is a standardized system where for many, not all but many diseases, there is a corresponding code associated with it. ICD-10, which was recently rolled out nationwide about a year ago, is much more detailed than the previous coding system includes some interesting categories. For example, bitten by a turtle has a code for it bitten by sea lion struck by Mcdowell, right? So it's starting to get really detailed here, which has its benefits and its disadvantages when it comes to research using that data. But certainly we can do more with detailed data than we could with less detailed data. Laboratory test results are standardized using a system called link here. In the United States, it's again, every lab test or duress OSHI, eight encode with for it. I just want to point out briefly that the values associated a lab test are less standardized. Pharmacy National drug codes should be very familiar to you. If you take any medication that you've been prescribed and you look carefully, you'll see a number on it, NDC, zero, zero, one, five, three, four, seven, nine, eleven. That number is unique to that medication. In fact, it's even unique to the brand of that medication. And there's an associate taxonomy with it. And so one can really understand in a very structured way what medications the patient is on and how those medications relate to one another. A lot of medical data is found not in this structured form, but in free text and notes written by doctors. And these notes have often lots of mentions of symptoms and conditions in them. And one can try to standardize those by mapping them to what's called the unified medical language system, which is a ontology with millions of different medical concepts in them. So, I'm not going to go too much more it's again, something where I think policy can big difference. But luckily here at MIT, the story's gonna be a bit different. So thanks to the MIT IBM Watson AI lab. MIT has a close relationship with IBM and fingers crossed. It looks like we'll get access to this database for our homework and projects for this semester. Now, there are a lot of other initiatives that are, that are creating large data sets. A really important example here in the U.S. is President Obama's precision medicine initiative, which has since been renamed to the All of US initiative. And this initiative is creating a data set of 1 million patients, drawn in a representative manner from across the United States, to capture patients, both poor and rich patients who are healthy and have chronic disease. With the goal of trying to create a research database, where all of US and other people, both inside and outside the US could do research to make medical discoveries. And this will include data such as data from a baseline health exam, where the typical vitals are taken, blood blood is drawn. It will combine data of the previous two types I've mentioned, including both data from electronic medical records and health insurance claims. And, and a lot of this work is also happening here in Boston, So, ready to cross the street. at the Broad Institute. there's a team, which is creating all of the software infrastructure to accommodate this data. And there are a large number of Kroombit sites here in the BOSS, broader Boston Area, where patients or anyone of you really could go and volunteer to be part of this study. I just got a mail, I think I got a letter in the mail last week, inviting me to go, and I was really excited to see time. and in some places, radiologist consults could take days depending on the urgency of the condition. So this is an area where data is quite standardized. In fact, MIT just released last week, a dataset of 300,000 chest x-rays with associated labels on them. And one could try to ask the question of, could we build machine learning algorithms using the convolutional neural network type techniques that we've seen play a big role on object recognition to try to understand what's going on with this patient. For example, in this case, the prediction is that the patient has pneumonia from this chest x--ray. And and using those systems, it could help both reduce the load of radiology consults and it could allow us to really translate these algorithms to settings which are might be much more resource poor, for example in developing nations. Now Yet another example from the ER really has to do with not, how do we care for the patient today, but how do we get better data, which will then result in taking better care of the patient tomorrow? And so one example of that, which, which my group deployed at Beth Israel Deaconess, and it's still running there. In the emergency department, has to do with getting higher quality chief complaints. The chief complaint is the, it's usually a very short two or three word quantity, like left knee pain, rectal pain, right, right, upper quadrant quadrant ruq abdominal pain. And it's just a very quick summary of why did the patient come into the ER today, and it, despite the fact that it so few words, it plays a huge role in the care of a patient. If you look at the big screens in the ER, which summarize who are the patients at what beds they have the chief complaint next to it. Chief complaints are used as criteria for enrolling patients in clinical trials. It's used as criteria for doing retrospective quality research to see how do we care for patients in a particular type. So plays a very big role. But unfortunately, the data that we've been getting has been crap. And that's because it was free text, and it was sufficiently high-dimensional that just attempting to standardize it with a big drop--down, like you see over here, would have killed the clinical workflow would have taken way too much time for clinicians try to find the relevant one. And so it just wouldn't have been used. And that's where some very simple machine learning algorithms turn out to be really valuable. So, for example, we changed the workflow all together, rather than chief complaint, being the first thing that the triage nurse assigns when the patient comes in is the last thing First the nurse takes the vital signs, patient's temperature, heart rate, blood pressure, respiratory rate and oxygen saturation, they talk to the patient, they write up a 10-word, 30 word note about what's going with the patient. Here it says 69-year-old male patient with severe intermittent, right, upper quadrant pain begins soon after eating, also is a heavy drinker. So quite a bit of information in that we take that we use a machine learning algorithm, a supervised machine learning algorithm, in this case, to predict a set of chief complaints. Now, drawn from a standardized ontology, we show that 5 most likely ones in a clinician, the in this case, the nurse could just click one of them and it would enter it into there. we also allow the nurse to type in part of a chief complaint, but rather than just doing a text matching to find words that match what's being typed in, we do a contextual autocompletes. So we use our predictions to prioritize what's the most likely chief complaint that contains that sequence of characters. And that way, it's way faster to enter in the relevant information. And what we found is that over time, we got much higher quality data out. And again, this is something we'll be talking about in one of our lectures in this course. So I just gave you an example, a few examples of how machine learning artificial intelligence will transform the providers space. But now I want to jump up a level and think through not, how do we treat a patient today, but how do we think about the progression of a patient's chronic disease over a period of years can be 10 years, 20 years. And this, this question of how do we manage chronic disease is something which affects all aspects of the health care ACT assists. It'll be used by providers, payers, and also by patients themselves. So consider a patient with chronic kidney disease, chronic kidney disease, it typically only gets worse. So, you might stay, you might start with patient being healthy and that have some increased risk. Eventually, they have some kidney damage. Over time, they reach kidney failure. and once they reach kidney failure, typically they need dialysis or or a kidney transplant. But understanding when each of these things is gonna happen for patients is actually really, really challenging right now, we have one way of trying to stage patients. So standard approach is known as the EGFR. It's derived predominantly from the patient's creatinine, which is a blood test result, and their age, and it gives you a number out. And from that number, you can get some sense of where the patient is in this trajectory. But it's really coarse-grained and it's not at all predictive about when the patient is going to progress to the next stage of the disease. Now other conditions for example some cancers like I'll tell you about next don't follow that linear trajectory. Rather patients conditions and the disease burden which is what I'm showing the y--axis here, might get worse, better, worse again, better again, worse again and so on. And of course is a function of this treatment for the patient and other things that are going on with them. understanding what influences how patients disease is going to progress and when is that progression going to happen could be enormous ly valuable for many of those different parts of the healthcare ecosystem. So so one concrete example of how that type of prediction could be used would be in a type of precision medicine. So returning back to the example that I mentioned in the very beginning of today's lecture of multiple myeloma, which I said, my mother died, I've, there were, there are large number of existing treatments for multiple myeloma, and we don't really know which treatments work best for whom. But imagine a day where we have algorithms that could take what you by the patient, at one point in time. That might include, for example, blood test results. It might include RNA seek, which gives you some sense of the, of the gene expression for the patient that in this case will be derived from a sample taken from the patient's bone marrow. You could take that data and try to predict what would happen to a patient under two different scenarios. The blue scenario that I'm showing you here, if you give them treatment A or this red scenario here where you give them treatment B and of course you've been ain't even big, aren't just one--time treatments by their treatment. There are strategies so they're repeated treatments across time with some intervals of effete And if you if your algorithm is says that under treatment B this is what's gonna happen then you might the the clinician might think okay treatment B is probably the way to go here. It's going to long--term control that the patient's disease burden the best and this is an example of a causal question because we want to know how do we cause a change in the patient's disease trajectory and we can try to answer this now using data. So for example one of the datasets that's available for you to use in your course projects is from the Multiple OMA research Foundation. It's an example of a disease registry. just like the Disease registry. I talked to you about earlier for rheumatoid arthritis, and it follows about a thousand patients across time, patients who have multiple myeloma, what treatments they're getting, what their symptoms are, and at a couple of different stages, very detailed biological data about their cancer. And in this case, RNA SEEK and one can attempt to use that data to learn models to make predictions like this, but such predictions are fraught with errors. And one of the things that Pete and I will be teaching you in this course is that there's a very big difference between prediction and prediction for the purpose of making causal statements, and the way that you interpret the data that you have when your goal is to do treatment, suggestion or optimization is going to be very different from what you are taught in the introductory machine learning algorithms class. So other ways that we could try to treat a managed patients with chronic disease include early diagnosis, for example, of patients with Alzheimer's disease. There's been some really Justing results just in the last few years here, or new modalities altogether. for example, liquid biopsies that are able to do early diagnosis of cancer, even without having to do a biopsy of the, of the, of the cancer tumor itself. We chose to think about how do we better track and measure disease, chronic disease. So one example shown on the left here is from D, Nikita B's lab here at MIT, in C cell, where they've developed a system called Emerald, which is using wireless signals, the same wireless signals that we have in this room today to try to track patients and they can actually see behind walls, which is quite impressive. So using this, where the signal you could, you could install what looks like just a regular wireless router in, in an elderly person's home, and you could detect if that out. Early patient falls, And of course, if the patient has falled and they're elderly, it might be very hard for them to get back up. They might have broken hip, for example, and one could then alert the caregivers maybe if necessary, bring in emergency support, and that could have a long--term outcome for this patient, which is which could really help them. So this is an example what I mean by better tracking chronic patients with chronic disease. Another example comes for patients who have type 1 diabetes, type 1 diabetes is as opposed to type 2 diabetes, generally develops in, in patients at a very early age, as usually as children it's diagnosed, and one is typically managed by having a insulin pump which is attached to a patient, and I can give injections of insulin on the fly as necessary, but there's a really challenging control problem there. If you give the patient too much insulin, you could kill them because I'm too much, too little insulin, you could really hurt them. And how much insulin you give them is going to be a function of their activity. it's gonna be a function of what food they're eating and various other factors. So, this is a question which the control theory community has been thinking through for a number of years. and there are a number of sophisticated algorithms that are present in today's products, and I wouldn't be surprised if one or two people in the room today have one of these, but it also presents a real interesting opportunity for machine learning, because right now, we're not doing a very good job at predicting future glucose levels, which is essential for figuring out how to regulate insulin. And if we had algorithms that forget, for example, take a patient's phone, take a picture of the food that a patient is eating, have that automatically feed into an algorithm that predicts its caloric content, and how quickly that'll be processed by the body. And then as a result of that, think about one based on this patient's metabolic system, one, should you start increasing insulin levels? and by how much that could have a huge impact on quality of life of these types of patients? So finally, we've talked a lot about how do we manage healthcare, but equally important is about discovery. So, the same data that we could use to try to change the way that algorithms are implemented could be used to think through, what would be new treatments and make new discoveries about disease subtypes So at one point later in the semester, we'll be talking about disease progression modeling, and we'll talk about how to use data driven approaches to discover different subtypes of disease. And on the left here, i'm showing an example of a really nice study from back in 2008 that used a k--means clustering algorithm to discover subtypes of asthma. One could also use machine learning to try to make discoveries about, about what, what proteins, for example, are important in regulating disease. How could we differentiate at a biological level, which patients will progress quickly, which patients will respond to treatments. And that, of course, will then suggest new ways of new drug targets for for new pharmaceutical efforts. Another direction also studied here at MIT by quite a few labs actually has to do with drug creation or discovery. So one could use machine learning algorithms to try to predict what would a good antibody be for trying to bind with a particular target. So so that's that's all for my overview. So censoring, which we'll talk about in two weeks, is, is what happens when you have data only for small windows of time. So for example, you're a data set where where your goal is to say predict survivals. Do you want to know how long until person dies but the but a person you only have data on them up to January 2009 and they haven't yet died by January 2009. then that individuals censored, you don't know what would have happened. You don't know when they would when they died. So that doesn't mean you should throw away that data point. And in fact, we'll talk about learning algorithms that can learn from sensor data very effectively. So their number of, oh so logistical challenges to doing machine learning healthcare. I talked about how having access to data is so important. But one of the reasons there are others for why getting large amounts of data and the public domain is challenging, is because it's so sensitive. And removing identifiers like name and social, from data, which includes free text notes, can be very challenging. And as a result, when we do research here at MIT, typically, it takes us anywhere from a few months, which had never happened to two years, which is the usual situation to negotiate a data sharing agreement to get the health data to MIT to do research on. And of course, then my students, right, code would were very happy to open source under MIT license, but that code is completely useless, because no one can reproduce our results on the same data, because they don't have access to it. So, that's a major challenge to this field. Another challenge is about the difficulty in deploying machine learning algorithm is due to the challenge of integration. So, you build a good algorithm. You want to deploy it at your favorite hospital. But guess what? that hospital has. epic or Cerner, or, or Athena or some other commercial electronic medical record system, and that electronic medical record system is not built for your algorithm to plug into. So there's a big gap, big, a large amount of difficulty to, to getting your algorithms into production systems, which, which we'll talk about as well during the semester. All right, we hope so. what's unique about machine learning healthcare? I gave you already some hints at this. So first healthcare is ultimately unfortunately about life-or-death decisions. All right, so we need robust algorithms that don't screw up a prime example of this, which I'll tell you a little bit more about in towards the end of the semester is from a major software error that occurred something like 20, 30 years ago in a in an X-ray type device, where an overwhelming amount of radiation was exposed to a patient just because of a software overflow problem, a bug. And of course, that resulted a number of patients dying. So that was a software error from decades ago, where there was no machine learning in the loop. And as a result of that, and similar types of disasters, including in the space industry and airplanes and so on, led to a whole area of research in computer science, in formal methods, and how do we design computer algorithms that can check that a piece of software would do what it's supposed to do and will not make And and that there are no bugs in it. But now that we're going to start to bring data and machine learning algorithms into the picture, we are really suffering for lack of good tools for doing similar chucking of our algorithms and their behavior. And so this is going to be really important in the future decade, as machine learning gets deployed, not just in settings like healthcare, but also in other settings, we have life and death, such as in autonomous driving. And it's something that we'll, we'll touch on throughout the semester. So, for example, one, one deploys machine learning algorithms. we need to be thinking about, are they safe, but also how do we check for safety, long-term, what are checks and balances that we should put into the deployment of the algorithm to make sure that still working as was intended. We also need fair and accountable algorithms, because increasingly machine learning results are being used to drive resources in a healthcare setting. An example that I'll discuss in about a week and a half, when we talk about risk stratification is that algorithms are being used by payers to risk stratify patients, for example, to figure out which patients are likely to be readmitted to the hospital in the next 30 days, are likely to have undiagnosed diabetes, or likely to progress quickly in their diabetes. And based on those predictions, they're doing a number of interventions. For example, they might send nurses to the patient's home, they might offer pay, they might offer their members access to a weight loss program. And each of these interventions has money associated to them as it have a cost. And so you can't do them for everyone. And so, one uses of machine learning algorithms to prioritize who do you give those interventions to but because health is so intimately tied to socioeconomic status, one could think about what happens if these algorithms are not fair. It could have really long-term implications for our society and something that we're going to talk about later in the semester as well. Now I mentioned earlier that many of the questions that we need a study in in the field don't have good label data in cases where we know we want to predict it in a supervised prediction problem. Often we just don't have labels for that thing we want to predict. But it also many situations where I'm interested in just predicting something, we're interested in discovery. So for example, what I thought was these subtyping or disease progression. It's much harder to quantify what you're looking for. And so unsupervised learning algorithms are going to be really important if what we do. And finally, I already mentioned how many the questions we want to answer are causal in nature, particularly when you want to think about treatment strategies. And so we'll have two lectures on causal inference and we'll have two lectures on reinforcement learning, which is increasingly being used to learn treatment policies in healthcare. So all of these different problems that we've talked about result in our having to rethink how do we do machine learning in the setting? For example, because driving labels for supervised prediction is very hard. One has to think through how can we automatically build algorithms to do what's called electronic phenotyping to discover to figure out automatically what is the relevant labels for a set of patients that one could then have attempt to predict in the future? Because we often have very little data, for example, some rare diseases, there might only be a few hundred or a few thousand people in the nation that have that disease. some common diseases present in very diverse ways. And thoughts, and in essence, are very rare. Because of that, you have just a small number of patient samples that you could get, even if you had all of the data in the right place. And so we need to think through, how can we bring through, how you help? We bring together domain knowledge? How can we bring together data from other areas? Well, everyone look over here. Now, from other areas, other diseases, or to learn something that then we could refine for the foregrounds, question of interest. Finally, there's a ton of missing data in healthcare. Healthcare in the United States is expensive : Currently spending $3 trillion annually. Not delivering optimal outcomes. Chronic diseases are often diagnosed late and poorly managed, even with world-class clinicians. Medical errors are frequent, leading to preventable deaths and disease worsening. Healthcare impacts everyone: Almost everyone has a personal story of a family member, friend, or themselves suffering from a health condition. Alzheimer's disease: The speaker's grandfather was diagnosed late, highlighting the need for early detection. Multiple myeloma: The speaker's mother's cancer was diagnosed in early stages, but she died due to complications that were not detected early enough. AI is only one piece of the puzzle: Systemic changes are also necessary in the healthcare system. Understanding the potential of AI elements is essential. Machine learning (ML) and artificial intelligence (AI) can be used as part of a larger solution to improve healthcare. Traditional sources: EMRs, insurance claims, lab tests, vital signs, imaging data. Non-traditional sources: Social media, mobile phone data. ICD-10: A standardized coding system for diagnoses. LOINC: A standardized coding system for lab tests. NDC: A standardized coding system for medications. UMLS: A standardized ontology mapping medical concepts found in free text notes. FHIR and OMOP: Common data models for exchanging health data. ImageNet competition: Demonstrated the rapid progress in object recognition using deep learning. Algorithmic advances: Learning with high-dimensional features: Support Vector Machines (SVMs) and L1 regularization. Stochastic gradient descent: Efficiently solving convex optimization problems. Unsupervised and semi-supervised learning: Handling limited labeled data. Deep learning: Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs). Data availability: Electronic medical records (EMRs) are becoming more prevalent, allowing for large-scale data collection. Standardization of health data: Systems like ICD-10 , LOINC , and NDC provide standardized ways to encode diagnoses, lab tests, and medications. Breakthroughs in ML algorithms: Recent advancements in deep learning , unsupervised learning , and causal inference have enhanced the capabilities of AI. Industry interest: There is a growing investment in AI healthcare by companies like DeepMind Health , IBM Watson , and Flatiron Health . PhysioNet and MIMIC databases: Publicly available EMR datasets from MIT. Truven MarketScan database: Acquired by IBM, contains data from insurance claims. All of Us Initiative: A research database of 1 million patients with diverse backgrounds. Life-or-death decisions: Robustness and safety are critical. Fair and accountable algorithms: Addressing potential biases and ensuring equitable access to care. Limited labeled data: Requiring reliance on unsupervised and semi-supervised learning. Causal inference: Understanding the impact of treatments and interventions. Logistical challenges: Data access: Sensitivity and privacy concerns. Integration with existing systems: Compatibility with EMRs and other healthcare infrastructure. Provider Space: Emergency Department (ED): Reasoning about patient data: Similar to Internist-1, but leveraging EMR data. Triage and early detection: Identifying patients who need urgent attention. Clinical decision support: Surfacing relevant guidelines and pathways. Anticipating clinician needs: Pre-populating orders based on patient data. Reducing specialist consults: Using AI for image interpretation. Improving data quality: Automated chief complaint generation. Chronic Disease Management: Disease progression modeling: Predicting future disease trajectories. Precision medicine: Tailoring treatments based on patient-specific factors. Early diagnosis: Detecting diseases like Alzheimer's earlier. Tracking and measuring disease: Using wearable sensors and other technologies. Automated insulin management: Improving control for Type 1 diabetes. Discovery: Disease subtyping: Identifying distinct subtypes of diseases. Drug discovery: Predicting effective drug targets.