Augmented intelligence (AI) and related branches—such as machine learning and natural language processing—offer lots of promise for health care, but how can physicians and other health professionals distinguish between clinically safe and useful innovations and hot air?
That question is at the heart of a recent JAMA Pediatrics editorial on machine learning, a branch of AI, that outlines some rules of thumb to help doctors tell the difference between hype and reliable research on machine learning in medicine.
New health care AI policy adopted at the 2019 AMA Annual Meeting provides that AI should advance the quadruple aim—meaning that it “should enhance the patient experience of care and outcomes, improve population health, reduce overall costs for the health care system while increasing value, and support the professional satisfaction of physicians and the health care team.” The AMA House of Delegates also adopted policy on the use of AI in medical education and physician training.
This built on the foundation of the AMA’s initial AI policies adopted last year that emphasized that the perspective of physicians needed to be heard as the technology continues to develop.
Machine learning, a form of AI, has a learner algorithm that analyzes data and automates analytical model building. This allows the system to “learn,” identify patterns and make decisions “with minimal human intervention,” says the editorial written by Joseph Zorc, MD, with Children’s Hospital of Philadelphia; James Chamberlain, MD, with George Washington University School of Medicine; and Lalit Bajaj, MD, with Children’s Hospital of Colorado.
When it comes to assessing the validity of studies reporting the latest findings on machine learning, there are three elements to look for, the physicians write.
The validity of a study’s methods according to standard published references. For this, the authors recommend consulting the “Users’ Guide to the Medical Literature XXII: How to use articles about clinical decision rules,” published in JAMA. Another choice is the “Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD)” guidelines published in the Annals of Internal Medicine.
“These guides include ensuring that all important predictors are included in an unbiased fashion in a population representing a wide spectrum of disease severity and that outcomes are assessed fully and independently,” the editorial says. “The methods need to provide full details on how the model was developed and allow the reader to assess how the decision is being made and reproduce it as much as possible.”
Whether machine-learning model researchers provide the information that clinicians need to use them in practice. Otherwise, models may have high overall predictive ability, but provide little practical assistance for specific clinical questions. This may also include questions of what data was used as “ground truth,” and which patient populations were used by the learner algorithm to develop the model.
Whether researchers apply machine learning tools and test them in actual clinical practice. AI allows more complex decision-support tools to be developed, which can be part of a high-performance computing platform that supplies needed data at the point of care. Also, integration into the electronic health record can allow automated processes to occur that might not otherwise be possible.
But, the editorial adds, the implementation and informatics knowledge needed to take these tools from concept to practice “are research disciplines in themselves” and physicians need to be aware that this would require staff and resources.
“By following these principles, clinicians can harness the ever-growing power of computer science and bring it as a useful tool to the patient bedside,” the editorial states.
Drs. Zorc, Chamberlain and Bajaj also warned of difficulties physicians may have in an environment of shared decision-making when they will be called upon to counsel patients about AI-derived recommendations perceived as coming from “black box” AI systems where it may be unclear how and with what information an algorithm generated a score or recommendation.
“Machine learning and artificial intelligence have the potential to revolutionize clinical decision-making, but also the potential to introduce confusion and mistrust between clinicians and patients that may limit their potential utility,” the editorial states.