In medicine, the cautionary tales about the unintended effects of artificial intelligence are already legendary. There was the program meant to predict when patients would develop sepsis, a deadly bloodstream infection, that triggered a litany of false alarms. Another, intended to improve follow-up care for the sickest patients, appeared to deepen troubling health disparities.
Wary of such flaws, physicians have kept A.I. working on the sidelines: assisting as a scribe, as a casual second opinion, and as a back-office organizer. But the field has gained investment and momentum for uses in medicine and beyond.
Within the Food and Drug Administration (FDA), which plays a key role in approving new medical products, A.I. is a hot topic. It is helping to discover new drugs. It could pinpoint unexpected side effects. And it is even being discussed as an aid to staff who are overwhelmed with repetitive, rote tasks.
Yet in one crucial way, the FDA’s role has been subject to sharp criticism: how carefully it vets and describes the programs it approves to help doctors detect everything from tumors to blood clots to collapsed lungs.
“We’re going to have a lot of choices. It’s exciting,” Dr. Jesse Ehrenfeld, president of the American Medical Association, a leading doctors’ lobbying group, said in an interview. “But if physicians are going to incorporate these things into their workflow, if they’re going to pay for them and if they’re going to use them – we’re going to have to have some confidence that these tools work.”
President Biden issued an executive order on Monday that calls for regulations across a broad spectrum of agencies to try to manage the security and privacy risks of A.I., including in health care. The order seeks more funding for A.I. research in medicine and also for a safety program to gather reports on harm or unsafe practices. There is a meeting with world leaders later this week to discuss the topic.
In an event on Monday, Mr. Biden said it was important to oversee A.I. development and safety and build systems that people can trust.
“For example, to protect patients, we will use A.I. to develop cancer drugs that work better and cost less,” Mr. Biden said. “We will also launch a safety program to make sure A.I. health systems do no harm.”
No single U.S. agency governs the entire landscape. Senator Chuck Schumer, Democrat of New York and the majority leader, summoned tech executives to Capitol Hill in September to discuss ways to nurture the field and also identify pitfalls.
Google has already drawn attention from Congress with its pilot of a new chatbot for health workers. Called Med-PaLM 2, it is designed to answer medical questions but has raised concerns about patient privacy and informed consent.
How the FDA will oversee such “large language models,” or programs that mimic expert advisers, is just one area where the agency lags behind rapidly evolving advances in the A.I. field. Agency officials have only begun to talk about reviewing technology that would continue to “learn” as it processes thousands of diagnostic scans. And the agency’s existing rules encourage developers to focus on one problem at a time – like a heart murmur or a brain aneurysm – a contrast to A.I. tools used in Europe that scan for a range of problems.
The agency’s reach is limited to products being approved for sale. It has no authority over programs that health systems build and use internally. Large health systems like Stanford, Mayo Clinic, and Duke – as well as health insurers – can build their own A.I. tools that affect care and coverage decisions for thousands of patients with little to no direct government oversight.
Still, doctors are raising more questions as they attempt to deploy the roughly 350 software tools that the FDA has cleared to help detect clots, tumors, or a hole in the lung. They have found few answers to basic questions: How was the program built? How many people was it tested on? Is it likely to identify something a typical doctor would miss?
The lack of publicly available information, perhaps paradoxical in a realm replete with data, is causing doctors to hang back, wary that technology that sounds exciting can lead patients down a path to more biopsies, higher medical bills, and toxic drugs without significantly improving care.
Dr. Eric Topol, author of a book on A.I. in medicine, is a nearly unflappable optimist about the technology’s potential. But he said the FDA had fumbled by allowing A.I. developers to keep their “secret sauce” under wraps and failing to require careful studies to assess any meaningful benefits.
“You have to have really compelling, great data to change medical practice and to exude confidence that this is the way to go,” said Dr. Topol, executive vice president of Scripps Research in San Diego. Instead, he added, the FDA has allowed “shortcuts.”
Large studies are beginning to tell more of the story: One found the benefits of using A.I. to detect breast cancer, and another highlighted flaws in an app meant to identify skin cancer, Dr. Topol said.
Dr. Jeffrey Shuren, the chief of the FDA’s medical device division, has acknowledged the need for continuing efforts to ensure that A.I. programs deliver on their promises after his division clears them. While drugs and some devices are tested on patients before approval, the same is not typically required of A.I. software programs.
One new approach could be building labs where developers could access vast amounts of data and build or test A.I. programs, Dr. Shuren said during the National Organization for Rare Disorders conference on Oct. 16.
“If we really want to assure that right balance, we’re going to have to change federal law because the framework in place for us to use for these technologies is almost 50 years old,” Dr. Shuren said. “It really was not designed for A.I.”
Other forces complicate efforts to adapt machine learning for major hospital and health networks. Software systems don’t talk to each other. No one agrees on who should pay for them.
By one estimate, about 30 percent of radiologists (a field in which A.I. has made deep inroads) are using A.I. technology. Simple tools that might sharpen an image are an easy sell. But higher-risk ones, like those selecting whose brain scans should be given priority, concern doctors if they do not know, for instance, whether the program was trained to catch the maladies of a 19-year-old versus a 90-year-old.
Aware of such flaws, Dr. Nina Kottler is leading a multiyear, multimillion-dollar effort to vet A.I. programs. She is the chief medical officer for clinical A.I. at Radiology Partners, a Los Angeles-based practice that reads roughly 50 million scans annually for about 3,200 hospitals, free-standing emergency rooms, and imaging centers in the United States.
She knew diving into A.I. would be delicate with the practice’s 3,600 radiologists. After all, Geoffrey Hinton, known as the “godfather of A.I.,” roiled the profession in 2016 when he predicted that machine learning would replace radiologists altogether.
Dr. Kottler said she began evaluating approved A.I. programs by quizzing their developers and then tested some to see which programs missed relatively obvious problems or pinpointed subtle ones.
She rejected one approved program that did not detect lung abnormalities beyond the cases her radiologists found – and missed some obvious ones.
Another program that scanned images of the head for aneurysms, a potentially life-threatening condition, proved impressive, she said. Though it flagged many false positives, it detected about 24 percent more cases than radiologists had identified. More people with an apparent brain aneurysm received follow-up care, including a 47-year-old with a bulging vessel in an unexpected corner of the brain.
At the end of a telehealth appointment in August, Dr. Roy Fagan realized he was having trouble speaking to the patient. Suspecting a stroke, he hurried to a hospital in rural North Carolina for a CT scan.
The image went to Greensboro Radiology, a Radiology Partners practice, where it set off an alert in a stroke-triage A.I. program. A radiologist didn’t have to sift through cases ahead of Dr. Fagan’s or click through more than 1,000 image slices; the one spotting the brain clot popped up immediately.
The radiologist had Dr. Fagan transferred to a larger hospital that could rapidly remove the clot. He woke up feeling normal.
“It doesn’t always work this well,” said Dr. Sriyesh Krishnan, of Greensboro Radiology, who is also the director of innovation development at Radiology Partners. “But when it works this well, it’s life-changing for…