| The background In January, Utah became the first state in the nation to allow an AI system to autonomously handle routine prescription refills for patients with chronic conditions. The pilot intends to reduce delays and friction in the prescription refill process, which can be a major barrier to medication adherence. Earlier this month, however, researchers said they found flaws in a chatbot made by New York-based startup Doctronic, the same company Utah is partnering with for its pilot. The report critiquing Doctronic's AI was published by London-based Mindgard AI, a cybersecurity and research company born out of Lancaster University. It sells AI vulnerability tools and specializes in stress-testing AI systems for safety and security vulnerabilities. In the report, Mindgard detailed how it tricked the system into producing dangerous medical guidance and altering prescription doses. However, both Doctronic and Utah's Office AI Policy say that the vulnerabilities Mindgard uncovered do not reflect the AI system currently managing patient prescriptions in the state, noting that the AI bot involved in the pilot operates under strict safeguards. Separating fear from reality In its report, Mindgard showed that Doctronic's AI could be jailbroken by exploiting flaws in its system prompts — the hidden instructions that govern its behavior. By tricking the AI bot into reciting and then rewriting these instructions, the researchers were able to make it generate unsafe clinical guidance, including wildly incorrect medication doses and instructions for illegal drugs. Doctronic's co-CEOs — Matt Pavelle and Dr. Adam Oskowitz — said Mindgard didn't uncover any new risks, noting that the kinds of prompt-manipulation vulnerabilities the report demonstrated are already well understood in the AI community. "It is absolutely impossible for the chatbot to change the rest of the code to modify a prescription or prescribe a drug that's not in our formulary. A researcher might convince the chatbot to say it will do it, because I can convince a chatbot to say that red is green, but it's not actually doing it," Pavelle declared. "I suppose that you never know, as far as people trying to get [improper doses of drugs on the formulary], but I don't know that there's a large black market for statins." Utah's response "We understand why reports like this raise questions, and we take them seriously. Independent red-teaming can surface cases that are not encountered in ordinary use, and that kind of stress-testing is valuable as these systems mature," read a statement emailed to MedCityNews from Utah's Office of AI Policy. The office also said it was aware of these types of risks before the pilot began. That's why it structured this program with layered safeguards, escalation pathways, reporting requirements, physician oversight and physician review phases. All things considered, Mindgard's report does seem to raise a relevant policy question. It's not whether edge cases exist — they do, across all large language models — but whether tech developers, providers and regulators are exercising the necessary diligence as they venture into uncharted territory: medication refills without a human in the loop. Doctronic and Utah's Office of AI Policy say that for their refill bot pilot, their answer is yes. Both organizations maintain that the use of this bot does not put patients in harm's way. And until real-world evidence shows otherwise, they see no reason to slow the rollout. — By Katie Adams |
No comments