Why radiologists don’t trust AI but they use it anyway 🤯
When perceptions and usage don’t match
Hello and welcome to Careviser by Marie Loubiere, the weekly newsletter that cuts through the healthcare noise with a single focus: productization of the latest research and tech breakthroughs.
“Radiologists will soon be replaced by AI” is often read in click-bait articles in the media. The reality is that AI, even in radiology, is still in its infancy. AI solutions are not ready to be FDA-cleared for diagnosis. At this stage, the intended use is to support the decision making process of physicians. But what happens when radiologists and AI have to work hand in hand? Are radiologists ready to adopt these new tools? That’s what we will discuss today.
Do as AI say: susceptibility in deployment of clinical decision-aids by Gaube, S., Suresh, H., Raue, M. et al.
🗝️ Why it matters: AI is slowing changing healthcare especially in radiology. It has been proven in numerous studies that it can make reliable diagnosis recommendations on medical images. This raises the question of its implementation in real world clinical environments. Physicians need to trust AI, but also remain critical to prevent medical errors. Such a balance is hard to find.
🔎 The study: The objective was to assess how diagnostic advice was perceived by physicians based on
Whether it was labeled as coming from an AI, or from an experienced radiologist
If there was a difference in perception based on the level of expertise of the physician on the receiving end of the advice. The study included experts (radiologist) and physicians with less expertise (internal/emergency medicine physicians)
They were asked to (1) evaluate the quality of the advice through a series of questions, and (2) make a final diagnosis. This enables to see if their perception is consistent with how they act on the advice.
All the advice had been formulated by experienced radiologists, even when labeled as coming from the AI. Some of the diagnostic advice was purposely inaccurate.
✅ The results: The good news is that most physicians were able to spot inaccurate advice and gave such advice a lower rating.
Radiologists were even better at flagging inaccurate advice which is consistent with their level of expertise. They also exhibited stronger “algorithmic aversion” than their non-expert peers meaning that they gave lower ratings to inaccurate advice given by an AI vs. similar advice given by a human.
The final diagnosis made by the physician was affected by the advice received. The diagnosis performance was 37 to 40% better when they had received accurate advice, which shows that the quality of the advice given does affect the quality of the final diagnosis made by a physician. This raises a dramatic question about the potential impact of receiving inaccurate advice from an AI.
It is also interesting to note that the diagnostic was not affected by the source (whether it came from an AI or a human). Despite the fact that physicians exhibited less trust in the AI, they still followed its advice in the same proportion as when it came from a human physician.
🚀 Challenges and opportunities for AI implementation:
It is now proven that diagnostic advice influences clinical decision-making, even when it comes from an AI that physicians claim they don’t trust.
Lower expertise physicians can be even more influenced by inaccurate advice (based on confirmatory bias), meaning that the evaluation of the quality of the AI needs to be stringent. If the AI performs better than humans, then all should be well.
AI tends to give firm diagnostic answers, without the flexibility provided by two humans reviewing a case. Humans are more likely to trust advice indicating notions of confidence, which is a feature that could support AI for product adoption.
There is a strong paradox between the fact that high expertise physicians exhibited distrust towards the AI system, and yet trusted it as much as humans when it came to taking into account its advice for the final diagnosis. This means that even if they will use it in clinical practice, it may be hard for AI vendors to get a foot in the door at first.
Gleamer was founded in 2017 in France. It is an AI startup for radiology.
💊 The product: Their main product, BoneView, is a class IIa medical device. It is an AI assistant for bone trauma X-rays. It enables radiologists to double-check that they have not missed tiny lesions. It also comes with a smart work list enabling to prioritize patients that are at risk. It is an interesting twist that allows them to be completely included in the regular clinical workflow as radiologists use the worklist all the time. It is sold with a subscription model depending on the size of the hospital.
📈 Progress: They claim to have 800 users in over 50 hospitals. One of the main benefits of Boneview is that it is another “shield against potential litigation procedures”. Indeed, in case both the radiologist and the AI miss a lesion, it is an additional way to back the physician and claim that it could not be detected. Overall Gleamer says that they reduce missed fractures by 30%. It would be interesting to see a savings figure associated with this claim, eg., how much time they save, what impact they have on outcomes, readmissions etc.
They raised a $9m series A led by French VC Xange last September and with 37 radiologists backing them as business angels. It is one of the first times that I have heard of a startup backed at a later stage by physicians and it is an inspiring move.
🚀 Next steps: They have been focused on Europe so far as their AI is not FDA-cleared yet. They launched a clinical study and aim to be FDA approved by the end of the year. The go-to-market for European healthcare startups in the US is always a challenge so it will be interesting to see how they make it through.
They would also like to expand the scope of the AI so that it can be used in other radiology indications.
Deepc addresses one of the key challenges of radiology AI: multiple startups have launched AI solutions that cover a specific indication (e.g., detection of a specific type of lesion or cancer). It is challenging for providers to integrate all these solutions. Deepc is building an operating system that brings together all the relevant AI solutions in one platform.
💊 The product:The deepcOS platform covers more than 20 indication fields. It can be installed in a day, and leads to a 7x return on investment. They have done all the interfacing work between all the AI algorithms. It is also interoperable with all the main PACS and hosted on a cloud. They claim to have built a user-friendly interface.
📈 Progress: Deepc started as a spin-off from a university project at the Ludwig Maximilian University in Germany in 2019. They raised a seed round at the end of 2020. They are CE marked and have some of the leading AI solutions (including Gleamer) on their platform. They just launched the product and are currently hiring a sales team to get their first paid customers.
🚀 Next steps: Deepc initially go-to-market strategy seems to be focused on Germany and then Europe. Germany is a large market for healthcare IT in Europe but with a limited penetration of innovation solutions and an entry-level level of pricing. However deepc would like to enable its customers to receive public funding to use its solutions under the Hospital Future Act (KHZG) for hospitals.
That’s a wrap for today! Don’t hesitate to reply to this email with comments, I read and answer all emails :)
Great article (this comment has NOT been written by an AI)