In the UK, ‘algorithm’ has become a dirty word as the fiasco of the A level and GCSE results has played out. Deemed to be unfair, unethical and unjust, the grades announced following use of the algorithm were overturned in favour of professional expertise and judgment – the teacher estimated grades.
These events show everyone the wider concerns about the use of artificial intelligence (AI) and algorithms that we in the radiology community have recognised since it was first mooted as the future radiologist. We are keen to prevent such a debacle occurring in healthcare. It is not the algorithm that is at fault but the people who make decisions about which algorithm to apply, the size and quality of the data they are trained and tested on, and how they are evaluated into and used in practice. Admittedly, 2020 has been an extraordinary year and arguably not the best time to launch an untested and untried algorithm, which appears to have had inherent bias programmed in.
The algorithm resulted in grade inflation for less academic students from high-performing schools while not allowing for individual brilliance of students from schools with a poor track record in exams. The winners and losers were decided on factors beyond their control or influence. In radiology training programmes there is often a trend in pass rates at exams from year to year. Individual circumstances have a significant effect on exam performance no matter how well prepared you are. Performance over a few hours on a single day can make the difference between pass and fail and significant potential to change life choices and outcomes. Exams are an all-consuming feature of doctors’ lives from GCSEs to the final exams in their chosen specialty. So much so that, following FRCR, I promised myself I would never sit another exam; the MA(Ed) was assignment based!
What are the implications for AI in healthcare? It highlights the importance of rigorous development and testing before cautious, regulated and evaluated implementation into practice. The datasets used for training and the test sets may not match the demographics and disease profiles of the population. This is well known within the drug world, where real-world results rarely match trial outcomes. Piecemeal implementation of AI, without careful evaluation, carries the risk that results may be difficult to corroborate with test data. Small scale roll-out will therefore be unlikely to uncover any systematic bias in an algorithm. Equally it will be less likely to grab the headlines. Just as post marketing surveillance is deemed essential for drugs and devices, so should it be for AI.
Innovate UK and the AI accelerators are developing the infrastructure and test beds/sandboxes, utilising NHS patient data. The events of the past few weeks should be taken as a lesson and NHS bodies should ensure that, before widespread deployment of AI, appropriate testing and evaluation into practice has been undertaken and regulation is in place. Clinicians need to bring rigor along the pathway to clinical utility. In the interests of transparency, patient and clinician trust, the use of an algorithm and its place within healthcare must be explainable. There is a lack of understanding about data and artificial intelligence among the general public, who will now be more concerned about its implementation within healthcare. There also appears to be a lack of understanding among politicians and policymakers. As clinicians, we need to be aware and engage in ethical implementation of AI into our practice, ensuring that it does ‘what it says on the tin’, as its use to augment rather than replace radiologists will leave us with the responsibility and likely the liability if it goes wrong.
Dr Caroline Rubin
Vice-President, Clinical Radiology