AI -- it’s everywhere. It’s there every time a politician pronounces on how to transform productivity in all industries, healthcare included, each time you open a newspaper or watch TV, in conversations over coffee, and in advertising and culture. AI, however ambiguously defined, is the new “white heat of technology” -- a phrase first coined in 1963 by former British Prime Minister Harold Wilson.
In her excellent book, “Artificial Intelligence: A Guide for Thinking Humans,” Melanie Mitchell discusses the cycles of AI enthusiasm, from gushing AI boosterism to disappointment, rationalization, or steady and considered incorporation. She likens this cycle to the passing of the seasons -- AI spring followed by an inevitable AI winter.
Dr. Chris Hammond from Leeds, U.K.
The recent successes of AI, and in particular the rapid development of large language models like ChatGPT, have resulted in a sustained period of AI spring, with increasingly ambitious claims made for the technology, fuelled by the hubris of the “Bitter Lesson” -- that any human problem might be solvable not by thought, imagination, innovation, or collaboration but simply by throwing enough computing power at it.
These seem like exaggerated claims. Like many technologies, AI may be excellent for some things, not so good for others, and we have not learned to tell the difference.
Most human problems come with a panoply of complexities that prevent wholly rational solutions. Personal (or corporate) values, prejudices, experience, intuition, emotion, playfulness, and a whole host of other intangible human traits factor into their management. For example, AI is great at transcribing speech (voice recognition), but understanding spoken meaning is an altogether different problem, laden with glorious human ambiguity. When a British English speaker says, “not bad,” that can mean anything from amazing to deeply disappointing.
Daily clinical routine
In our work as radiologists, we live this issue of problem misappropriation every day. We understand there is a world of difference between the simple question “What’s on this scan?” and the much more challenging “Which of the multiple findings on this scan is relevant to my patient in the context of their clinical presentation, and what does this mean for their care?”. That’s why we call ourselves “clinical radiologists” and why we have multidisciplinary team meetings.
Again, what seems like a simple problem may in fact be hugely complex. To suggest (as some have) that certain professions will be rendered obsolete by AI is to utterly misunderstand those professions and the nature of the problems their human practitioners apply themselves to.
Why do we struggle to separate AI reality from hubristic overreach? Partly this is due to inevitable marketing and investor hype, but I also think the influence of literature and popular culture has an important role.
Manufactured sentient agents are a common fictional device: from Frankenstein’s monster via Hal 9000 to the Cyberdyne T800 or Ash of modern science fiction. But we speak about actual AI using the same language as these fictional characters (and they are characters -- that’s the point), imbuing it with anthropomorphic talents and motivations that are far divorced from today’s reality. We describe it as learning, as knowing, but we have no idea what this means. We are beguiled by its ability to mimic our language, but we don’t question the underlying thought.
In short, we think of AI systems more like people than like a tool limited in purpose and role. To steal a quote, we forget that these systems know everything about what they know, and nothing about anything else (there it is again: “know”?). Because we can solve complex problems, we think AI can, and in the same way.
The example of chest x-rays
In studies of AI image interpretation, neural networks “learn” from a “training” dataset. Is this training and learning in the way we understand it?
Think about how you train a radiologist to interpret a chest x-ray (CXR). After embedding the routine habit of demographic checking, you teach the principles of x-ray absorption in different tissues, then move on to helping them understand the silhouette sign and how the image findings fall inevitably, even beautifully, from the pathological processes present in the patient. It’s true that over time, and with enough experience, a radiologist develops “gestalt” or pattern recognition, meaning they don’t have to follow each of the steps to compose the report; they just “know.” But, occasionally, the gestalt fails, and they need to fall back to first principles.
What we do not do is give a trainee 100,000 CXRs, each tagged with the diagnosis, and ask them to make up their own scheme for interpreting them. Yet this is exactly how we train an AI system: We give it a stack of labeled data and away it goes. There is no pedagogy, mentoring, understanding, explanation, or derivation of first principles. There is merely the development of a statistical model in the hidden layers of the software’s neural network, which may or may not produce the same output as a human. Is this learning?
In her book, Mitchell provides some examples of how an AI’s learning is different (and I would say inferior) to human understanding.
She describes “adversarial attacks” where the output from a system designed to interpret an image can be rendered wholly inaccurate by altering a single pixel within it, a change invisible to a human observer. More illustratively, she describes a system designed to identify whether an image contained a bird, trained on a vast number of images containing and not containing birds. But what the system actually “learned” was not to identify a feathered animal but to identify a blurred background. Because it turns out that most photos of birds are taken with a long lens, a shallow depth of field, and a strong bokeh. So the system associated the bokeh with the tag “bird.” Why wouldn’t it without the helping hand of a parent, a teacher, or a guide to point out its mistake?
Is a machine developed like this actually learning in the way we use the term? I’d argue it isn’t, and to suggest so implies much more than the system calibration actually going on. Would you expect the same from a self-calibrating neural network as a learning machine? Language matters: using less anthropomorphic terms allows us to think of AI systems as tools, not as entries.
The best tool for the job
We are used to deciding on the best tool for a given purpose. Considering AI more instrumentally, as a tool, allows us the space to articulate more clearly about the problem we want to solve, where an AI system would usefully be deployed, and what other options might be available.
For example, improving the immediate interpretation of CXRs by patient-facing (nonradiology) clinicians might be best served by an AI support tool, an education program, a brief induction refresher, an increase in reporting capacity, or a combination of all four. Which of those things should a department invest in? Framing the question in this way at least encourages us to consider all alternatives, human and machine, and to weigh up the governance and economic risks of each more objectively. But how often does such an assessment happen? I’d venture, rarely. Rather, the technocratic allure of the new toy wins out, and alternatives are either ignored or at least incompletely explored.
So to me, AI is a tool, like any other. My suspicion of it derives from my observation that what is promised for AI goes way beyond what is likely to be deliverable, that our language about it inappropriately imbues it with human traits, and that it crowds out human solutions, which are rarely given equal consideration.
Melanie Mitchell concludes her book with a simple example, a question so basic that it seems laughable. What does “it” refer to in the following sentence:
The table won’t fit through the door: It is too big.
AI struggles with questions like this. We can answer this because we know what a table is and what a door is: concepts derived from our lived experience and our labeling of that experience. We know that doors don’t go through tables but that tables may sometimes be carried through doors. This knowledge is not predicated on assessment of a thousand billion sentences containing the words door and table and the likelihood of the words appearing in a certain order. It’s based on what we term “common sense.”
No matter how reductionist your view of the mind as a product of the human brain, to reduce intelligence to a mere function of the number of achievable teraflops per second ignores that past experience of the world, nurture, relationships, personality, and many other traits legitimately shape our common sense, thinking, decision-making, and problem-solving.
AI systems are remarkable achievements, but there is a way to go before I’ll lift my scepticism of their role as anything other than a tool to be deployed judiciously and alongside other, human, solutions.
Dr. Chris Hammond is consultant vascular radiologist and medical director at Leeds Teaching Hospitals NHS Trust, Leeds, U.K.
The comments and observations expressed herein do not necessarily reflect the opinions of AuntMinnieEurope.com, nor should they be construed as an endorsement or admonishment of any particular vendor, analyst, industry consultant, or consulting group.