ADVERTISEMENT

ADVERTISEMENT

Vishal Gupta: Worried that ChatGPT is coming for your job? An old assessment tool may have the answer

From the commentary: Using Bloom’s Taxonomy we can see that effective human-AI collaboration will largely mean delegating lower-level cognitive tasks so that we can focus our energy on more complex, cognitive tasks.

FILE PHOTO: Illustration shows ChatGPT logo
FILE PHOTO: A smartphone with a displayed ChatGPT logo is placed on a computer motherboard in this illustration taken February 23, 2023.
REUTERS/Dado Ruvic/Illustration/File Photo

“AI passes U.S. medical licensing exam.” “ChatGPT passes law school exams despite ‘mediocre’ performance.” “Would ChatGPT get a Wharton MBA?”

WCT.OP.Commentary.jpg
More Commentary:
From the commentary: Sometimes, for some women, separate is not only equal but better.
From the commentary: Further, Pence was perfectly willing to watch a multi-front coup attempt inflate on every side of him for months without making a sound, the same way he spent every hour of Trump’s decency-mocking presidency as its primary lickspittle.
From the commentary: Government bailouts do not penalize bad management and lack of oversight, or risky investment strategies that caused the problem.

Headlines such as these have recently touted (and often exaggerated) the successes of ChatGPT, an artificial intelligence tool capable of writing sophisticated text responses to human prompts. These successes follow a long tradition of comparing an AI’s ability to that of human experts, such as Deep Blue’s chess victory over Gary Kasparov in 1997, IBM Watson’s “Jeopardy!” victory over Ken Jennings and Brad Rutter in 2011, and AlphaGo’s victory in the game Go over Lee Sedol in 2016.

The implied subtext of these recent headlines is more alarmist: AI is coming for your job. It’s as smart as your doctor, your lawyer and that consultant you hired. It heralds an imminent, pervasive disruption to our lives.

But sensationalism aside, does comparison of AI with human performance tell us anything practically useful? How should we effectively utilize an AI that passes the U.S. medical licensing exam? Could it reliably and safely collect medical histories during patient intake? What about offering a second opinion on a diagnosis? These kinds of questions can’t be answered by performing comparably to a human on the medical licensing exam.

The problem is most people have little AI literacy — an understanding of when and how to use AI tools effectively. What we need is a straightforward, general-purpose framework for assessing the strengths and weaknesses of AI tools that everyone can use. Only then can the public make informed decisions about incorporating those tools into our daily lives.

ADVERTISEMENT

To meet this need, my research group turned to an old idea from education: Bloom’s Taxonomy. First published in 1956 and later revised in 2001, Bloom’s Taxonomy is a hierarchy describing levels of thinking in which higher levels represent more complex thought. Its six levels are: 1) Remember — recall basic facts, 2) Understand — explain concepts, 3) Apply — use information in new situations, 4) Analyze — draw connections between ideas, 5) Evaluate — critique or justify a decision or opinion, and 6) Create — produce original work.

These six levels are intuitive, even for non-experts, but specific enough to make meaningful assessments. Moreover, Bloom’s Taxonomy isn’t tied to a particular technology — it applies to cognition broadly. We can use it to assess the strengths and limitations of ChatGPT or other AI tools that manipulate images, create audio, or pilot drones.

My research group has begun assessing ChatGPT through the lens of Bloom’s Taxonomy by asking it to respond to variations on a prompt, each targeting a different level of cognition.

For example, we asked the AI: “Suppose demand for COVID vaccines this winter is forecasted to be 1 million doses plus or minus 300,000 doses. How much should we stock to meet 95% of demand?” — an Apply task. We then modified the question, asking it to “Discuss the pros and cons of ordering 1.8 million vaccines” — an Evaluate level task. Then we compared the quality of the two responses and repeated this exercise for all six levels of the taxonomy.

Preliminary results are instructive. ChatGPT generally does well with Recall, Understand and Apply tasks but struggles with the more complex Analyze and Evaluate tasks. With the first prompt, ChatGPT responded well by applying and explaining a formula to suggest a reasonable vaccine quantity (albeit making a small arithmetic mistake in the process).

With the second, however, ChatGPT waffled unconvincingly about having too much or too little vaccine. It made no quantitative assessment of these risks, did not account for the logistical challenges of cold storage for such an immense quantity and did not warn of the possibility that a vaccine-resistant variant might arise.

We are seeing similar behavior for different prompts across these taxonomy levels. Thus, Bloom’s Taxonomy allows us to draw more nuanced assessments of the AI technology than raw human versus AI comparison.

As for our doctor, lawyer, and consultant, Bloom’s Taxonomy also provides a more nuanced view of how AI might someday reshape — not replace — these professions. Although AI may excel at Recall and Understand tasks, few people consult their doctor to inventory all possible symptoms of a disease or ask their lawyer to recite case law verbatim or hire a consultant to explain the theory of Porter’s Five Forces.

ADVERTISEMENT

But we turn to experts for higher-level cognitive tasks. We value our doctor’s clinical judgment in weighing the benefits and risks of a treatment plan, our lawyer’s ability to synthesize precedent and advocate on our behalf, and a consultant’s ability to identify an out-of-the-box solution no one else thought of. These skills are Analyze, Evaluate and Create tasks, levels of cognition where AI technology currently falls short.

More Opinion:
From the commentary: While it is increasingly difficult to launch successful boycotts against large companies, pro-lifers can take their business to Walgreens that don't dispense the pill, or to independent pharmacies.
From the commentary: In describing the 1930s Depression, humorist Will Rogers said, “If stupidity got us into this mess, then stupidity can get us out of it.” That would appear to be the strategy of the “smart” people now running our government.
From the commentary: "Every tribe has its own words, basically, and it becomes more and more difficult to have conversations across tribal fault lines if we can't even agree on the terminology."
From the commentary:
From the commentary:
From the commentary: For it to turn its back on its own values, and on the voices of concern it is hearing from its friends in the American Jewish community, would be a very troubling and dangerous mistake.
From the commentary: We need to start the conversation about menopause by extending explicit invitations to men to join in.
From the commentary: So what's Biden up to? His motive could be simple opposition to a law that reduces punishments for serious crimes. It could be to seem tough-on-crime at a time when public disorder has become a potent campaign issue. It could be both.
From the commentary: If human rights records had the clout that Carter intended, reports like these would have shaped our foreign policy instead, ensuring that those who foster injustice and violence would not remain beneficiaries of U.S. support.
From the commentary: Here was the kind of light Jesus was talking about, the kind I wanted my son to bask in. Prismatic, inclusive, but dazzlingly clear in the face of evil.

Using Bloom’s Taxonomy we can see that effective human-AI collaboration will largely mean delegating lower-level cognitive tasks so that we can focus our energy on more complex, cognitive tasks. Thus, instead of dwelling on whether an AI can compete with a human expert, we should be asking how well an AI’s capabilities can be used to help foster human critical thinking, judgment and creativity.

Of course, Bloom’s Taxonomy has its own limitations. Many complex tasks involve multiple levels of the taxonomy, frustrating attempts at categorization. And Bloom’s Taxonomy does not directly address issues of bias or racism, a major concern in large-scale AI applications. But while imperfect, Bloom’s Taxonomy remains useful. It is simple enough for everyone to grasp, general-purpose enough to apply to a broad range of AI tools, and structured enough to ensure we ask a consistent, thorough set of questions of those tools.

Much like the rise of social media and fake news requires us to develop better media literacy, tools such as ChatGPT demand that we develop our AI literacy. Bloom’s Taxonomy offers a way to think about what AI can do — and what it can’t — as this type of technology becomes embedded in more parts of our lives.

Vishal Gupta is an associate professor of data sciences and operations at the USC Marshall School of Business and holds a courtesy appointment in the department of industrial and systems engineering. This commentary is the columnist's opinion. Send feedback to: opinion@wctrib.com.

©2023 Los Angeles Times. Visit at latimes.com. Distributed by Tribune Content Agency, LLC.

______________________________________________________

This story was written by one of our partner news agencies. Forum Communications Company uses content from agencies such as Reuters, Kaiser Health News, Tribune News Service and others to provide a wider range of news to our readers. Learn more about the news services FCC uses here.

What To Read Next
Get Local

ADVERTISEMENT

Local Sports and News