What is Generative AI Good At?

8 min readApr 29, 2024

I hate that I ended that title with a preposition. It’s not incorrect, but it feels lazy. I’m pretty good at rewriting sentences to not end in prepositions and not sound absurd. This is a difficult question to rewrite. If it bothers you, add “, Asshole” to end of it, before the question mark. Which brings me to an important point of my style. I am a thinker and a comedian. I interject levity and even profanity into very serious points to make them clearer. If that bothers you, feel free to add “, asshole” to the end of any snarky sentence you find below, before the period, of course.

Johnny is a Jack Russell / Corgi mix, according to my attempts to make a cartoon image of him with DALL-E.

TL;DR, but you should read the rest carefully: Generative AI is bad at the things you hope it would be good at. And it’s good at things where you exploit what everyone else sees as problems. In short, making shit up. Read on for applications where that’s actually very valuable.

So, What is Generative AI Bad At?

Let’s start with this question, ended in a preposition. Very easy answer: Pretty much everything we hope it’s good at (, asshole). If you understand that Large Language Models (LLMs) are probabilistic models that string word sequences together, and you’ve ever had trouble expressing a complicated thought, perhaps in a second language, you will know in your gut that “intelligent” is a terrible misnomer for what they do. Maybe it takes a classic computer science education and 35 years in the industry too with all its hype and unrealized promises mixed in with mind-blowing real progress. Well, I know. And I knew.

Let me itemize a few things.

Software development. If you pay for top 10% software development talent, which is let’s call it 40x as productive any way you want to measure it as median talent and required for the most research oriented tasks in R&D, you would love if you didn’t have to pay 2x to 3x for that talent. I get it. You tried outsourcing to Indian companies with hundreds of computer science Ph.D.s working for fry cook wages, and you couldn’t match top talent. That has to be really frustrating. Going back to the late 1980s, CASE tools were going to solve the disparity by lifting everyone up. Agile sprints and morning standups based not terribly loosely on the Synanon Game would solve the communication problem. Tayloristic management would make everyone equally productive! Now with AI, we can do all of that at scale for the price of computing resources that cost less (amortized) than the cheapest capable paid developer. It turns out, AI tools are most helpful to the bottom 25% and get in everyone else’s way. Bummer.
Chatting with your dead father. This applies to all other chatbots for that matter, if you’re seeking anything more than a repetition of chatting (with a chatbot). That rep might be useful, but it’s not your Dad. First off, he’s not going to divulge previously unknown family secrets to you unless he wrote them down in whatever becomes the training corpus. Who does that? Not your Dad. More than likely, if you really want to chat with him, he was quiet and reserved about personal things, even though he might have been prolific in his professional writings. A more serious point was brought up by Russ Roberts in a recent episode of EconTalk. He had little interest in chatbotting with his recently deceased father when his father was old and his mind afflicted by things that afflict old people. But he would have loved to chatbot with his Dad in his 40s, at the top of his career and family life.
Here’s a chatbot idea sure to brighten your day, inspired by Roberts’ wish. How about if you’re a man going through a divorce, you could agree to give up the house and half of your 401K for a chatbot of your soon to be ex-wife, from 2o years ago, when she loved you and the two of you were still kind of hot?
I joke and poke fun, but I definitely understand the appeal to some people. Here’s something that has helped me tremendously with dealing short and long term with the loss of family and friends. Get together with someone else who knew them and share memories. Those memories, even colored by time, are very real and not limited to what was written down.
Data analysis and decision making. The biggest “advantage” of GenAI is that when you give it a question, it will give you an answer, and it will give you that answer quickly. That does not mean it will give you a right answer. I have been calling that “artificial certitude”. People seem very hungry for answers, often where they don’t exist or aren’t easy.
Recently, I listened in horror as an associate recounted taking data from a commissioned web survey, pasting it into ChatGPT, asking for a summary, and then pointing out mundane (alleged) connections that happen every time you do a large survey, as if they meant anything. Group attention and decision making went predictably downhill from there.
If you want to do useful analysis leading to a good decision, you can use the LLM to summarize arguments for both (or all known) sides based on the data, then be the human in the middle that makes the decision. That process is fairly defensible assuming the summaries check out. It at least matches a trusted process of “looking at both sides”, absent the AI tool.
Advice. This could be numbered 3b. I worked briefly to help the same company from (3) that had some investor funding to create a “report” for a particular piece of personal data uploaded by participating users. I programmed the system from the ground up, based on a kind of functional custom GPT demo in the ChatGPT environment. I used ChatGPT to process the document and give its analysis based on a “prompt” that had to be extensively tuned to give an “interesting” spread of results. Its results on the exact same training documents varied wildly, both instantaneously and from day to day. I rendered the “results” into a uniform report that would look like a familiar, branded format for each user. The “results” I asked for were often unparsable, even though I asked for the results in particular, parsable format. YAML with specific key / value pairs to be precise. When they weren’t formatted correctly, I asked again, and almost always got a materially different analysis. I describe the process because it gets to the heart of how bullshitty analysis and advice are in the GenAI world. Palm reading is easier and more accurate.
Side note: the business model behind offering free AI advice is collecting your name, email address, and personal data. Participate in any of these schemes with caution.
Truth. See Google Gemini, Black George Washington, Woke AI, etc. Practitioners often call such mistakes “hallucinations”. It means that given the underlying data, the LLM basically made something up and passed it off as true. Notice how the AI industry anthropomorphizes everything, but has specific definitions for the terms that aren’t quite what you expect.
The best examples of hallucinations in the practitioner sense are legal industry LLMs that make up cases and even URLs. “Not good at truth.” The problems previously mentioned with Google Gemini were not “hallucinations” in the practitioner sense of the word. They were trained into the models. Google Gemini was by no means a failure. It was, in fact, an amazing demonstration to governments and regulators that Google could deliver AI with whatever truth governments and regulators wanted.
For the record, I am down with Black George Washington. I just want to know that he has wooden teeth.

Gemini had no definitive opinion about George Washington’s wooden teeth.

What is Generative AI Good At?

This is the perfect spot to switch gears and get to my thesis. It’s like Calculatus Eliminatus from the Cat in the Hat.

“To find what something’s good for, you should find out what it’s not.” -Badly paraphrased Dr. Seuss.

Pick the things that GenAI turns out to be really bad at. Now ask yourself, “Is there an application where this kind of failure could be valuable?”

Pfizer took a similar approach in 1989 with a drug, Sildenafil, that it hoped would treat, wait for it, acute angina. Damn, this joke writes itself. If you’re not familiar, Pfizer discovered an interesting side effect in trials and later marketed the drug as Viagra. The rest is history. I’ll leave the punchline to you.

Medium doesn’t do tables, so I’ll lay out my Calculatus Eliminatus in pairs.

Not good at: Writing good code.
Might be good at: Writing bad code to train humans to debug code.

Generating reps will be a theme. Humans need lots of reps to develop proficiency. Often, there are only a handful of textbook reps available.

Not good at: Chatting with specific people and characters, advice.
Might be good at: Roll playing with difficult customers.

Air Canada had it bass-ackwards with its chatbot rollout. What if instead, it trained a chatbot to come up with an endless supply of difficult customer requests to continually train its human reps to handle difficult cases efficiently and correctly? Again, generating realistic reps.

Not good at: Complying with copyright, respecting trademarks.
Might be good at: Humorously remixing public domain content.

“Make up a story about Paul Bunyan and Eeyore saving the Oakland A’s from a bathroom disaster.” Pick your favorite large enough LLM. That prompt will be hilarious.

I’m the host. My friend Pete read the story.

Not good at: Truth. Makes things up.
Might be good at: Develop scenarios for military and emergency preparedness training.

Again, reps. FICINT (fictional intelligence) is a thing. Now imagine prompting with the basic plot of the exercise and getting countless significant variations. Huge thanks to John Sullivan for the article link. It’s what galvanized my thinking on this whole topic.

My goal here was not to be contrarian. Nor was it to get a few cheap laughs. My goal was to reframe the discussion.

My main criticism of generative AI has been around artificial certitude. All of us should be more comfortable with the ideas of questions that are unanswerable or difficult/costly to answer. We most certainly are not, and that leaves us vulnerable to buy into any available answer. My main hope for AI has been delight. Despite a $700B entertainment industry, delight is seen as unserious and unworthy of this great technology. Delight is also easily faked, for example, with promises of resurrecting dead people or broken marriages.

I hope I have reframed the discussion for you in a more useful light. I hope you will share this article with others who are trying to make a living in this space.

What is Generative AI Good At?

So, What is Generative AI Bad At?

What is Generative AI Good At?

Written by Brad Hutchings

No responses yet