Let’s Reframe Large Language Models (LLMs)
I see the biggest challenge in the market for artificial intelligence (AI) products and services as aligning customer and user expectations with what these products and services actually deliver.
I don’t mean “align” in that mysterious and bullshitty C-suite sense where you just know some weird power play is coming. I mean “align” like the front wheels of your car. If they are out of alignment by too much, your car will be difficult to drive, maybe even dangerous. Similarly, if expectations of AI are too high, people will be disappointed. If expectations are too low, business opportunities will be left of the table.
But it’s not just one-dimensional high and low expectations. It’s understanding exact capabilities. If your application needs something that can travel underwater, and I’m selling the best airplanes in the world, I can’t help you. Making (or worse, letting) you believe that my airplanes are also great submarines doesn’t help either of us. You’re going to be disappointed when you figure out that airplanes don’t work well underwater. If we are going to do business, I need to convince you that flying is an alternative. Then, and only then, will your requirements align with my product’s capabilities.
When we talk about AI today, we’re usually talking about large language models (LLMs). Let’s ignore speculative talk about the holy grail of artificial general intelligence (AGI). We don’t have anything like that now. And let’s focus on text models. Modalities like voice and video are cool and deserve discussions of their own, but the general “truth” principles from text apply.
A data scientist can show and explain to you, in excruciating detail with beautiful illustrated examples, how neural networks work. This video series from YouTube account 3Blue1Brown is amazing and pretty accessible.
I don’t remember if they explicitly mention it in the videos, but the key takeaway that should inform you and amaze you is that when you program these neural networks, the training data itself isn’t stored in the network. A second takeaway is that the network becomes a complicated, hard to explain statistical representation of the training data. The narrator tries to explain with shape components of numerals, but that’s a hand wave without tight correlation to actual nodes and weights that arise in the training process in his example.
LLMs contain very large neural networks. Think millions of nodes and billions of edges (or weights). They also contain something called a transformer, which Google unleashed not long ago in 2017. The transformer takes a running start with your question (or prompt) and uses the neural network to create an answer, word by word, based on all those weights. It is a random process steered by the weights.
Let’s verify the randomness by letting an LLM be creative for us.
- Open Bing Copilot (← Click to open in a new tab.)
- In the Ask me anything field at the bottom, type (or copy/paste):
Make up a 3 paragraph story about Tigger and Eeyore starring in the new Starsky and Hutch movie. - Open Bing Copilot again in another new tab.
- In the Ask me anything field at the bottom, type:
Make up a 3 paragraph story about Tigger and Eeyore starring in the new Starsky and Hutch movie. - Compare the stories.
Notice anything? They should be different stories. If you’re lucky, they will have different plotlines, and certainly different details. Bing Copilot is presenting us with “a” story, not “the” story.
Let’s try something where we need the answer to be more truthy. Type or copy/paste this prompt into two tabs with Bing Copilot. Be sure to start a new topic if you’re using the tabs from the previous example.
I need to build a barn. Tell me the three most important things to consider.
My first try: (1) Purpose and Design, (2) Location and Land Assessment (focus on zoning), (3) Materials and Budget (includes materials choice).
My second try: (1) Purpose and Design, (2) Land and Location (no mention of zoning), (3) Budget and Resources (all financial).
Both answers are plausible. Perhaps two barn building experts would disagree too. But Bing Copilot is one entity acting as a barn building expert, responding twice to a very specific question. It gave multiple different enough plausible answers.
This is not a formal proof that Bing Copilot operates by making up plausible answers from a combination of its weights and input. It does, but this doesn’t prove that. However, the examples support that working hypothesis. More specifically, Bing Copilot does not think. It does not analyze. It doesn’t even remember. But here is the real magic. Even though Bing Copilot doesn’t think, either of those two plausible answers was good enough to get you started thinking about your barn project.
A common criticism of LLMs is that they make stuff up. Following from that, they are not good at analysis, truth, customer support, law, programming. Oh, but you’ve heard they are good at those thing, right? Turns out they are not. Expensive lessons and uncovered deceptions abound. But surely, if we spend more money, we’ll fix these little problems. Nope. Because those use cases do not align with what LLMs do.
Let me try a different way of explaining what neural networks do, and thus LLMs at large scale. As a computer user, you are probably familiar with .zip (and 7zip .7z) files. They can take a whole bunch of files and folders and combine them into one file that you can archive or share. They also compress those files and folders so that they take less space in your archives or can be shared faster. Typical compressed size might be as low as 20% of the original size for typical files we produce on computers. The most important thing about this compression is that it is lossless. That means you can recreate the exact originals from the archives.
Those of you who work with pictures and websites are familiar with JPEG files. These are pictures that have been compressed with the JPEG algorithm. The byte size of the original full quality picture is typically width in pixels x height in pixels x 4 bytes, allowing 8 bits each for a red, green, and blue component of each pixels, plus 8 bits for transparency or just for byte alignment for easy calculation. That’s a mouthful, but a 1920 x 1080 picture would be 8.3 megabytes uncompressed. It might end up 2 MB compressed into a lossless format like zip. That same image might be 100K (or compressed to 1/80th their original size) when compressed by JPEG.
JPEG is a lossy compression scheme. This means that when you have an original image, you compress it with a JPEG encoder, and then decompress that encoded JPEG, you will end up with — pixel for pixel — a different image than the one you started with. But that’s acceptable, because the new image looks similar enough to the original. Your eyes might not be able to tell the difference. Imagine all the disk space and network traffic this saves!
So let’s consider an LLM. We have billions of web pages of training data totaling terabytes, maybe even petabytes of raw data. We’re going to compress that data using our LLM training procedures. We end up with a model representing terabytes or petabytes that is a mere few gigabytes on disk. Amazing compression! The downside? We can’t recover the original files.
What we can recover, through prompts, is new files (responses) informed by all of the files we ingested. We should call this super lossy compression. And we know, from our experiments above, that it can be pretty useful, even if it doesn’t give us our source material or an easy way to trace through and understand the responses it generates. It simply is what it is and does what it does.
A quick Google search tells me this isn’t an original concept, per se. But I will bet I’m one of the first to try to explain it to non-practitioners. How did I do? Are you ready to realign your expectations of AI with super lossy compression?
Drop by Brad-GPT.com and reach out to me on the form. Or, drop me a note at brad@Brad-GPT.com.
Finally, as I write this article, I am still looking for my next professional adventure. If my thinking matches a need in your organization, I’d love to hear from you.