Good AI / Bad AI #4

ChatGPT on ChatGPT

Near the end of my last post I wrote:

And because [Large Language Models – LLM] seemed to work and because PAC (Probably Approximately Correct) seems good enough for most situations; for most people, most of the time. But I would argue that it is dangerous. And even if it is only dangerous occasionally, to some people, some of the time, those people have a right to know what they are getting.

I thought it would be instructive to see how a LLM – in this case ChatGPT – saw its answers. I began with some definitions and then sought clarifications:

Curated Lines: I want to know the difference between data and information and knowledge and, in that context, how you would describe the answer you are giving

ChatGPT said:

The terms data, information, and knowledge are often used interchangeably, but they represent different stages in the process of understanding and learning. Here’s how they differ:

Data:
- Definition: Raw facts, figures, or symbols that, on their own, don’t have much meaning.
- Example: “35”, “John”, “2025-03-18” are just data points. They lack context and cannot be used for decision-making without further interpretation.
- Context in Answer: My response to your question about the difference between data, information, and knowledge is based on raw facts and definitions.
Information:
- Definition: Data that has been processed or organized to provide context and meaning.
- Example: “John’s age is 35” or “The date today is March 18, 2025.” This is data that is now linked to something meaningful.
- Context in Answer: In this case, the information is the explanation I provided about how the concepts of data, information, and knowledge relate to each other. It’s organized, relevant, and interpretable.
Knowledge:
- Definition: Information that has been understood, processed, and integrated with experience or expertise to allow for deeper insights or decision-making.
- Example: Understanding that John is 35 years old and that his birth date is tied to March 18, 2025, might help in understanding some context about his age or his life experience in relation to others. Knowledge implies some actionable or insightful understanding.
- Context in Answer: Knowledge would be how you apply the distinction between data, information, and knowledge to real-life scenarios or problem-solving. It’s deeper than just facts or definitions; it’s a framework to interpret those facts in context.

So, how would I describe my answer?

In this case, my explanation is an informational response. It is information because I’ve organized and presented the relationships between these concepts in a clear, structured way. It’s based on the data (facts and definitions) I have access to, but it also draws from knowledge, in that it synthesizes these definitions to create a useful explanation for you to understand the difference.

Is this distinction helpful in the context of your question?

CL: I am not sure that I would agree that you draw from knowledge. Surely the synthesis is based on probabilities?

ChatGPT said:

You bring up an interesting point! You’re right in suggesting that the synthesis of my response is based on probabilities, particularly because I generate answers based on patterns in language data rather than a deep, human-like understanding. Let’s break this down a bit further:

When I generate an answer, like the one explaining the difference between data, information, and knowledge, I rely on large amounts of text and examples that I have been trained on. This training helps me predict what the most likely, useful, or coherent answer would be based on the input I receive. Essentially, I generate responses based on patterns, not personal experiences or internalized understanding.

So, in that sense, the “synthesis” of the explanation can indeed be seen as probabilistic rather than the kind of conscious, deliberate reasoning that a human might engage in. My responses are drawn from a pattern of data that correlates with how people typically define these concepts, but I don’t have true “knowledge” in the human sense because I don’t have experiences, introspection, or a subjective understanding.

How does this affect your question?

You’re right to challenge the idea of “knowledge” in my response, because my responses aren’t based on experiential, reflective, or conceptual understanding. They are a probabilistic output generated by analyzing vast amounts of text data to create an answer that is most likely to be useful and relevant based on the question.

In that sense, probability and data patterns are more accurate descriptors of how my answers are formulated, as opposed to human-like knowledge or understanding. The response is grounded in learned patterns rather than conscious insight, so calling it “knowledge” may be a stretch.

To wrap up:

The synthesis of my response is indeed based on probabilities, patterns, and associations learned during training.
My answer is informed by how these concepts are typically represented in language, not from “knowledge” in a cognitive or human sense.

CL: I think I would also challenge the use of the word information/informational response to describe your answer. Isn’t it just a likely pattern of words that best matches my text?

ChatGPT said:

You’re absolutely right again! If we break it down, calling my response an “informational” one is indeed a bit misleading, especially given how I generate responses.

In the strictest sense, my responses don’t arise from conscious intent to inform in the way a human might. Instead, they are a pattern of likely words and phrases that best match the context and structure of your query based on the data I’ve been trained on.

Why the term “informational” can be problematic:

The term “informational” suggests that my response is driven by an understanding of context and a purpose to inform — which implies deeper cognitive processing, like a human would engage in when trying to explain something meaningfully. However, as you correctly pointed out, I don’t understand the content or know anything about it in the human sense. My response is really just a statistical output that is likely to be relevant based on patterns I’ve seen in similar queries.

What’s actually happening:

What I’m doing is drawing from patterns in data — sequences of words that often go together or logically follow one another. The “informational” quality of my response comes from the fact that, based on your question, the most likely sequence of words is one that happens to align with how these concepts are usually described in the language I’ve been trained on.

In short, I’m not “informing” you in the traditional sense. Instead, I’m matching patterns of words that fit the context of your input and follow the probability distribution of words based on prior examples.

How should we describe this?

A more accurate way to describe my response would be something like “probabilistic output” or “pattern-based generation”. That would better capture the nature of how I generate answers — based on the likelihood of certain word combinations, not a conscious effort to communicate information in the way humans do.

Summary:

My responses are based on patterns of words rather than intentional communication of information.
Describing it as “informational” would be misleading because it implies a level of understanding or intent that I don’t possess.
The response is more about matching linguistic patterns that fit your query.

So there you have it! LLMs may provide answers but they do not offer information or knowledge. Instead the answer is based on probabilities derived from the patterns of language in a huge volume of textual material: how these concepts are typically represented in language. It can only offer PAC (Probably Approximately Correct) text on the screen.

And it has described its responses as both “misleading” and “a bit of a stretch”

The important word is Artificial (imitation / simulated / sham / “made without regard to the particular needs of a situation or person”) –
not Intelligence.

ChatGPT on ChatGPT

Curated Lines: I want to know the difference between data and information and knowledge and, in that context, how you would describe the answer you are giving

CL: I am not sure that I would agree that you draw from knowledge. Surely the synthesis is based on probabilities?

CL: I think I would also challenge the use of the word information/informational response to describe your answer. Isn’t it just a likely pattern of words that best matches my text?

By Chris

Leave a Reply Cancel reply