Why can we avoid many detours by viewing AI as a model?

Understand the nature of AI models and avoid unnecessary misunderstandings and detours.
Core content:
1. Why large language models are misunderstood as "talkers" or "experts"
2. The role and nature of mathematical models in simplifying the complex world
3. Structure and bias analysis of large language models as parametric modeling tools
This title may seem a bit strange to many readers:
“Isn’t AI (here specifically referring to the large language model) just a mathematical model? Is it necessary to talk about something so obvious?”
This is necessary because we often forget the essence of large language models.
In our current context, because of the huge success of the Big Language Model, we would regard it as a "talking person", or an "omniscient expert", or even an "awakened consciousness".
We begin to endow it with will, emotion and judgment, and we also begin to fear its replacement, manipulation and loss of control. But in fact, the big language model is nothing more than a huge mathematical function , a high-dimensional modeling of language behavior.
I do not think this perspective is cold, but rather a rational, restrained, and pragmatic way of thinking. This article explores this point.
Models are simple expressions of a complex world
The role of mathematical models is to simplify reality without compromising the value of decision making .
From differential equations in classical physics to optimization models in economics to network models in social sciences, their common feature is the formal and structured expression of complex systems, thereby capturing key relationships for deduction and intervention.
Every mathematical model is a tool for purpose, structure, and simplification :
It has a specific modeling goal (explanation, prediction, optimization); It selects some variables to express, and necessarily ignores other factors; It establishes relationships between variables in the form of logical rules or statistical patterns.
In this sense, the Large Language Model (LLM) is a parametric modeling tool for human language behavior . It is not "people thinking", but "the model is fitting" the possibility and structural logic of language emergence.
The structural nature of large language models
We can abstract the language model into a mathematical function:
The input is a prompt word composed of natural language, and the output is the language segment that the model "thinks" is most likely to continue. The core of this function is a parameter space - a neural network composed of hundreds of billions of weights, which maximizes the "likelihood" of generating language by optimizing a certain loss function.
This is like an extremely complex regression or classification model, except that the prediction object is not a numerical value or label, but the next most reasonable word, sentence or even paragraph. It does not know what it is saying, it is just the "co-occurrence probability" that has appeared in the modeling language data.
When we say that "large language models can make up stories", "have logical loopholes", and "can talk nonsense in a serious manner", we are actually describing the typical manifestations of a failure in approximate function fitting : the input deviates from the training distribution, the objective function does not contain logical constraints, and there is a lack of a real-world verification mechanism.
Errors are model biases, not cognitive failures
Why do we emphasize that “it is just a model”? Because this can help us correctly understand the source of its errors and not misinterpret it as a problem of human intention, ability or morality.
Here are some common examples:
Factual errors : It’s not that the model is “lying”, but rather that its training data is inconsistent with reality or the correct contextual patterns are not activated;
Incoherent logic : It is not that the model is "confused", but that the objective function does not require logical consistency;
Semantic ambiguity : It is not that the model is “playing Tai Chi”, but that the language structure itself is highly ambiguous, and the model can only choose the maximum probability from the statistical ambiguity.
These are not failures of “intelligence” but deviations of “modeling”. In traditional mathematical modeling, we also encounter problems such as “residuals”, “failure of extrapolation” and “overfitting”. Language models simply bring these errors into text generation.
Using "model error" rather than "thinking defects" to explain the performance of language models is the premise for us to remain rational and control risks.
Hint word: boundary condition
The key to mathematical modeling lies not only in the model structure itself, but also in how to impose boundary conditions and input control . The prompt word of the large language model plays this role:
It is the control of the initial state ; It is an activation of contextual space ; It is a fine-tuning of the model output direction .
This means that when we use large language models, we are essentially doing a kind of "interactive modeling":
Every prompt word you enter is actually "setting boundary conditions"; The examples and format you provided are in the "constrained solution space"; Your corrections and feedback on the output are "optimizing the objective function".
This requires us to understand the model structure, adjust the input method, and evaluate the output boundaries just like using a complex control system .
“Assumptions of use” must be set
An important but often overlooked issue is that before using a model, you must set the assumptions that underlie its use.
In traditional modeling, we explicitly assume premises such as "linear relationship", "variable independence", and "controllable observation errors".
Similarly, when we use language models for tasks such as writing, translation, question answering, and decision support, we must also make clear the corresponding assumptions:
Does the model’s output require manual review or secondary verification?
Can the model only be applied in specific contexts, such as non-critical decision-making scenarios?
Does the task require factual accuracy or logical consistency as a basis?
Does the model have value bias or ethical risks, and does it require additional constraint mechanisms?
Only after clarifying these assumptions can we use the large language model as a "controllable system" . Otherwise, we can easily fall into two misunderstandings:
The first is “technological faith”, which deifies the model, endows it with capabilities it does not possess, and even equates it with expert judgment.
The second is "technological nihilism", which completely denies the value of language models because of their errors and misses the efficiency improvement brought by human-computer collaboration.
Both of these attitudes are actually irrational. The rational attitude is to set clear boundaries for the model based on the understanding of its nature, so that it can operate within the expected framework.
De-humanization is a necessary awakening
The current public misunderstanding of large language models is largely due to an anthropomorphic narrative style. We say that the model "understands", "knows", "remembers", "thinks", and even "has bias" and "has consciousness", which are all semantically wrong analogies.
A language model has no inherent “understanding” — it just captures some coherence through the statistical laws of language .
It has no “knowledge” – it just stores and compresses a lot of textual correlations, and can call on these patterns to generate content;
It is even less likely to have "consciousness" or "judgment" - it can neither understand the intentions of human behavior nor bear the consequences of its actions .
Anthropomorphizing it will only lead to excessive expectations, cognitive mismatch and ethical confusion . When we return to the perspective of "mathematical model", we can more accurately answer three key questions:
What did it do right?
What did it do wrong?
How can it be improved or controlled?
In other words, only when we remove the projection of human will can we see a clear "function body" : a complex and powerful generator that can help us process information, generate language, and improve efficiency, but it must operate within the control range.
The first step of rationality is the clarity of language. The big language model is not "who" but "it" - it is not a consciousness body but a parameter body; it is not a moral subject but a functional structure.
What we need to do is not to worship it, fear it, or personify it, but to understand it, control it, and tune it .
The large language model is a groundbreaking technological breakthrough, but our attitude towards it should also be mature, restrained and rational.
Viewing it as a mathematical model is not to belittle its capabilities, but to help you make better use of it .
The advantages of the model are that it has a clear structure, is analyzable, and adjustable; the limitations of the model are that it is limited by data, assumptions, and expressiveness. These advantages and disadvantages also apply to large language models.
In the future era of artificial intelligence, we will continue to face increasingly complex model systems. Whether we have a way of thinking to "model" them will determine whether we can control the tools instead of being controlled by them.