Kinds of AI

Motivation

The field of AI contains algorithms that look very different from one another — logical rule systems, neural networks, search procedures, bandit algorithms. Before studying any of them in depth it is useful to understand the landscape: how the major paradigms differ in what they assume, what problems they are suited to, and how they relate. This article organizes that landscape along three axes — the degree to which behavior is specified versus learned, the amount of internal state an agent maintains, and the source of the learning signal.

The Spectrum from Rules to Learning

Artificial intelligence systems can be organized along a spectrum from fully hand-specified behavior to fully learned behavior (Russell and Norvig 2020).

Symbolic AI encodes knowledge as explicit symbols and rules: logical axioms, production rules, ontologies. The system’s behavior is directly inspectable and provably correct relative to its rules. Symbolic systems excel when the problem is well-formalized and rules can be written down explicitly. Classical examples include production systems, forward chaining, and backward chaining reasoners; planning systems; and constraint solvers such as Davis-Putnam-Logemann-Loveland.

Learning-based AI derives its behavior from data. Rather than specifying how to behave, the designer specifies a model family and an objective; the system learns the behavior by adjusting parameters to minimize that objective on training examples. Neural networks are the dominant representation: a multilayer perceptron is a composition of affine maps and nonlinearities with learnable weights. Strengths: handles raw perception and irregular domains where rules are hard to write. Weakness: behavior depends on data distribution and is often opaque.

Hybrid / neurosymbolic AI combines both. A learned component handles perception or low-level control; a symbolic component handles planning or constraint satisfaction. AlphaGo is a canonical example: neural networks evaluate positions and guide exploration while Monte Carlo tree search handles planning. Formal verification applied to neural networks is another example.

Agent Types

A finer decomposition categorizes agents by how much internal state they maintain and what they optimize (Russell and Norvig 2020):

Simple reflex agent. Maps observations directly to actions via condition-action rules. No memory. Correct only in fully observable environments.
Model-based reflex agent. Maintains an internal state that tracks world evolution. Applies rules to that state. Works in partially observable environments.
Goal-based agent. Selects actions that lead to goal states. Requires search or planning.
Utility-based agent. Selects actions that maximize expected utility — a scalar measure of outcome quality. Handles tradeoffs among goals and uncertainty.
Learning agent. Has a learning element that improves the performance element over time, a critic that evaluates current performance, and a problem generator that suggests exploration.

The agent types are nested: a utility-based agent generalizes goal-based agents; a learning agent can wrap any of the others.

Supervised, Unsupervised, and Reinforcement Learning

Within learning-based AI, the three canonical paradigms differ in what signal the learner receives:

Supervised learning. Training provides input-output pairs \((x, y)\). The agent learns a function \(f: x \mapsto y\) that minimizes a loss function on held-out data. Used for classification, regression, and translation.
Unsupervised learning. Only inputs \(x\) are provided. The agent learns structure: clusters, latent factors, generative models. Examples include expectation-maximization, autoencoders, and variational autoencoders.
Reinforcement learning. The agent acts in an environment and receives scalar reward signals. It learns from trial and error, without labeled examples of the correct action. The Markov decision process is the standard formalism.

A fourth category, self-supervised learning, uses the structure of unlabeled data to construct surrogate supervised tasks. Masked-token prediction in BERT and next-token prediction in GPT are self-supervised objectives — technically supervised (the label comes from the data itself) but requiring no human annotation.

References

Russell, Stuart, and Peter Norvig. 2020. Artificial Intelligence: A Modern Approach. 4th ed. Pearson. https://www.pearson.com/en-us/subject-catalog/p/artificial-intelligence-a-modern-approach/P200000003500/9780137505135.