What is a Neural Network? How AI's Digital Brain Works

The Biological Inspiration

Your brain contains roughly 86 billion neurons, each connected to thousands of others through synapses. When you see a face, hear a voice, or recall a memory, electrical signals travel across these vast networks of neurons, strengthening some connections and weakening others.

An artificial neural network mimics this concept digitally. Instead of biological neurons, it uses mathematical functions called nodes (or artificial neurons). Instead of synapses, it uses numerical weights. And instead of electrical signals, it passes numbers from one layer of nodes to the next.

Think of it this way:

A neural network is like an assembly line of decision-makers. Each worker (neuron) receives information, applies their own judgment (weights and bias), and passes a decision onward. Individually, each worker is simple. Together, they can solve extraordinarily complex problems.

Anatomy of a Neural Network

Every neural network has three types of layers. Click on each layer below to learn what it does.

Click a button to explore each part of the network.

The Three Types of Layers

Input Layer

Receives the raw data. Each node represents one feature of the input. For an image, each node might hold the brightness of one pixel. For tabular data, each node holds one column value.

Hidden Layer(s)

Where the learning happens. Each hidden neuron computes a weighted sum of its inputs, adds a bias, then passes the result through an activation function. Networks with many hidden layers are called "deep" -- hence deep learning.

Output Layer

Produces the final result. For classification, each node represents a class (e.g., "cat" or "dog"). For regression, a single node outputs a continuous value. The output format depends on the task.

The Building Blocks: Weights, Biases, and Activation Functions

Each connection between neurons has three critical mathematical components.

W

Weights

Each connection between neurons has a weight -- a number that determines how much influence one neuron has on the next. A large weight means a strong connection; a weight near zero means almost no influence. Learning is fundamentally the process of finding the right weights.

B

Biases

Each neuron has a bias -- an extra number added after the weighted sum. Think of it as the neuron's "default inclination." It allows the network to shift the activation function, giving it more flexibility to fit the data.

f

Activation Functions

After computing the weighted sum plus bias, the neuron passes the result through an activation function. This introduces non-linearity, allowing the network to learn complex patterns. Without activation functions, a neural network would only be able to model straight lines, no matter how many layers it had.

Common Activation Functions

ReLU

f(x) = max(0, x)

The most widely used. If the input is negative, output zero; otherwise, pass it through. Simple, fast, and effective.

Sigmoid

f(x) = 1 / (1 + e^-x)

Squashes values between 0 and 1. Useful for binary classification outputs (probability of yes/no).

Tanh

f(x) = (e^x - e^-x) / (e^x + e^-x)

Squashes values between -1 and 1. Centers data around zero, which can help training converge faster.

Softmax

f(x_i) = e^x_i / sum(e^x_j)

Converts a vector of numbers into probabilities that sum to 1. Used in the output layer for multi-class classification.

How Neural Networks Learn

A neural network learns by repeatedly adjusting its weights and biases to minimize errors. This process involves two key phases.

1. Forward Pass

Data flows from input to output. Each neuron computes its weighted sum, applies the activation function, and passes the result forward. At the end, the network produces a prediction.

Input → Weights → Activation → ... → Output

2. Backpropagation

The prediction is compared to the correct answer using a loss function. The error is then propagated backward through the network, layer by layer. Using calculus (the chain rule), the network calculates how much each weight contributed to the error and adjusts it accordingly.

Error → Gradients → Update Weights → Repeat

This forward-backward cycle happens thousands or millions of times during training. Each cycle (called an "epoch" when the entire dataset is processed) makes the network slightly more accurate. The learning rate controls how big each adjustment is -- too large and the network overshoots; too small and training takes forever.

Types of Neural Networks

Different architectures are designed for different types of data and tasks.

Feedforward (FNN)

The simplest type. Data flows in one direction -- input to output, no loops. Good for tabular data and basic classification tasks.

Convolutional (CNN)

Specialized for grid-like data such as images. Uses convolutional filters to detect spatial features like edges, textures, and objects.

Recurrent (RNN)

Processes sequential data by maintaining a "memory" of previous inputs. Used for time series and language, though largely replaced by Transformers.

Transformer

Uses self-attention to process all inputs in parallel. The architecture behind GPT, BERT, Claude, and all modern large language models. Learn more.

Real-World Applications

Image Recognition

CNNs power facial recognition, medical image diagnosis, autonomous vehicles, and quality inspection in manufacturing.

Language Understanding

Transformer-based networks enable chatbots, translation services, search engines, and AI writing assistants.

Voice and Audio

Neural networks power speech recognition (Siri, Alexa), voice synthesis, music generation, and audio transcription.

Scientific Discovery

Neural networks predict protein structures (AlphaFold), discover new drugs, model climate patterns, and simulate physical systems.

Ready for the Full Deep Dive?

This lexicon entry covers the fundamentals. For a comprehensive guide with visual walkthroughs, mathematical details, and code examples, read our full neural networks article.

Read the Full Neural Networks Guide →

What is a Neural Network?