Feedforward Neural Network
The simplest neural network architecture where information flows in one direction from input to output, with no cycles or loops.
Structure
Consists of an input layer, one or more hidden layers, and an output layer. Each layer is fully connected to the next. Each neuron applies a weighted sum followed by an activation function.
In Transformers
Every transformer block contains a feedforward network (FFN) applied independently to each token position after the attention operation. These FFN layers are where much of the model's 'knowledge' is stored, and they typically have 4x the hidden dimension.