A neural network is a class of computational models designed to recognize patterns, approximate functions, and make decisions by adjusting connections between simple processing units. Inspired by the organization of biological brains, artificial neural networks (ANNs) represent information as layers of interconnected nodes (sometimes called neurons) and adapt their connection strengths through experience or training.

Core structure and components

Typical neural networks consist of an input layer, one or more hidden layers, and an output layer. Each node computes a weighted sum of its inputs and passes the result through a non-linear activation function. Key elements include weights, biases, activation functions (such as sigmoid, ReLU, or tanh) and a loss function that quantifies the difference between predicted and target outputs.

Learning and training

Networks learn by adjusting weights to reduce loss on training data. Supervised learning uses labeled examples and gradient-based optimization, most commonly backpropagation combined with stochastic gradient descent or its variants. Unsupervised, semi-supervised and reinforcement learning are alternative paradigms for different kinds of tasks.

Types and variations

  • Feedforward networks (simple multilayer perceptrons)
  • Convolutional neural networks (CNNs) for spatial data like images
  • Recurrent neural networks (RNNs) and transformers for sequences
  • Autoencoders and generative models for representation learning

These architectures are often combined and adapted to specific domains.

History and development

The field began with early neuron models and perceptrons in mid-20th century and advanced through the rediscovery of backpropagation and the scaling of models, data, and compute, which enabled deep learning breakthroughs in the 21st century.

Applications and limitations

Neural networks power image and speech recognition, language models, recommendation systems, scientific modeling and many other applications. Limitations include need for large labeled datasets, sensitivity to adversarial inputs, interpretability challenges and computational cost. Responsible deployment requires attention to bias, robustness and resource use.

Notable distinctions: neural networks are a subset of machine learning and differ from rule-based systems by learning representations from data rather than relying on hand-coded rules.