Overview

FLOPS stands for Floating Point Operations per Second. It is a unit of throughput that expresses how many floating‑point arithmetic operations a processor or system can carry out in one second. Because many scientific, engineering and machine‑learning workloads rely heavily on floating‑point math, FLOPS has become a common way to compare raw numerical capability across CPUs, GPUs and entire clusters.

What it measures and common scales

FLOPS counts arithmetic operations such as additions, multiplications and fused multiply‑add (FMA) on floating‑point numbers. Implementations report capability at several scales, for example:

  • MFLOPS — millions of FLOPS
  • GFLOPS — billions of FLOPS
  • TFLOPS — trillions of FLOPS
  • PFLOPS — quadrillions of FLOPS and higher

Precision, operations and counting

Different precisions (single, double, half, bfloat16) affect both throughput and numerical behavior. Some processors report separate FLOPS rates for single‑precision and double‑precision operations because hardware often executes lower‑precision math faster. Counting can also vary: some measurements include complex operations or vectorized instructions that perform multiple floating‑point calculations per clock cycle.

Measurement: peak vs sustained

Reported FLOPS commonly distinguish between theoretical peak performance and sustained (measured) performance. Theoretical peak is derived from hardware specs (number of cores, clock speed, vector width), while sustained performance is obtained by running benchmarks. Popular benchmarks for reporting and ranking numerical throughput include implementations that stress matrix and linear‑algebra kernels; they reveal how memory, interconnects and software limit real‑world throughput. See links on system performance testing for details.

Applications and importance

High FLOPS counts are important in high‑performance computing (HPC), scientific simulation, weather modeling and deep learning. Graphics processors and specialized accelerators often achieve very high FLOPS for parallel workloads. Architects design systems to balance FLOPS with memory bandwidth, latency and energy efficiency because raw FLOPS alone does not guarantee fast application runs.

Distinctions and limitations

FLOPS measures only floating‑point arithmetic throughput, not instruction mix, integer performance, I/O, or software efficiency. Two machines with similar FLOPS may perform differently on a particular application if one has superior memory bandwidth or better compiler support. For background on computers in general, see computer resources; to learn more about floating‑point arithmetic itself, consult materials on floating point.