Overview
A multiset, often called a bag in computer science, is a collection in which members may appear more than once. Unlike a conventional set where each element is either present or absent, a multiset keeps track of how many copies of each element occur. Multisets appear throughout combinatorics, databases, programming languages and everyday contexts (for example, a pile of identical shopping bags or items in a shopping bag), and they are treated as a formal object in mathematics and theoretical computer science.
Definition and notation
A multiset can be described in two equivalent ways. One view lists elements with repetition, for example {a, a, b, b, b, c}. The other uses a multiplicity function m that assigns to each possible element x a nonnegative integer m(x) indicating how many copies of x are present. In the example above, m(a)=2, m(b)=3 and m(c)=1. The multiplicity function is sometimes called the count, weight, or frequency of an element. Notation varies: people write multisets using braces with repetition, or as formal sums, or with a bar showing multiplicities. The term bag emphasizes unordered membership with counts rather than order of appearance.
Basic operations and relations
Operations on multisets generalize those on ordinary sets but must account for multiplicities. Common operations include:
- Union (multiset union): often defined so the multiplicity of an element is the maximum of its multiplicities in the operands.
- Sum (disjoint union or addition): adds multiplicities, useful when combining counts from distinct sources.
- Intersection: multiplicity is the minimum of the operand multiplicities.
- Difference: subtract multiplicities and floor at zero so counts do not become negative.
Which variant is used depends on context: algebraic constructions often use addition of multiplicities, while lattice-theoretic treatments favor max/min rules. A multiset is considered a generalization of an unordered tuple when multiplicities carry position-independent repetition; see formal discussions that compare these viewpoints here.
Counting and the multiset coefficient
Multisets are central to counting problems where repetition is allowed. If one has n distinct types of element and wants to choose r elements allowing repeats and ignoring order, the number of distinct r-element multisets equals the number of combinations with repetition. A standard closed form for this count is "n + r - 1 choose r", often written C(n + r - 1, r). This quantity is sometimes denoted by a multiset coefficient or by specialized notation such as ((n r)). These formulas are used in problems ranging from distributing identical objects into distinct boxes to choosing a multiset of letters for constructing words.
History, representation and visualization
The idea of counting repeated items is ancient in combinatorics, though the explicit notion of a multiset and the term bag became more systematic in the 20th century as set theory and algebraic structures were extended. Multisets can be represented compactly by listing distinct elements with their multiplicities or by histogram-style diagrams that show frequency counts; a histogram or bar-chart is a common visual aid to make multiplicities clear illustration. In computer implementations, multisets are often stored as associative arrays or maps from element to count.
Uses, examples and notable distinctions
Multisets model many practical situations: inventory where multiple identical items exist, multisets of prime factors of integers, bag semantics in database query results where duplicates are preserved, and multiset rewriting systems in formal language theory and chemical reaction modelling. They differ from sequences and tuples because order is ignored; they differ from simple sets because multiplicity matters. When reasoning about equality one can either require identical multiplicities for all elements (strict equality) or consider equivalence up to some normalization depending on the application. For more technical discussions and formal properties, readers can consult introductory references in combinatorics and algebraic data types about sets and on multiplicity functions.
Further reading
Multisets are a small but flexible generalization of sets. They provide a simple way to record counts while retaining much of the familiar intuition of set-based thinking. For mathematical background, algorithmic implementations, and applications across domains, follow introductory resources and surveys that treat multisets both combinatorially and algebraically.