Most commonly used numbers: statistics, psycholinguistics, and Benford's Law
Introduction: The number as a unit of information and a cultural marker
The question of the frequency of numbers seems simple, but its analysis lies at the intersection of mathematical statistics, perception psychology, linguistics, and information theory. It is important to distinguish between the natural frequency of occurrence of numbers in numerical data in the real world and their subjective frequency in human practice (in numbers, prices, elections). The most surprising is that these distributions are not random or uniform, but follow deep regularities that are important for data analysis, fraud detection, and understanding cognitive distortions.
1. Benford's Law: unexpected asymmetry in the world of numbers
The most powerful and counter-intuitive fact about the frequency of numbers is described by Benford's Law (the first digit law). It states that in many natural sets of numerical data (from electricity bills and mountain heights to molecular weights and stock market quotations), the probability that the first significant digit (from 1 to 9) will be equal to d is calculated by the formula: P(d) = log₁₀(1 + 1/d).
This gives the following distribution of probabilities for the first digit:
1 appears approximately in 30.1% of cases.
2 — about 17.6%.
3 — about 12.5%.
Then the frequency decreases: 9 occurs only in 4.6% of cases.
Reason: The Law works for data that are distributed over many orders of magnitude (from units to millions) and describe processes of growth or multiplication. For example, the population of cities, stock prices, lake areas. The number 1 leads because to move from 1 to 2 the value must increase by 100%, while from 8 to 9 — only by 12.5%. The system “sticks” to numbers starting with 1 longer.
Application: Tax and financial authorities around the world use Benford's Law to detect suspicious reports and falsified data, as a person inventing numbers intuitively tends t ...
Read more