Roboculator
Online CalculatorsCategoriesDate & EventsNews
Get Started
Online CalculatorsCategoriesDate & EventsNewsGet Started
Roboculator

Smart calculators for every challenge. Free, fast, and private.

Categories

  • Finance
  • Health
  • Math
  • Construction
  • Conversion
  • Everyday Life

Popular Tools

  • Date & Events
  • Loan Calculator
  • BMI Calculator
  • Percentage Calc
  • Latest News
  • Search All

Resources

  • Glossary
  • Topic Tags
  • News & Insights

Company

  • About
  • Contact

Legal

  • Privacy Policy
  • Terms of Service
  • Editorial Policy
  • Disclaimer
© 2026 Roboculator. All rights reserved.
Roboculator

roboculator.com

  1. Home
  2. /Statistics
  3. /Probability Distributions
  4. /Hypergeometric Distribution Calculator

Hypergeometric Distribution Calculator

Calculator

Results

P(X = k)

0.55170273

Mean

1

Variance

0.734694

Standard Deviation

0.857143

Minimum Feasible k

0

Maximum Feasible k

5

Valid Input Flag

1

Results

P(X = k)

0.55170273

Mean

1

Variance

0.734694

Standard Deviation

0.857143

Minimum Feasible k

0

Maximum Feasible k

5

Valid Input Flag

1

The Hypergeometric Distribution Calculator computes the probability mass function (PMF), mean, variance, and standard deviation for the hypergeometric distribution. This distribution models the number of successes in a sample drawn without replacement from a finite population, making it fundamentally different from the binomial distribution, which assumes sampling with replacement (or from an infinite population).

The hypergeometric distribution arises whenever you draw a sample from a finite collection containing two types of items (successes and failures) without putting items back. Classic examples include drawing cards from a deck (how many aces in a 5-card hand?), quality control inspection (how many defectives in a sample from a finite lot?), ecological capture-recapture studies (how many tagged fish in a recaptured sample?), and committee selection (how many women on a randomly chosen committee from a mixed pool?).

The distribution is characterized by three parameters: N (total population size), K (number of success states in the population), and n (number of draws). The random variable k counts the number of observed successes in the draw. The key difference from the binomial is that each draw changes the composition of the remaining population, so successive draws are not independent. As a result, the variance of the hypergeometric is smaller than the corresponding binomial variance by a factor of (N-n)/(N-1), called the finite population correction.

This calculator uses Stirling's approximation for the log-factorial computation, which allows it to handle population sizes up to 1000 while maintaining good accuracy. For small populations, the approximation is less precise but still provides useful estimates. The exact PMF involves a ratio of three binomial coefficients: C(K,k) * C(N-K, n-k) / C(N, n), which can involve astronomically large numbers that Stirling's formula handles gracefully in log-space.

Understanding the hypergeometric distribution is essential in quality assurance (acceptance sampling plans), genetics (Fisher's exact test for independence), ecology (population estimation via capture-recapture), card game probability, lotteries, and any scenario involving finite-population sampling without replacement.

In industrial quality control, acceptance sampling plans use the hypergeometric distribution to determine whether a batch of products meets quality standards based on a sample inspection. Military Standard 105E and its civilian equivalent ANSI/ASQ Z1.4 are built on hypergeometric calculations. In genomics, the hypergeometric test is widely used for gene set enrichment analysis, determining whether a set of genes of interest is overrepresented in a particular biological pathway. In lottery mathematics, the hypergeometric distribution computes the exact probability of matching k numbers from a drawn set, which is fundamental to prize structure design and expected value calculations.

Visual Analysis

How It Works

The hypergeometric PMF gives the probability of observing exactly k successes when drawing n items without replacement from a population of N items containing K successes:

$$P(X = k) = \frac{\binom{K}{k} \binom{N-K}{n-k}}{\binom{N}{n}}$$

This formula counts: (ways to choose k successes from K) × (ways to choose n-k failures from N-K) / (total ways to choose n from N). To avoid numerical overflow, we compute in log-space using Stirling's approximation:

$$\ln(m!) \approx m \ln(m) - m + \frac{1}{2}\ln(2\pi m)$$

The key statistics are:

$$E[X] = \frac{nK}{N}, \quad \text{Var}(X) = n \cdot \frac{K}{N} \cdot \frac{N-K}{N} \cdot \frac{N-n}{N-1}$$

The last factor (N-n)/(N-1) is the finite population correction, which reduces the variance compared to the binomial.

Understanding Your Results

The PMF P(X = k) is the exact probability of drawing exactly k success items from the population. The mean nK/N is the expected number of successes, which equals what you would expect proportionally. The variance is less than the binomial variance np(1-p) due to the finite population correction factor. Note: the calculator returns PMF = 0 for impossible combinations (e.g., k > K, k > n, or n-k > N-K). Stirling's approximation is highly accurate for large values but may have small errors for very small factorials (0!, 1!, 2!).

Worked Examples

Drawing Aces from a Deck

Inputs

N pop52
K succ4
n draw5
k obs2

Results

pmf0.03992982
mean val0.3846
variance0.3254
std dev0.5704

The probability of getting exactly 2 aces in a 5-card poker hand is about 3.99%. On average, a 5-card hand contains 0.385 aces. The exact probability is C(4,2)*C(48,3)/C(52,5).

Quality Control: Defects in Sample

Inputs

N pop100
K succ8
n draw10
k obs1

Results

pmf0.41472316
mean val0.8
variance0.6618
std dev0.8136

From a lot of 100 items with 8 defective, drawing 10 and finding exactly 1 defective has a probability of about 41.5%. The expected number of defectives in the sample is 0.8.

Frequently Asked Questions

Use the hypergeometric when sampling without replacement from a finite population. The binomial assumes either replacement or an infinite population. As a rule of thumb, if the sample size n is less than 5-10% of the population N, the binomial is a good approximation. When n/N > 0.05, the hypergeometric is more accurate because the finite population correction becomes significant.

The factor (N-n)/(N-1) reduces the hypergeometric variance compared to the binomial. When the sample is a large fraction of the population, each draw significantly changes the remaining composition, reducing variability. When N is much larger than n, this factor approaches 1 and the hypergeometric converges to the binomial.

Fisher's exact test uses the hypergeometric distribution to test for association in 2×2 contingency tables. It computes the exact probability of observing the given table (or more extreme) under the null hypothesis of no association. Unlike chi-squared tests, it is valid for small sample sizes and is the gold standard for testing independence in small samples.

Stirling's approximation ln(m!) ≈ m·ln(m) - m + 0.5·ln(2πm) is very accurate for large m (relative error < 1/(12m)). For m = 10, the error is about 0.8%. For m = 50, it is about 0.17%. For m = 0, 1, 2, the approximation is less precise but the calculator handles the m = 0 case (ln(0!) = 0) explicitly.

The parameters must satisfy: 0 ≤ K ≤ N (success states cannot exceed population), 1 ≤ n ≤ N (cannot draw more than population), and max(0, n+K-N) ≤ k ≤ min(n, K) (observed successes must be feasible). The calculator returns PMF = 0 for infeasible combinations.

In capture-recapture studies, N is the unknown total population, K is the number of tagged animals from the first capture, n is the second capture sample size, and k is the number of tagged animals recaptured. The hypergeometric distribution models k, and the maximum likelihood estimate of N is approximately nK/k. This is the Lincoln-Petersen method for population estimation.

Sources & Methodology

Johnson, N.L., Kemp, A.W., and Kotz, S. Univariate Discrete Distributions, 3rd Edition, Wiley, 2005. Agresti, A. Categorical Data Analysis, 3rd Edition, Wiley, 2013. Rice, J.A. Mathematical Statistics and Data Analysis, 3rd Edition, Cengage, 2006. Krebs, C.J. Ecological Methodology, 2nd Edition, Benjamin Cummings, 1999.
R

Roboculator Team

The Roboculator Team explains calculations, planning tools, and practical formulas in clear language for real-life situations.

How helpful was this calculator?

Be the first to rate!

Related Calculators

Normal Distribution Calculator

Probability Distributions

Standard Normal Distribution Calculator

Probability Distributions

Poisson Distribution Calculator

Probability Distributions

Exponential Distribution Calculator

Probability Distributions

Uniform Distribution Calculator

Probability Distributions

Geometric Distribution Calculator

Probability Distributions