—
—
—
0
0
0
0
0
0
—
—
—
0
0
0
0
0
0
The Hypergeometric Distribution Calculator computes the probability of drawing exactly $$k$$ successes from a finite population without replacement. Unlike the binomial distribution, which assumes independent trials, the hypergeometric distribution accounts for the changing composition of the population as items are drawn.
Consider a population of $$N$$ items containing $$K$$ success states. If $$n$$ items are drawn without replacement, the probability of observing exactly $$k$$ successes is:
$$P(X = k) = \frac{\binom{K}{k} \binom{N-K}{n-k}}{\binom{N}{n}}$$
where $$\binom{a}{b}$$ denotes the binomial coefficient "a choose b." The valid range of $$k$$ is $$\max(0, n + K - N) \leq k \leq \min(n, K)$$.
The mean of the hypergeometric distribution is $$E[X] = \frac{nK}{N}$$, which is the same as the binomial mean $$np$$ where $$p = K/N$$. The variance is:
$$\text{Var}(X) = n \cdot \frac{K}{N} \cdot \frac{N-K}{N} \cdot \frac{N-n}{N-1}$$
The factor $$\frac{N-n}{N-1}$$ is called the finite population correction (FPC) and makes the variance smaller than the corresponding binomial variance. As $$N \to \infty$$, the FPC approaches 1 and the hypergeometric converges to the binomial.
The key distinction from the binomial distribution is that draws are made without replacement. Each draw changes the composition of the remaining population, so trials are not independent. The hypergeometric distribution is the exact model for this scenario, while the binomial is an approximation valid when $$N$$ is much larger than $$n$$.
The hypergeometric distribution is used in quality control (acceptance sampling), card games (probability of a specific hand), ecological capture-recapture studies, clinical trial design, Fisher's exact test in statistics, and gene set enrichment analysis in bioinformatics.
Enter the population size $$N$$, number of success states $$K$$, number of draws $$n$$, and observed successes $$k$$. The calculator computes the PMF, mean, variance, standard deviation, and mode using Stirling's approximation for large factorials.
The calculator computes the hypergeometric PMF using logarithmic factorials (Stirling's approximation) to handle large binomial coefficients without overflow: $$\ln P = \ln\binom{K}{k} + \ln\binom{N-K}{n-k} - \ln\binom{N}{n}$$, then exponentiates the result. The mean, variance, and mode use closed-form expressions.
The PMF gives the exact probability of drawing $$k$$ successes in $$n$$ draws from a population of $$N$$ with $$K$$ success states. If the PMF returns 0, the combination of parameters may be invalid (e.g., requesting more successes than available). The mean $$nK/N$$ represents the expected number of successes, proportional to the fraction of successes in the population.
Inputs
Results
In a 52-card deck with 4 aces, the probability of drawing exactly 2 aces in a 5-card hand is about 4.0%. The expected number of aces is 0.385.
Inputs
Results
From a lot of 50 items with 10 defectives, sampling 10 items yields exactly 3 defectives with probability 21.5%. The expected number of defectives in the sample is 2.
Use the hypergeometric distribution when sampling without replacement from a finite population. If the population is large relative to the sample (commonly $$N > 20n$$), the binomial is a good approximation. For smaller populations, the hypergeometric gives exact probabilities.
The FPC is $$\frac{N-n}{N-1}$$, which multiplies the binomial-like variance. It accounts for the reduced variability when a larger fraction of the population is sampled. When $$n$$ is small relative to $$N$$, FPC is close to 1 and the hypergeometric variance is similar to the binomial variance.
Fisher's exact test evaluates the significance of association in a 2×2 contingency table by computing the probability of the observed table (and more extreme tables) under the null hypothesis of independence. These probabilities follow the hypergeometric distribution.
We need $$0 \leq K \leq N$$, $$1 \leq n \leq N$$, and $$k$$ must satisfy $$\max(0, n+K-N) \leq k \leq \min(n, K)$$. If these constraints are violated, the probability is 0.
Yes. The multivariate hypergeometric distribution extends the concept to populations with more than two categories (not just success/failure). It gives the probability of drawing specific counts from each category.
Binomial coefficients for large values involve very large factorials that can overflow standard numeric types. Stirling's approximation computes $$\ln(n!)$$ accurately without overflow, then the final probability is obtained by exponentiation.
Roboculator Team
The Roboculator Team explains calculations, planning tools, and practical formulas in clear language for real-life situations.
How helpful was this calculator?
Be the first to rate!
Exponential Distribution Calculator
Probability Distributions
Uniform Distribution Calculator
Probability Distributions
Geometric Distribution Calculator
Probability Distributions
Negative Binomial Distribution Calculator
Probability Distributions
Gamma Distribution Calculator
Probability Distributions
Weibull Distribution Calculator
Probability Distributions