Statistics Summary
Quick mean, median, SD, range, and min/max snapshot of any dataset.
Your Dataset
What Is Descriptive Statistics?
Descriptive statistics reduce a dataset to a handful of numbers that describe its centre, spread, shape, and tail behaviour. Centre is captured by the mean, median, and mode; spread by the variance, standard deviation, and range; shape by skewness and kurtosis; and tail behaviour by quartiles, percentiles, and outlier fences. A complete descriptive summary tells you what a dataset typically looks like in a single screen, and it's the entry point to any inferential test — confidence intervals, hypothesis tests, regression, machine-learning baselines — all start by summarising what you have.
This Statistics Calculator bundles six analysis modes onto a single dataset: a Quick Summary that returns every classic statistic in one click; a Descriptive Statistics table with formulas and confidence intervals; a Frequency Distribution with histogram and cumulative-frequency chart; a Percentile & Outlier view with quartiles, percentiles, and an SVG box plot; a Distribution Shape mode with normal-curve overlay and 68-95-99.7 rule; and a Probability mode with per-value Z-scores and interval probabilities. It pairs naturally with the standard deviation calculator, the sample size calculator, the z-score calculator, and the probability calculator.
How the Calculator Works
Paste, type, or upload your dataset
Drop a list of numbers separated by commas, spaces, tabs, semicolons, or newlines. Scientific notation, negatives, and decimals are all parsed. You can also upload a CSV, TSV, or TXT file straight from your machine — the calculator extracts every numeric token and flags any that fail to parse.
Pick population or sample
Statistics that involve variance change formula based on whether you have the entire population (divide by n, σ²) or a sample drawn from one (divide by n − 1, s²). The toggle at the top of the input card switches both the variance and standard-deviation outputs site-wide.
One click computes 20+ statistics
Press Calculate Statistics and the calculator returns mean, median, mode, range, midrange, variance, standard deviation, standard error, coefficient of variation, geometric mean, harmonic mean, RMS, mean absolute deviation, skewness, excess kurtosis, quartiles, percentiles, IQR, outliers, and 90/95/99/99.9% confidence intervals.
Six views of the same dataset
Switch tabs to drill into different perspectives — frequency tables, histogram, cumulative CDF, box plot, normal-curve overlay, 68-95-99.7 rule, Z-scores, and probability lookups. Every view stays in sync with the dataset you entered.
6 Ways to Use the Statistics Calculator
Get a one-screen overview of any dataset
The Summary tab returns 16 statistics in a single grid — perfect for a quick read on a homework dataset, A/B test sample, or sensor batch. No scrolling, no exporting to a spreadsheet.
Write a methods section with confidence intervals
The Descriptive tab includes a four-row CI table at 90%, 95%, 99%, and 99.9%. Copy the row that matches your paper's reporting standard straight into the manuscript.
Build a frequency table for a presentation
The Frequency tab returns value, count, relative frequency, cumulative count, percentage, and cumulative percentage — ready to drop into a deck or report.
Hunt for outliers before modelling
The Percentile tab applies both the 1.5·IQR and 3·IQR fences automatically. Mild outliers appear amber, extreme outliers rose — see exactly which observations might bias a regression.
Sanity-check normality assumptions
The Distribution tab overlays a fitted normal curve on the histogram and reports skewness and excess kurtosis with plain-English interpretation — leptokurtic, mesokurtic, or platykurtic.
Look up a probability for a value
The Probability tab converts any value into a Z-score and reports tail probabilities P(X < x), P(X > x), and the interval P(a < X < b) — no separate Z-table needed.
Best Practices for Reporting Descriptive Statistics
Always report the centre and the spread together. A mean without a standard deviation is uninterpretable — you don't know whether the value is typical or coincidental. The conventional report is x̄ ± s or x̄ (s) for sample data; the calculator displays both side by side so you can copy them directly.
Pick the right measure of centre for the shape of your data. The mean is the natural choice for symmetric distributions; the median is far more robust for skewed data such as incomes, response times, and home prices, where a few extreme values pull the mean away from where most observations actually sit. The calculator prints both, and the Distribution tab tells you which is more representative.
Decide whether your data is a sample or the entire population before you read the spread. The sample standard deviation s uses Bessel's correction (divides by n − 1) and is the right choice for almost every real research dataset. The population formula σ (divides by n) is only correct when you genuinely have every member of the population — a rare situation outside textbook exercises.
Flag and report outliers — don't silently delete them. Use the IQR fence as a first pass, look at each flagged value, and decide whether it's a true reading, a data-entry error, or a meaningful tail event. Drop only the values you can defend dropping, and document the decision in your methods section.
Why Descriptive Statistics Matter
Foundation for inference
Every hypothesis test, confidence interval, and machine-learning baseline reads in the same descriptive summary as a first step. Getting the mean, SD, and outliers right protects every downstream conclusion you draw.
Communication of uncertainty
Reporting only a mean implies there is no spread — a deceptive simplification. A median, range, IQR, and confidence interval together communicate how much the underlying value could realistically be, given the data you collected.
Quality control and safety
Manufacturing, healthcare, and clinical settings rely on out-of-control signals defined in standard deviations. Calculating SD and Z-scores correctly is what separates a controlled process from a recall.
Honest data storytelling
Skewness, kurtosis, and outliers are exactly the features that visual summaries hide. Quoting them alongside the mean and median makes a chart honest rather than misleading.
Where Descriptive Statistics Get Tricky
Heavily skewed data
For income, runtime, and reaction-time data the mean lies far above the median because the right tail is long. Report both — and consider a log transform or the geometric mean for multiplicative effects. The calculator returns the geometric mean automatically when all values are positive.
Small samples (n < 10)
Skewness, kurtosis, and outlier fences become unstable for tiny samples — a single extra observation can flip the verdict. Treat the descriptive summary as preliminary and lean on the median, range, and confidence interval rather than the moment-based statistics.
Heaping and rounding
Datasets where every value is rounded to the nearest 5 or 10 will look bimodal or multimodal even when the underlying process is unimodal. Check the frequency table for suspicious clumping before reading into the modes.
Mixed populations
Two genuinely different groups (e.g. weekend and weekday traffic) mashed into one dataset will produce a bimodal distribution and a misleading single mean. Use the Frequency and Histogram tabs to spot multimodality, then split the data before summarising.
The Core Statistics Formulas
Mean (arithmetic average)
x̄ = Σx ÷ n
Sum of every value divided by the number of values.
Median
P50 = middle value of sorted x
Average of the two middle values when n is even.
Mode
argmax frequency(x)
Value (or values) that occurs most often.
Range
R = max(x) − min(x)
Crudest measure of spread; affected only by extremes.
Population variance
σ² = Σ(x − x̄)² ÷ n
Use only when you have every member of the population.
Sample variance
s² = Σ(x − x̄)² ÷ (n − 1)
Bessel correction — the unbiased estimator for σ².
Standard deviation
σ = √σ² s = √s²
Reported in the same units as the underlying data.
Standard error of mean
SEM = s ÷ √n
Standard deviation of the sampling distribution of x̄.
Coefficient of variation
CV = s ÷ |x̄| × 100
Spread expressed as a percentage of the mean.
Quartiles (linear interpolation)
Qp = sorted[(p/100)·(n − 1)]
Standard Tukey method — used here for all percentiles.
Interquartile range
IQR = Q3 − Q1
Robust measure of spread that ignores outliers.
IQR outlier fence
Q1 − 1.5·IQR … Q3 + 1.5·IQR
Tukey fence — values outside are flagged as mild outliers.
Skewness (moment)
g₁ = m₃ ÷ σ³
Positive = right-tailed; negative = left-tailed.
Excess kurtosis
g₂ = m₄ ÷ σ⁴ − 3
Positive = heavier tails than normal; negative = lighter.
Z-score
z = (x − x̄) ÷ s
Standard deviations away from the mean for any value.
Confidence interval (mean)
x̄ ± Z* · s ÷ √n
Standard normal critical value Z*; for 95% Z* ≈ 1.960.
Geometric mean
GM = (∏ xᵢ)^(1/n)
Right metric for ratios and rates of return; positive values only.
Harmonic mean
HM = n ÷ Σ(1 ÷ xᵢ)
Right metric for averaging rates per unit time.
Root mean square
RMS = √(Σx² ÷ n)
Used in physics and signal processing for magnitudes.
Mean absolute deviation
MAD = Σ|x − x̄| ÷ n
Robust alternative to standard deviation.
Standard normal CDF
Φ(z) = ½ [1 + erf(z/√2)]
Maps any Z-score to a cumulative probability.
Common Mistakes to Avoid
- 1
Confusing mean and median for skewed data
For salaries, response times, or home prices the mean sits above the median because of a long right tail. Quoting the mean alone for skewed data overstates a typical observation. Report the median (and the IQR) for skewed distributions.
- 2
Using the population SD on sample data
If you only collected a slice of the underlying population, divide by n − 1 (sample SD), not n (population SD). Using the population formula understates the true variability and produces confidence intervals that are too narrow.
- 3
Ignoring outliers because they're inconvenient
Outliers can be data-entry errors, sensor faults, or legitimate tail events. Never delete them silently — the calculator's IQR fence shows them so you can investigate. Document any deletions in writing.
- 4
Reading variance in the wrong units
Variance is reported in squared units of the underlying data — $² for dollars, ms² for milliseconds. Take the square root to get the standard deviation, which is reported in the original units and is comparable to the mean.
- 5
Treating a 95% CI as a probability statement about the parameter
A 95% confidence interval is a procedure that captures the true mean 95 times out of 100 across repeated samples — it's not the probability that this particular interval contains the true mean. The distinction matters for Bayesian vs frequentist reporting.
- 6
Misinterpreting percentile rank
Being at the 90th percentile means 90% of values are ≤ you, not that you scored 90%. The calculator's percentile table makes this distinction explicit.
- 7
Assuming normality without checking
A lot of intro-stats results — z-tests, naive confidence intervals, parametric ANOVAs — require approximate normality. Use the Distribution tab's histogram + normal overlay and the skewness/kurtosis numbers to check the assumption before relying on it.
Worked Examples
Example 1 — full descriptive summary for a tiny dataset
Dataset: {15, 22, 18, 31, 27, 19, 24, 29, 35, 20}
- 1Sort: {15, 18, 19, 20, 22, 24, 27, 29, 31, 35}.
- 2Sum = 240, n = 10, so x̄ = 240 / 10 = 24.
- 3Median = (22 + 24) / 2 = 23 (average of the two middle values for even n).
- 4Mode: every value appears once — no mode.
- 5Range = 35 − 15 = 20; midrange = (35 + 15) / 2 = 25.
- 6Σ(x − x̄)² = 9² + 6² + 5² + 4² + 2² + 0² + 3² + 5² + 7² + 11² = 81 + 36 + 25 + 16 + 4 + 0 + 9 + 25 + 49 + 121 = 366.
- 7Sample variance s² = 366 / 9 ≈ 40.667 — so s ≈ 6.377.
- 8SEM = s / √n = 6.377 / √10 ≈ 2.017; 95% CI = 24 ± 1.96·2.017 ≈ [20.05, 27.95].
Example 2 — IQR outlier detection on a wage dataset
Dataset: {55, 62, 70, 58, 95, 64, 72, 68, 75, 60, 88, 66, 73, 71, 67, 245} ($k salaries)
- 1Sort and find quartiles: Q1 ≈ 63, Q3 ≈ 74.5, IQR ≈ 11.5.
- 2Lower fence = 63 − 1.5·11.5 = 45.75; upper fence = 74.5 + 1.5·11.5 = 91.75.
- 3The value 245 lies well above the upper fence — flagged as a mild outlier.
- 4Apply the 3·IQR test: extreme upper = 74.5 + 3·11.5 = 109 — 245 is also an extreme outlier.
- 5Decision: investigate the 245k entry before computing the mean and SD; consider reporting the median ($69) alongside or instead of the mean.
Example 3 — 95% confidence interval for a measurement series
Dataset: {12.4, 12.6, 12.3, 12.5, 12.7, 12.4, 12.5, 12.6, 12.4, 12.5, 12.6, 12.3, 12.5, 12.4, 12.5} mm
- 1Sum = 187.2; n = 15; x̄ = 12.48 mm.
- 2Sample standard deviation s ≈ 0.117 mm.
- 3SEM = 0.117 / √15 ≈ 0.0302 mm.
- 495% CI = 12.48 ± 1.96·0.0302 ≈ [12.421, 12.539] mm.
- 5Report: the true mean diameter is in [12.42, 12.54] mm with 95% confidence — comfortably inside any 12.5 ± 0.1 mm tolerance band.
Example 4 — Z-score probability for a single observation
Dataset: x̄ = 75, s = 8, query value x = 90
- 1z = (90 − 75) / 8 = 1.875.
- 2P(X < 90) = Φ(1.875) ≈ 0.9696 — about 96.96% of values lie at or below 90.
- 3P(X > 90) = 1 − Φ(1.875) ≈ 0.0304 — about 3.04% of values lie above 90.
- 4Interpretation: a score of 90 is roughly the top 3% of the distribution.
Real-World Applications
Business analytics
Daily-revenue dashboards almost always quote a mean and a standard deviation. Add the median and IQR to spot whether a recent revenue dip is a single bad day or a sustained pattern.
Market research
Survey data lives or dies by descriptive statistics — mean satisfaction scores, percentage in each segment, modal preference. Use the frequency table for categorical-style rating scales.
Academic research
Methods sections begin with descriptive statistics for every key variable: mean, SD, range, and n at minimum. Confidence intervals are increasingly expected for any primary outcome.
Machine learning
Feature engineering starts with descriptive statistics — outliers, skewness, and scale all change how a model behaves. Run every numeric column through this calculator before you scale, transform, or fit.
Healthcare
Vitals are reported as means with confidence intervals; outliers are clinically actionable. The IQR fence is the standard rule of thumb for flagging measurements worth a second look.
Finance
Return distributions are notoriously fat-tailed — descriptive stats with skewness and kurtosis are the entry point to risk modelling. Coefficient of variation is the standard way to compare volatility across instruments.
Manufacturing
Statistical process control sets specification limits at x̄ ± 3·σ. Daily descriptive summaries are how engineers spot a drifting process before it produces scrap.
Sports analytics
Per-game stats, season averages, percentile rankings, and outlier games all rely on the same descriptive toolkit. The frequency table is perfect for shot-distance distributions.
Quality control
Acceptance sampling and Six Sigma compute defect rates, process capability indices, and out-of-control rules in terms of mean and SD — the calculator returns every input those formulas need.
Government statistics
Census data, economic indicators, and public-health releases publish full descriptive distributions — quartiles, percentiles, IQR — because the median and IQR are robust to the extreme tails of real-world data.
Statistics Reference Guide
Mean
The arithmetic average, x̄ = Σx ÷ n. Uses every observation, so a single extreme value pulls it noticeably. The natural measure of centre for roughly symmetric data.
Median
The middle value of the sorted dataset (or the average of the two middle values for even n). Unaffected by extreme values, so the right choice for skewed distributions like incomes or response times.
Mode
The value (or values) that appears most often. A dataset can be unimodal, bimodal, multimodal, or have no mode at all when every observation is unique.
Variance
Mean of the squared deviations from the mean. Reported in squared units of the underlying data — variance of dollars is in dollars². Take the square root to get the standard deviation.
Standard deviation
Square root of the variance. Reported in the same units as the data, so it's directly comparable to the mean. The most widely-used measure of spread.
Sample standard deviation
Uses Bessel's correction — divides Σ(x − x̄)² by n − 1 instead of n. The unbiased estimator of the population SD when you only have a sample.
Range
Maximum minus minimum. Trivial to compute but driven entirely by the two most extreme values — use it as a quick sanity check, not as a serious measure of spread.
Quartiles
Q1 (25th percentile), Q2 (median, 50th percentile), and Q3 (75th percentile) split the sorted data into four equal pieces.
Percentiles
The value below which a given percentage of observations falls — the 90th percentile means 90% of values are ≤ that point. Linear interpolation between order statistics is the standard method.
Outliers
Values flagged by the IQR fence (Q1 − 1.5·IQR, Q3 + 1.5·IQR). Extreme outliers use the 3·IQR fence. Always investigate before deleting.
Skewness
Third standardised moment. Zero means symmetric; positive means a long right tail; negative means a long left tail. Values above |1| indicate strong asymmetry.
Kurtosis
Fourth standardised moment minus 3 (excess kurtosis). Zero matches the normal distribution; positive indicates fatter tails; negative indicates lighter tails.
Confidence interval
A range computed from the data that captures the true population parameter a known fraction of the time across repeated samples. 95% is the conventional reporting standard.
Normal distribution
The bell-shaped continuous distribution determined entirely by its mean and standard deviation. Many natural processes are approximately normal — and many statistical tests assume it.
Z-score
The number of standard deviations a value lies away from the mean. Standardises across datasets — a Z of 2 means "two SD above the mean" regardless of units.
Probability
Long-run proportion of observations expected at or below a value (left tail), above (right tail), or between two values, under the assumed distribution.
Methodology you can verify
Quartile and percentile values use linear interpolation between order statistics (Tukey method). Variance and standard deviation report both the population (divide by n) and sample (divide by n − 1, Bessel correction) formulas. Skewness and excess kurtosis are computed from the third and fourth central moments. Normal-distribution probabilities use the Abramowitz & Stegun erf approximation, accurate to ≈ 1.5 × 10⁻⁷. Read more on the methodology and editorial policy pages.
Frequently Asked Questions
Related Statistics Calculators
Pair the statistics calculator with related statistics and probability tools.
- Mean Median Mode Range CalculatorFull descriptive statistics — mean, median, mode, range, quartiles, variance, standard deviation, percentiles, skewness, outliers — with charts and step-by-step working.
- Sample Size CalculatorRequired sample size, margin of error, confidence interval, confidence level, and finite population correction — five survey tools with charts.
- Z-Score CalculatorZ-score, percentile rank, tail probabilities, Z ↔ probability conversion, and probability between any two Z-scores with bell-curve visualisation.
- Probability CalculatorTwo-event probabilities, unions, intersections, complements, normal distribution, confidence intervals, and step-by-step solutions across five integrated tools.
- Average CalculatorMean, median, mode, range, standard deviation, variance, and geometric and harmonic mean from any list of numbers.