notes · 08

Distributions

A distribution is one of the first places where probability becomes visual. This page is built to help you move from definition to intuition: what a distribution is, how families differ, how parameters change shape, and why those choices matter in modelling.

Start from the idea, then move through families, then use the explorer. The goal is not memorising formulas — it is building shape intuition.

The idea Families How to read shape Explorer Compare Risk context Summary

start here

What a distribution actually tells you

A distribution is the full description of uncertainty for a variable. It tells you what values are possible, how likely they are, and what overall shape those possibilities create.

The simple version

Imagine observing the same random process again and again. A distribution is the shape that repeated outcomes settle into.

It is not just a single number. It is the entire structure behind the numbers: centre, spread, asymmetry, tails, and support.

Distribution = the geometry of uncertainty.

In practice, this means distributions help you answer questions like: where do outcomes usually land, how variable are they, and how much mass sits in extreme outcomes?

Why this matters before modelling

Before fitting a model, you should ask what kind of variable you are dealing with. Is it bounded between 0 and 1? Is it strictly positive? Is it a count? Is it symmetric or heavily skewed?

Distributional thinking is what stops modelling from becoming mechanical. It makes you ask whether the mathematical family matches the data-generating behaviour.

first split

Continuous vs discrete families

The first useful distinction is not Normal vs Binomial. It is continuous vs discrete. That single split already tells you what kind of object you are modelling.

Continuous distributions

Values move across an interval or the real line. We usually describe them with a PDF.

Normal Log-Normal Beta Exponential Student-t Uniform

Good for measurements, rates, proportions, times, losses, positive severities, and anything that is not naturally a count.

Discrete distributions

Values are countable: 0, 1, 2, 3, … We usually describe them with a PMF.

Bernoulli Binomial Poisson Geometric Negative Binomial

Good for counts, events, successes across trials, and rare-frequency style questions.

reading distributions

How to read the shape of a distribution

Before naming the family, learn to look at shape. These are the first four things worth checking.

Centre

Where does the mass live? Mean, median, and mode are three different ways of describing the typical region.

Spread

How far do values wander from the centre? Variance and standard deviation make the width visible.

Skew

Is one side longer or heavier than the other? Right-skew and left-skew already narrow the list of plausible families.

Tails

How much probability sits far from the centre? Heavy tails matter because models often fail in the extremes.

Support

What values are even possible? Support is often the simplest clue: counts, only positive values, or bounded values.

Parameter sensitivity

Small parameter changes can produce very different shapes. Understanding that movement is the real point of the explorer below.

learning sequence

A useful way to learn distributions

Start with support

Ask what values are possible. Real line? Positive only? 0 to 1? Integer counts? This immediately rules families in or out.

Then look at shape

Check symmetry, skew, spread, and tail behaviour. This is where Normal, Log-Normal, Beta, and Student-t begin to separate.

Then touch the parameters

Use sliders. Watch what happens when you move μ, σ, α, β, λ, or ν. Intuition comes from moving the shape, not only reading the formula.

Only then care about formulas

Formulas matter, but they become much easier once you already know what the family is trying to do geometrically.

interactive explorer

Move the parameters and watch the shape change

Use the tabs below as a guided progression. Start with Normal, then move to positive skew (Log-Normal), bounded space (Beta), waiting-time logic (Exponential), heavy tails (Student-t), and finally discrete counts (Binomial / Poisson).

Normal is the cleanest starting point: symmetric, centred, and easy to standardise. Use it to understand what “mean moves location” and “standard deviation changes width” really look like.

Parameters

μ (mean)0

σ (std dev)1

f(x) = (1 / σ√2π) · e^{−(x−μ)² / 2σ²}

What to notice

Mean0

Variance1

Skewness0

Kurtosis3

Move μ left and right: the whole bell shifts. Increase σ: the bell spreads and flattens.

Parameters

μ (log-mean)0

σ (log-std)0.5

f(x) = (1 / xσ√2π) · e^{−(ln x−μ)² / 2σ²}, x > 0

What to notice

Mean1.133

Median1.000

Skewnesspositive

This is a “positive only” family. As σ grows, the right tail stretches hard.

Parameters

α (alpha)2

β (beta)5

f(x) = x^α−1(1−x)^β−1 / B(α,β), x ∈ [0,1]

What to notice

Mean0.286

Mode0.200

Variance0.026

Beta is one of the best examples of “bounded shape flexibility.” It can be flat, U-shaped, left-skewed, right-skewed, or mound-shaped.

Parameters

ν (degrees of freedom)3

Lower ν = heavier tails. Higher ν = closer to Normal.

What to notice

Mean0 (ν > 1)

Variance3.000

Normal limitas ν → ∞

Student-t is a good lesson in tail risk: the centre can look familiar while extremes behave very differently.

Parameters

n (trials)20

p (probability)0.3

P(X=k) = C(n,k) · p^k · (1−p)^n−k

What to notice

Mean6.0

Variance4.2

Std Dev2.05

This is the cleanest way to think about “number of successes out of n attempts.”

Parameters

a (min)0

b (max)5

f(x) = 1 / (b−a), for a ≤ x ≤ b

What to notice

Mean2.500

Variance2.083

Entropyflat

Uniform is useful as a baseline: equal weight across an interval, no preference inside the support.

compare shapes

Overlay multiple families on one chart

Comparison makes intuition sharper. Turn families on and off to compare symmetry, boundedness, skew, and tail thickness.

Toggle overlay

Normal (0, 1)

Student-t (ν=3)

Log-Normal (0, 0.5)

Exponential (λ=1)

Beta (2, 5)

Uniform (0, 4)

modelling context

Where distributions show up in risk and modelling

Distributions are not just exam material. They are modelling choices. They shape what kinds of outcomes your model can represent.

Distribution	Type	Support	Shape intuition	Typical use
Normal	Continuous	(−∞, +∞)	Symmetric, bell-shaped, light tails	Standardisation, latent-variable thinking, CLT approximations
Log-Normal	Continuous	(0, +∞)	Positive, right-skewed	Positive severity-type variables, skewed magnitudes
Beta	Continuous	[0,1]	Flexible bounded shape	Recovery rates, proportions, bounded probabilities
Exponential	Continuous	[0,+∞)	Fast decay, memoryless	Waiting time intuition, hazard-style reasoning
Student-t	Continuous	(−∞,+∞)	Heavier tails than Normal	Small-sample caution, extreme outcome sensitivity
Binomial	Discrete	{0,1,…,n}	Count of successes	Simple default-count thinking across fixed trials
Poisson	Discrete	{0,1,2,…}	Rare-event frequency	Event counts, low-frequency environments
Uniform	Continuous	[a,b]	Flat support, no preferred region	Baseline simulation logic, simple uncertainty bounds

foundations

Key concepts worth keeping

PDF vs PMF

Density vs mass

Continuous variables use density. Discrete variables assign mass to exact outcomes. Same modelling idea, different mathematical object.

CDF

Cumulative probability

The CDF asks how much probability has accumulated by the time you reach x. It is the “at or below x” view.

Support

Possible values matter

The support of the variable often tells you more than the formula. Counts, bounded variables, and positive-only variables need different families.

Moments

Mean, variance, skew, tails

These are compact summaries of shape. They help compare families without reading every point on the curve.

MLE

Fitting a family

Maximum likelihood estimation asks which parameter values make the observed data most plausible under the assumed family.

Model risk

Wrong family, wrong story

If the distributional assumption is wrong, tail behaviour, calibration, uncertainty estimates, and decisions built on them can all drift off-course.

summary

What to leave this page with

A distribution is not just a formula. It is a claim about how uncertainty is structured.

The most useful learning path is: first understand support, then read shape, then move parameters, then connect the family to a modelling use case.

Once that becomes intuitive, the formulas stop looking abstract — they start looking like compressed descriptions of behaviour.