Gamma Model Tutorial#
The Gamma model is for continuous positive outcomes with right-skewed distributions.
When to Use#
Use the Gamma model when:
Outcome is continuous and strictly positive
Data is right-skewed (long right tail)
Variance increases with the mean
Examples: insurance claims, healthcare costs, time durations, income
Mathematical Setup#
Data Generating Process#
where: $\(\mu = \exp(\alpha(X) + \beta(X) \cdot T)\)$
And:
\(k\) is the shape parameter (controls variance)
\(\mu\) is the mean: \(E[Y] = \mu\)
\(\text{Var}[Y] = \mu^2 / k\)
Estimand#
The average effect on the log-mean across the covariate distribution.
Loss Function#
Gamma deviance loss (up to constants).
Influence Score Components#
Component |
Formula |
|---|---|
Residual |
\(r = 1 - Y/\mu\) |
Hessian weight \(W\) |
\(Y/\mu\) |
Score \(\nabla\ell\) |
\(r \cdot [1, T]\) |
Note: The Hessian depends on \(\theta\) through \(\mu = \exp(\alpha + \beta T)\).
Complete Example#
import numpy as np
from deep_inference import structural_dml
# Generate synthetic data
np.random.seed(42)
n = 2000
X = np.random.randn(n, 10)
T = np.random.randn(n)
# True parameters
alpha_true = 2.0 + 0.3 * X[:, 0]
beta_true = 0.5 + 0.2 * X[:, 0] # Heterogeneous effect
mu_true = beta_true.mean()
# Generate Gamma outcomes
mu = np.exp(alpha_true + beta_true * T)
shape = 2.0
Y = np.random.gamma(shape, mu / shape, size=n)
print(f"True mu* = {mu_true:.6f}")
# Run inference
result = structural_dml(
Y=Y, T=T, X=X,
family='gamma',
hidden_dims=[64, 32],
epochs=100,
n_folds=50,
lr=0.01
)
print(result.summary())
Expected Results#
From Eval 01: Parameter Recovery:
Family |
Corr(α) |
Corr(β) |
Status |
|---|---|---|---|
gamma |
0.993 |
0.990 |
PASS |
The influence function correction produces valid confidence intervals. See Validation for full results.
Real-World Applications#
Healthcare Costs#
Estimate the effect of a treatment on medical expenditure:
# Y = medical costs ($)
# T = treatment indicator
# X = (age, comorbidities, insurance type, ...)
# Target: E[beta(X)] = average effect on log-cost
result = structural_dml(Y, T, X, family='gamma')
Insurance Claims#
Estimate how policy features affect claim amounts:
# Y = claim amount
# T = deductible level
# X = (policyholder demographics, ...)
# Target: E[beta(X)] = average elasticity
result = structural_dml(Y, T, X, family='gamma')
Key Takeaways#
Right-skewed positive data: Gamma is ideal when outcomes are strictly positive and variance increases with mean
Hessian depends on theta: Requires three-way splitting (automatic in
structural_dml)Log-link interpretation: \(\beta\) represents effect on log-mean, so \(\exp(\beta)\) is multiplicative effect
Shape parameter: Higher shape = lower variance relative to mean