Poisson Model Tutorial#
The Poisson model handles count data with heterogeneous treatment effects.
When to Use#
Use the Poisson model when:
Outcome is a non-negative integer (0, 1, 2, …)
Variance approximately equals the mean
Examples: patent counts, doctor visits, accidents
Mathematical Setup#
Data Generating Process#
Where: $\(\lambda = \exp(\alpha(X) + \beta(X) \cdot T)\)$
The log-link ensures \(\lambda > 0\).
Estimand#
The average treatment effect on the log-rate.
Loss Function#
Poisson negative log-likelihood (up to constants).
Influence Score Components#
Component |
Formula |
|---|---|
Residual \(r\) |
\(Y - \lambda\) |
Hessian weight \(W\) |
\(\lambda\) |
Score \(\nabla\ell\) |
\(-r \cdot [1, \tilde{T}]\) |
Note: Weight \(W = \lambda\) means high-count observations get more weight.
Complete Example#
import numpy as np
from deep_inference import structural_dml
# Generate count data
np.random.seed(42)
n = 2000
X = np.random.randn(n, 10)
T = np.random.randn(n)
# True structural functions
alpha_true = 1.0 + 0.2 * X[:, 0]
beta_true = 0.3 + 0.1 * X[:, 0]
lam = np.exp(alpha_true + beta_true * T)
Y = np.random.poisson(lam).astype(float)
mu_true = beta_true.mean()
print(f"True mu* = {mu_true:.6f}")
print(f"Mean count = {Y.mean():.2f}")
print(f"Max count = {Y.max()}")
# Run inference
result = structural_dml(
Y=Y, T=T, X=X,
family='poisson',
hidden_dims=[64, 32],
epochs=100,
n_folds=50,
lr=0.01
)
print(result.summary())
Interpreting Coefficients#
With the log-link, \(\beta\) represents a semi-elasticity:
A unit increase in \(T\) changes \(E[Y]\) by approximately \(100 \cdot \beta\)%.
Example Interpretation#
If \(\hat{\mu} = 0.05\), then on average a 1-unit increase in treatment increases the expected count by 5%.
Real-World Applications#
Patent Counts#
# Y = number of patents filed
# T = R&D spending (log)
# X = (firm size, industry, prior patents, ...)
# Target: E[beta(X)] = average R&D elasticity of patenting
result = structural_dml(Y, T, X, family='poisson')
Doctor Visits#
# Y = number of doctor visits per year
# T = insurance generosity
# X = (age, health status, income, ...)
# Target: E[beta(X)] = average effect of insurance on utilization
result = structural_dml(Y, T, X, family='poisson')
Traffic Accidents#
# Y = number of accidents at intersection
# T = speed limit
# X = (traffic volume, weather, road design, ...)
# Target: E[beta(X)] = average effect of speed on accidents
result = structural_dml(Y, T, X, family='poisson')
Poisson vs Negative Binomial#
If your count data shows overdispersion (variance > mean), consider the Negative Binomial model instead:
# Check for overdispersion
print(f"Mean: {Y.mean():.2f}")
print(f"Variance: {Y.var():.2f}")
if Y.var() > 1.5 * Y.mean():
print("Consider using NegBin model")
result = structural_dml(Y, T, X, family='negbin')
Key Takeaways#
Log-link interpretation: Coefficients are semi-elasticities
Weight = lambda: High counts get more influence
Check for overdispersion: Use NegBin if variance >> mean
Count data is common: Many economic outcomes are counts