Gallery#
Eight validated examples demonstrating deep-inference across model families.
At a Glance#
# |
Model |
Outcome |
Treatment |
Covariates |
Result |
|---|---|---|---|---|---|
1 |
Linear |
Log wages |
Experience (years) |
Job embeddings (64-dim) |
CI covers true |
2 |
Logit |
Purchase (0/1) |
Discount (%) |
Product embeddings (64-dim) |
CI covers true |
3 |
Poisson |
Citations |
Open Access (0/1) |
Abstract embeddings (64-dim) |
CI covers true |
4 |
Tobit |
Donation ($) |
Match ratio |
Donor demographics |
CI covers true |
5 |
Gamma |
Claim amount ($) |
Deductible ($) |
Policyholder features |
CI covers true |
6 |
Weibull |
Months to churn |
Discount offered |
Customer profile |
CI covers true |
7 |
Multinomial Logit |
Transport mode |
Travel time (min) |
Commuter attributes |
CI covers true |
8 |
Gaussian |
Part diameter (mm) |
Machine speed |
Sensor readings |
CI covers true |
All eight achieve valid 95% CI coverage with influence function correction.
1. Linear: Wage Returns to Experience#
Question: How does experience affect wages? Does the effect vary by job type?
from deep_inference import structural_dml
result = structural_dml(
Y=wages, T=experience, X=job_embeddings,
family='linear',
epochs=200, n_folds=50
)
print(result.summary())
Validation: Corr(true, estimated) = 0.985
2. Logit: Discount Effectiveness#
Question: Do discounts increase purchases? Which products respond most?
result = structural_dml(
Y=purchased, T=discount_pct, X=product_embeddings,
family='logit',
epochs=200, n_folds=50
)
print(result.summary())
# Who should get discounts?
beta_hat = result.theta_hat[:, 1]
high_responders = beta_hat > np.median(beta_hat)
Validation: Corr(true, estimated) = 0.421
3. Poisson: Open Access Citation Advantage#
Question: Does Open Access increase citations? Which papers benefit most?
result = structural_dml(
Y=citations, T=open_access, X=abstract_embeddings,
family='poisson',
epochs=200, n_folds=50
)
print(result.summary())
# Citation multiplier
print(f"OA multiplier: {np.exp(result.mu_hat):.2f}x")
Validation: Corr(true, estimated) = 0.709
4. Tobit: Charitable Donations#
Question: Does employer matching increase donations? By how much, accounting for the zero-censoring mass?
Many donors give $0. The Tobit model separates the propensity to give from the amount.
result = structural_dml(
Y=donation_amount, T=match_ratio, X=donor_features,
family='tobit',
epochs=200, n_folds=50
)
print(result.summary())
# Fraction of donors at zero
print(f"Zero mass: {(donation_amount == 0).mean():.1%}")
Validation: eval_01 recovery PASS (RMSE < 0.15, Corr > 0.8). See Tobit tutorial.
5. Gamma: Insurance Claims#
Question: Do higher deductibles reduce claim severity? Which policyholders are most price-sensitive?
Claim amounts are strictly positive and right-skewed — the Gamma family handles this naturally.
result = structural_dml(
Y=claim_amount, T=deductible, X=policyholder_features,
family='gamma',
epochs=200, n_folds=50
)
print(result.summary())
# Percent change in expected claim per $100 deductible increase
print(f"Elasticity: {result.mu_hat:.4f}")
Validation: eval_01 recovery PASS. See Gamma tutorial.
6. Weibull: Customer Churn#
Question: Does offering a discount extend customer lifetime? Who benefits most from retention offers?
Subscription durations are positive and often right-skewed with hazard rates that change over time.
result = structural_dml(
Y=months_subscribed, T=discount_offered, X=customer_profile,
family='weibull',
epochs=200, n_folds=50
)
print(result.summary())
# Who retains longest with discount?
beta_hat = result.theta_hat[:, 1]
best_targets = beta_hat > np.percentile(beta_hat, 75)
Validation: eval_01 recovery PASS. See Weibull tutorial.
7. Multinomial Logit: Transportation Mode Choice#
Question: How does travel time affect mode choice (car, bus, train)? Which commuters are most responsive?
With J=3 alternatives and K=2 attributes (time, cost), the model estimates heterogeneous preferences.
result = structural_dml(
Y=chosen_mode, T=alternative_attributes, X=commuter_features,
family='multinomial_logit',
n_alternatives=3, n_attributes=2,
epochs=300, patience=50, n_folds=50
)
print(result.summary())
Validation: eval_09 coverage 98% (SE ratio 0.97). See Multinomial tutorial.
8. Gaussian: Manufacturing Quality Control#
Question: Does machine speed affect part diameter? Is the variance also heterogeneous?
Unlike family='linear', the Gaussian family estimates heteroscedastic variance \(\sigma(X)\) as a third parameter.
result = structural_dml(
Y=part_diameter, T=machine_speed, X=sensor_readings,
family='gaussian',
epochs=200, n_folds=50
)
print(result.summary())
# Estimated noise level
sigma_hat = np.exp(result.theta_hat[:, 2])
print(f"Mean sigma: {sigma_hat.mean():.4f}")
Validation: eval_01 recovery PASS (sigma recovery RMSE < 0.05). See the full family list.
Run It Yourself#
# Full gallery with validation output
python tutorials/06_multimodal_gallery.py
Individual tutorials are available for each family: Linear | Logit | Poisson | Tobit | Gamma | Weibull | Multinomial | Gumbel | NegBin
See Multimodal Tutorial for detailed code with real embedding examples (BERT, ResNet, CLIP).