Flowchart#
Use this decision tree to select the right model family, target functional, and inference regime for your data.
Interactive Decision Flowchart#
flowchart TD
subgraph START [" "]
A([I have data Y, T, X])
end
A --> Q1{What type of<br/>outcome Y?}
%% Outcome type branching
Q1 -->|Continuous<br/>unbounded| LINEAR[family='linear']
Q1 -->|Binary 0/1| LOGIT[family='logit']
Q1 -->|Count 0,1,2...| COUNT{Overdispersed?}
Q1 -->|Positive<br/>continuous| POS{Distribution shape?}
Q1 -->|Censored| TOBIT[family='tobit']
COUNT -->|No| POISSON[family='poisson']
COUNT -->|Yes| NEGBIN[family='negbin']
POS -->|Multiplicative errors| GAMMA[family='gamma']
POS -->|Extreme values| GUMBEL[family='gumbel']
POS -->|Time-to-event| WEIBULL[family='weibull']
%% Target selection
LINEAR --> T1{Target functional?}
LOGIT --> T2{Target functional?}
POISSON --> T3[target='beta']
NEGBIN --> T3
GAMMA --> T3
GUMBEL --> T3
WEIBULL --> T3
TOBIT --> T4{Target functional?}
T1 -->|Average parameter| BETA_LIN[target='beta']
T1 -->|Custom function| CUSTOM[target_fn=...]
T2 -->|Average parameter| BETA_LOG[target='beta']
T2 -->|Average marginal effect| AME[target='ame']
T2 -->|Custom function| CUSTOM
T4 -->|Latent Y*| LATENT[target='latent']
T4 -->|Observed Y| OBS[target='observed']
%% Regime detection
BETA_LIN --> R_B[Regime B<br/>Analytic Lambda]
BETA_LOG --> R1{Treatment<br/>randomized?}
AME --> R1
T3 --> R1
LATENT --> R_C
OBS --> R_C
CUSTOM --> R_C[Regime C<br/>Estimate Lambda]
R1 -->|Yes + known distribution| R_A[Regime A<br/>Compute Lambda]
R1 -->|No / observational| R_C
%% Final API calls
R_A --> API_A["inference(...,<br/>is_randomized=True,<br/>treatment_dist=Normal())"]
R_B --> API_B["structural_dml(...,<br/>family='linear')"]
R_C --> API_C["structural_dml(...,<br/>family='logit')"]
%% Styling
classDef question fill:#fff3cd,stroke:#856404
classDef model fill:#d4edda,stroke:#155724
classDef target fill:#cce5ff,stroke:#004085
classDef regime fill:#f8d7da,stroke:#721c24
classDef api fill:#e2e3e5,stroke:#383d41
class Q1,COUNT,POS,T1,T2,T4,R1 question
class LINEAR,LOGIT,POISSON,NEGBIN,GAMMA,GUMBEL,WEIBULL,TOBIT model
class BETA_LIN,BETA_LOG,AME,LATENT,OBS,T3,CUSTOM target
class R_A,R_B,R_C regime
class API_A,API_B,API_C api
Decision 1: Model Family#
Choose based on your outcome variable type:
Outcome Type |
Family |
Link Function |
Example Use Cases |
|---|---|---|---|
Continuous (unbounded) |
|
Identity |
Wages, prices, test scores |
Binary (0/1) |
|
Logit |
Purchase decisions, clicks, conversions |
Count (0, 1, 2, …) |
|
Log |
Website visits, order counts |
Overdispersed count |
|
Log |
Insurance claims, rare events |
Positive continuous |
|
Log |
Expenditures, durations |
Extreme values |
|
Identity |
Maximum temperatures, flood levels |
Time-to-event |
|
Log |
Survival times, equipment failure |
Censored |
|
Identity |
Demand with stockouts |
How to Check for Overdispersion#
For count data, compare the variance to the mean:
variance = np.var(Y)
mean = np.mean(Y)
overdispersion_ratio = variance / mean
if overdispersion_ratio > 1.5:
print("Use family='negbin'")
else:
print("Use family='poisson'")
Decision 2: Target Functional#
The target functional determines what quantity you’re estimating:
target='beta' (Default)#
Estimates the average structural parameter: \(\mu^* = \mathbb{E}[\beta(X)]\)
result = structural_dml(Y, T, X, family='logit', target='beta')
# result.mu_hat estimates E[beta(X)]
target='ame' (Average Marginal Effect)#
For binary outcomes, estimates the effect on probability scale:
result = inference(Y, T, X, model='logit', target='ame', t_tilde=0.0)
# result.mu_hat estimates the AME at T=0
target_fn=... (Custom Target)#
Define any target functional with automatic Jacobian computation:
import torch
def my_target(x, theta, t_tilde):
"""Custom target: probability at treatment level t_tilde."""
alpha, beta = theta[..., 0], theta[..., 1]
eta = alpha + beta * t_tilde
return torch.sigmoid(eta)
result = inference(Y, T, X, model='logit', target_fn=my_target, t_tilde=0.5)
Decision 3: Regime & Lambda Strategy#
The regime determines how \(\Lambda(x) = \mathbb{E}[\ell_{\theta\theta}|X=x]\) is obtained:
Regime |
Condition |
Lambda Strategy |
Cross-Fitting |
|---|---|---|---|
A |
RCT with known \(F_T\) |
Compute via MC integration |
2-way |
B |
Linear model |
Analytic closed-form |
2-way |
C |
Observational, nonlinear |
Estimate with ridge regression |
3-way |
Regime A: Randomized Experiments#
When treatment is randomized and you know the distribution:
from deep_inference import inference
from deep_inference.lambda_.compute import Normal
result = inference(
Y, T, X,
model='logit',
target='beta',
is_randomized=True,
treatment_dist=Normal(mean=0, std=1)
)
Regime B: Linear Models#
For linear models, Lambda has a closed-form solution:
result = structural_dml(Y, T, X, family='linear')
# Automatically uses analytic Lambda
Regime C: Observational Studies#
For nonlinear models with observational data (default):
result = structural_dml(Y, T, X, family='logit')
# Uses ridge regression to estimate Lambda (default)
Common Paths#
Path 1: Logistic Demand (Observational)#
Binary purchase decisions with heterogeneous price sensitivity:
import numpy as np
from deep_inference import structural_dml
# Data: Y = purchase (0/1), T = price, X = customer features
result = structural_dml(
Y=Y, T=T, X=X,
family='logit',
hidden_dims=[64, 32],
epochs=100,
n_folds=50
)
print(result.summary())
# mu_hat = average price sensitivity E[beta(X)]
Path through flowchart: Binary outcome → logit → target='beta' → Observational → Regime C → structural_dml()
Path 2: RCT with Binary Outcome#
A/B test where treatment assignment is known:
from deep_inference import inference
from deep_inference.lambda_.compute import Bernoulli
result = inference(
Y=Y, T=T, X=X,
model='logit',
target='ame',
t_tilde=0.0,
is_randomized=True,
treatment_dist=Bernoulli(p=0.5)
)
print(result.summary())
# mu_hat = average marginal effect on P(Y=1)
Path through flowchart: Binary outcome → logit → target='ame' → Randomized → Regime A → inference()
Path 3: Linear Model#
Continuous outcome with linear structure:
result = structural_dml(
Y=Y, T=T, X=X,
family='linear',
hidden_dims=[64, 32],
epochs=100,
n_folds=50
)
print(result.summary())
# mu_hat = E[beta(X)], the average treatment effect
Path through flowchart: Continuous outcome → linear → target='beta' → Regime B → structural_dml()
Path 4: Custom Loss + Custom Target#
For non-standard models, define your own loss and target:
import torch
from deep_inference import inference
def custom_loss(y, t, theta):
"""Custom negative log-likelihood."""
alpha, beta = theta[..., 0], theta[..., 1]
mu = alpha + beta * t
return (y - mu) ** 2 # Example: squared error
def custom_target(x, theta, t_tilde):
"""Custom target functional."""
return theta[..., 1] # Just return beta
result = inference(
Y=Y, T=T, X=X,
loss_fn=custom_loss,
target_fn=custom_target,
theta_dim=2
)
Path through flowchart: Custom → target_fn=... → Regime C → inference()
Quick Reference#
Parameter Cheat Sheet#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
str |
|
Model family: linear, logit, poisson, etc. |
|
str |
|
Target functional: beta, ame, latent, observed |
|
callable |
None |
Custom target function |
|
list |
|
Neural network hidden layer sizes |
|
int |
100 |
Training epochs |
|
int |
50 |
Number of cross-fitting folds |
|
str |
|
Lambda estimation: ridge, lgbm, aggregate |
|
bool |
False |
Whether treatment is randomized |
|
object |
None |
Treatment distribution for Regime A |
|
float |
0.0 |
Evaluation point for AME |
API Summary#
# Legacy API (recommended for most use cases)
from deep_inference import structural_dml
result = structural_dml(Y, T, X, family='logit', ...)
# New API (for custom targets and regimes)
from deep_inference import inference
result = inference(Y, T, X, model='logit', target='ame', ...)
See Also#
Getting Started - Installation and first steps
Tutorials - Detailed worked examples
Theory - Mathematical foundations
Algorithm - Implementation details
API Reference - Full API documentation