Models Module#

Neural network architectures for structural estimation.

Network Classes#

StructuralNet#

The main neural network architecture for structural parameter estimation.

from deep_inference.models import StructuralNet

# Create network
net = StructuralNet(
    input_dim=10,           # Number of covariates
    hidden_dims=[64, 32],   # Hidden layer sizes
    theta_dim=2,            # Number of parameters (alpha, beta)
    dropout=0.1             # Dropout rate
)

# Forward pass
import torch
X = torch.randn(100, 10)
theta = net(X)  # (100, 2)

Network Architecture#

Input (d features)
    |
Linear(d, hidden_dims[0])
    |
ReLU + Dropout
    |
Linear(hidden_dims[0], hidden_dims[1])
    |
ReLU + Dropout
    |
...
    |
Linear(hidden_dims[-1], theta_dim)
    |
Output (theta_dim parameters per observation)

Usage with structural_dml#

The structural_dml function creates and trains the network internally:

from deep_inference import structural_dml

result = structural_dml(
    Y=Y, T=T, X=X,
    family='linear',
    hidden_dims=[64, 32],  # Network architecture
    epochs=100,            # Training epochs
    lr=0.01               # Learning rate
)

# Access estimated parameters
theta_hat = result.theta_hat  # (n, theta_dim) numpy array
alpha_hat = theta_hat[:, 0]
beta_hat = theta_hat[:, 1]

Training Configuration#

Parameter	Default	Description
`hidden_dims`	`[64, 32]`	Hidden layer sizes
`epochs`	`100`	Training epochs
`lr`	`0.01`	Learning rate
`batch_size`	`64`	Mini-batch size
`weight_decay`	`1e-4`	L2 regularization
`dropout`	`0.1`	Dropout rate

Architecture Guidelines#

Sample Size	Recommended Architecture
n < 1,000	`[32, 16]`
1,000 < n < 10,000	`[64, 32]`
10,000 < n < 100,000	`[128, 64, 32]`
n > 100,000	`[256, 128, 64]`

MultinomialLogitModel#

For multinomial logit (conditional logit / McFadden) choice models:

from deep_inference.models.multinomial import MultinomialLogitModel

model = MultinomialLogitModel(n_alternatives=3, n_attributes=2)
# theta_dim = (J-1) + K = 4
# theta = [alpha_1, ..., alpha_{J-1}, beta_1, ..., beta_K]
# Hessian: Fisher information (no Y dependence, depends on theta)
# Requires 3-way cross-fitting

CombinatorialModel#

For multi-treatment combinatorial experiments with binary treatment vectors T ∈ {0,1}^m.

from deep_inference.models.combinatorial import CombinatorialModel

model = CombinatorialModel(n_treatments=3, link='gen_sigmoid_ii')
# theta_dim = m + 2 = 5 for gen_sigmoid_ii
# theta = [θ₀, θ₁, θ₂, θ₃, θ₄]

Four link functions:

Link	Formula	θ_dim	Description
`multiplicative`	θ₀ · ∏(1 + θ_k · t_k)	m+1	Product interaction
`sigmoid`	a/(1+exp(-(θ₀ + Σ θ_k·t_k))) + b	m+1	Bounded response with fixed scale
`gen_sigmoid_i`	θ_{m+1} · σ(Σ θ_k·t_k)	m+1	Flexible scale, no intercept
`gen_sigmoid_ii`	θ_{m+1} · σ(θ₀ + Σ θ_k·t_k)	m+2	Most flexible (recommended)

Hessian properties:

Uses Fisher information: 2·G_θ·G_θ’ (does NOT depend on y)
hessian_depends_on_theta = True (G_θ depends on θ)
hessian_depends_on_y = False (Fisher information)
Requires 3-way cross-fitting (Regime C)

Lambda computation for randomized experiments:

# Compute Λ(x) via Monte Carlo for Regime A
import torch
t_samples = torch.randint(0, 2, (1000, 3)).float()
Lambda = model.compute_lambda_integral(theta, t_samples)

Usage with MultiTreatmentATE:

from deep_inference.targets import MultiTreatmentATE

model = CombinatorialModel(n_treatments=3, link='gen_sigmoid_ii')
target = MultiTreatmentATE(model=model, treatment=[1, 0, 1])

Reference: Ye et al. (2025, Management Science) — DeDL: Debiased Deep Learning for Combinatorial Experiments

Custom Network Usage#

For advanced users who want to use the network directly:

import torch
import torch.nn as nn
from deep_inference.models import StructuralNet
from deep_inference import LinearFamily

# Create network and family
net = StructuralNet(input_dim=10, hidden_dims=[64, 32], theta_dim=2)
family = LinearFamily()
optimizer = torch.optim.Adam(net.parameters(), lr=0.01)

# Training loop
for epoch in range(100):
    theta = net(X_tensor)
    loss = family.loss(Y_tensor, T_tensor, theta).mean()

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()