# Models Module Neural network architectures for structural estimation. ## Network Classes ### StructuralNet The main neural network architecture for structural parameter estimation. ```python from deep_inference.models import StructuralNet # Create network net = StructuralNet( input_dim=10, # Number of covariates hidden_dims=[64, 32], # Hidden layer sizes theta_dim=2, # Number of parameters (alpha, beta) dropout=0.1 # Dropout rate ) # Forward pass import torch X = torch.randn(100, 10) theta = net(X) # (100, 2) ``` ## Network Architecture ``` Input (d features) | Linear(d, hidden_dims[0]) | ReLU + Dropout | Linear(hidden_dims[0], hidden_dims[1]) | ReLU + Dropout | ... | Linear(hidden_dims[-1], theta_dim) | Output (theta_dim parameters per observation) ``` ## Usage with structural_dml The `structural_dml` function creates and trains the network internally: ```python from deep_inference import structural_dml result = structural_dml( Y=Y, T=T, X=X, family='linear', hidden_dims=[64, 32], # Network architecture epochs=100, # Training epochs lr=0.01 # Learning rate ) # Access estimated parameters theta_hat = result.theta_hat # (n, theta_dim) numpy array alpha_hat = theta_hat[:, 0] beta_hat = theta_hat[:, 1] ``` ## Training Configuration | Parameter | Default | Description | |-----------|---------|-------------| | `hidden_dims` | `[64, 32]` | Hidden layer sizes | | `epochs` | `100` | Training epochs | | `lr` | `0.01` | Learning rate | | `batch_size` | `64` | Mini-batch size | | `weight_decay` | `1e-4` | L2 regularization | | `dropout` | `0.1` | Dropout rate | ## Architecture Guidelines | Sample Size | Recommended Architecture | |-------------|-------------------------| | n < 1,000 | `[32, 16]` | | 1,000 < n < 10,000 | `[64, 32]` | | 10,000 < n < 100,000 | `[128, 64, 32]` | | n > 100,000 | `[256, 128, 64]` | ### MultinomialLogitModel For multinomial logit (conditional logit / McFadden) choice models: ```python from deep_inference.models.multinomial import MultinomialLogitModel model = MultinomialLogitModel(n_alternatives=3, n_attributes=2) # theta_dim = (J-1) + K = 4 # theta = [alpha_1, ..., alpha_{J-1}, beta_1, ..., beta_K] # Hessian: Fisher information (no Y dependence, depends on theta) # Requires 3-way cross-fitting ``` ### CombinatorialModel For multi-treatment combinatorial experiments with binary treatment vectors T ∈ {0,1}^m. ```python from deep_inference.models.combinatorial import CombinatorialModel model = CombinatorialModel(n_treatments=3, link='gen_sigmoid_ii') # theta_dim = m + 2 = 5 for gen_sigmoid_ii # theta = [θ₀, θ₁, θ₂, θ₃, θ₄] ``` **Four link functions:** | Link | Formula | θ_dim | Description | |------|---------|-------|-------------| | `multiplicative` | θ₀ · ∏(1 + θ_k · t_k) | m+1 | Product interaction | | `sigmoid` | a/(1+exp(-(θ₀ + Σ θ_k·t_k))) + b | m+1 | Bounded response with fixed scale | | `gen_sigmoid_i` | θ_{m+1} · σ(Σ θ_k·t_k) | m+1 | Flexible scale, no intercept | | `gen_sigmoid_ii` | θ_{m+1} · σ(θ₀ + Σ θ_k·t_k) | m+2 | Most flexible (recommended) | **Hessian properties:** - Uses Fisher information: 2·G_θ·G_θ' (does NOT depend on y) - `hessian_depends_on_theta = True` (G_θ depends on θ) - `hessian_depends_on_y = False` (Fisher information) - Requires 3-way cross-fitting (Regime C) **Lambda computation for randomized experiments:** ```python # Compute Λ(x) via Monte Carlo for Regime A import torch t_samples = torch.randint(0, 2, (1000, 3)).float() Lambda = model.compute_lambda_integral(theta, t_samples) ``` **Usage with MultiTreatmentATE:** ```python from deep_inference.targets import MultiTreatmentATE model = CombinatorialModel(n_treatments=3, link='gen_sigmoid_ii') target = MultiTreatmentATE(model=model, treatment=[1, 0, 1]) ``` *Reference: Ye et al. (2025, Management Science) — DeDL: Debiased Deep Learning for Combinatorial Experiments* --- ## Custom Network Usage For advanced users who want to use the network directly: ```python import torch import torch.nn as nn from deep_inference.models import StructuralNet from deep_inference import LinearFamily # Create network and family net = StructuralNet(input_dim=10, hidden_dims=[64, 32], theta_dim=2) family = LinearFamily() optimizer = torch.optim.Adam(net.parameters(), lr=0.01) # Training loop for epoch in range(100): theta = net(X_tensor) loss = family.loss(Y_tensor, T_tensor, theta).mean() optimizer.zero_grad() loss.backward() optimizer.step() ```