Eval 09: Multinomial Logit#
Comprehensive validation of the multinomial logit (conditional logit / McFadden) implementation.
Configuration#
Parameter |
Value (Recovery) |
Value (Coverage) |
|---|---|---|
Sample Size (n) |
10000 |
8000 |
Alternatives (J) |
3 |
3 |
Attributes (K) |
2 |
2 |
Simulations (M) |
— |
50 |
Epochs |
300 |
300 |
Patience |
50 |
50 |
Cross-fitting Folds |
50 |
50 |
DGP: Heterogeneous Conditional Logit#
J = 3 alternatives, K = 2 attributes, d_w = 3
V_ij = alpha_j(W) + X'_ij * beta(W)
P(Y=j | W, X) = softmax(V)[j]
alpha_0 = 0 (reference)
alpha_1(W) = 0.5 + 0.2*W[0]
alpha_2(W) = -0.3 - 0.1*W[0]
beta_1(W) = -0.8 - 0.2*W[0]
beta_2(W) = 0.5 + 0.1*W[0]
True mu* = E[beta_1(W)] = -0.8
Test 1: Parameter Recovery#
Component |
RMSE |
Correlation |
Status |
|---|---|---|---|
alpha_1 |
0.08 |
0.90 |
PASS |
alpha_2 |
0.12 |
0.78 |
PASS |
beta_1 |
0.09 |
0.88 |
PASS |
beta_2 |
0.10 |
0.85 |
PASS |
Test 2: Autodiff Validation#
Score and Hessian computed via autodiff match oracle closed-form formulas.
Metric |
Value |
Status |
|---|---|---|
Max score error |
4.44e-16 |
PASS |
Max Hessian error |
4.44e-16 |
PASS |
Test 3: Lambda Estimation#
Monte Carlo integration for E[H | W=w] matches oracle.
Metric |
Value |
Threshold |
Status |
|---|---|---|---|
Relative Frobenius error |
< 0.15 |
< 0.15 |
PASS |
Min eigenvalue |
> 1e-4 |
> 1e-4 |
PASS |
Non-PSD count |
0 |
0 |
PASS |
Test 4: Coverage (M=50)#
Metric |
Value |
Target |
Status |
|---|---|---|---|
Coverage |
98% |
90-99% |
PASS |
SE Ratio |
0.97 |
0.7-1.5 |
PASS |
Bias |
-0.006 |
< 0.05 |
PASS |
z-score Mean |
0.14 |
(-0.3, 0.3) |
PASS |
z-score Std |
0.96 |
0.7-1.5 |
PASS |
EVAL 09: PASS
Key Findings#
patience=50 essential: Default patience=10 triggers early stopping at ~15-20 epochs, fatal for 3-way split training
n >= 8000 required: 3-way splitting reduces effective training data to 60%; n=5000 gives only 88% coverage
correction_ratio ~70-90 is normal: Much larger than binary logit (~2) due to higher-dimensional theta
alpha_2 is hardest: Weakest signal (slope -0.1) requires most data for reliable recovery
Run Command#
python3 -m evals.eval_09_multinomial 2>&1 | tee evals/reports/eval_09_$(date +%Y%m%d_%H%M%S).txt