Validation#

Comprehensive eval suite validating every mathematical component of the influence function methodology.

Validation

Eval Suite Overview#

The package includes 8 evals in evals/ validating Theorem 2.

Eval	Component	Tests	Result	Details
01	Parameter Recovery θ̂(x)	12 families × 3 seeds	12/12 PASS	→
02	Autodiff vs Calculus	Score + Hessian	31/31 PASS	→
03	Lambda Estimation Λ̂(x)	5 methods	9/9 PASS	→
04	Target Jacobian H_θ	Autodiff vs oracle	92/92 PASS	→
05	Influence Function ψ	Assembly + coverage	4/4 PASS	→
06	Frequentist Coverage	Monte Carlo M=50	PASS	→
07	End-to-End	Full workflow	7/7 PASS	→
09	Multinomial Logit	Recovery + Coverage	98% coverage PASS	→

Total: 228+ individual checks, all passing.

Quick Summary#

Eval 01: Parameter Recovery#

Neural networks recover θ(x) = [α(x), β(x)] across all 12 families with Corr(β) > 0.94. Details →

Eval 02: Autodiff Accuracy#

PyTorch autodiff matches calculus formulas to machine precision (error < 1e-14). Details →

Eval 03: Lambda Estimation#

MLP achieves Corr=0.997 with true Λ(x); aggregate ignores heterogeneity (Corr=0.000). Details →

Eval 04: Target Jacobian#

∂H/∂θ computed correctly for all targets and families (92/92 tests). Details →

Eval 05: Influence Functions#

Complete ψ assembly validated with 88% coverage, SE ratio 0.87. Details →

Eval 06: Frequentist Coverage#

Monte Carlo (M=50, n=5000) confirms valid CIs with z-scores ~ N(0,1). Details →

Eval 07: End-to-End#

Full analyst workflow: Oracle vs Bootstrap vs NN comparison shows IF correction is essential. Details →

Eval 09: Multinomial Logit#

Multinomial logit (conditional logit) validated with 98% coverage (M=50, n=8000). Recovery, autodiff, Lambda, and coverage all PASS. Details →

Running Evals#

# Run all evals
python3 -m evals.run_all 2>&1 | tee evals/reports/run_all_$(date +%Y%m%d_%H%M%S).txt

# Run individual evals
python3 -m evals.eval_01_theta
python3 -m evals.eval_02_autodiff
python3 -m evals.eval_03_lambda
python3 -m evals.eval_04_jacobian
python3 -m evals.eval_05_psi
python3 -m evals.eval_06_coverage
python3 -m evals.eval_07_e2e
python3 -m evals.eval_09_multinomial

References#

Farrell, Liang, Misra (2021): “Deep Neural Networks for Estimation and Inference” Econometrica
Farrell, Liang, Misra (2025): “Deep Learning for Individual Heterogeneity” Working Paper
Verification Against FLM2 - comparison with original implementation