Validation#

Comprehensive eval suite validating every mathematical component of the influence function methodology.


Eval Suite Overview#

The package includes 8 evals in evals/ validating Theorem 2.

Eval

Component

Tests

Result

Details

01

Parameter Recovery θ̂(x)

12 families × 3 seeds

12/12 PASS

02

Autodiff vs Calculus

Score + Hessian

31/31 PASS

03

Lambda Estimation Λ̂(x)

5 methods

9/9 PASS

04

Target Jacobian H_θ

Autodiff vs oracle

92/92 PASS

05

Influence Function ψ

Assembly + coverage

4/4 PASS

06

Frequentist Coverage

Monte Carlo M=50

PASS

07

End-to-End

Full workflow

7/7 PASS

09

Multinomial Logit

Recovery + Coverage

98% coverage PASS

Total: 228+ individual checks, all passing.


Quick Summary#

Eval 01: Parameter Recovery#

Neural networks recover θ(x) = [α(x), β(x)] across all 12 families with Corr(β) > 0.94. Details →

Eval 02: Autodiff Accuracy#

PyTorch autodiff matches calculus formulas to machine precision (error < 1e-14). Details →

Eval 03: Lambda Estimation#

MLP achieves Corr=0.997 with true Λ(x); aggregate ignores heterogeneity (Corr=0.000). Details →

Eval 04: Target Jacobian#

∂H/∂θ computed correctly for all targets and families (92/92 tests). Details →

Eval 05: Influence Functions#

Complete ψ assembly validated with 88% coverage, SE ratio 0.87. Details →

Eval 06: Frequentist Coverage#

Monte Carlo (M=50, n=5000) confirms valid CIs with z-scores ~ N(0,1). Details →

Eval 07: End-to-End#

Full analyst workflow: Oracle vs Bootstrap vs NN comparison shows IF correction is essential. Details →

Eval 09: Multinomial Logit#

Multinomial logit (conditional logit) validated with 98% coverage (M=50, n=8000). Recovery, autodiff, Lambda, and coverage all PASS. Details →


Running Evals#

# Run all evals
python3 -m evals.run_all 2>&1 | tee evals/reports/run_all_$(date +%Y%m%d_%H%M%S).txt

# Run individual evals
python3 -m evals.eval_01_theta
python3 -m evals.eval_02_autodiff
python3 -m evals.eval_03_lambda
python3 -m evals.eval_04_jacobian
python3 -m evals.eval_05_psi
python3 -m evals.eval_06_coverage
python3 -m evals.eval_07_e2e
python3 -m evals.eval_09_multinomial

References#

  • Farrell, Liang, Misra (2021): “Deep Neural Networks for Estimation and Inference” Econometrica

  • Farrell, Liang, Misra (2025): “Deep Learning for Individual Heterogeneity” Working Paper

  • Verification Against FLM2 - comparison with original implementation