Metrics Module#
Helper functions for computing inference quality metrics.
Main Functions#
compute_coverage#
from deep_inference import compute_coverage
# Check if true value falls within CI
covered = compute_coverage(mu_true, ci_lower, ci_upper)
compute_se_ratio#
from deep_inference import compute_se_ratio
# Compare estimated SE to empirical SE
se_ratio = compute_se_ratio(estimated_se, empirical_se)
Usage Example#
from deep_inference import structural_dml
import numpy as np
# Run multiple simulations
results = []
for seed in range(100):
np.random.seed(seed)
# Generate data...
result = structural_dml(Y, T, X, family='linear')
results.append({
'mu_hat': result.mu_hat,
'se': result.se,
'ci_lower': result.ci_lower,
'ci_upper': result.ci_upper
})
# Compute metrics
mu_true = 0.5 # known ground truth
mu_hats = [r['mu_hat'] for r in results]
ses = [r['se'] for r in results]
# Coverage
covered = [(r['ci_lower'] <= mu_true <= r['ci_upper']) for r in results]
coverage = np.mean(covered)
print(f"Coverage: {coverage:.1%}") # Target: 95%
# SE ratio
empirical_se = np.std(mu_hats)
mean_estimated_se = np.mean(ses)
se_ratio = mean_estimated_se / empirical_se
print(f"SE Ratio: {se_ratio:.2f}") # Target: 1.0
Key Metrics#
Metric |
Formula |
Target |
|---|---|---|
|
\(E[\hat\mu] - \mu^*\) |
0 |
|
\(\text{Var}(\hat\mu)\) |
- |
|
\(\sqrt{\text{Bias}^2 + \text{Var}}\) |
Small |
|
\(\sqrt{\text{Var}}\) |
- |
|
\(\hat{SE} / SE_{emp}\) |
1.0 |
|
\(P(\mu^* \in CI)\) |
95% |
Validation Targets#
Metric |
Valid Range |
Interpretation |
|---|---|---|
Coverage |
93-97% |
CI contains true value |
SE Ratio |
0.9-1.2 |
SE is properly calibrated |
min(lambda) |
> 1e-4 |
Hessian is well-conditioned |
Interpreting Results#
Good Results#
Coverage: 95.0% [PASS - in 93-97% range]
SE Ratio: 1.02 [PASS - close to 1.0]
RMSE: 0.032 [Low bias and variance]
Warning Signs#
Coverage: 30% [FAIL - severe undercoverage]
SE Ratio: 0.27 [FAIL - SE underestimated 4x]
Common causes of poor coverage:
Naive method (no IF correction)
Too few folds (K < 20)
Insufficient training epochs
Model misspecification
Monte Carlo Validation#
For rigorous validation, run Monte Carlo simulations:
import numpy as np
from deep_inference import structural_dml
M = 100 # number of simulations
N = 2000 # sample size
MU_TRUE = 0.5
results = []
for m in range(M):
np.random.seed(m)
# Generate data with known DGP
X = np.random.randn(N, 10)
T = np.random.randn(N)
Y = X[:, 0] + MU_TRUE * T + np.random.randn(N)
result = structural_dml(Y, T, X, family='linear', verbose=False)
covered = result.ci_lower <= MU_TRUE <= result.ci_upper
results.append({
'mu_hat': result.mu_hat,
'se': result.se,
'covered': covered
})
# Summary
coverage = np.mean([r['covered'] for r in results])
se_ratio = np.mean([r['se'] for r in results]) / np.std([r['mu_hat'] for r in results])
print(f"Coverage: {coverage:.1%}")
print(f"SE Ratio: {se_ratio:.2f}")