4. The Influence Function Correction
The influence function (IF) correction of Farrell, Liang, and Misra (2025, Theorem 2) removes the regularization bias derived on the previous page. For each observation \(i\), define
where
\(H_i = H(X_i, \hat{\theta}(X_i), \tilde{t})\) is the target evaluated at the estimated parameters,
\(H_{\theta,i} = \left.\partial H / \partial \theta\right|_{\hat{\theta}(X_i)} \in \mathbb{R}^{1 \times d_\theta}\) is the target Jacobian,
\(\Lambda_i = \left.\mathbb{E}[\nabla_\theta^2 \ell(Y, T, \theta) \mid X = X_i]\right|_{\hat{\theta}(X_i)} \in \mathbb{R}^{d_\theta \times d_\theta}\) is the expected Hessian of the loss conditional on covariates,
\(\ell_{\theta,i} = \nabla_\theta \ell(Y_i, T_i, \hat{\theta}(X_i)) \in \mathbb{R}^{d_\theta}\) is the score (gradient of the loss at the observed data point).
The debiased estimator and its standard error are
Intuition
The correction term \(H_{\theta} \Lambda^{-1} \ell_\theta\) is an “adjustment for what the neural net got wrong.”
The score \(\ell_\theta\) measures how far observation \(i\)’s data is from the fitted model.
The Hessian \(\Lambda\) converts this into parameter-space error.
The Jacobian \(H_\theta\) maps this to the target.
When the neural net fits perfectly (scores near zero), the correction vanishes. When it makes systematic errors (regularization bias), the correction kicks in.
Neyman orthogonality
The influence function \(\psi\) satisfies the Neyman orthogonality condition:
The correction term \(H_\theta \Lambda^{-1} \ell_\theta\) is constructed precisely so that first-order perturbations in \(\hat{\theta}\) around \(\theta^*\) cancel. This is why the debiased estimator \(\hat{\mu} = \bar{\psi}\) achieves \(\sqrt{n}\)-consistency despite the neural network’s slow convergence rate: the estimation error in \(\hat{\theta}\) enters only through a second-order remainder, which is negligible at the \(\sqrt{n}\) scale. (The formal rate conditions are stated on the Theorems and Convergence Rates page.)
Connection to classical statistics
The IF \(\psi\) is a semiparametric influence function in the tradition of Hampel (1974) and Newey (1994). The specific form arises from the “one-step correction” or “debiasing” approach used widely in semiparametric inference (Chernozhukov et al., 2018). The Neyman orthogonality property is central to the broader program of orthogonal statistical learning (Foster & Syrgkanis, 2023); related automatic debiasing approaches include RieszNet (Chernozhukov et al., 2022). The contribution of Farrell, Liang, and Misra (2021, 2025) is showing that this correction remains valid when the first-stage nuisance — the structural parameters — is estimated by a deep neural network.