# Tobit Model Tutorial

The Tobit model handles censored data where outcomes pile up at a boundary.

## When to Use

Use the Tobit model when:
- Outcome is censored at a boundary (typically 0)
- You observe the censored value, not the latent value
- Examples: labor supply (hours >= 0), expenditure, donations

## Mathematical Setup

### Data Generating Process

$$Y^* = \alpha(X) + \beta(X) \cdot T + \varepsilon, \quad \varepsilon \sim N(0, \sigma^2)$$

$$Y = \max(0, Y^*)$$

Where $Y^*$ is the latent (unobserved) variable and $Y$ is the observed (censored) outcome.

### Estimand

$$\mu^* = E[\beta(X)]$$

The average effect on the **latent** outcome.

### Loss Function

The Tobit likelihood has two parts:

1. **Censored observations** ($Y = 0$): Probability of $Y^* \leq 0$
2. **Uncensored observations** ($Y > 0$): Normal density

$$L = -\sum_{Y_i=0} \log \Phi\left(-\frac{\mu_i}{\sigma}\right) - \sum_{Y_i>0} \left[\log \phi\left(\frac{Y_i - \mu_i}{\sigma}\right) - \log \sigma\right]$$

### Influence Score Components

| Component | Formula |
|-----------|---------|
| Residual $r$ | Mills ratio (censored) or $(Y-\mu)/\sigma$ (uncensored) |
| Hessian weight $W$ | $1 - \Phi(-\mu/\sigma)$ |
| Score $\nabla\ell$ | Varies by censoring status |

The **Mills ratio** is $\phi(z)/\Phi(z)$ where $z = -\mu/\sigma$.

## Complete Example

```python
import numpy as np
from deep_inference import structural_dml

# Generate censored data
np.random.seed(42)
n = 2000
X = np.random.randn(n, 10)
T = np.random.randn(n)

# True structural functions
alpha_true = 0.5 + 0.3 * X[:, 0]
beta_true = 0.3 + 0.2 * X[:, 0]
sigma = 1.0
Y_star = alpha_true + beta_true * T + sigma * np.random.randn(n)
Y = np.maximum(0, Y_star)  # Censoring at 0
mu_true = beta_true.mean()

# Check censoring rate
censored_pct = (Y == 0).mean() * 100
print(f"True mu* = {mu_true:.6f}")
print(f"Censored at 0: {censored_pct:.1f}%")

# Run inference
result = structural_dml(
    Y=Y, T=T, X=X,
    family='tobit',
    hidden_dims=[64, 32],
    epochs=100,
    n_folds=50,
    lr=0.01
)

print(result.summary())
```

## Alternative Targets

### Latent Effect (Default)

```python
# Default target is effect on latent Y*
result = structural_dml(Y, T, X, family='tobit')
# mu* = E[beta(X)] = effect on latent Y*
```

### Observed Effect

For observed effect, use the TobitFamily class directly:

```python
from deep_inference import TobitFamily, structural_dml

# Create family with observed target
family = TobitFamily(target='observed')
result = structural_dml(Y, T, X, family=family)
# mu* = E[beta(X) * Phi(mu/sigma)] = effect on observed E[Y]
```

The observed effect accounts for the probability of being uncensored.

## Parameter Structure

The Tobit model estimates **three** parameters per observation:

$$\theta(X) = [\alpha(X), \beta(X), \gamma(X)]$$

Where $\sigma(X) = \exp(\gamma(X))$ is the conditional variance.

## Real-World Applications

### Labor Supply

```python
# Y = hours worked (>= 0)
# T = wage rate
# X = (education, family size, non-labor income, ...)
# Target: E[beta(X)] = average labor supply elasticity

result = structural_dml(Y, T, X, family='tobit')
```

### Charitable Donations

```python
# Y = donation amount (>= 0)
# T = match rate offered
# X = (income, past giving, solicitation type, ...)
# Target: E[beta(X)] = average matching effect

result = structural_dml(Y, T, X, family='tobit')
```

### Durable Goods Expenditure

```python
# Y = spending on cars (many zeros)
# T = income change
# X = (current car age, household size, ...)
# Target: E[beta(X)] = average income effect on car spending

result = structural_dml(Y, T, X, family='tobit')
```

## Handling Different Censoring

### Left-censoring at 0 (Default)

```python
result = structural_dml(Y, T, X, family='tobit')  # Assumes Y >= 0
```

### Right-censoring

```python
# For data censored from above (e.g., top-coded income)
# Transform: Y_new = upper_bound - Y
# Then use standard Tobit
```

### Two-sided censoring

```python
# For data censored at both ends
# Requires custom implementation
```

## Key Takeaways

1. **Latent vs observed**: Choose target based on research question
2. **Mills ratio**: Key ingredient for censored observations
3. **Joint sigma estimation**: Model estimates conditional variance
4. **Check censoring rate**: Very high (>50%) or low (<10%) censoring can cause issues
5. **Three parameters**: alpha, beta, and gamma (log-sigma)