#### Discover more from A Causal Affair

# Weekly Collection of Interesting Things

### A weekly attempt to collate interesting ('Metrics + Economics + Finance) things I've found

I haven’t finished writing up my first few posts here, but I thought I would try to have a (hopefully) weekly or biweekly collection of interesting links, content, and papers I have found over the past week (mainly through Twitter, but other places as well):

## Links

Esther Duflo and Ben Olken at MIT have put their Ph.D. Development Economics course material online

The American Finance Association is announcing two finance versions of AER:Insights and the Journal of Economic Perspectives (JEP), named Journal of Finance: Insights and Journal of Finance: Perspectives (if it ain’t broke, why fix it).

The journal will limit the length of published papers. Manuscripts without exhibits may not exceed 7,000 words. Each exhibit reduces the maximum allowance by 200 words (e.g., a manuscript with 5 exhibits has a maximum of 6,000 words). The maximum number of exhibits (figures or tables) is five. These guidelines apply to both

InsightsandPerspectivesarticles.The first editorial response after submission will be either a reject decision or a “conditional acceptance” that limits expositional changes. The objective of this quick turnaround is to reduce the ex ante uncertainty about potential publication and to limit the number of revisions.

Peter Hull has posted his notes from his econometrics course.

5. Clustering

6. IV Mechanics

The Journal of Financial Economics has announced that it will also have a data editor:

https://twitter.com/J_Fin_Economics/status/1641505742392905729?s=20

Effective 4/3/23, all manuscripts that receive an accept pending uploads decision must follow the new data and code policy, which is identical to the old policy with two additions.

1. The data editor will review all code and data packets to ensure that the packets are complete.

2. In those cases in which the actual data cannot be disclosed, a pseudo data set must be provided. Articles will not be accepted outright until the data editor approves them.

## Papers I’m Reading

Uncertainty Quantification in Synthetic Controls with Staggered Treatment Adoption by Cattaneo, Feng, Palomba and Titiunik, thanks to **Giuseppe Cavaliere**

We propose principled prediction intervals to quantify the uncertainty of a large class of synthetic control predictions or estimators in settings with staggered treatment adoption, offering precise non-asymptotic coverage probability guarantees. From a methodological perspective, we provide a detailed discussion of different causal quantities to be predicted, which we call causal predictands, allowing for multiple treated units with treatment adoption at possibly different points in time. From a theoretical perspective, our uncertainty quantification methods improve on prior literature by (i) covering a large class of causal predictands in staggered adoption settings, (ii) allowing for synthetic control methods with possibly nonlinear constraints, (iii) proposing scalable robust conic optimization methods and principled data-driven tuning parameter selection, and (iv) offering valid uniform inference across post-treatment periods. We illustrate our methodology with a substantive empirical application studying the effects of economic liberalization in the 1990s on GDP for emerging European countries. Companion general-purpose software packages are provided in Python, R and Stata.

Log-like? Identified ATEs defined with zero-valued outcomes are (arbitrarily) scale-dependent by Chen and Roth

Economists frequently estimate average treatment effects (ATEs) for transformations of the outcome that are well-defined at zero but behave like log(y) when yis large (e.g., log(1+y), arcsinh(y)). We show that these ATEs depend arbitrarily on the units of the outcome, and thus should not be interpreted as percentage effects. In line with this result, we find that estimated treatment effects for arcsinh-transformed outcomes published in the American Economic Review change substantially when we multiply the units of the outcome by 100 (e.g., convert dollars to cents). To help delineate alternative approaches, we prove that when the outcome can equal zero, there is no average treatment effect of the form E(g(Y(1), Y(0))) that is point-identified and unit-invariant. We conclude by discussing sensible alternative target parameters for settings with zero-valued outcomes that relax at least one of these requirements.

Wage Garnishment in the United States: New Facts from Administrative Payroll Records by DeFusco, Enriquez, and Yellen

Wage garnishment allows creditors to deduct money directly from workers’ paychecks to repay defaulted debts. We document new facts about wage garnishment between 2014–2019 using data from a large payroll processor who distributes paychecks to approximately 20% of U.S. private-sector workers. As of 2019, over one in every 100 workers was being garnished for delinquent debt. The average garnished worker experiences garnishment for five months, during which approximately 11% of gross earnings is remitted to their creditor(s). The beginning of a new garnishment is associated with an increase in job turnover rates but no intensive margin change in hours worked

Empirical Asset Pricing via Machine Learning by Gu, Kelly and Xiu

We perform a comparative analysis of machine learning methods for the canonical problem of empirical asset pricing: measuring asset risk premia. We demonstrate large economic gains to investors using machine learning forecasts, in some cases doubling the performance of leading regression-based strategies from the literature. We identify the best performing methods (trees and neural networks) and trace their predictive gains to allowance of nonlinear predictor interactions that are missed by other methods. All methods agree on the same set of dominant predictive signals which includes variations on momentum, liquidity, and volatility. Improved risk premium measurement through machine learning simplifies the investigation into economic mechanisms of asset pricing and highlights the value of machine learning in financial innovation.

Large Dimensional Latent Factor Modeling with Missing Observations and Applications to Causal Inference** **by Xiong and Pelger

This paper develops the inferential theory for latent factor models estimated from large dimensional panel data with missing observations. We propose an easy-to-use all-purpose estimator for a latent factor model by applying principal component analysis to an adjusted covariance matrix estimated from partially observed panel data. We derive the asymptotic distribution for the estimated factors, loadings and the imputed values under an approximate factor model and general missing patterns. The key application is to estimate counterfactual outcomes in causal inference from panel data. The unobserved control group is modeled as missing values, which are inferred from the latent factor model. The inferential theory for the imputed values allows us to test for individual treatment effects at any time under general adoption patterns where the units can be affected by unobserved factors.

On Robust Inference in Time Series Regression by Baillie, Diebold, Kapetanios, and Kim thanks to **Giuseppe Cavaliere**

Least squares regression with heteroskedasticity and autocorrelation consistent (HAC) standard errors has proved very useful in cross section environments. However, several major difficulties, which are generally overlooked, must be confronted when transferring the HAC estimation technology to time series environments. First, most economic time series have strong autocorrelation, which renders HAC regression parameter estimates highly inefficient. Second, strong autocorrelation similarly renders HAC conditional predictions highly inefficient. Finally, the structure of most popular HAC estimators is ill-suited to capture the autoregressive autocorrelation typically present in economic time series, which produces large size distortions and reduced power in hypothesis testing, in all but the largest sample sizes. We show that all three problems are largely avoided by the use of a simple dynamic regression (DynReg), which is easily implemented and also avoids possible problems concerning strong exogeneity. We demonstrate the advantages of DynReg with detailed simulations covering a range of practical issues.

## Code

Synthetic Diff-in-Diff Code:

R: https://github.com/synth-inference/synthdid

Stata: https://github.com/Daniel-Pailanir/sdidSynthetic Control methods:

R+Stata+Python: https://nppackages.github.io/scpi/Matrix completion methods from

**Matrix Completion Methods for Causal Panel Data Models**:

https://github.com/susanathey/MCPanel