**February 2022.**

Is healthcare employment recession proof? We examine the hypothesis that healthcare employment is stable across the business cycle. We explicitly distinguish between negative aggregate demand and supply shocks in studying how healthcare employment responds to recessions, and show that this response depends largely on the type of the exogenous shock triggering the recession. We find that healthcare employment responds procyclically to demand-induced recessions; and the reduction is driven by layoffs and discharges rather than voluntary quits. In evaluating additional mechanisms, we find evidence of a reduction in real personal healthcare expenditures resulting from an adverse demand shock. By contrast, we find that healthcare employment is fairly stable and even responds countercyclically to supply-induced recessions, suggesting compositional changes such as downskilling particularly in nursing sectors. Our findings establish that employment responses during economic downturns are heterogeneous across healthcare sub-sectors. More generally, by isolating the impact of the structural demand shock from supply shock on healthcare employment, we provide new empirical evidence that healthcare employment is not recession proof.]]>

**February 2022.**

This paper examines the econometric causal model for policy analysis developed by the seminal ideas of Ragnar Frisch and Trygve Haavelmo. We compare the econometric causal model with two popular causal frameworks: Neyman-Holland causal model and the do-calculus. The Neyman-Holland causal model is based on the language of potential outcomes and was largely developed by statisticians. The do-calculus, developed by Judea Pearl and co-authors, relies on Directed Acyclic Graphs (DAGs) and is a popular causal framework in computer science. We make the case that economists who uncritically use these approximating frameworks often discard the substantial benefits of the econometric causal model to the detriment of more informative economic policy analyses. We illustrate the versatility and capabilities of the econometric framework using causal models that are frequently studied by economists.]]>

**February 2022.**

Economists are obsessed with rankings of institutions, journals, or scholars according to the value of some feature of interest. These rankings are invariably computed using estimates rather than the true values of such features. As a result, there may be considerable uncertainty concerning the ranks. In this paper, we consider the problem of accounting for such uncertainty by constructing confidence sets for the ranks. We consider both the problem of constructing marginal confidence sets for the rank of, say, a particular journal as well as simultaneous confidence sets for the ranks of all journals. We apply these confidence sets to draw inferences about uncertainty in the ranking of economics journals and universities by impact factors.]]>

**February 2022.**

This paper presents a framework for how to incorporate prior sources of information into the design of a sequential experiment. These sources can include previous experiments, expert opinions, or the experimenter's own introspection. We formalize this problem using a multi-prior Bayesian approach that maps each source to a Bayesian model. These models are aggregated according to their associated posterior probabilities. We evaluate a broad of policy rules according to three criteria: whether the experimenter learns the parameters of the payoff distributions, the probability that the experimenter chooses the wrong treatment when deciding to stop the experiment, and the average rewards. We show that our framework exhibits several nice finite sample properties, including robustness to any source that is not externally valid.]]>

**February 2022.**

One strand of the literature in labor economics, household finance, and macroeconomics has studied whether individual earnings volatility has risen or fallen in the U.S. over the last several decades. There are disagreements in the empirical literature on this question, with some suggestions that the differences are the result of using flawed survey data instead of more accurate administrative data. This paper summarizes the results of a project to reconcile these findings with four different data sets and six different data series--three survey and three administrative data series, including two which match survey respondent data to their administrative data. Four of the six series show no significant trend in male earnings volatility over the last 20-to-30+ years when differences across the data sets are properly accounted for. A fifth shows a positive net trend but small in magnitude. A sixth shows no net trend 1998-2011 and only a small decline thereafter. The remaining differences across data series can be largely explained by differences in the left tail of their cross-sectional earnings distributions. We conclude that the data sets we have analyzed show little evidence of any significant trend in male earnings volatility since the mid-1980s.]]>

**February 2022.**

We study regressions with period and group fixed effects and several treatment variables. Under a parallel trends assumption, the coefficient on each treatment identifies the sum of two terms. The first term is a weighted sum of the effect of that treatment in each group and period, with weights that may be negative and sum to one. The second term is a sum of the effects of the other treatments, with weights summing to zero. Accordingly, coefficients in those regressions are not robust to heterogeneous effects, and may be contaminated by the effect of other treatments. We propose alternative differences-in-differences estimators. To estimate, say, the effect of the first treatment, our estimators compare the outcome evolution of a group whose first treatment changes while its other treatments remain unchanged, to control groups whose treatments all remain unchanged, and with the same baseline treatments or treatments' history as the switching group. Those carefully selected comparisons are robust to heterogeneous effects, and do not suffer from the contamination problem.]]>

**January 2022.**

Linear instrumental variable estimators, such as two-stage least squares (TSLS), are commonly interpreted as estimating positively weighted averages of causal effects, referred to as local average treatment effects (LATEs). We examine whether the LATE interpretation actually applies to the types of TSLS specifications that are used in practice. We show that if the specification includes covariates, which most empirical work does, then the LATE interpretation does not apply in general. Instead, the TSLS estimator will in general reflect treatment effects for both compliers and always/never-takers, and some of the treatment effects for the always/never-takers will necessarily be negatively weighted. We show that the only specifications that have a LATE interpretation are "saturated" specifications that control for covariates nonparametrically, implying that such specifications are both sufficient and necessary for TSLS to have a LATE interpretation, at least without additional parametric assumptions. This result is concerning because, as we document, empirical researchers almost never control for covariates nonparametrically, and rarely discuss or justify parametric specifications of covariates. We develop a decomposition that quantifies the extent to which the usual LATE interpretation fails. We apply the decomposition to four empirical analyses and find strong evidence that the LATE interpretation of TSLS is far from accurate for the types of specifications actually used in practice.]]>

**January 2022.**

A prominent challenge when drawing causal inference using observational data is the ubiquitous presence of endogenous regressors. The classical econometric method to handle regressor endogeneity requires instrumental variables that must satisfy the stringent condition of exclusion restriction, making it infeasible to use in many settings. We propose new instrument-free methods using copulas to address the endogeneity problem. The existing copula correction method focuses only on the endogenous regressors and may yield biased estimates when exogenous and endogenous regressors are correlated. Furthermore, (nearly) normally distributed endogenous regressors cause model non-identification or finite-sample poor performance. Our proposed two-stage copula endogeneity correction (2sCOPE) method simultaneously overcomes the two key limitations and yields consistent causal-effect estimates with correlated endogenous and exogenous regressors as well as normally distributed endogenous regressors. 2sCOPE employs generated regressors derived from existing regressors to control for endogeneity, and is straightforward to use and broadly applicable. Moreover, we prove that exploiting correlated exogenous regressors can address the problem of insufficient regressor non-normality, relax identification requirements and improve estimation precision (by as much as ∼50% in empirical evaluation). Overall, 2sCOPE can greatly increase the ease of and broaden the applicability of instrument-free methods for dealing with regressor endogeneity. We demonstrate the performance of 2sCOPE via simulation studies and an empirical application.]]>

**January 2022.**

Statistically significant results are more rewarded than insignificant ones, so researchers have the incentive to pursue statistical significance. Such p-hacking reduces the informativeness of hypothesis tests by making significant results much more common than they are supposed to be in the absence of true significance. To address this problem, we construct critical values of test statistics such that, if these values are used to determine significance, and if researchers optimally respond to these new significance standards, then significant results occur with the desired frequency. Such incentive-compatible critical values allow for p-hacking so they are larger than classical critical values. Using evidence from the social and medical sciences, we find that the incentive-compatible critical value for any test and any significance level is the classical critical value for the same test with approximately one fifth of the significance level—a form of Bonferroni correction. For instance, for a z-test with a significance level of 5%, the incentive-compatible critical value is 2.31 instead of 1.65 if the test is one-sided and 2.57 instead of 1.96 if the test is two-sided.]]>

**January 2022.**

Linear regressions with period and group fixed effects are widely used to estimate policies' effects: 26 of the 100 most cited papers published by the American Economic Review from 2015 to 2019 estimate such regressions. It has recently been show that those regressions may produce misleading estimates, if the policy's effect is heterogeneous between groups or over time, as is often the case. This survey reviews a fast-growing literature that documents this issue, and that proposes alternative estimators robust to heterogeneous effects.]]>

**January 2022.**

Using Dutch time-diary data from 1975-2005 covering over 10,000 respondents for 7 consecutive days each, we show that individuals’ sleep time exhibits both variability and volatility characterized by stationary autoregressive conditional heteroscedasticity: The absolute values of deviations from a person’s average sleep on one day are positively correlated with those on the next day. Sleep is more variable on weekends and among people with less education, who are younger and who do not have young children at home. Volatility is greater among parents with young children, slightly greater among men than women, but independent of other demographics. A theory of economic incentives to minimize the dispersion of sleep predicts that higher-wage workers will exhibit less dispersion, a result demonstrated using extraneous estimates of earnings equations to impute wage rates. Volatility in sleep spills over onto volatility in other personal activities, with no reverse causation onto sleep. The results illustrate a novel dimension of economic inequality and could be applied to a wide variety of human behavior and biological processes.]]>

**January 2022.**

Economists routinely make functional form assumptions about consumer demand to obtain welfare estimates. How sensitive are welfare estimates to these assumptions? We answer this question by providing bounds on welfare that hold for families of demand curves commonly considered in different literatures. We show that commonly chosen functional forms, such as linear, exponential and CES demand, are extremal in different families: they yield either the highest or lowest welfare estimate among all demand curves in those families. To illustrate our approach, we apply our results to the welfare analysis of trade tariffs, income taxation, and energy subsidies.]]>

**January 2022.**

This chapter analyzes the implications of the unexpected 2020-2021 COVID-19 pandemic for work and retirement in the U.S. The pandemic induced the greatest loss of jobs in the shortest period of time in U.S. history. A slow economic recovery would surely have endangered work longer/retire later policies that seek to adjust the finances of Social Security retirement to an aging population. Boosted by the huge CARES (March 2020) and ARPA (April 2021) rescue packages, the early recovery from the COVID-19 recession was faster and stronger than the recovery from the 2007-2009 Great Recession. Even so, the pandemic greatly altered the job market, with workers suffering from long COVID having difficulty returning to work and more workers working from home. In its immediate effect and potential long-run impact, the pandemic recession/recovery is a wake-up call to the danger that shocks from the natural world pose to work and retirement. Realistic planning for the future of work and retirement should go beyond analyzing socioeconomic trends to analyzing expected unexpected changes from the natural world as well.]]>

**January 2022.**

Do urban children live more segregated lives than urban adults? Using cellphone location data and following the ‘experienced isolation’ methodology of Athey et al. (2021), we compare the isolation of students over the age of 16—who we identify based on their time spent at a high school—and adults. We find that students in cities experience significantly less integration in their day-to-day lives than adults. The average student experiences 27% more isolation outside of the home than the average adult. Even when comparing students and adults living in the same neighborhood, exposure to devices associated with a different race is 20% lower for students. Looking at more broad measures of urban mobility, we find that students spend more time at home, more time closer to home when they do leave the house, and less time at school than adults spend at work. Finally, we find correlational evidence that neighborhoods with more geographic mobility today also had more intergenerational income mobility in the past. We hope future work will more rigorously test the hypothesis that different geographic mobility patterns for children and adults can explain why urban density appears to boost adult wages but reduce intergenerational income mobility.]]>

**January 2022.**

We propose methods for constructing regularized mixtures of density forecasts. We explore a variety of objectives and regularization penalties, and we use them in a substantive exploration of Eurozone inflation and real interest rate density forecasts. All individual inflation forecasters (even the ex post best forecaster) are outperformed by our regularized mixtures. From the Great Recession onward, the optimal regularization tends to move density forecasts' probability mass from the centers to the tails, correcting for overconfidence.]]>

**January 2022.**

This paper explores methods for inferring the causal effects of treatments on choices by combining data on real choices with hypothetical evaluations. We propose a class of estimators, identify conditions under which they yield consistent estimates, and derive their asymptotic distributions. The approach is applicable in settings where standard methods cannot be used (e.g., due to the absence of helpful instruments, or because the treatment has not been implemented). It can recover heterogeneous treatment effects more comprehensively, and can improve precision. We provide proof of concept using data generated in a laboratory experiment and through a field application.]]>

**December 2021.**

We evaluate the impact of government mandated proof of vaccination requirements for access to public venues and non-essential businesses on COVID-19 vaccine uptake. We find that the announcement of a mandate is associated with a rapid and significant surge in new vaccinations (more than 60\% increase in weekly first doses) using the variation in the timing of these measures across Canadian provinces in a difference-in-differences approach. Time-series analysis for each province and for France, Italy and Germany corroborates this finding, and we estimate cumulative gains of up to 5 percentage points in provincial vaccination rates and 790,000 or more first doses for Canada as a whole as of October 31, 2021 (5 to 13 weeks after the provincial mandate announcements). We also find large vaccination gains in France (3 to 5 mln first doses), Italy (around 6 mln) and Germany (around 3.5 mln) 11 to 16 weeks after the proof of vaccination mandate announcements.]]>

**December 2021.**

The COVID-19 pandemic brought unprecedented policy responses and a large literature evaluating their impacts. This paper re-examines this literature and investigates the role of researchers' degrees-of-flexibility on the estimated effects of mobility-reducing policies on social-distancing behavior. We find that two-way fixed effects estimates are not robust to minor changes in usually-unexplored dimensions of the degree-of-flexibility space. While standard robustness tests based on the sequential addition of covariates are very stable, small changes in the outcome variable and its transformation lead to large and sometimes contradictory changes in the estimates, where the same policy can be found to significantly increase or decrease mobility. Yet, due to the large number of degrees-of-flexibility, one can focus on a set of results that appears stable, while ignoring problematic ones. We show that recently developed heterogeneity-robust difference-in-differences estimators only partially mitigate these issues, and discuss how a strategy of identifying the point at which a sequence of ever more-stringent robustness tests eventually fail could increase the credibility of policy evaluations.]]>

**December 2021.**

We evaluate how nonresponse affects conclusions drawn from survey data and consider how researchers can reliably test and correct for nonresponse bias. To do so, we examine a survey on labor market conditions during the COVID-19 pandemic that used randomly assigned financial incentives to encourage participation. We link the survey data to administrative data sources, allowing us to observe a ground truth for participants and nonparticipants. We find evidence of large nonresponse bias, even after correcting for observable differences between participants and nonparticipants. We apply a range of existing methods that account for nonresponse bias due to unobserved differences, including worst-case bounds, bounds that incorporate monotonicity assumptions, and approaches based on parametric and nonparametric selection models. These methods produce bounds (or point estimates) that are either too wide to be useful or far from the ground truth. We show how these shortcomings can be addressed by modeling how nonparticipation can be both active (declining to participate) and passive (not seeing the survey invitation). The model makes use of variation from the randomly assigned financial incentives, as well as the timing of reminder emails. Applying the model to our data produces bounds (or point estimates) that are narrower and closer to the ground truth than the other methods.]]>

**December 2021.**

We resuscitated the mixed-frequency vector autoregression (MF-VAR) developed in Schorfheide and Song (2015, JBES) to generate macroeconomic forecasts for the U.S. during the COVID-19 pandemic in real time. The model combines eleven time series observed at two frequencies: quarterly and monthly. We deliberately did not modify the model specification in view of the COVID-19 outbreak, except for the exclusion of crisis observations from the estimation sample. We compare the MF-VAR forecasts to the median forecast from the Survey of Professional Forecasters (SPF). While the MF-VAR performed poorly during 2020:Q2, subsequent forecasts were at par with the SPF forecasts. We show that excluding a few months of extreme observations is a promising way of handling VAR estimation going forward, as an alternative of a sophisticated modeling of outliers.]]>

**December 2021.**

Using a highly stylized dynamic microsimulation model, we project the labor force of the United States up to the year 2060 and contrast these projections with projections for Germany to assess differential effects on outcomes The projections are consistent with the U S Census Bureau’s and Eurostat’s demographic projections. Our modeling approach allows to show and quantify how policy changes the future size of the labor force, which we assess with a series of what-if scenarios.

Both the US and Germany are expected to undergo demographic aging, but their demographic fundamentals differ starkly. This has strong implications for their labor force developments. According to our microsimulation, the US labor force will, despite population aging, increase by 16.2 percent in the age groups 15 to 74 (corresponding to 25.2 million workers) between 2020 and 2060, while Germany will experience a decline by 10.7 percent (4.4 million workers). In these baseline projections, improvements in the education structure will add about two million persons to the US labor force and about half a million persons to the German labor force by 2060.

In the what-if scenarios, we examine the implications of improvements in the educational structure of the population and of policies which address the health impediments for labor force participation. Of the educational scenarios that we evaluate, increasing the number of persons who achieve more than lower education has the strongest positive impact on labor force participation, relative to the number of additional years of schooling implied by the various scenarios. Shifting people from intermediate to higher education levels also increases labor force participation in higher age groups, however, this is partially offset by lock in effects at younger ages.

Our projections highlight that improvements in the labor market integration of people with health limitations provide a particularly promising avenue to increase labor force participation rates and thus help to address the challenges posed by demographic aging. If the health gap in participation rates in the United States were similar to that currently observed in Sweden, the labor force in 2060 would be larger by about 14.9 million persons.]]>

**December 2021.**

We propose an approach to modeling and estimating discrete choice demand that allows for a large number of zero sale observations, rich unobserved heterogeneity, and endogenous prices. We do so by modeling small market sizes through Poisson arrivals. Each of these arriving consumers then solves a standard discrete choice problem. We present a Bayesian IV estimation approach that addresses sampling error in product shares and scales well to rich data environments. The data requirements are traditional market-level data and measures of consumer search intensity. After presenting simulation studies, we consider an empirical application of air travel demand where product-level sales are sparse. We find considerable variation in demand over time. Periods of peak demand feature both larger market sizes and consumers with higher willingness to pay. This amplifies cyclicality. However, observed frequent price and capacity adjustments offset some of this compounding effect.]]>

**November 2021.**

It is common to rank different categories by means of preferences that are revealed through data on choices. A prominent example is the ranking of political candidates or parties using the estimated share of support each one receives in surveys or polls about political attitudes. Since these rankings are computed using estimates of the share of support rather than the true share of support, there may be considerable uncertainty concerning the true ranking of the political candidates or parties. In this paper, we consider the problem of accounting for such uncertainty by constructing confidence sets for the rank of each category. We consider both the problem of constructing marginal confidence sets for the rank of a particular category as well as simultaneous confidence sets for the ranks of all categories. A distinguishing feature of our analysis is that we exploit the multinomial structure of the data to develop confidence sets that are valid in finite samples. We additionally develop confidence sets using the bootstrap that are valid only approximately in large samples. We use our methodology to rank political parties in Australia using data from the 2019 Australian Election Survey. We find that our finite-sample confidence sets are informative across the entire ranking of political parties, even in Australian territories with few survey respondents and/or with parties that are chosen by only a small share of the survey respondents. In contrast, the bootstrap-based confidence sets may sometimes be considerably less informative. These findings motivate us to compare these methods in an empirically-driven simulation study, in which we conclude that our finite-sample confidence sets often perform better than their large-sample, bootstrap-based counterparts, especially in settings that resemble our empirical application.]]>

**November 2021.**

We study how organizational boundaries affect pricing decisions using comprehensive data from a large U.S. airline. We document that the firm's advanced pricing algorithm, utilizing inputs from different organizational teams, is subject to multiple biases. To quantify the impacts of these biases, we estimate a structural demand model using sales and search data. We recover the demand curves the firm believes it faces using forecasting data. In counterfactuals, we show that correcting biases introduced by organizational teams individually have little impact on market outcomes, but coordinating organizational outcomes leads to higher prices/revenues and increased dead-weight loss in the markets studied.]]>

**November 2021.**

We document substantial variation in the effects of a highly-effective literacy pro-gram in northern Uganda. The program increases test scores by 1.40 SDs on average, but standard statistical bounds show that the impact standard deviation exceeds 1.0SD. This implies that the variation in effects across our students is wider than the spread of mean effects across all randomized evaluations of developing country education interventions in the literature. This very effective program does indeed leave some students behind. At the same time, we do not learn much from our analyses that attempt to determine which students benefit more or less from the program. We reject rank preservation, and the weaker assumption of stochastic increasingness leaves wide bounds on quantile-specific average treatment effects. Neither conventional nor machine-learning approaches to estimating systematic heterogeneity capture more than a small fraction of the variation in impacts given our available candidate moderators.]]>

**November 2021.**

Two-stage least squares estimates in heavily over-identified instrumental variables (IV) models can be misleadingly close to the corresponding ordinary least squares (OLS) estimates when many instruments are weak. Just-identified (just-ID) IV estimates using a single instrument are also biased, but the importance of weak-instrument bias in just-ID IV applications remains contentious. We argue that in microeconometric applications, just-ID IV estimators can typically be treated as all but unbiased and that the usual inference strategies are likely to be adequate. The argument begins with contour plots for confidence interval coverage as a function of instrument strength and explanatory variable endogeneity. These show undercoverage in excess of 5% only for endogeneity beyond that seen even when IV and OLS estimates differ by an order of magnitude. Three widely cited microeconometric applications are used to explain why endogeneity is likely low enough for IV estimates to be reliable. We then show that an estimator that’s unbiased given a population first-stage sign restriction has bias exceeding that of IV when the restriction is imposed on the data. But screening on the sign of the estimated first stage is shown to halve the median bias of conventional IV without reducing coverage. To the extent that sign-screening is already part of empirical workflows, reported IV estimates enjoy the minimal bias of sign-screened just-ID IV.]]>

**November 2021.**

There is growing concern that the increasing use of machine learning and artificial intelligence-based systems may exacerbate health disparities through discrimination. We provide a hierarchical definition of discrimination consisting of algorithmic discrimination arising from predictive scores used for allocating resources and human discrimination arising from allocating resources by human decision-makers conditional on these predictive scores. We then offer an overarching statistical framework of algorithmic discrimination through the lens of measurement errors, which is familiar to the health economics audience. Specifically, we show that algorithmic discrimination exists when measurement errors exist in either the outcome or the predictors, and there is endogenous selection for participation in the observed data. The absence of any of these phenomena would eliminate algorithmic discrimination. We show that although equalized odds constraints can be employed as bias-mitigating strategies, such constraints may increase algorithmic discrimination when there is measurement error in the dependent variable.]]>

**November 2021.**

We review approaches to identification and inference on models in Industrial Organization with partial identification and/or moment inequalities. Often, such approaches are intentionally built directly on assumptions of optimizing behavior that are credible in Industrial Organization settings, while avoiding the use of strong modeling and measurement assumptions that may not be warranted. The result is an identified set for the object of interest, reflecting what the econometrician can learn from the data and assumptions. The chapter formally defines identification, reviews the assumptions underlying the identification argument, and provides examples of their use in Industrial Organization settings. We then discuss the corresponding statistical inference problem paying particular attention to practical implementation issues.]]>

**October 2021.**

We study the performance of many traditional and novel, text-based variables for in-sample and out-of-sample forecasting of oil spot, futures, and energy company stock returns, and changes in oil volatility, production, and inventories. After controlling for small-sample biases, we find evidence of in-sample predictability. Our text measures, derived using energy news articles, hold their own against traditional variables. While we cannot identify ex-ante rules for selecting successful out-of-sample forecasters, an analysis of all possible two-variable models reveals out-of-sample performance above that expected under random variation. Our findings provide new directions for identifying robust forecasting models for oil markets, and beyond.]]>

**October 2021.**

The proposed change under the American Families Plan (AFP) to the Tax Cuts and Jobs Act (TCJA) Child Tax Credit (CTC) would increase maximum benefit amounts to $3,000 or $3,600 per child (up from $2,000 per child) and make the full credit available to all low and middle-income families regardless of earnings or income. We estimate the anti-poverty, targeting, and labor supply effects of the expansion by linking survey data with administrative tax and government program data which form part of the Comprehensive Income Dataset (CID). Initially ignoring any behavioral responses, we estimate that the expansion of the CTC would reduce child poverty by 34% and deep child poverty by 39%. The expansion of the CTC would have a larger anti-poverty effect on children than any existing government program, though at a higher cost per child raised above the poverty line than any other means-tested program. Relatedly, the CTC expansion would allocate a smaller share of its total dollars to families at the bottom of the income distribution—as well as families with the lowest levels of long-term income, education, or health—than any existing means-tested program with the exception of housing assistance. We then simulate anti-poverty effects accounting for labor supply responses. By replacing the TCJA CTC (which contained substantial work incentives akin to the Earned Income Tax Credit) with a universal basic income-type benefit, the CTC expansion reduces the return to working at all by at least $2,000 per child for most workers with children. Relying on elasticity estimates consistent with mainstream simulation models and the academic literature, we estimate that this change in policy would lead 1.5 million workers (constituting 2.6% of all working parents) to exit the labor force. The decline in employment and the consequent earnings loss would mean that child poverty would only fall by 22% and deep child poverty would not fall at all with the CTC expansion.]]>

**October 2021.**

We examine how the net worth of billionaires relates to their looks, as rated by 16 people of different gender and ethnicity. Surprisingly, their financial assets are unrelated to their beauty; nor are they related to their educational attainment. As a group, however, billionaires are both more educated and better-looking than average for their age. Men, people who reside in Western countries, and those who inherited substantial wealth, are wealthier than other billionaires. The results do not arise from measurement error or nonrandom sample selectivity. They are consistent with econometric theory about the impact of truncating a sample to include observations only from the extreme tail of the dependent variable. The point is underscored by comparing estimates of earnings equations using all employees in the 2018 American Community Survey to those using a sample of the top 0.1 percent. The findings suggest the powerful role of luck within the extremes of the distributions of economic outcomes.]]>

**October 2021.**

This paper extends my research applying statistical decision theory to treatment choice with sample data, using maximum regret to evaluate the performance of treatment rules. The specific new contribution is to study as-if optimization using estimates of illness probabilities in clinical choice between surveillance and aggressive treatment. Beyond its specifics, the paper sends a broad message. Statisticians and computer scientists have addressed conditional prediction for decision making in indirect ways, the former applying classical statistical theory and the latter measuring prediction accuracy in test samples. Neither approach is satisfactory. Statistical decision theory provides a coherent, generally applicable methodology.]]>

**October 2021.**

Identification in VARs has traditionally mainly relied on second moments. Some researchers have considered using higher moments as well, but there are concerns about the strength of the identification obtained in this way. In this paper, we propose refining existing identification schemes by augmenting sign restrictions with a requirement that rules out shocks whose higher moments significantly depart from independence. This approach does not assume that higher moments help with identification; it is robust to weak identification. In simulations we show that it controls coverage well, in contrast to approaches that assume that the higher moments deliver point-identification. However, it requires large sample sizes and/or considerable non-normality to reduce the width of confidence intervals by much. We consider some empirical applications. We find that it can reject many possible rotations. The resulting confidence sets for impulse responses may be non-convex, corresponding to disjoint parts of the space of rotation matrices. We show that in this case, augmenting sign and magnitude restrictions with an independence requirement can yield bigger gains.]]>

**September 2021.**

Demand elasticities and other features of demand are critical determinants of the answers to most positive and normative questions about market power or the functioning of markets in practice. As a result, reliable demand estimation is an essential input to many types of research in Industrial Organization and other fields of economics. This chapter presents a discussion of some foundational issues in demand estimation. We focus on the distinctive challenges of demand estimation and strategies one can use to overcome them. We cover core models, alternative data settings, common estimation approaches, the role and choice of instruments, and nonparametric identification.]]>

**September 2021.**

We report the labor market effects of the Jamaica Early Childhood Stimulation intervention at age 31. The study is a small-sample randomized early childhood education stimulation intervention targeting stunted children living in the poor neighborhoods of Kingston, Jamaica. Implemented in 1987-1989, treatment consisted of a two-year home-based intervention designed to improve nutrition and the quality of mother-child interactions to foster cognitive, language and psycho-social skills. The original sample is 127 stunted children between 9 and 24 months old. Our study is able to track and interview 75% of the original sample 30 years after the intervention, both still living in Jamaica and migrated abroad. We find large and statistically significant effects on income and schooling; the treatment group had 43% higher hourly wages and 37% higher earnings than the control group. This is a substantial increase over the treatment effect estimated for age 22 where we observed a 25% increase in earnings. The Jamaican Study is a rare case of a long-term follow up for an early childhood development (ECD) intervention implemented in a less-developed country. Our results confirm large economic returns to an early childhood intervention that targeted disadvantaged families living in poverty in the poor neighborhoods of Jamaica. The Jamaican intervention is being replicated around the world. Our analysis provides justification for expanding ECD interventions targeting disadvantaged children living in poor countries around the world.]]>

**September 2021.**

Students who attend different colleges in the U.S. end up with vastly different economic outcomes. We study the role of relative value-added across colleges within student choice sets in producing these outcome disparities. Linking high school, college, and earnings registries spanning the state of Texas, we identify relative college value-added by comparing the outcomes of students who apply to and are admitted by the same set of institutions, as this approach strikingly balances observable student potential across college treatments and renders our extensive set of covariates irrelevant as controls. Methodologically, we develop a framework for identifying and interpreting value-added under varying assumptions about match effects and sorting gains. Empirically, we estimate a relatively tight, though non-degenerate, distribution of relative value-added across the wide diversity of Texas public universities. Selectivity poorly predicts value-added within student choice sets, with only a fleeting selectivity earnings premium fading to zero after a few years in the labor market. Non-peer college inputs like instructional spending more strongly predict value-added, especially conditional on selectivity. Colleges that boost BA completion, especially in STEM majors, also tend to boost earnings. Finally, we probe the potential for (mis)match effects by allowing value-added schedules to vary by student characteristics.]]>