The goal of this book is to figure out at least some characteristics of the best possible tax system. This problem is a difficult one even to pose. The amount of tax that a typical citizen pays is a function of many economic variables. A far from exhaustive list includes labor earnings, interest income, dividend income, consumption, and money-holdings (via inflation). The dependence of collected taxes on these variables may be quite complicated. Moreover, taxes depend on asset incomes and asset holdings, and these represent the outcomes of decisions about how much wealth to transfer from one period to another. The problem of designing a good tax system that includes asset income taxes is intrinsically a dynamic one.
At the end of the 1990s, most of the research on optimal taxation in multiperiod settings was being done by macroeconomists (as opposed to specialists in public finance). Following an approach pioneered by Chamley (1986), the research made some rather strong assumptions: it restricted taxes to be linear and (generally) assumed all agents are identical. The resulting research program is extremely tractable. Unfortunately, it is also deeply flawed. Its key economic trade-off is that the government would like to make the taxes nonlinear but cannot. This basic tension is really irrelevant in the actual design of taxes, because governments can (and do) use nonlinear taxes.
In response to this conceptual problem, the new dynamic public finance (NDPF) thinks about how to design optimal taxes using the fundamentally different approach pioneered by Mirrlees (1971). The NDPF explicitly allows taxes to be nonlinear and allows for heterogeneity among people in the economy. The heterogeneity comes from a rather natural source. People's labor earnings depend on their choices of labor inputs (how hard or how long to work). Increasing the size of this input causes them disutility, but generates more labor income. As Mirrlees (1971) originally did, the NDPF presumes that people differ in their skills, that is, in how much labor input they need to generate a given level of labor income. By way of extension to Mirrlees's baseline analysis, the NDPF allows for the possibility that these skills evolve over time stochastically (so that people may gain or lose skills over time in a surprising fashion).
In the NDPF, the government commits itself ex ante to a tax schedule that maximizes a (possibly weighted) average of agents' utilities. The only restriction on this schedule is that taxes can only depend on incomes, and not directly on people's skills. This restriction immediately translates into the main trade-off that the government faces when designing its optimal tax schedule. On the one hand, the benevolent government wants to provide insurance. People can turn out to be high skilled or low skilled at the beginning of their lives or over the course of their lives. The government would like to insure them against this skill risk. This force leads the government to favor high taxes on income. On the other hand, the government would like to motivate the high-skilled people to produce more income than the low-skilled people. This force leads the government to favor low taxes. The government's problem is to figure out how to resolve this tension in various dates and states.
I have made no explicit mention of private information in describing the NDPF. However, the government's inability to condition taxes directly on skills ends up implying that it has to treat agents as being privately informed about their productivities. It follows that the optimal tax problem in the NDPF is isomorphic to a dynamic contracting problem between a risk-neutral principal and a risk-averse agent who is privately informed about productivities. There is a large literature on such dynamic principal–agent problems (including work by Rogerson (1985), Spear and Srivastava (1987), Green (1987), and Atkeson and Lucas (1995)), and the NDPF exploits its technical insights in many ways.
In the remainder of this introduction, I discuss the scope of the book. I lay out four main lessons of the new dynamic public finance. Finally, I describe the structure of the book.
This book is normative. It is interesting and important to figure out why we have the taxes that we have, but this book does not seek to answer that question. Instead, it tries to figure out what taxes we should have. It follows that the actual specification of taxes is irrelevant for the purposes of this book, except to indicate the range of taxation possibilities available to the government. Here's an analogy that might be helpful. The existence of agricultural subsidies and tariffs means that the government has the ability to levy these taxes. But the existence of these taxes does not mean that economists are wrong to recommend their elimination. In the same vein, if taxes recommended by the NDPF differ from the taxes that are actually used, there is no logical reason to conclude that there is something wrong with the NDPF.
This argument does not imply that normative economics in general or the NDPF in particular is disconnected from reality. The ultimate goal of the NDPF is to provide relatively precise recommendations as to what taxes should be. These recommendations will depend on a host of model parameters, and we will need to use data to obtain these parameters. As yet, the NDPF has not made much progress in obtaining good measures of the necessary inputs. This book reflects this weakness, but in chapter 7 I provide some ideas about how more progress can be made.
The normative focus means that I am not going to discuss two recent and related literatures. One such literature is on time-consistency. (More technically, it focuses on the structure of sequential equilibrium taxes when governments choose those taxes periodically.) The other literature is on dynamic political economy. (It focuses on the structure of sequential equilibrium outcomes when taxes are determined by periodic voting.) These literatures examine the properties of equilibrium outcomes of particular dynamic games. Hence, they are trying to model the actual behavior of governments. They are not normative in nature and so lie outside the scope of this book.
As the remainder of this book shows, we have learned a great deal in a short time from the NDPF. However, I think that there are four particularly important lessons that are worth emphasizing. The first three require preferences to exhibit separability between consumption and leisure. The last does not.
1.2.1 Lesson 1: Optimality of Asset Income Taxes
The first lesson concerns the design of optimal asset income taxes. It is valid regardless of the data-generation process for skills. Consider a risk-averse person at date t who faces skill risk at date (t + 1). Under an optimal tax system, the person's shadow interest rate from period t to period (t +1) must be less than the market interest rate. This result immediately implies that an optimal tax system must confront such a person with a nonzero asset income tax that deters him from saving.
Intuitively, when preferences are separable between consumption and leisure, leisure is a normal good. Normality of leisure means that agents with a large amount of accumulated wealth in period (t + 1) are harder to motivate in that period. Hence, on the margin, good tax systems deter wealth accumulation from period t to period (t+1) to provide people with better incentives to work in the latter period.
This result was originally derived by Diamond and Mirrlees (1978) in the context of a model of endogenous retirement. However, Diamond and Mirrlees restricted attention to a specific data-generation process for skills (a two-point Markov chain with an absorbing state). The contribution of the NDPF (and specifically of Golosov et al. (2003)) is to show that Diamond and Mirrlees's finding applies to all data-generation processes for skills, and can in fact be extended to models in which skills are endogenous.
1.2.2 Lesson 2: An Optimal Asset Income Tax System
The first lesson implies that any optimal tax system features nonzero asset income taxes. The second lesson is about the structure of these nonzero asset income taxes, and is best divided into two parts. The first part is that in many settings, the optimal tax on a person's asset income in period (t +1) must be a nontrivial function of his labor income in period (t + 1). People's decisions about asset holdings in period t depend on their labor input plans in period (t + 1), and optimal asset income taxes must take this intertemporal connection into account. (This conclusion was originally reached in work by Albanesi and Sleet (2006) and Golosov and Tsyvinski (2006).)
The second part of this lesson is that there is an optimal tax system in which taxes are linear functions of asset income in every period. In this system, given the information available at period t, period (t + 1) asset income taxes are negative for people with surprisingly high labor income in period (t + 1) and positive for people with surprisingly low labor income. The cross-sectional average asset income tax rate, and total asset income tax revenue, is always zero regardless of the aggregate state of the world. Thus, the tax system deters investment not through the level of asset income taxes, but through the positive covariance of these taxes with skill realizations. (This conclusion was originally reached in work by Kocherlakota (2005).)
1.2.3 Lesson 3: Optimal Bequest Taxes and Intergenerational Transmission
Some of the most exciting work in the NDPF concerns the optimal taxation of bequests (see, in particular, Phelan 2006; Farhi and Werning 2007). There are two main results. The first has nothing to do with incentives: even if parents are altruistic, in most Pareto optimal tax systems, optimal bequest taxes are negative. The intuition is simple. In any Pareto optimum in which society puts positive weight on all people, society cares about a child in two ways: through its ancestors and directly as a person. It follows that society always puts more weight on a given child than its ancestors do, and so society wants to subsidize parent–child transfers.
The second result is a characterization of a particular optimal bequest tax system and is connected to incentives and insurance. If parents are altruistic, it is optimal for a persons's after-tax outcomes to depend on his/her parents' labor earnings. This dependence is a good way to motivate parents to work hard. On the other hand, society does want to insure children somewhat against their parents' outcomes. As a result, it is optimal to subsidize bequests at a higher rate for poor parents than for rich parents.
1.2.4 Lesson 4: Individual Ricardian Equivalence and Social Security
In the NDPF, a person's labor income taxes at a given date are allowed to be a function of one's full history of labor earnings. This kind of generality mimics the flexibility that governments actually enjoy. For example, in the United States, social security transfers are a function of the full history of one's labor earnings.
The fourth lesson is that, with this degree of flexibility, optimality considerations only pin down the present value of labor income taxes as a function of a person's labor earnings. Thus, if a person owes $10,000 in taxes at age 25, the government could collect half of that at age 60 (with appropriate interest charges) without affecting individual decisions at all. This indeterminacy is essentially an individual-level version of Ricardian equivalence. (See Bassetto and Kocherlakota (2004) for a discussion.)
The government can exploit this indeterminacy to simplify the structure of labor income taxes. In particular, there is an optimal tax system in which the government imposes a flat tax on labor earnings while people are working, and then bases post-retirement social security transfers on the full history of labor earnings. Intuitively, all that matters for incentives and insurance is the dependence of the present value of labor income taxes on the history of labor incomes. Any required dependence can be fully encoded into the structure of post-retirement transfers, as long as agents can borrow against these transfers. (This argument is explained more fully in Grochulski and Kocherlakota (2008).)
The remainder of the book is divided into six chapters. The second chapter of the book concerns the Ramsey (that is, linear tax) approach to dynamic optimal taxation. The chapter derives the classic Chamley (1986) result concerning long-run capital income taxes. The chapter also contains a discussion of the limitations of the Ramsey approach and motivates the alternative Mirrleesian approach that informs the rest of the book.
As discussed above, the NDPF is closely linked to the problem of optimal resource allocation in dynamic economies with private information. Chapter 3 provides an analysis of such problems, including a discussion of the "reciprocal" Euler equation and the longrun properties of optimal allocations. Relative to other treatments, its novelty is that it allows for general specifications of data-generation processes for individual skills. This generality rules out the recursive approaches used by, among others, Atkeson and Lucas (1992). Instead, I employ classical perturbation methods similar to Rogerson (1985). These methods are both more general and (I believe) more intuitive.
In chapter 4, I develop the implications of the NDPF for macroeconomists. I set up a canonical optimal nonlinear taxation problem in a dynamic economy with heterogeneous agents. I show how, in terms of quantities, the solution to this problem is the same as the solution to the private information allocation problem in chapter 3. I use this connection to derive general properties of optimal taxes, and discuss the properties of a particular optimal tax system.
Chapter 5 extends the analysis to bequest taxes. Mathematically, the chapter is similar to the previous one. However, the results differ in important ways, because the societal objective puts more weight on descendants than parents do. This difference affects both the sign of bequest taxes and their dependence on the income levels of parents.
The analysis in these chapters 2–5 is entirely qualitative. In chapter 6, I set forth recursive methods that in principle allow one to find approximate solutions to the basic nonlinear taxation problem when skills follow a Markov chain. This literature is an old one (dating back at least twenty years), but progress has been slow: much remains to be done. I then solve for optimal taxes in a simple numerical example. The example is purely illustrative, but it is nonetheless suggestive.
In chapter 7, I discuss possible paths for future research. This chapter is probably the most important but it is also necessarily the most speculative.
I should add a final warning about notation. In terms of their economic lessons, the various chapters are certainly cumulative. However, the chapters use rather distinct models to derive these lessons. For this reason, I have made no attempt to ensure that the notation is consistent across chapters, although it is consistent within chapters.
Chapter TwoThe Ramsey Approach and Its Problems
The Ramsey approach was the dominant approach to dynamic optimal taxation (and, indeed, for discussions of much of macroeconomic policy) in the late twentieth century. The approach begins with the premise that taxes are distorting. It captures this distortion in the simplest possible fashion by assuming that all taxes are linear functions of current variables. It then chooses those tax rates to optimize social welfare (measured in some fashion). As we shall see, the Ramsey approach is remarkably tractable, which is one of its main attractions.