Learning Dynamics with Private and Public Signals
Abstract
This paper studies the evolution of firms' beliefs in a dynamic model of technology adoption. Firms play a simple variant of the classic two-armed bandit problem, where one arm represents a known, deterministic production technology and the other arm an unknown, stochastic technology. Firms learn about the unknown technology by observing both private and public signals. I find that because of the externality associated with the public signal, the evolution of beliefs under a market equilibrium can differ significantly from that under a planner. In particular, firms experiment earlier under the planner than they do under the market equilibrium and thus firms under the planner generate more information at the start of the model. This intertemporal effect brings about the unusual result that, on a per period basis, there exist cases where firms in a market equilibrium over-experiment relative to the planner in the latter periods of the model.
(cid:3) Learning Dynamics with Private and Public Signals y Adam Copeland Board of Governors of the Federal Reserve System December 6, 2004 Abstract Thispaperstudiestheevolution of(cid:12)rms’beliefsinadynamicmodelof technologyadoption. Firmsplayasimplevariantoftheclassictwo-armedbanditproblem, where one arm represents a known, deterministic production technology and the other arm an unknown, stochastic technology. Firms learn about the unknown technology by observing both private and publicsignals. I (cid:12)nd that because of the externality associated with the public signal, the evolution of beliefs under a market equilibrium can di(cid:11)er signi(cid:12)cantly from that under a planner. In particular, (cid:12)rms experiment earlier under the planner than they do under the market equilibrium and thus (cid:12)rms under the planner generate more information at the start of themodel. Thisintertemporal e(cid:11)ect brings abouttheunusualresultthat, on aper periodbasis, there exist cases where(cid:12)rmsin amarket equilibriumover-experiment relative to the planner in the latter periods of the model. (cid:3)I would like to thank Matt Mitchell and Tom Holmes for their advice and encouragement. I would also like to thank Thor Koeppl, Cyril Monnet, John Stevens, and Jason Cummins for their helpful comments. Thispaperisthesecondchapterofmydissertation. Theviewsexpressedhereinaremyown and not necessarily those of the Federal Reserve Board. yI can be reached at adam.m.copeland@frb.gov 1
1 Introduction When a new technology is introduced into the marketplace, (cid:12)rms typically are unsure about its true quality. Over time, (cid:12)rms learn through direct and indirect means how the new technology compares with the old, existing one. The technology adoption literature hasstudiedthedi(cid:11)usionofinformationunderrestrictiveinformationalconstraints. Firms either all hold the same beliefs about the new technology, or they hold heterogeneous beliefs but learn and act in isolation. As shown in the herding literature however, the combination of both heterogeneous beliefs and learning from others creates interesting and potent e(cid:11)ects on the sequential generation of information in an economy. This paper adds to the literature by constructing a dynamic model of technology adoption that incorporates both heterogeneous beliefs and social learning and by studying the speed of adoption within such an environment. The model in this paper embeds the classic two-armed bandit problem into a (cid:12)niteperiod equilibrium model of technology adoption.1 The adoption decision that (cid:12)rms face is to determine which technology, or arm, to use to produce output each period. One arm represents the \old" technology and has a known, deterministic output distribution common to all players. The second arm represents the \new" technology and has an unknown, stochastic output distribution common to all players. The mean output of the stochastic technology, however, can be either higher or lower than that of the deterministic technology. There are no costs to operating either technology, nor are there costs to switching from one technology to the other between periods. This model departs from the usual two-armed bandit analysis through its informationalstructure. Iassume thatagentscanobserve onlytheir ownindividual output anda noisy signal of aggregate output. Experimenting with the new technology and observing individual output conveys the usual learning-by-doing e(cid:11)ect. As (cid:12)rms can observe only their own output, this learning-by-doing e(cid:11)ect is a private signal that generates heterogeneous beliefs among (cid:12)rms. This e(cid:11)ect encourages (cid:12)rms to adopt the new technology, as information has positive value. In contrast to the individual, private aspect of experimentation, the noisy signal of 1Two good sources on one- and two-armed bandit problems are Rothschild (1974) and Berry and Fristedt (1985). 2
aggregate output is a public signal observed by all agents. Aggregate quantity contains useful information as its value depends upon the true mean of the stochastic technology. Intuitively, a signal of unexpectedly high aggregate quantity is more likely to occur if the new, stochastic technology is superior to the old technology. Because it is observed by all (cid:12)rms regardless of their actions, the public signal creates an informational externality. As such, this signal generates the usual free-rider e(cid:11)ect found in models with public signals. Unlike the typical bandit problem, however, the public signal is not an ad hoc speci(cid:12)cation but a function of the model equilibrium and the distribution of (cid:12)rms’ beliefs. As the informational value of the public signal varies with the measure of (cid:12)rms that experiment with the new technology, the learning interaction among (cid:12)rms is tied to economic fundamentals. A key feature of the public signal is the increase in its informational value as the measure of adopting (cid:12)rms rises. To solve the (cid:12)rm’s problem in this rich informational environment, I de(cid:12)ne an anonymous sequential game. Under this equilibrium, the following cuto(cid:11) rule applies. Firms holding beliefs greater than the cuto(cid:11) belief adopt the stochastic technology; (cid:12)rms holding beliefs below the cuto(cid:11) belief choose the deterministic technology; and (cid:12)rms holding the cuto(cid:11) belief employ a mixed strategy. To determine whether the speed with which (cid:12)rms adopt the stochastic technology is e(cid:14)cient, I de(cid:12)ne and solve the social planner’s problem. Because (cid:12)rms hold heterogeneous beliefs and there is aggregate uncertainty, I cannot formulate a classic planner’s problem that maximizes expected aggregate quantity. So instead, I de(cid:12)ne an uninformed social planner’s problem, where the planner seeks to maximize each (cid:12)rm’s expected output conditional on the (cid:12)rm being no worse o(cid:11) relative to the market equilibrium outcome. In essence, an uninformed planner coordinates (cid:12)rms’ actions so as to implement a Pareto-optimal outcome that Pareto dominates the market outcome. The uninformed social planner is the relevant point of comparison, asthis formulation maintains the same informational constraints that (cid:12)rms face under a market equilibrium. Firms’ expected output increases under a social planner, because the planner can coordinate (cid:12)rms’ experimentation strategies and take full advantage of the informational externality associated with the public signal. To show that the standard result holds is easy: ceteris paribus, the planner induces a higher level of adoption among (cid:12)rms in a 3
given period than the market equilibrium does. More interesting, this model highlights how the planner shifts the generation of information to earlier periods in the model because information is more valuable the earlier it is generated. As a result of this intertemporal substitution, the planner induces a lower level of experimentation than (cid:12)rms under a market equilibrium in later periods of the model. This (cid:12)nding is at odds with models with more restrictive informational frameworks, which predict that (cid:12)rms always under-experiment relative to the social planner. The under-adoption result is driven by two separate forces. First, the informational externality of the public signal results in the planner’s use of a di(cid:11)erent experimentation strategy relative to the market equilibrium. Consequently the sequential generation of information across these two regimes di(cid:11)ers, as does, therefore, the evolution of aggregate beliefs across the two regimes. Second, the existence of private signals ensures that (cid:12)rms hold heterogeneous beliefs. This heterogeneity and the nonmonotonic return on information across beliefs imply that (cid:12)rms vary in the way they value additional information. In the extreme, (cid:12)rms that believe that the risky technology is good or bad with 100 percent probability do not value additional information. Hence, I can construct a case in which the social planner, relative to the market equilibrium, generates more information through higher experimentation, in the initial periods of the model. As a consequence, in the later periods of the model, (cid:12)rms under the planner place less value on additional information, and as a result initiate less experimentation than (cid:12)rms in a market equilibrium. This paper builds on three bodies of literature. The (cid:12)rst is the (cid:12)eld of social learning. I use a framework close to Bolton and Harris (1999), who examine a multiperiod, multiplayer two-armed bandit problem and explore the strategic interactions among agents. As they did in their paper, I try to answer the same basic questions about the speed and di(cid:11)usion of information among agents, but I do using an environment with heterogeneous beliefs and without strategic e(cid:11)ects. I consider an economy with a measure 1 of agents, resulting in perfect competition in information, as no single agent can influence the aggregate outcome. This simpli(cid:12)cation allows for a more-complex modeling of the public signal and makes an environment with both private and public signals tractable. This additional complexity allows us to analyze the e(cid:14)cient dissemination of information 4
in a world with heterogeneous beliefs. Second is the literature on technology adoption under uncertainty. Using a twoarmed bandit framework where each arm has an unknown output distribution, Jensen (1983) analyzes the belief dynamics of (cid:12)rms when they have access to private signals. He demonstrates that a simple learning-by-doing mechanism will generate heterogeneous beliefs, resultinginanogive-shapeddi(cid:11)usioncurve. Unlikethispaper,Jensen(1983)does not incorporate informational externalities and so predicts that (cid:12)rms learn e(cid:14)ciently. Rob(1991)establishes that, inanadhocmodelof entry, a publicsignal canalso generate gradual learning even when beliefs are homogeneous. In addition, he shows that by internalizing the informational externality associated with the public signal, the social planner will induce a higher level of entry relative to (cid:12)rms in a market equilibrium, ceteris paribus. Furthermore, he proves that the planner will always choose a higher level of entry throughout all periods of the model. The present paper builds upon Rob’s result by highlighting a subtle ine(cid:14)ciency in the rate of adoption in competitive markets { the slower generation of information. I show how the planner wants not only to induce more adoption to take advantage of the informational externality of the public signal but also to make (cid:12)rms adopt earlier in the model because information is more valuable the sooner it is revealed. Thirdistheworkonherding/informationcascadesbyBanerjee(1992),Bikhchandani, Hirshleifer, and Welch (1992), and more recently Dasgupta (2002). As is done in the present paper, this (cid:12)eld of research studies the sequential generation of information and its e(cid:11)ects on agents’ adoption, or entry, decisions. In addition, the previous papers study agent behavior when both private and public signals exist. Unlike my paper, however, the herding literature focuses on certain ine(cid:14)ciencies in the generation of information that do not exist within this paper’s framework. 2 A Three Period Example In this section I describe a three period example of the model. After outlining the parameter values of the model, I intuitively describe the (cid:12)rm’s problem and present the example’s results. 5
2.1 Environment There is a measure 1 of (cid:12)rms producing one good, where each (cid:12)rm seeks to maximize output. The model has three periods, in each of which each (cid:12)rm decides which of two production technologies to use. A (cid:12)rm’s objective is to produce the most output over all threeperiods. The(cid:12)rst technologyis\safe", always producingY. The secondtechnology is \risky", producing (cid:22) 2 f(cid:22);(cid:22)g, where (cid:22) < Y < (cid:22). The probability distribution over (cid:22) depends upon a parameter s 2 fL;Hg. Thus, the output of the deterministic technology is known and common to all (cid:12)rms, whereas the output of the stochastic technology is unknown and common to all (cid:12)rms. I assume that Y = 0:5, (cid:22) = 1, and (cid:22) = 0. I impose symmetry on the distribution of s, and assume that Pr((cid:22)js = H) = Pr((cid:22)js = L) = 0:7. Letting Y denote the conditional expected output from using the stochastic technology, s the parameter choices imply that Y = 0:3 andY = 0:7. In seeking to maximize output, L H the (cid:12)rm needs to determine which production technology is better. Under the chosen parameters, the safe technology is better if s = L and worse if s = H. There are no costs involved with production nor with switching technologies. Aggregate uncertainty is present in this environment as the parameter s is unknown. This parameter has the same value over all three periods but is unobserved by the (cid:12)rm. Firms have priors, denoted γ, over s 2 fL;Hg. As s can take only two values, let γ = Pr(s = H). Firms learn about s through two channels. First, they can experiment with the stochastic technology and observe their individual output. Firms only can observe their own output, and so this information is a private signal of s. Firms use Bayes rule to update their beliefs over s. Given the prior belief γ, let γ(γ) denote a (cid:12)rms’ posterior belief after observing (cid:22). Similarly, letγ(γ) denote a (cid:12)rm’s posterior belief after observing (cid:22), and note that Bayes rule implies that γ < γ < γ for γ 2 f0;1g. The second channel of learning occurs through a noisy signal of aggregate quantity, Q~, which is observed at the end of the period. Regardless of their actions, all (cid:12)rms see the same signal, making Q~ a public signal of s. As described in more detail later, Q~ = Q+", where Q is the actual aggregate quantity produced, and " (cid:24) U[−b;b], where b > 0. The uniform distribution of " makes this model tractable, as it reduces the informational gain of the public signal to two extremes: either (cid:12)rms learn the true value of s with certainty after observing Q~, 6
or they learn nothing after observing Q~. In this example, the upper and lower bound of the uniform distribution, b, is equal to 0.6. This interval is a large enough to ensure that the probability of learning the true value of s from the public signal, denoted p~, is less than one. The probability that a (cid:12)rm will learn from the public signal is a function of all (cid:12)rms’ actions. As section 3 will show, the more (cid:12)rms that adopt the stochastic technology, the higher the probability that Q~ reveals the true value of s. In other words, the strength of the public signal is increasing in the measure of adopting (cid:12)rms. Withthepublic signal, the(cid:12)rm’sproblemisa mixtureoflearning-by-doing andsocial learning. The major forces at work, however, can be described intuitively by using the framework of the (cid:12)rm’s static problem. This problem is n o V(γ) = max (1−e)Y +e(γY +(1−γ)Y ) ; (1) H L e2[0;1] where e is the adoption strategy of the (cid:12)rm. Here, the (cid:12)rm compares the output from the deterministic technology against the expected output from the stochastic technology. The decision rule is a cuto(cid:11) belief, above which the (cid:12)rm uses the stochastic technology with probability 1. At the cuto(cid:11) belief, the (cid:12)rm is indi(cid:11)erent between using either technology. Below the cuto(cid:11), (cid:12)rms adopt the stochastic technology with probability 0. In a multiperiod problem with learning, it is signi(cid:12)cant that V is increasing and convex in γ. Attaching a signal ofs to the stochastic technology means that the adopting (cid:12)rm’s next-period belief will be either γ or γ. Bayes rule implies that γ < γ < γ and γ[1−(cid:11)(γ)]+γ(cid:11)(γ) = γ, where (cid:11)(γ) is the prior probability of observing (cid:22). By Jensen’s inequality, the convexity of V tells us the sum of V(γ) and V(γ), weighted by the probability of observing (cid:22) and (cid:22), is greater than V(γ); information increases the next period’s expected output. Although informationis valuable, its valueis not monotonic in γ. Those (cid:12)rms that have ‘strong’ beliefs, with γ close to 0 or 1, gain less from additional information than (cid:12)rms with γ close to 0.5. Thepublicsignalbluntsthereturnfromthestochastictechnology. Thissignalreveals the true value of s with some positive probability. Thus, the learning bene(cid:12)ts from observing the stochastic technology accrue only when the public signal is uninformative. Through the public signal, the adoption decisions of all (cid:12)rms influence the decision of an individual (cid:12)rm. This influence complicates the notation required to formally describe 7
the (cid:12)rm’s problem. The formal description adds little intuition, and so it appears in the latter part of the paper. Consequently, in this example I will not lay out the (cid:12)rm’s full problem and will only touch upon the equilibrium concept used in the model. The equilibrium concept used in this model is an anonymous sequential game (ASG) introduced by Jovanovic and Rosenthal (1988). In an ASG, all (cid:12)rms pursue outputmaximizing strategies that are best responses to all other (cid:12)rms’ optimal strategies. The equilibrium is of an anonymous nature in that (cid:12)rms care only about the distribution of (cid:12)rms’ beliefs, not about the individual identities of the (cid:12)rms on the distribution. The state space for the (cid:12)rm’s problem, then, is the aggregate distribution of (cid:12)rms’ beliefs. In the (cid:12)rst period of the problem, the distribution of (cid:12)rms’ beliefs is simply the initial prior that all(cid:12)rms hold, which Iassume is a point mass known to all (cid:12)rms. The distribution of beliefs in the second period, however, depends upon all (cid:12)rms’ actions as well as upon the value of s. In equilibrium, (cid:12)rms can rationally deduce all other (cid:12)rms’ adoption decisions. There is, however, uncertainty over whether 70 percent of adopting (cid:12)rms observed (cid:22) or (cid:22). Conditional on s, all (cid:12)rms agree on the aggregate distribution of (cid:12)rms’ beliefs. Firms disagree, however, on the probability that s = H. This disagreement can be sustained in equilibrium, as each (cid:12)rm is too small to a(cid:11)ect the aggregate distribution of beliefs. As a result, the state space for the (cid:12)rm’s problem consists of three elements: the (cid:12)rm’s belief about the value of s, the distribution of (cid:12)rms’ beliefs conditional on s = L and the distribution of (cid:12)rms’ beliefs conditional on s = H. Later in the paper, I fully specify the (cid:12)rm’s problem and fully de(cid:12)ne an ASG. Here one needs only to know that there is a unique equilibrium and that in every period the (cid:12)rm’s policy rule is a cuto(cid:11) function. 2.2 Solving the Model The (cid:12)rst step in solving the model is de(cid:12)ning the set of feasible beliefs that (cid:12)rms can hold in all periods of the model. I assume that (cid:12)rms start the model holding the same prior belief, γ = 0:45, and, for simplicity, do not discount.2 2There is a range of prior beliefs for which the main results of this paper hold. With all other parameters held (cid:12)xed, the use of any prior belief with a point mass in the range of 0.442 to 0.456 will generate the main over-adoption results of this paper. In addition, the main results of this paper can 8
Table 1: Set of Feasible Beliefs, Pr(s = H) 1st Period 2nd Period 3rd Period 1 1 .82 .66 .66 .45 .45 .45 .26 .26 .13 0 0 As discussed before, the timing of the model is such that (cid:12)rms choose which technology to use, produce output, and then observe their individual output and the noisy signal of aggregate output. After observing the signals of s, (cid:12)rms update their beliefs and enter the next period. Given an initial belief, (cid:12)rms could hold (cid:12)ve possible beliefs in the second period, and seven in the third. Table 1 displays the set of feasible beliefs in all three periods. At the end of the (cid:12)rst period, all (cid:12)rms observe a public signal that may reveal the true value of s. Hence, (cid:12)rms’ beliefs at the beginning of the second period could equal 0 or 1. When the public signal is uninformative, the (cid:12)rms that adopt the stochastic technology learn about s by observing their individual output. Under Bayes rule, a (cid:12)rm’s possible posterior beliefs are either 0:66 or 0:26, depending on whether the (cid:12)rm observed (cid:22) or (cid:22), respectively.3 The (cid:12)rms that use the safe, deterministic technology learn nothing and so continue to hold their prior belief of 0:45. In the third period, the set of feasible beliefs continues to expand in this fashion. This set increases only by two elements, as under Bayes rule those (cid:12)rms that observe both (cid:22) and (cid:22) end up with a posterior belief equal to their initial prior belief, regardless of the order in which those outputs are observed. Using the set of possible beliefs and the assumed parameters, I solve the (cid:12)rm’s problem through backward induction. Solving the (cid:12)rm’s problem is di(cid:14)cult because the second period’s state space, the two distributions of aggregate beliefs conditional on s, is be generated for the case in which (cid:12)rms discount the future. 3The formula used to derive these posterior beliefs is later provided in equation 5. 9
Figure 1: Second Period Distribution of Firms’ Beliefs under an ASG True Value of s is Low True Value of s is High 1 1 .66 .66 .28 .28 .06 .06 0 0 γ γ γ γ γ γ large. Standard computational techniques, however, can surmount this problem. These details are described in appendix A.1. 2.3 Results The (cid:12)rst period’s cuto(cid:11) belief is 0.45, the prior belief that all (cid:12)rms hold. All (cid:12)rms employ the same mixed strategy of adopting the stochastic technology with 94 percent probability. Given an uninformative public signal, this adoption strategy results in the second-period conditional distributions of aggregate beliefs charted in (cid:12)gure 1. As shown in this (cid:12)gure, measure 0.94 of (cid:12)rms adopt, receive a private signal and update their prior beliefs. For the case in which s = L, 30 percent of adopting (cid:12)rms observe (cid:22). As a result, measure 0:28 = 0:3 (cid:3) 0:94 of (cid:12)rms hold the posterior belief γ = 0:66. The remaining 70 percent of adopting (cid:12)rms observe (cid:22) and hold the posterior γ = 0:26. Finally, (cid:12)rms that do not adopt, and so learn nothing, enter the second period with their initial prior, γ = 0:45. The symmetry in the model causes the distribution of beliefs when s = H to be the mirror image of the distribution of beliefs when s = L. In the second period, the conditional distributions of beliefs described above result in theoptimalpolicyrulethatall(cid:12)rmswithγ > 0:44adoptthestochastictechnology. Thus all but those (cid:12)rms with γ = 0:26 adopt the stochastic technology. As shown in table 2, a measure of 0:34 = 0:06+0:28 (cid:12)rms adopt the stochastic technology in the second period, given s = L. For s = H the measure of adopting (cid:12)rms is 0:72 = 0:06+0:66. 10
Table 2: Measure of Firms Adopting Period ASG SP Di(cid:11)erence 1 .94 1.00 -.06 s = L 2 .34 .30 .04 3 .10 .09 .01 1 .94 1.00 -.06 s = H 2 .72 .70 .02 3 .50 .49 .01 Note: ASG stands for anonymous sequential game, and SP stands for social planner. In the third and (cid:12)nal period, the (cid:12)rm’s problem reduces to a static problem. Given the symmetry of the parameter assumptions, the cuto(cid:11) belief is γ = 0:5. Given (cid:12)rms’ adoption strategies in the second period, this policy rule results in measure 0.1 (cid:12)rms adopting the risky technology in the third period, if s = L. If s = H, measure 0.5 (cid:12)rms adopt. 2.4 Social Planner’s Problem To analyze the e(cid:14)ciency of (cid:12)rms’ adoption decisions in an ASG, a comparable social planner’s problem must be de(cid:12)ned. The standard social planner’s problem, however, is not applicable in this environment. As (cid:12)rms hold heterogeneous beliefs and the true aggregatedistributionofbeliefsisunknown, whichbelieftheplannershouldholdandhow that belief is updated are unclear. If a planner can observe all (cid:12)rms’ individual output, then for any positive measure of adopting (cid:12)rms the planner will learn the true value of s after one period. To make the planner’s problem an interesting comparison to the (cid:12)rm’s problem under an ASG, I de(cid:12)ne an uninformed social planner’s problem. As detailed later, this type of planner’s problem reduces to coordinating (cid:12)rms’ adoption decisions so that the resulting outcome is Pareto-optimal and Pareto dominates the outcome under anASG.Under anuninformedsocialplanner, (cid:12)rmsareabletopursueadoptionstrategies that take advantage of the externalities inherent in models with public signals. 11
Figure 2: Second Period Distribution of Firms’ Beliefs under an SP True Value of s Is Low True Value of s Is High 1 1 .7 .7 .3 .3 0 0 0 0 γ γ γ γ γ γ The optimal adoption strategies under this planner’s problem are, as in an ASG, cuto(cid:11) beliefs. As before, when (cid:12)rms hold the cuto(cid:11) belief, their optimal strategy is a mixed strategy (see theorem 2 later in the paper). As with an ASG, I solve for the planner’s adoption strategies by using backward induction. An additional twist to the planner’s problem is the lack of a unique equilibrium; however, for the parameters chosen in this example, there is a unique solution.4 Under a social planner, the optimal adoption strategy is for (cid:12)rms to adopt the risky technology with probability 1. Assuming an uninformative public signal, this adoption strategy results in the second-period conditional aggregate distribution of beliefs charted in (cid:12)gure 2, which shows all (cid:12)rms massed at γ = 0:26 and γ = 0:66. Not surprisingly, the optimal adoption strategy in the second period is for those (cid:12)rms with the belief γ to adopt the stochastic technology with probability 1, and for those holding γ to use the safe technology. As shown in table 2, this adoption strategy implies that the measure of (cid:12)rms adopting the risky technology in the second period is 0.3 if s = L. For s = H, this measure of adopting (cid:12)rms is 0.7. As shown in table 2, both measures of adoption are lower than those under the ASG equilibrium. Hence, the planner, in the (cid:12)rst period, induces a higher level of experimentation in the (cid:12)rst period, but in the second period dictates a lower level of experimentation for the case in which the public signal is uninformative. In essence, 4Thereis aunique solutioninthis examplebecause in the (cid:12)rstperiodthe optimaladoptionstrategy is a corner solution | everyone adopts with probability one. See appendix A.1 for more details. 12
the planner has (cid:12)rms experimenting earlier to generate more information about s at the beginning of the model. Relative to the ASG, this earlier experimentation not only increases thestrengthofthepublicsignalinthe(cid:12)rst periodbut alsoraises thevarianceof the second period aggregate distribution of beliefs. Under a social planner, no (cid:12)rms hold the initial prior | a belief where the return on observing the private signal is relatively high. As a consequence, (cid:12)rms in the second period strongly believe either that s = H or that s = L. This strong belief results in less overall experimentation in the second period, than under an ASG. With no dynamics in the last period of the model, the planner uses the same cuto(cid:11) belief of 0.5 that (cid:12)rms use under an ASG. Under this policy, the planner continues to underexperiment relative to (cid:12)rms in an ASG (see table 2). Summing over all three periods, the planner induces a higher measure of (cid:12)rms to adopt the stochastic technology. In addition, by having signi(cid:12)cantly more (cid:12)rms adopt in the (cid:12)rst period, the planner increases the odds that the public signal will be more informative in the initial period. When the public signal is uninformative, however, the generation of information under the planner results in (cid:12)rms experimenting less in later periods of the model relative the ASG equilibrium. The remainder of this paper formally describes the model. 3 The Model In this section, I formally describe the environment of the model and state the (cid:12)rm’s technology-adoption problem. I then de(cid:12)ne an equilibrium. 3.1 The Environment Given the environment in the example, the following generalities apply. The model has T periods, and the probability distribution over (cid:22) is 1. P((cid:22)js = H) = , 2. P((cid:22)js = L) = , and 13
3. 0:5 < < 1, where P(A) is the probability of the occurrence of event A occurring. These assumptions imply that when s = H, a (cid:12)rm is more likely to produce (cid:22) than (cid:22). In de(cid:12)ning Y = P s (cid:22)(cid:1)P((cid:22)js), I assume values of (cid:22) and (cid:22) such that (cid:22)2f(cid:22);(cid:22)g 4. Y < Y < Y , and L H 5. (Y −Y ) = (Y −Y). L H Thus the stochastic technology has an expected output higher than that of the safe technology only if s = H. The symmetry simpli(cid:12)es solving the model. Aggregateuncertaintyexistsinthisenvironment becausetheparametersisunknown. Firms have priors, denoted γ, over s 2 S. As S is a two-element set, I let γ = P(s = H). The aggregate distribution of (cid:12)rms’ beliefs is denoted (cid:14), and the distribution of (cid:12)rms’ beliefs conditional on s is (cid:14) . s 3.2 The Firm’s Problem The (cid:12)rm’s problem is to maximize totaldiscounted quantity produced over T periods. In each period, (cid:12)rms make their adoption decisions, produce output, and then receive their private and public signals. In all but the last period, learning has value in as much as it provides information about the stochastic technology that is useful for future adoption decisions. In this section, I (cid:12)rst explain how (cid:12)rms learn and then describe the (cid:12)rm’s dynamic technology adoption problem. 3.2.1 Learning The central feature of the (cid:12)rm’s problem is in updating the (cid:12)rm’s beliefs about the parameter s. As detailed earlier, (cid:12)rms learn by using the risky technology and by observing a public signal of aggregate quantity. To model how a (cid:12)rm learns in this environment, I use a variation of the ASG framework. Firms begin the model knowing the initial distribution of aggregate beliefs, (cid:14)^ , which I assume is a point mass. Other (cid:12)rms’ beliefs and actions are inputs into the (cid:12)rm’s 14
problem, astheyinfluence thestrengthofthepublicsignal. Overthecourseofthemodel, (cid:12)rms observe a sequence of public signals and receive an individual sequence of private signals. To track the evolution of the aggregate distribution of beliefs, a (cid:12)rm needs to infer which (cid:12)rms are experimenting with the stochastic technology. Let the distributional strategy (cid:28) (γ;(cid:14)^;(cid:28)t;Q~t) 2 [0;1] provide such a description, where (cid:28)t = f(cid:28) ;(cid:28) ;::: ;(cid:28) g t 0 1 t−1 and Q~t = fQ~ ;Q~ ;::: ;Q~ g. Given the initial distribution of beliefs, the distributional 0 1 t−1 strategy details the adoption strategy of a (cid:12)rm with belief γ given the history of past adoption strategies and of past public signals. When (cid:28) is equal to 0, the (cid:12)rm’s strategy t is to adopt the stochastic technology with probability 0. Two sequences, ((cid:28)t;Q~t), characterize the state space at time t, given (cid:14)^ . These sequences allow each (cid:12)rm to deduce how the conditional aggregate distributions of (cid:12)rms’ beliefs evolve, given the initial distribution of (cid:12)rms’ beliefs. Hence, an alternative characterization of the state space is the pair of aggregate distributions of (cid:12)rms’ beliefs conditional on s, ((cid:14) ;(cid:14) ). Naturally, as (cid:12)rms L H di(cid:11)er on the probability that s = H, they disagree on the unconditional distribution of beliefs. Notational, it is simpler to use the pair of conditional distributions of beliefs to describe the state space rather than the sequences of (cid:28) and Q~. Let f(cid:14) g = ((cid:14) ;(cid:14) ) and s L H then re-de(cid:12)ne the distributional strategy as (cid:28)(γ;f(cid:14) g), dropping the dependence on (cid:14)^ . s Using thedistributional strategy, Icande(cid:12)ne a (cid:12)rm’sexpectations ofaggregatequantity and so specify how (cid:12)rms learn from the public signal. Aggregate output, conditional on s, is given by Z Z 1 1 Q ((cid:28) ;f(cid:14) g) = Y [1−(cid:28) (x;f(cid:14) g)](cid:14) (x)dx+Y (cid:28) (x;f(cid:14) g)(cid:14) (x)dx: (2) s t s t s s s t s s 0 0 This function sums up each (cid:12)rm’s expected output to arrive at a deterministic aggregate quantity, conditional on s and ((cid:14) ;(cid:14) ). The (cid:12)rst part of Q aggregates the output of L H s (cid:12)rms using the deterministic technology, and the second part sums up the expected output of those using the stochastic technology. Conditional on s, the large number of (cid:12)rms in the economy ensures that the expected output of the stochastic technology is equal to the actual quantity produced. With Q de(cid:12)ned, I can describe how beliefs are updated through the public signal. s 15
Recall that the public signal is de(cid:12)ned as Q(cid:3) = Q+"; (3) where Q denotes actual aggregate quantity and " (cid:24) U(−b;b) for b > 0. Exogenous noise is essential to prevent (cid:12)rms from always learning the true value of s with complete certainty. Without ", (cid:12)rms could simply invert Q ((cid:28) ;f(cid:14) g) and determine the true value s t s of s.5 Using the uniform distribution with a (cid:12)nite support simpli(cid:12)es the learning process, as it implies that a (cid:12)rm either learns nothing from the public signal or learns the true value of s completely. As shown in (cid:12)gure 3, the support of Q(cid:3) di(cid:11)ers depending on the true value of s, when Q ((cid:28) ;f(cid:14) g) 6= Q ((cid:28) ;f(cid:14) g).6 For values of the public signal that L t s H t s fall in the support of Q(cid:3) for both values of s, a (cid:12)rm learns nothing because these noisy signals are equally likely to have been generated in an economy where s = L as in one where s = H. For values of Q(cid:3) that are possible only given one of the values of s, a (cid:12)rm immediately learns the true value of s. The probability that (cid:12)rms learn the true value of s from the public signal is simply the odds of drawing an " that produces a Q(cid:3) that is uniquely identi(cid:12)ed with a particular underlying value of s. This probability is equal to Q ((cid:28) ;f(cid:14) g)−Q ((cid:28) ;f(cid:14) g) p~((cid:28) ;f(cid:14) g) = H t s L t s : (4) t s 2b A (cid:12)rm is more likely to learn the true value of s as the di(cid:11)erence in the conditional aggregate quantities increases. Also, as the support of the error term " increases or as the signal gets noisier, the probability of learning anything from the public signal decreases. As expectations of Q do not depend on individual beliefs, if (cid:12)rms agree s on ((cid:28) ;f(cid:14) g), they all have the same probability of learning everything from the public t s signal. Private signals inform (cid:12)rms through a learning-by-doing mechanism. The private signals that (cid:12)rms observe are their individual outputs, f(cid:22);Y;(cid:22)g. Using Bayes rule, a 5As equation 4 makes clear, b > QH((cid:28)t;f(cid:14)sg)−QL((cid:28)t;f(cid:14)sg) must hold in order for the noisy component 2 of Q(cid:3) to prevent a (cid:12)rm from completely learning s. 6Q H =Q L only if the measure of adopting (cid:12)rms equals zero. 16
Figure 3: Conditional Support of Q(cid:3) Q* = Q + (cid:1) H Q* = Q + (cid:1) L Q - b Q - b Q Q Q + b Q + b L H L H L H (cid:12)rm’s posterior belief, given a prior belief γ, is (1− )γ γ(γ) = ; if observe (cid:22), or (1− )γ + (1−γ) γ γ(γ) = ; if observe (cid:22), or (5) γ +(1− )(1−γ) γ; if observe Y: Note that 0 (cid:20) γ (cid:20) γ (cid:20) γ (cid:20) 1, where strict inequalities hold for γ 2 (0;1). Because the safe technology is deterministic, (cid:12)rms that observe Y learn nothing about s. In cases in which there is no confusion over the prior belief, I suppress the posterior functions’ dependence on γ. 17
With the flow of public and private signals of s, the conditional distributions of aggregatedistributionsevolvedeterministically. Torepresent thecaseinwhichthepublic signal is informative, I let (cid:14) and (cid:14) denote a point mass of beliefs at γ = 0 and γ = 1 0 1 respectively. When the public signal is uninformative, I let (cid:14)~ ((cid:28) ;f(cid:14) g) return the pair of t s posterior conditional distributions of aggregate beliefs, given the distributional strategy and the prior conditional distributions. 3.2.2 Value Function The (cid:12)rm’s problem is to pick a strategy over the choice of technologies to use in each period. Its objective is to maximize total discounted output, which involves computing the returns that learning has on future output. The (cid:12)rm’s problem is easiest to solve by going backwards. In the last period, T, the (cid:12)rm’s problem given a belief γ is n (cid:16) (cid:17)o V (γ;(cid:28) ;f(cid:14)g) = max (1−e)Y +e Y γ +Y (1−γ) : (6) T T H L e2[0;1] If a (cid:12)rm chooses not to use the stochastic technology, that is e = 0, then it produces Y with certainty. If it experiments however, then it expects Y with probability γ and H Y with probability (1 − γ). As these two state variables influence how a (cid:12)rm learns, L the distribution of (cid:12)rms’ beliefs and the set of distributional strategies do not matter in the last period of the model. Solving the problem is straightforward and results in (cid:12)rms following a cuto(cid:11) rule, where only (cid:12)rms who hold a γ > 0:5 adopt the stochastic technology. This cuto(cid:11) point is a result of the assumptions of symmetry on fY ;Y;Y g. L H 18
For each period except the last, t = 1;2;::: ;T −1, the (cid:12)rm’s problem is ( (cid:16) (cid:17) V (γ;(cid:28);f(cid:14) g) =max (1−e)Y +e Y γ +Y (1−γ) + t s H L e2[0;1] (cid:18) h i (cid:12) p~((cid:28) ;f(cid:14) g) V [1;(cid:28);(cid:14) ]γ +V [0;(cid:28);(cid:14) ](1−γ) + t s t+1 1 t+1 0 h (7) [1−p~((cid:28) ;f(cid:14) g)] (1−e)V [γ;(cid:28);(cid:14)~ ((cid:28) ;f(cid:14) g)]+ t s t+1 t s ) (cid:19) (cid:16) (cid:17)i e V [γ;(cid:28);(cid:14)~ ((cid:28) ;f(cid:14) g)](cid:11)(γ)+V [γ;(cid:28);(cid:14)~ ((cid:28) ;f(cid:14) g)][1−(cid:11)(γ)] ; t+1 t s t+1 t s where (cid:11)(γ) is the probability of observing (cid:22) given the (cid:12)rm’s belief γ and (cid:28) = f(cid:28) gT . t t=1 The(cid:12)rst lineofequation7isthe(cid:12)rm’scurrent periodoutput, andtheremaining linesare the expected output over the rest of the model horizon. The second line de(cid:12)nes the case in which the public signal is informative. This case occurs with probability p~((cid:28) ;f(cid:14) g), t s and results in the aggregate distribution of (cid:12)rms’ beliefs equaling (cid:14) (point mass at 0) 0 or (cid:14) (point mass at 1). Using its beliefs, the (cid:12)rm assigns probability γ to s = H, and 1 (1 − γ) to s = L. The third and fourth lines represent the case in which the public signal is not informative. If the (cid:12)rm chooses not to adopt the stochastic technology, then it learns nothing from its individual output and so will have the same belief in the next period that it had in the current period. In this case, the (cid:12)rm’s continuation value is V (γ;(cid:28);(cid:14)~ ((cid:28) ;f(cid:14) g)). In contrast, if the (cid:12)rm adopts, then it learns by observing t+1 t s its output and enters the next period either with the posterior belief γ or with γ. The probability of observing (cid:22) is given by (cid:11) and is a function of γ. Because a (cid:12)rm has measure 0, its entry decision has no e(cid:11)ect on the evolution of aggregate beliefs. However, the distribution of (cid:12)rms’ beliefs and the distributional strategy play an important role in the (cid:12)rm’s problem through p~. When deciding whether to experiment with the stochastic technology, the (cid:12)rm takes into account the probability of learning everything through the public signal. As is clear in equation 7, a higher p~ devalues the informational gain from the private signal, making private and public signals substitutable sources of information. 19
3.3 Equilibrium The equilibrium notation used in this model is a variation on an anonymous sequential game.7 In this variation, for any belief, a (cid:12)rm’s optimal e(cid:11)ort decision has to match the prediction given by the distributional strategy (cid:28) = f(cid:28) gT . To state this equilibrium t t=1 concept formally, I (cid:12)rst de(cid:12)ne Ve(γ;(cid:28);f(cid:14) g) as the expected output of a (cid:12)rm in period t s t = 1;2;::: ;T, taking action e 2 [0;1], with belief γ, given the pair f(cid:14) g. s I can then de(cid:12)ne that a distributional strategy (cid:28) is an equilibrium if it satis(cid:12)es 8 f(cid:14) g; 8 e 2 [0;1], and for t = 1;2;::: ;T, s Ve~t(γ;f(cid:14)sg) (γ;(cid:28);f(cid:14) g) (cid:21) Ve (γ;(cid:28);f(cid:14) g); (8) t s t s where e~(γ;f(cid:14) g) = (cid:28) (γ;f(cid:14) g). This condition is an optimality restriction, requiring that s t s (cid:28) assigns the decision e to a (cid:12)rm that maximizes that (cid:12)rm’s objective function, given t the state variables. 4 Analysis of the Firm’s Problem In this section, I show how the gain from additional information is nonmonotonic in beliefs. I then prove that the optimal policy rule is a cuto(cid:11) rule. Intuitively, the(cid:12)rm’sproblemcanbereducedtoacomparisonbetweenthenetpresent value of the expected output of the (cid:12)rm over the model horizon given that it experiments with the stochastic technology in the current period versus the case where it does not 7Mitchell (1997) extends the equilibrium concept of Jovanovic and Rosenthal (1988) for economies with aggregate uncertainty, such as the one used in this paper. 20
experiment and uses the deterministic technology. We know from equation 7 that h i V0 (γ;(cid:28);f(cid:14) g) =Y +(cid:12)p~((cid:28) ;(cid:14)) V (1;(cid:28);(cid:14) )γ +V (0;(cid:28);(cid:14) )(1−γ) + t s t t+1 1 t+1 0 (cid:12)[1−p~((cid:28) ;(cid:14))]V (γ;(cid:28);(cid:14)~ ); t n+1 h i V1 (γ;(cid:28);f(cid:14) g) =Y γ +Y (1−γ)+(cid:12)p~((cid:28) ;(cid:14)) V (1;(cid:28);(cid:14) )γ +V (0;(cid:28);(cid:14) )(1−γ) + t s H L t t+1 1 t+1 0 h i (cid:12)[1−p~((cid:28) ;(cid:14))] V (γ;(cid:28);(cid:14)~ )(cid:11)(γ)+V (γ;(cid:28) ;(cid:14)~ )[1−(cid:11)(γ)] ; t t+1 t+1 t (9) where (cid:14)~ ’s dependence on (cid:28) and f(cid:14) g have been suppressed. Subtracting V1−V0 results t s t t in the indi(cid:11)erence condition 0 = Y γ +Y (1−γ)−Y + H L h i (cid:12)[1−p~((cid:28) ;(cid:14))] V (γ;(cid:28);(cid:14)~ )(cid:11)(γ)+V (γ;(cid:28) ;(cid:14)~ )[1−(cid:11)(γ)]−V (γ;(cid:28);(cid:14)~ ) : (10) t t+1 t+1 t t+1 The (cid:12)rst line of the indi(cid:11)erence condition compares the return on current period output from experimentation and the second line compares the future return. By assumption, the term in the (cid:12)rst line is negative for γ = 0 and positive for γ = 1, and has a positive, linear slope in γ. Because of the properties of Bayes rule, the second line is non-monotonic. For γ 2 f0;1g, γ = γ = γ and so the second line is equal to zero at these two points. For intermediate values of γ, as γ < γ < γ, the value of the second line depends on the curvature of V. The following lemma proves that V is strictly convex, which, by Jensen’s inequality, implies that the second line is strictly positive for all γ 2 (0;1). Thus, private signals are valuable as they increase future expected output, but the informational gain is nonmonotonic in γ. Lemma 1. For all n = 1;2;::: ;N, V is increasing and strictly convex in γ. t Proof. Through induction, it is easy to show that V is increasing in γ. Similarly, I use t induction to show that V is strictly convex in γ. I prove convexity using the fact that t V = maxfV0;V1g and that V is strictly convex. Because V0 is a linear transformation t t t T t−1 of V , it is convex. In addition, dropping the dependence on ((cid:28);f(cid:14) g), and letting F (γ) = t s t 21
V (γ)(cid:11)(γ)+V (γ)[1−(cid:11)(γ)], shows that V1 is strictly convex for all t = 1;2;::: ;T, as, t t t−1 F (γ)(cid:11)(γ)+F (γ)[1−(cid:11)(γ)] > t t V (γ)(cid:11)(γ)+V (γ)[1−(cid:11)(γ)] = (11) t t F (γ): t Despite the nonmonotonicity of the return to receiving a private signal, the optimal policy rule for (cid:12)rms remains a cuto(cid:11) rule. Theorem 1 proves this result. Theorem 1. Under an anonymous sequential game, the distribution strategy (cid:28) is a cuto(cid:11) rule. Hence for every f(cid:14) g and t = 1;2;::: ;T, there exists a cuto(cid:11) belief γ^ such s that (cid:28) (γ;f(cid:14) g) = 0 for all γ < γ^, and t s (cid:28) (γ;f(cid:14) g) = 1 for all γ > γ^: t s Proof. See appendix A.2 for details of the proof. This result is in line with the results in classic two-armed bandit problems, in which information is also nonmonotonic in beliefs (see Jensen (1983) and Bolton and Harris (1999), for example). Now that I have de(cid:12)ned the (cid:12)rm’s problem and can solve for the optimal policy rule under an ASG, I consider the e(cid:14)ciency of the rate of adoption. Naturally, to evaluate the speed with which (cid:12)rms adopt the stochastic technology, I need to solve for the rate of adoption under a social planner. The next section details the planner’s problem. 5 Social Planner Inthissection, Ide(cid:12)netheplanner’sproblem, prove thatacuto(cid:11)rulesolves theplanner’s problem, and compare this solution to the policy rule under an ASG. I show that the standard result holds: Firms underexperiment relative to the planner in the current 22
period, ceteris paribus. I then examine the sequential generation of information under a planner. 5.1 The Social Planner’s Problem As detailed earlier, the standard social-planner problem of maximizing aggregate quantity is not applicable to this model. Therefore, I construct an uninformed planner’s problem in which the planner, who has no information himself, acts as a simple coordinating device and speci(cid:12)es strategies for agents as a function of their own beliefs. The planner’s objective is to implement a distributional strategy that results in a Paretooptimal outcome, where (cid:12)rms are not worse o(cid:11) relative to the outcome from an ASG. I let (cid:28) be the distributional strategy in an ASG, denote (cid:23) as the planner’s distributional t strategy in period t, and let (cid:23) = f(cid:23) gT . Then the planner’s problem is to choose the (cid:23) t t=1 t that solves, for all γ, (cid:26) h i V (γ;(cid:23);f(cid:14) g) =max [1−(cid:23) (γ)]Y +(cid:23) (γ) Y γ +Y (1−γ) + t s t t H L (cid:23)t h i (cid:12)p~((cid:23) ;f(cid:14) g) V (1;(cid:23);(cid:14) )γ +V (0;(cid:23);(cid:14) )(1−γ) + t s t+1 1 t+1 0 (cid:27) (12) (cid:12)[1−p~((cid:23) ;f(cid:14) g)]E [V ] ; t s (cid:23)t t+1 s.t. V (γ;(cid:23);f(cid:14) g) (cid:21) V (γ;(cid:28);f(cid:14) g); t s t s where E [V ] = [1−(cid:23) (γ)]V [γ;(cid:23);γ~]+ (cid:23)t t+1 t t+1 h i (cid:23) (γ) V (γ;(cid:23);(cid:14)~ )(cid:11)(γ)+V (γ;(cid:23);(cid:14)~ )[1−(cid:11)(γ)] : (13) t t+1 t+1 This problem is almost identical to the (cid:12)rm’s problem, except that the planner explicitly takes into account how the measure of adopting (cid:12)rms a(cid:11)ects the strength of the public signal. As can be seen in the problem above, (cid:23) appears both in the expected output of t the (cid:12)rm and in the argument of p~. In the (cid:12)rm’s problem, the (cid:12)rm took p~ as given and chose an experimentation strategy. 23
Bysetting (cid:23) (cid:17) (cid:28), theplanner canmimic theoutcomeobtainedunder anASG.Ishow, however, thattheplannercanmakeall(cid:12)rmsbettero(cid:11)byinducingmoreexperimentation. Having more (cid:12)rms adopt the stochastic technology bene(cid:12)ts all (cid:12)rms through a higher p~, even those marginal (cid:12)rms that otherwise would not adopt under an ASG. The following theorem proves that any solution to the social planner’s problem is a cuto(cid:11) rule. Theorem 2. Any solution to the planner’s problem is a cuto(cid:11) rule (cid:23)^, such that all (cid:12)rms with beliefs greater than (cid:23)^ enter with probability 1 and all (cid:12)rms with beliefs less than (cid:23)^ enter with probability 0. Firms holding the cuto(cid:11) rule employ a mixed strategy. Proof. The social planner’s problem can be divided into two stages. First, the planner picks the measure of (cid:12)rms that will adopt the risky technology. Conditional on this measure of entry, the planner then chooses the strategy that maximizes (cid:12)rms’ expected output, such that each (cid:12)rm is weakly better o(cid:11) relative to the outcome in an ASG. From the proof of theorem 1, we know that cuto(cid:11) rules maximize (cid:12)rm’s expected output. Hence, for any measure of entry that the planner implements, a cuto(cid:11) rule is the optimal strategy. Unlike inanASG,typically there isnot aunique solution tothesocialplanner’s problem. However, given that cuto(cid:11) rules solve the planner’s problem, I can compare the cuto(cid:11) rules that solve the planner’s problem with the distributional strategies in an ASG. The following proposition shows that the planner, in the current period, always induces a weakly greater measure of (cid:12)rms to experiment relative to the outcome in an ASG, ceteris paribus. The planner experiments strictly more for \interior" solutions. Within this environment, I de(cid:12)ne an interior solution as one where the cuto(cid:11) belief is a mass point in the distribution of (cid:12)rms’ beliefs and those (cid:12)rms at the cuto(cid:11) belief employ a mixed strategy. Proposition 1. Compared with the measure of adopting (cid:12)rms under an ASG, a social planner chooses a weakly higher level of experimentation in the current period, given the same set of state variables. Further, given that (cid:28) is an interior solution to the (cid:12)rm’s t problem under an ASG, all cuto(cid:11) rules that solve the equivalent planner’s problem are 24
strictly less than (cid:28) . Hence, for interior solutions, the planner induces a strictly higher t level of experimentation relative to the measure of adopting (cid:12)rms under an ASG. Proof. Because p~isanincreasing functionof themeasure of(cid:12)rms that adoptthe stochastic technology, the planner must have at least the same measure of (cid:12)rms experiment as occurs in an ASG. This implies that any cuto(cid:11) rule used by the planner either has a lower cuto(cid:11) belief, or employs the same cuto(cid:11) belief but uses a weakly higher probability of entry compared with the cuto(cid:11) rule in an ASG. I now show that, given that (cid:28) is an interior solution, the planner has a strictly higher t measure of (cid:12)rms experiment with the stochastic technology. Because (cid:28) is an interior t solution, I can evaluate the derivative of V from the planner’s problem at the cuto(cid:11) t belief. If γ^ denotes the cuto(cid:11) belief, then the derivative is dV t =−Y +Y γ +Y (1−γ)+ d(cid:23) H L t h i dp~ (cid:12) V (1;(cid:23);(cid:14) )γ +V (0;(cid:23);(cid:14) )(1−γ)−E [V ] + (14) d(cid:23) t+1 1 t+1 0 (cid:23)t t+1 t dE [V ] (cid:12)[1−p~((cid:23) ;f(cid:14) g)] (cid:23)t t+1 : t s d(cid:23) t Because of the envelope condition, the (cid:12)rst and third lines of this derivative are equal to the (cid:12)rm’s indi(cid:11)erence condition, shown by equation 10. We know that in an ASG the (cid:12)rm holding the cuto(cid:11) belief is indi(cid:11)erent between the two technologies. The second line of equation 14, the return from the public signal, is negative, as dp~ < 0 and the d(cid:23)t term in the brackets is positive for γ 2 (0;1). Thus for the marginal (cid:12)rm under an ASG, dVtj < 0. Hence the optimal cuto(cid:11) point for the planner is strictly less than γ^, the d(cid:28)t γ^ cuto(cid:11) belief under an ASG. This result is inline with the literature andnot unusual given the externalities associated with the public signal. Its importance lies in showing that all the optimal cuto(cid:11) rules a planner might implement are weakly less than (cid:28) . This proposition also has signi(cid:12)cant t implications for the generation of information within the economy. In the (cid:12)rst period of the model, the state variables, or initial conditions, are the same for the planner’s problem and for the (cid:12)rms in an ASG. By proposition 1, we know that, except in the case 25
of corner solutions, the planner will induce a higher level of adoption than will (cid:12)rms in a market equilibrium. This di(cid:11)erence in strategies implies not only thatp~is higher under a planner but also that (cid:12)rms receive more private signals. As shown in the example at the beginning of the paper, this faster generation of information is particularly signi(cid:12)cant in that it lowers the return on information in later periods of the model. In certain cases, this devaluation of information leads to the planner underadopting relative to (cid:12)rms in the corresponding ASG. The following proposition formally states this result: Proposition 2. Relative to an uninformed social planner, (cid:12)rms do not always underadopt the stochastic technology in all periods. Rather, parameters values exist such that (cid:12)rms overadopt the risky technology in each period of the model except the (cid:12)rst. The three-period model considered earlier provides an example of this underadoption result, which stands in contrast to the literature. In Rob (1991), the planner always weakly experiments more than (cid:12)rms do in a market equilibrium. Further, Bolton and Harris (1999) show that (cid:12)rms working as a team experiment to a larger extent than (cid:12)rms acting individually at any point in time. This proposition builds upon these results by demonstrating that (cid:12)rms underexperiment relative to the planner in the initial period and when summing over all periods of the model. This underadoption result, however, does not necessarily hold in each period of the model. 6 Conclusion In this paper I examine the speed of adoption by (cid:12)rms in an environment with private and public signals. The standard result holds: All else being equal, the planner chooses a higher level of experimentation relative to (cid:12)rms in a market equilibrium. More interesting, I study the generation of information over the model’s lifetime. I (cid:12)nd that the planner seeks to exploit the gains of experimenting earlier and so generates more information in the initial periods of the model. Generating more private signals in the beginning of the model results in (cid:12)rms decreasing the value of additional information later in the model. Hence (cid:12)rms, under a planner, might underadopt relative to (cid:12)rms in a market equilibrium in the latter periods of the model. 26
A Appendix A.1 Computational Details This section of the appendix details the computational techniques used to solve both the (cid:12)rm’s problem and the planner’s problem. I then show why the solution to the planner’s problem in the last period is unique. The large dimension of the second period’s state space, all possible f(cid:14) g, makes s implementing the standard backward induction algorithm di(cid:14)cult. Therefore, I solve for the optimal adoption strategies under an ASG equilibrium and under an uninformed social planner by using a two-step optimization algorithm. I (cid:12)rst explain this algorithm within the context of solving the model as an ASG. In the third period of the model, the static nature of the (cid:12)rm’s problem implies that the optimal cuto(cid:11) rule is one-half. In the second period, a bisection routine can be used to solve for the optimal cuto(cid:11) rule by using the (cid:12)rm’s indi(cid:11)erence condition, equation 10, for a given f(cid:14) g. The two relevant conditional distributions of (cid:12)rms’ beliefs in the second s period, however, depend upon (cid:12)rms’ adoption strategies in the (cid:12)rst period. Because the optimal adoption strategy in the (cid:12)rst period depends on adoption strategies in the second period, I employ a pair of bisection routines to compute the optimal adoption strategies in the (cid:12)rst and second periods. The outer bisection routine solves the (cid:12)rm’s indi(cid:11)erence conditioninthe(cid:12)rstperiod. Tocompute(cid:12)rms’expected utilityinthesecond and third periods, however, this outer routine calls the second bisection routine. This second, inner routine solves for the optimal cuto(cid:11) belief in the second period, given a guess of the policy rule in the (cid:12)rst period. In this manner, I simultaneously numerically solve for the optimal cuto(cid:11) rules in the (cid:12)rst and second periods of the model in an ASG. I use a similar algorithm to solve the planner’s problem. The planner’s problem has an additional twist as there are multiple solutions. I choose to (cid:12)nd the optimal cuto(cid:11) belief that is closest to the cuto(cid:11) rule used in an ASG. To (cid:12)nd this particular solution to the planner’s problem, I once again use a two-step optimization routine. In place of the bisection routines, however, I employ an iterative search over the set of possible cuto(cid:11) beliefs and mixed strategies. This iterative search starts with a cuto(cid:11) belief of one-half 27
andamixed strategy inwhich no(cid:12)rmholding cuto(cid:11) belief enters.8 Every (cid:12)rm’sexpected utility is computed under this policy rule. Then, I consider a policy rule that involves slightly more (cid:12)rms experimenting. If a mass of (cid:12)rms exist at the cuto(cid:11) point, I increase the probability of those (cid:12)rms entering. Otherwise, I lower the cuto(cid:11) belief to the next lowest mass point of (cid:12)rms’ beliefs and consider a low probability of adoption. Firms’ expected utilities arethen re-evaluatedandcompared withtheresults under the previous cuto(cid:11) rule. If all (cid:12)rms are weakly better o(cid:11), this new policy rule Pareto dominates the previous one. I then try a policy rule with even more experimentation. At some point, the outcome under the new policy rule will not Pareto dominate the previous outcome. When this occurs, the previous policy rule must generate a Pareto-optimal outcome. By construction, it is not Pareto dominated by the outcome associated with a policy rule with a higher cuto(cid:11) belief. Further, from the proof of theorem 1, we know that if an outcome associated with a rule is not Pareto dominated by the outcome generated by a cuto(cid:11) belief slightly below it, then this outcome is not Pareto dominated by any outcome generated by policy rules with lower cuto(cid:11) beliefs.9 This policy rule is also the \highest" cuto(cid:11) rule that solves the planner’s problem, or is the one closest to the outcome under an ASG. To check that I solved for the correct policy rule under the planner, I compare each (cid:12)rm’s outcome under each regime and check that the planner’s rule induces weakly more adoption and that each (cid:12)rm under the planner has weakly higher expected utility. In the three-period example, the planner’s policy rule is unique. Given the solution algorithm I used, other possible policy rules must use cuto(cid:11) beliefs below the ones I reported. Thisisnotpossibleinthe(cid:12)rst period,whereall(cid:12)rmsadopt(acornersolution). Further, inthesecondperiod,inducing anypositive measureof(cid:12)rmswiththelower belief to adoptdoesnot satisfytheplanner’s constraint thattheplanner’s rule Pareto dominate the equivalent ASG outcome. Hence, the reported solutions to the planner’s problem in the example are unique. 8It is easy to show that the cuto(cid:11) rule in ASG is strictly less than one-half in all periods except the last. So by proposition 1 we know that all solutions to the planner’s problem are less than one-half. 9Thisresultfollowsfromthefactthatasγ approacheszero,theexpectationofcurrentperiodoutput undertheriskytechnologyfalls. Inaddition,thereturnfromlearning-by-doingfallsasmore(cid:12)rmsenter, decreasing the future output gains from experimenting. 28
A.2 Proof of Theorem 1 Proof. This is a proof by induction that closely follows the proof laid out for the (cid:12)rst theorem in Jensen (1983). In this proof, I suppress functions’ dependence on ((cid:28);f(cid:14) g). s Referring to the notation laid out in equation 9, I know that the assumptions on the production technologies imply that 8 t = 1;2;:::T, V0 (0) > V1 (0); t t V1 (1) > V0 (1): t t Let D (γ) = V1(γ)−V0(γ). To prove the theorem, it su(cid:14)ces to show that D is strictly t t t t increasing in γ for all t. I know that D (γ) = Y γ + Y (1 − γ) − Y, which is strictly T H L increasing in γ. By induction, I assume that D is strictly increasing in γ. Further, t+1 for t = 1;2;:::T −1, de(cid:12)ne Vf1;0g as the expected value of experimenting in the current t period, not experimenting in the next period, and then continuing optimally thereafter. Let Vf0;1g be similarly de(cid:12)ned. Finally, let D1 = Vf1;0g−V1 and D0 = Vf0;1g−V0. It is t t t t t t t straightforward to show that Vf1;0g (γ)−Vf0;1g (γ) = (1−(cid:12))D (γ) using the properties t t T of Bayes rule. Then I can show that D (γ) = D0 (γ)−D1 (γ)+(1−(cid:12))D (γ); (15) t t t T where I know that D is strictly increasing in γ. Turning to the (cid:12)rst two terms, I know T that D1 (γ) = Vf1;0g (γ)−V1; (16) t t(cid:16) t(cid:17) = (cid:12) V0 −V (17) t+1 t+1 as the current-period output in both terms is the sam(cid:16)e. Dependin(cid:17)g on the private signal received, the right-hand-side equals either 0 or (cid:12) V0 − V1 . By induction, t+1 t+1 V0 (γ)−V1 (γ) is decreasing in γ. Similarly, it is easy to show that D0(γ) is increasing t+1 t+1 t in γ. Hence, D is strictly increasing in γ. t 29
References Banerjee, A. (1992): \A Simple Model of Herd Behaviour," Quarterly Journal of Economics, 107, 797{817. Berry, D., and B. Fristedt(1985): Bandit Problems.Chapman andHall, New York. Bikhchandani, S., D. Hirshleifer, and I. Welch (1992): \A Theory of Fads, Fashion,CustomandCulturalChangeasInformationalCascades," Journal of Political Economy, 100, 992{1026. Bolton, P., and C. Harris (1999): \Strategic Experimentation," Econometrica, 67, 349{374. Dasgupta, A. (2002): \Coordination, Learning, and Delay," FMG Discussion Paper No. 494. Jensen, R. (1983): \Innovation Adoption and Di(cid:11)usion When There are Competing Innovations," Journal of Economic Theory, 29, 161{171. Jovanovic, B., and R. Rosenthal (1988): \Anonymous Sequential Games," Journal of Mathematical Economics, 17, 77{87. Mitchell, M. (1997): \Bayesian Learning from Others in Competitive Equilibrium," University of Minnesota mimeo. Rob, R. (1991): \Learning and Capacity Expansion under Demand Uncertainty," Review of Economic Studies, 58, 655{675. Rothschild, M. (1974): \A Two-Armed Bandit Theory of Market Pricing," Journal of Economic Theory, 9, 185{202. 30
Cite this document
Adam Copeland (2004). Learning Dynamics with Private and Public Signals (FEDS 2004-67). Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series. https://whenthefedspeaks.com/doc/feds_2004-67
@techreport{wtfs_feds_2004_67,
author = {Adam Copeland},
title = {Learning Dynamics with Private and Public Signals},
type = {Finance and Economics Discussion Series},
number = {2004-67},
institution = {Board of Governors of the Federal Reserve System},
year = {2004},
url = {https://whenthefedspeaks.com/doc/feds_2004-67},
abstract = {This paper studies the evolution of firms' beliefs in a dynamic model of technology adoption. Firms play a simple variant of the classic two-armed bandit problem, where one arm represents a known, deterministic production technology and the other arm an unknown, stochastic technology. Firms learn about the unknown technology by observing both private and public signals. I find that because of the externality associated with the public signal, the evolution of beliefs under a market equilibrium can differ significantly from that under a planner. In particular, firms experiment earlier under the planner than they do under the market equilibrium and thus firms under the planner generate more information at the start of the model. This intertemporal effect brings about the unusual result that, on a per period basis, there exist cases where firms in a market equilibrium over-experiment relative to the planner in the latter periods of the model.},
}