Financial Stability Implications of Generative AI: Taming the Animal Spirits
Abstract
This paper investigates the impact of the adoption of generative AI on financial stability. We conduct laboratory-style experiments using large language models to replicate classic studies on herd behavior in investment decisions. Our results show that AI agents make more rational decisions than humans, relying predominantly on private information over market trends. Increased reliance on AI-powered investment advice could therefore potentially lead to fewer asset price bubbles arising from animal spirits that trade by following the herd. However, exploring variations in the experimental settings reveals that AI agents can be induced to herd optimally when explicitly guided to make profit-maximizing decisions. While optimal herding improves market discipline, this behavior still carries potential implications for financial stability. In other experimental variations, we show that AI agents are not purely algorithmic, but have inherited some elements of human conditioning and bias.
Finance and Economics Discussion Series Federal Reserve Board, Washington, D.C. ISSN 1936-2854 (Print) ISSN 2767-3898 (Online) Financial Stability Implications of Generative AI: Taming the Animal Spirits Anne Lundgaard Hansen, Seung Jung Lee 2025-090 Please cite this paper as: Hansen, Anne L., and Seung Jung Lee (2025). “Financial Stability Implications of Generative AI: Taming the Animal Spirits,” Finance and Economics Discussion Series 2025-090. Washington: Board of Governors of the Federal Reserve System, https://doi.org/10.17016/FEDS.2025.090. NOTE: Staff working papers in the Finance and Economics Discussion Series (FEDS) are preliminary materials circulated to stimulate discussion and critical comment. The analysis and conclusions set forth are those of the authors and do not indicate concurrence by other members of the research staff or the Board of Governors. References in publications to the Finance and Economics Discussion Series (other than acknowledgement) should be cleared with the author(s) to protect the tentative character of these papers.
Financial Stability Implications of Generative AI: Taming the Animal Spirits ∗ Anne Lundgaard Hansena,b and Seung Jung Leea a BoardofGovernorsoftheFederalReserveSystem bFederalReserveBankofRichmond September 25, 2025 ThispaperinvestigatestheimpactoftheadoptionofgenerativeAIonfinancialstability. Weconduct laboratory-styleexperimentsusinglargelanguagemodelstoreplicateclassicstudiesonherdbehavior in investment decisions. Our results show that AI agents make more rational decisions than humans, relying predominantly on private information over market trends. Increased reliance on AI-powered investmentadvicecouldthereforepotentiallyleadtofewerassetpricebubblesarisingfromanimalspirits thattradebyfollowingtheherd. However,exploringvariationsintheexperimentalsettingsrevealsthat AIagentscanbeinducedtoherdoptimallywhenexplicitlyguidedtomakeprofit-maximizingdecisions. Whileoptimalherdingimprovesmarketdiscipline, thisbehaviorstillcarriespotentialimplicationsfor financialstability. Inotherexperimentalvariations,weshowthatAIagentsarenotpurelyalgorithmic, buthaveinheritedsomeelementsofhumanconditioningandbias. Keywords: Herdbehavior,largelanguagemodels,AI-poweredtraders,financialmarkets,financialstability. JELCodes: C90,D82,G11,G14,G40. ∗ TheauthorsthankJeffreyS.Allen,MarcoCipriani,ErikHeitfield,SophiaKazinnik,DanLi,MollyMahar,and NitishSinhaandparticipantsattheFinancialStabilityWorkshopattheFederalReserveBoardforvaluablefeedback. Theviewsexpressedinthispaperaresolelythoseoftheauthorsanddonotreflecttheopinionsofthe FederalReserveBankofRichmondortheBoardofGovernorsoftheFederalReserveSystem. GenerativeAIwas usedintheproductionofthispaper. Allerrorsaretheauthors’. Contactinformation: anne.hansen@rich.frb.org, seung.j.lee@frb.gov. 1
...[T]here is the instability due to the characteristic of human nature that a large proportion of our positiveactivitiesdependonspontaneousoptimismratherthanmathematicalexpectations[...]. Most, probably,ofourdecisions[...] canonlybetakenastheresultofanimalspirits—aspontaneousurge toactionratherthaninaction,andnotastheoutcomeofaweightedaverageofquantitativebenefits multipliedbyquantitativeprobabilities. —JohnMaynardKeynes 1. Introduction Humanirrationalityisakeydriverofthebuild-upoffinancialvulnerabilities,contributingtoassetprice bubblesandbankingcrises. Historyoffersnumerousexamples,includingTulipManiainthe17thcentury, the South Sea Bubble, the dot-com boom, the 2008 financial crisis, the 2010 Flash Crash, and the GameStopshortsqueeze. Awell-establishedbodyofresearchhighlightstheroleofpsychologicaland emotional factors, coined animal spirits or irrational exuberance, in these periods of boom and bust (Angeletosetal.,2018;Grauwe,2012;Shiller,2005). Understandingtheroleofanimalspiritsforfinancialstabilityisalreadyachallenge,giventheunpredictablenatureofhumanbehavior. Now,anewandunknownagenthasenteredtheequation: decisions poweredbygenerativeAI.HumansincreasinglyrelyonAIforinformationgatheringanddecisionmaking, whether as a co-pilot or as autonomous agents.1 As generative AI is reshaping workflows across institutionsandindividuals,thequestionarises: HowmighttheincreasedrelianceongenerativeAIimpactfinancialstability? Specifically,willgenerativeAIexaggerateordampentheroleofanimalspirits inthebuild-upoffinancialvulnerabilities? Twocompetinghypothesesemerge. Ontheonehand,AIisfundamentallyalgorithmic,operatingina setofinstructionsandgroundedinlogicandrationaldecisionmaking.2 IfAI-guideddecisionsreplace 1 Hartleyetal.(2024)conductedasurveycoveringtheU.S.labormarketshowingthatthepercentageofworkers adoptingLLMssteadilyincreasedfrom30%to45%betweenendof2024andmid2025. Theseadoptionrates aremassiveinthecontextofanearliersurveyfrom2018showingthatlessthan6%offirmsusedanyAI-related technology,i.e.,automated-guidedvehicles,machinelearning,machinevision,naturallanguageprocessing,or voicerecognition(McElheranetal.,2023). 2 This algorithmic foundation suggests that AI systems may enhance human judgment by providing consistent assessmentslesspronetobiases. TheseminalpaperbyKleinbergetal.(2018)arguesthatmachine-learningal- 2
humanintuition,theresultcouldbeareductionintheinfluenceofanimalspirits,leadingtomorestable financialmarkets. Ontheotherhand,generativeAImodels,suchaslargelanguagemodels(LLMs),are trainedonvastamountsofdata,sourcedfrombothrigorousmaterials,suchasacademicresearch,and the,attimes,chaoticdiscourseofsocialmediaplatformssuchasTwitter(X)andReddit. Consequently, generativeAImayinheritandevenamplifyhumanbiasesandirrationaltendencies(Hayesetal.,2024; Jiangetal.,2023;Koralus&Wang-Maścianica,2023;Zhu&Griffiths,2024). Moreover,manyAImodels undergoreinforcementlearningfromhumanfeedback(seeWangetal.,2024forasurvey),optimizing forengagementandpersuasionratherthanpurerationality. Thissuggeststhatinsteadofmitigatingthe build-upoffinancialvulnerabilities, AIcouldexacerbatefinancialturbulencedrivenbyanimalspirits. Finally, DanielssonandUthemann(2024)arguethatAIadoptionwilllikelycausemoreintensefuture crises due to AI’s ability to respond quickly to shocks. The net effect of AI’s involvement in financial decisionmakingisthereforeunclear. Thispaperexploresthesecompetingperspectives,examiningtheimplicationsoftheexpandingroleof AIineconomicdecisionmakingforfinancialstability. Wefocusonthepotentialfinancialvulnerabilities driven by irrational tendencies of decision makers in financial markets, and more specifically on herd behavior. Herd behavior—where investors ignore private signals and mimic the decisions of others, potentially driving prices away from fundamental values—is a well-documented form of irrationality thatcancauseassetpricebubbles(Galariotisetal.,2016;Hsiehetal.,2020). Evenincaseswhereherding isoptimal,thistypeofbehaviorcancontributetofinancialstabilityeventsbyincreasingmarketvolatility and accelerating price movements which can reverse quickly in the event of new information arriving (Bikhchandani&Sharma,2001;Chamley,2003). Weconductlaboratory-styleexperimentsusingLLMstoreplicateclassicstudiesonherdbehaviorin investment decisions. These experiments provide novel insights into the behavioral tendencies of AI agents,layingamicro-foundationforfutureworkonfinancialstabilityinanAI-poweredeconomy. Our micro-level approach is motivated by Horton (2023), who argues that LLMs can be treated as agents whosedecisionsandbehaviorcanbestudiedinparalleltostudiesofhumanbehavior. Westressthatwe arenotattemptingtomodelarealisticfinancialmarket,butrathertozoominonthebehaviorofLLMsin acontrolledsetting.3 Inparticular,oursettingallowsustoobservedetaileddataoneachdecisionmade gorithmscanimprovehumandecisionmakinginthecontextofbaildecisions,especiallyifcarefullyintegrated intoaneconomicframework. Similarly,Lietal.(2024)demonstrateshowAI-enabledcreditscoringmodelscan increaseloanapprovalratesforunder-servedpopulationswhilesimultaneouslyreducingdefaultrates,primarily throughthealgorithmicprocessingofweaksignalsthathumansmightoverlookorinconsistentlyevaluate. 3 While a complementary study of financial stability implications of generative AI for macro-level outcomes is 3
byAIagents,includingtheirreasoning. WefocusontheexperimentbyCiprianiandGuarino(2009),whichinvestigatesherdbehavioramong 32financialmarketprofessionalsthroughacontrolledlaboratorysetting. Theirsettingisgroundedin an established model of herd behavior (Avery & Zemsky, 1998), which has been rigorously tested in laboratory-style experiments (Cipriani & Guarino, 2005; Drehmann et al., 2005). Whereas the previousexperimentalliteratureconductedexperimentswithundergraduatestudents,CiprianiandGuarino (2009)recruitedfinancialmarketprofessionals. Thesedetailsmakestheirstudyparticularlyinteresting asahumanbenchmarkforourpurpose. Afterall,itistheactionsoffinancialmarketprofessionals—not students—thatshapereal-worldmarketdynamicsandimpactthestabilityofthefinancialsystem. In this setting, herd behavior arises when investors disregard private information to follow market trends.4 This definition is narrower than those found in broader and recent discussions of generative AIadoptionandherding. Forexample,DanielssonandUthemann(2024)andareportfromtheFinancialStabilityBoard(2024)raisetheissuethatwide-spreadAI-usageamongfinancialmarketparticipants may streamline modeling approaches leading to increased market correlation. Herding is also fundamentallydifferentfromtheconceptofcollusion,whichisstudiedinthecontextofAIadoptioninDou et al. (2025) who conduct simulated experiments to prove that autonomous reinforcement learning algorithms collude by coordinating their trading decisions to earn supra-competitive profits. Collusion involves strategic and deliberate coordination. In contrast, herding is uncoordinated imitation arising frominformationalspilloversorpsychologicalbiases. Whileconcernsofmodelmonocultureandcollusionarecertainlyrelevanttoconsider,theyarenotthefocusofourwork. Weaimtoprovideinsightsinto behavioralaspectsofLLMsandtheirtendencytoherdininvestmentdecisions,andleavetheimpactof AIadoptiononthedynamicsamongmarketparticipantsforfutureresearch. WereplicatetheCiprianiandGuarino(2009)experimentsusingtradingdecisionsofLLMs(whichwe refertoasAIagents)inplaceofdecisionsmadebyhumanparticipants. Ourimplementationcloselyfollowingtheoriginalexperimentaldesign. Specifically,weprompttheLLMswithinstructionsthatmirror certainlyofinterest,weconjecturethatsuchanundertakingwouldbedifficulttoachieve.First,therearetheusual problemsofdisentanglingeffectsfromhumanbehaviorfromconfoundingfactorsanddistinguishingintentional herdingfromsituationswheretheprivateinformationofmanytradershappentocoincide,coinedas“spurious herding”inBikhchandaniandSharma(2001). Second,theworldisstillinearlystagesofgenerativeAIadoption withimpactsyettobeseen. 4 Arelated conceptis momentum, which describes a situationwherehigh-return stocks continue toexhibit high returns.Whileherdingandmomentumareclearlyrelated,theyareinherentlydifferent:herdingreferstobehavior relativetoinvestors’privateinformation,whereasmomentuminvolvesbehaviorrelativetopastpricesonly. 4
thosegiventofinancialprofessionalsinthehumanstudy. ThissetupallowsustocompareLLMdecision makingwiththehumanresultstakendirectlyfromCiprianiandGuarino(2009): the“AIlaboratory”versusthe“humanlaboratory.” Togeneralizeourresultsasmuchaspossible,weusefourdifferentLLMs5 andaveragetheresultsacrossmodels. OurresultsshowthatAIagentsdemonstratesignificantlymorerationaltradingdecisions,whichare decisionsbasedonprivateinformation,comparedtohumanparticipants. Acrossdifferentparameterizationsoftheexperiment,AIagentsmaderationaldecisionsbetween61-97%ofthetime,substantially exceeding the 46-51% range observed in human participants. The AI laboratory also exhibited fewer informationcascades,whereinvestorstradebasedontheactionsofotherinvestorsratherthanrelying on their own private information. Specifically, information cascades occurred between 0-9% of the AI decisions,comparedtoaround20%forhumans. Notably,whenAIagentsdidengageincascadetradingbehavior,theytradedagainstmarkettrends(contrarianbehavior)ratherthanfollowingtheherd. In addition,weshowthatAIagentsdonotexhibitcompletelyirrationalbehavior(ormaketradingerrors), whichcontrastswiththeresultsoftheexperimentsconductedwithhumanparticipants. Weinterpret theseresultsasearlyindicationsthatafuturewhereinvestorsaremoreimpactedbyadvicegenerated byLLMscanpotentiallyinvolvefewerassetpricebubblesarisingfromherdbehavior. However, studying the rationals provided by LLMs alongside the trading decisions reveals that AI agents rely too heavily on their private information. As market trends can reflect the private information of others, it can be optimal from a profit-maximizing perspective to take trading history into accountwhenmakinginvestmentdecisions. WeshowthatAIagentsfailtoacknowledgethis,leadingto occasional suboptimal choices. We therefore implement AI agents that are prompted with additional guidanceonoptimaldecisionmaking;instructionsthattheirhumancounterpartsdidnotreceiveinthe originalfieldexperiment. Thisexperimentservesasaproxyforthefine-tuningthatfinancialinstitutions wouldlikelyimplementifadoptinggenerativeAIforpoweringtrading. WeshowthattheseoptimalAI agentsdoengageincascadetradingwhenoptimal,butthattheystillremainreluctanttoherding. Finally,weexplorevariationsoftheexperimenttoexaminewhetherdifferentconditionsleadtostronger evidenceofherdbehavior. Unliketheoriginalstudy,scalingupthelengthoftheexperimentsandmodifyingtheirparametersisbothcost-effectiveandefficientwithLLMs. Forexample,wetesttheimpactof re-labelingtheprivatesignalsthatparticipantsreceiveduringtheexperiment. Theoriginalexperiment uses neutral color-coding for these signals. We test outcomes using non-neutral colors, such as green andred. Using“green”toindicateahighprobabilityofahighassetvalueand“red”toindicateahigh 5 These LLMs are: Anthropic’s Claude 3.5 Sonnet and Claude 3.7 Sonnet models, Meta’s Llama 3 Instruct 70B model,andAmazon’sNovaPromodel. 5
probabilityofalowassetvalueyieldsimilarresultsasthebaselineexperiment. However,revertingthe labelingsuchthat“red”(“green”)signalsahighprobabilityofahigh(low)assetvalue,whichiscounterintuitive given human conditioning, the LLMs generate very few rational responses. AI agents are thereforenotalgorithmicrational,followingawell-definedsetofrules,buthasinheritedsomeelements ofhumanintuitionandbias. ThisfindingisconsistentwithagrowingliteratureshowingthatLLMscan replicatehumanerrorsandbiases(Argyleetal.,2023;Bybee,2023;Hansenetal.,2025). Our results suggest that AI agents exhibit less herd behavior than human financial professionals, a findingwithsignificantimplicationsforfuturefinancialstabilityasgenerativeAIgainstractioninmarket decision making. Specifically, the reduced tendency to herd could potentially lead to less extreme market movements and fewer asset price bubbles, contributing to greater overall financial market stability. However, we acknowledge that the introduction of AI agents could fundamentally alter market dynamics in ways that are not yet fully understood. Continued research and adaptive regulatory approachestomaintainfinancialstabilityinanAI-augmentedfinanciallandscapeisthereforewarranted. Weproceedasfollows. Section2reviewstheliterature. Section3discussestheconceptofherdingand itsrelationtofinancialstability. Section4outlinesthetheoreticalmodelthatunderpinstheexperimental design. Section 5 describes the human laboratory in which the experiment was conducted in Cipriani and Guarino (2009), and how we adopt this setting with LLMs. Section 6 presents the main results. In Section 7, we introduce various alterations to the experiment to understand the prevalence of herd behavior under different experimental settings. Section 8 provides an overview and discussion of our mainresults. ConclusionsfollowinSection9. 2. Literature This work contributes to the growing literature on the behavior of LLMs. While our study focuses on herdbehaviorandfinancialstability,otherworkshaveexaminedothertypesofbehavioranddepartures fromrationality. MostrelevanttoourapproachisHenningetal.(2025),whoconductsassetpricingexperimentswithLLMtradersdemonstratingthatAIagentstendtopriceassetsnearfundamentalvalues. Asinourwork,theyconcludethatAIadoptionhasthepotentialtoenhancefinancialstabilitybydampeningthelikelihoodofassetpricebubbles. Theirtestis,however,fundamentallydifferentfromours. In Henningetal.(2025),agentschoosewhethertoinvestcashinariskyassetunderfullinformationabout expecteddividendsandinterestrates,whichdirectlyfacilitatescomputationofthefundamentalvalue. Thissetupseekstotestrationalityinthecontextofpricesettingratherthanbehavioralaspectsofasset pricebubbles,whichisourfocus. 6
Chenetal.(2023)studiestheeconomicrationalityofGPTmodelsbyconductingrevealedpreference experiments, where models are prompted to make decisions under budget constraints. Similar to our results,althoughinadifferentaspectofthetermrationality,theauthorsconcludethatAIagentstendto exhibitmorerationalbehaviorthanhumans. Liuetal.(2025)confirmsthatLLMsarearemorerational than humans using a large data set of human decisions in risky choice problems. del Rio-Chanona et al.(2025)focusesonlaboratoryexperimentsrelatedtopriceexpectationsanddeviationsfromrational expectations. They emphasize the importance of the interactions of different AI agents and retaining memoryacrosstimeperiods;bothelementsthatweincludeintoourAIlaboratorysettingaswell. While they conclude that LLMs are not strictly rational in their expectation formation, they find that LLMs generatelessvariabilityintheirresponsescomparedwithhumans. Similarpatternsareobservedinour results. Whilethesestudies,likeours,mainlyemphasizedifferencesbetweenthebehaviorsofhumansandAI agents,somepapersemphasizetheirsimilaritiesandarguethatLLMscanbeusedtosimulatehuman outcomes. Forexample, Horton (2023) argues that LLMs can givehuman-like responses and suggests thattheycanbeusedconductpilotexperimentstocalibrateexperimentaldesignsbeforetestingonhumanbeings. Hansenetal.(2025)andJhaetal.(2024),andZarifhonarvar(2024)showthatLLMscanbe usedtosimulateeconomicsurveys. TheliteratureemphasizesthatLLMsinsomecontextsexhibithumanbiasessuchasriskaversionandlossaversion(Jiaetal.,2024;Rossetal.,2024). Alongthesamelines, Huaetal.(2024)showthatLLMsoftendeviatefromrationaldecisionsingametheoreticexperiments. CharacterizingtheexactdistinctionbetweenhumanandAIdecisionmakingremainsanopenquestion. Finally,ourworkisparticularlyimportantasLLMsstarttoplayanincreasinglylargerroleinfinancial marketdecisions. TheliteratureshowcasestheapplicationofLLMsininvesting. Lopez-Lira(2025)simulatesastockmarketusingLLMsandarguesthatthisframeworkcanbeusedtoconductcounterfactual experiments. Leeetal.(2025)arguesthatLLMssufferfromconfirmationbiasintherealmofinvestment. Specifically,theauthorsshowthatLLMsexhibitapreferenceforlarge-capstocksandcontrariantrading strategies. AndFedyketal.(2024)surveystheinvestmentpreferencesofhumanandAIagents,finding thatAIdemonstratesdemographicbiasesthatcanbeovercomewithdemographically-seededprompts. 3. Herding and financial stability The literature distinguishes between optimal and suboptimal herding, each with distinct implications. Optimal herding represents a fundamental mechanism through which individually optimal decisions cangeneratecollectivefragilityinfinancialmarkets. InthecanonicalmodelsofBikhchandanietal.(1992) 7
andBanerjee(1992),optimalherdingoccurswhenmimickingothersrepresentsanoptimalresponseto superior information possessed by early movers. Once sufficient agents have acted in one direction, subsequentinvestorsoptimallyignoretheirownprivatesignals,leadingtocollectivebehaviorthatmay divergefromfundamentalvalues. Suboptimalherdingoccurswheninvestorsfollowthecrowdevenwhenthisleadstolowerexpected profitsthandecisionsrelyingonprivateinformation. Thisbehaviorstemsfromcognitivebiases,reputationalconcerns(Scharfstein&Stein,1990),andmisalignedincentives. Onerecentexampleofsuboptimal herdingbehavioristhe”memestockmania”of2021,exemplifiedbyGameStopandAMC,whereretail investorscollectivelymovedmarketsbyfollowingsocialtrendsratherthanfundamentalvaluations. AsdepictedFigure1,bothtypesofherdingcanleadtofinancialinstability,thoughthroughdifferent mechanisms. Optimalherdingcanaccelerateinformationaggregationandenhanceefficiencybyincorporatingdispersedknowledgeintoprices(Chamley,2003). Forexample,withdrawalsfromagenuinely insolvent bank reflect individually rational and collectively justified behavior. Such market discipline may improve financial stability if the market is able to distinguish between insolvent banks and solventbanks. Nevertheless,evenoptimalherdingmaygenerateshort-termvolatilitybyacceleratingprice corrections, which can reverse quickly if new information arrives. Short-term volatility can also arise fromconcentratingtradingflows,potentiallyexacerbatingfiresalesandliquiditystrains. Finally,optimalherdingcanovershootorundershootfundamentalvalues,ortransmittoothermarkets,triggering suboptimal herding with adverse consequences (Cipriani & Guarino, 2008). These dynamics create a paradox: actionsthatoptimizeindividualutilitycansimultaneouslyunderminemarketefficiencyand stability. Financialstabilitymaythereforebeenhancedwheninvestorsactontheirprivatesignals(which werefertoasrationalbehavior)ratherthanengageinbehaviorthatisindividuallyoptimalbutsystemicallydestabilizing. Suboptimalherdingprovesparticularlydestabilizingasittransformsnoiseintocrises—-pricesseverely deviate from fundamentals, liquidity evaporates, and self-fulfilling runs emerge. Suboptimal herding canevenleadtocontagion. The2008freezingofinterbankandothermarketsexemplifiesthisdynamic, whereevensolventcounterpartieslostfundingaccessamidgeneralizedpanic. Bysuppressingdiverse privatesignals,suboptimalherdingfundamentallyunderminestheinformationefficiencyofmarkets. For financial stability, the distinction between herding types carries significant policy implications. Suboptimalherdingcanbemitigatedthroughenhancedtransparency,reformedincentives,andmarket infrastructure improvements (e.g., circuit breakers), as it originates in behavioral amplification rather thanfundamentalweaknesses. Optimalherding,however,provesmorechallengingtoaddresswithout resolving underlying vulnerabilities such as undercapitalization or asset toxicity. Both variants con- 8
tributetofinancialfragilitythroughfire-saleexternalities,liquidityspirals,andcross-institutionalcontagion (Brunnermeier & Pedersen, 2009). While herding can occasionally enhance efficiency through information aggregation, its tendency to suppress private signals renders it a persistent source of systemicrisk. 4. Experimental design ThissectionpresentsthemodelandtheoreticalpredictionsoutlinedinCiprianiandGuarino(2009). The modelisbasedonAveryandZemsky(1998). 4.1. Theoretical model The model describes a financial market with one risky asset and discrete trading periods indexed by t = {1,2,...}. Before the first trading period, there is a ρ probability of an information event, which changesthefundamentalvalueoftheassetineitherdirection. Ineachtradingperiod,sometradersare informedandreceiveaprivatesignalonthevaluechange,whileothersdonot. Themodelcharacterizes different types of trading behaviors based on whether informed traders act according to their private signal(rationalbehavior)orignoretheirsignal(cascadebehavior). The asset’s fundamental value belongs to the discrete set v ∈ {0,50,100}. Specifically, if there is no informationevent(withprobability1−ρ),thevalueisequaltoitsunconditionalexpectedvaluegiven equalprobabilities,i.e.,v =50. Aninformationevent(occurringwithprobabilityρ)pushesthevalueto zeroor100,withthefollowingprobabilitydistribution: Pr(v =0)=Pr(v =100)=0.5. Theassettrades atapricep,whichissetbythemarketmakeraccordingtoBayesianupdatingaswedetailbelow. Traders act sequentially with only one trader randomly chosen to trade in each trading period.6 In eachperiodt,thechosentraderexecutesanactionx ,whichistobuyoneunitoftheasset(x = buy), t t sell one unit of the asset (x = sell), or not trade (x = notrade). If there is no information event, all t t traders are uninformed noise traders, who trade based on exogenous probabilities, i.e., Pr(x =sell) = t Pr(x =buy) = Pr(x =notrade) = 1/3. In the case of an information event, the chosen trader is t t informedwithprobability µ (andanoise traderwith probability 1−µ). An informedtraderreceivesa signals ∈{white,blue},whichistiedtotheassetvalueinthefollowingway: t Pr(s =white|v =100)=Pr(s =blue|v =0)=0.7. (1) t t 6 Thisstructuresimulatesthemechanicsofcentrallimitorderbooktrading. 9
Thatis,awhitesignalcanbeinterpretedasagoodsignal,indicatingthattheinformationeventresultedin ahighassetvalue,whereasabluesignalisbadinthesensethatitincreasestheprobabilityofazeroasset value. Inadditiontothesignals ,aninformedtraderalsoobservesthetradinghistoryh ,andtherefore t t formsbeliefsabouttheassetvaluebasedontheconditionalexpectedvaluegivens andh : E(v|s ,h ). t t t t Therealizedpayoffisequaltov−pifthetraderchoosestobuytheasset,p−v ifthetraderchoosesto sell,andzeroifthetraderchoosesnottotrade. Weassumethattheinformedtraderisrisk-neutraland seekstomaximizeexpectedpayoffgivens andh . t t Amarketmakerfacilitatesexchangeswiththetradersandsetsthepriceoftheassetgiventhehistory of trades for periods up to t − 1, h t = {x 1 ,x 2 ,...,x t−1 } for t > 1. h 1 = ∅. Specifically, the price is determined as the expected asset value given h : p = E(v |h ).7 In the first trading period, with no t t t tradinghistory,thepriceisequaltoitsunconditionalexpectedvalue: p = 1100=50. Att>1,theprice 1 2 isgivenbytheexpectedassetvalueconditionalonthehistoryoftrades: p =100Pr(v =100|h )+0Pr(v =0|h )=100q (2) t t t t whereq =Pr(v =100|h )isdeterminedusingBayesianupdating:8 t t q t =Pr(v =100|x t−1 ,h t−1 ) (3) [ ( ) ] =1 ( 0. ) 7ρµ+(1 ( −ρµ)1 3 q t−1 ) + (xt−1=buy) [ 0.7ρµ+(1−ρµ ( )1 3 q t−1 + 0.3ρµ+ ) (1−ρµ)1 3 (1−q t−1 ) ] 1 ( 0. ) 3ρµ+(1 ( −ρµ)1 3 q t−1 ) + (4) (xt−1=sell) 0.3ρµ+(1−ρµ)1 3 q t−1 + 0.7ρµ+(1−ρµ)1 3 (1−q t−1 ) 1 (xt−1=notrade) q t−1 . 4.2. Theoretical predictions This section presents the theoretical predictions for how informed traders act according to the model. Informed traders make decisions by comparing the price of the asset to the expected value given the 7 Thereisonlyoneassetprice,i.e.,themodelassumesazerobid-askspread. Thisassumptionwasimposedby CiprianiandGuarino(2009)tosimplifythelaboratoryexperiment. 8 Theterm(1−ρµ)1representstheprobabilityabuyorsellcomesfromanoisetrader,whobuys,sells,andchooses 3 nottotradewithequalprobability.Thetermµρistheprobabilitythatatraderisinformed,givenbytheprobability thataninformationeventoccurred(ρ)timestheprobabilitythatatraderisinformedgivenaninformedevent(µ). 10
signalandtradinghistory: buy if p t <E(v|s t ,h t ) x t = sell if p t >E(v|s t ,h t ). (5) indifferent if p =E(v|s ,h ) t t t When indifferent, traders may buy, sell, or not trade; their payoff will be the same regardless of their action. Theirexpectedvalueisgivenforeachsignalasfollows: [ ] 0.7q⋆ E(v|s =white,h )=100 t , (6) t t 0.7q⋆+0.3(1−q⋆) [ t t ] 0.3q⋆ E(v|s =blue,h )=100 t (7) t t 0.3q⋆+0.7(1−q⋆) t t where q t ⋆ = Pr(v =100|x t−1 ,h t−1 ,ρ=1), which can be computed from (4) above. For the informed trader, the relevant probability of the high asset value conditional on the trading history sets ρ = 1 because the informed trader, by definition, knows with certainty that an information event occurred. The discrepancy between q , the probability of a high asset value from the perspective of the market t maker, and q⋆, the corresponding probability from the perspective of an informed trader, can lead to t optimalinformationcascades. Themodelcharacterizesdifferenttypesofbehaviorofinformedtraders,definedasfollows: Rational: Theinformedtraderchoosestobuyuponreceivingawhite(good)signalandsellupon receivedablue(bad)signal. Partialrational: The informed trader follows rational behavior upon receiving one signal and to not trade upon receiving the other signal, e.g., buy upon receiving a white (good) signal and no tradeuponreceivedablue(bad)signal. Cascadetrading: The informed trader chooses the same trading action (buy or sell) regardless of theprivatesignal. Ifthetraderchoosestobuy(sell)whenthetradinghistoryisdominatedbybuyactions(sell-actions),i.e.,actfollowingthemajorityactionofprevioustraders,thetraderengages inherdbehavior. Ifthetraderchoosestobuy(sell)whenthetradinghistoryisdominatedbysellactions (buy-actions), i.e., acting against the majority of previous traders, the trader engages in contrarianbehavior. Cascadenotrading: Theinformedtraderchoosesnottotraderegardlessoftheprivatesignal. Error: Theinformedtraderchoosestobuyuponreceivingablue(bad)signalandselluponreceivingawhite(good)signal. 11
Thelasttypeofbehaviorisalwayssub-optimalandisinterpretedasanerrorifobserved. However,itcan beoptimalfortraderstoengageincascadebehavior,dependingontheparameterizationsofthemodel. ThelaboratoryexperimentsinCiprianiandGuarino(2009)followtwodifferentparameterizationsofthe model,referredtoastreatments. Inthefirsttreatment(TreatmentI),thereisnouncertaintyaboutwhetheraninformationeventoccurs, i.e.,ρ=1. Inaddition,alltradersareinformed,i.e.,µ=1. Hence,q =q⋆,anditfollowsthat: t t 0.7q E(v|s =white,h )=100 t >100q =p t t 0.7q +0.3(1−q ) t t t t and 0.3q E(v|s =blue,h )=100 t <100q =p . t t 0.3q +0.7(1−q ) t t t t Hence,regardlessofthehistoryoftrades,atrader’sexpectedvaluegiventheirprivatesignalisalways onthesamesideofthemarketpriceastheirsignal. Therefore,itisalwaysoptimalfortraderstofollow theirprivatesignals. Asaresult,eachtraderevealsnewinformation,continuouslyupdatingthemarket price. Thispreventstheformationofinformationcascades,astradersneverhaveanincentivetoignore theirprivateinformationinfavoroffollowingtheactionsofothers. Inthesecondtreatment(TreatmentII),thereisuncertaintybothaboutwhetheraninformationevent occursandtheproportionofinformedtraders. CiprianiandGuarino(2009)setρ = 0.15andµ = 0.95, i.e.,aninformationeventoccurswith15%probabilityandtheprobabilitythatatraderreceivesaprivate signalontheinformationeventisaslightlysmallerthanone. With event uncertainty, it can be optimal for traders to engage in cascade behavior. The reason is thatthereisinformationasymmetrybetweeninformedtradersandthemarketmaker. Uponreceivinga privatesignal,theinformedtraderknowswithcertaintythataninformationeventhasoccurredandthat thehistoryoftradescomesfromaninformedtraderwithprobabilityµ=0.95. Incontrast,notknowing whetheraninformationeventhasoccurred,themarketmakerbelievesthatthetradersareinformedwith probabilityρµ = 0.15·0.95 = 0.14. Thisasymmetryleadsthemarketmakertoupdatetheassetprice moreconservativelythaninformedtradersupdatetheirbeliefs. Afterasequenceofbuyorders,thegap between traders’ expectations and the market price can widen. Eventually, a trader’s expectation may exceedthemarketpriceevenwithacontradictorysignal: E(v|s = white,h ) > E(v|s = blue,h ) > p . t t t t t Atthispoint,thetraderwillignoretheirprivateinformationandfollowtheherd.9 However,becausethe 9 Optimalherdingbehavioristemporary. Whentradersherd, theprivatesignalsarenotreflectedintheprices. However,themarketmakercontinuestoupdatebeliefsaboutwhetheraninformationeventhasoccurred,causing pricestokeepmoving,albeitslowly.Eventually,thepricemaymoveenoughtomakeprivateinformationrelevant again,breakingtheherdbehavior. 12
marketmakerupdateshisexpectationbylessthantheinformedtraders, itwillneverbethecasethat, after a history of buys, the expectation of a trader will be below the price for both signal realizations, i.e.,p > E(v|s = white,h ) > E(v|s = blue,h ). Asaresult,aninformedtraderwillneverengagein t t t t t contrarianbehavior. Analogousargumentsapplytoasequenceofsellorders. Attheextreme,themarketmakerdoesnotupdatethepriceatallsuchthatthepriceremainsatthe unconditional expected value throughout all trading periods. Cipriani and Guarino (2005) conducted an experiment with this setting (without event uncertainty) among undergraduate students. We shall refertothissettingasTreatmentIII.Inthisparametrization,optimalherdingariseswhenthereisatrade imbalancegreaterthanorequaltotwo(Bikhchandanietal.,1992);seeCiprianiandGuarino(2005)for intuition. Sincethisexperimentwasnotconductedamongfinancialmarketprofessionals,weshallfocus lessonthisparametrizationinourresults. Optimal behavior: Tosummarize,themodelpredictsthefollowingbehaviorinthetwotreatments: TreatmentI(priceupdating;noeventuncertainty): Traders always trade according to their private signal,preventingtheformationofcascades. TreatmentII(priceupdating;eventuncertainty): Aninformationcascadeoccurswithpositiveprobability. Herdingisoptimalwhenpricesarebelowtheexpectedvalueconditionalonbothsignals, butitisneveroptimaltoengageincontrarianbehavior. TreatmentIII(nopriceupdating;noeventuncertainty): Herding is optimal after a trade imbalance higherthanorequaltotwo. 5. Laboratory setup Cipriani and Guarino (2009) implemented the experiment among financial market professionals. We adopttheirhumanlaboratorysettingascloselyaspossible,replacinghumanparticipantswithAIagents. ThenwecompareourresultsfromthisAIlaboratorywiththehumanresultsfromCiprianiandGuarino (2009). ThissectiondescribesthehumanandAIlaboratories. 5.1. Human laboratory Thehumanexperimentwasconductedwith32participantsworkingforfinancialinstitutionsinLondon. Theparticipantsweredividedintofourgroupsofeight;eachgroupformedonesession. 13
In each of the four sessions, the experiment was repeated for two practice rounds followed by first eightroundsimplementedwiththeparametrizationinTreatmentIandtheneightroundswiththeTreatmentIIparametrization. Beforeeachtreatment,participantsweregivenwritteninstructions. Theywere informed that everyone received the same set of instructions, and were given the opportunity to ask clarifyingquestionswhichwereansweredprivately. Thetimelineforeachsessionwasasfollows: Timeline for each session in human laboratory: 1. ParticipantsweregivenwritteninstructionsforTreatmentI. 2. PracticeroundconsistingoftwotradingperiodswithTreatmentIparametrization. 3. TreatmentIroundconsistingofeighttradingperiods. 4. ParticipantsweregivenwritteninstructionsforTreatmentII. 5. TreatmentIIroundconsistingofeighttradingperiods. 6. Payoffswerepaidout. 7. Participants filled out a survey collecting personal characteristics (gender, age, education, work position,jobtenure). CiprianiandGuarino(2009)reporttheunconditionaldistributionsofthese characteristics. Eachroundproceededasfollows: Timeline for each round in human laboratory: 1. Acomputerselectedtheasset’sfundamentalvaluefromthedistributionPr(v =0)=Pr(v =100)= 0.5. InTreatmentII,thereisatheoretical85%probabilitythataninformationdidnotoccur,leaving thevalueat50. However,theexperimentwasimplementedasif aneventdidoccur. 2. Notknowingtheasset’svaluev,participantschosetheiractionsconditionalonobservingawhite andbluesignal. 3. A computer randomly chose one trader from a uniform distribution, who was selected to trade. Thecomputeralsochosetherealizedsignalfromthesignal’sprobabilitydistributionconditional onthevalueselectedinstep1. 4. Theselectedtraderreceivedtherealizedsignal. Theremainingtradersonlyobservedtheexecuted action(buy,sell,notrade). 5. Thepriceforthenextroundwascomputedgiventheselectedtrader’sactionfortherealizedsignal. 14
6. Steps 2-6 were repeated for eight rounds total, until all participants had been selected to trade exactlyonce. 7. Payoffsfortheroundwererevealedtoeachparticipant. Participantswhobought(sold)theasset intheroundatthepricep receivedv−p (p −v)lire,afictionalcurrencythatwastranslatedinto t t t GBPattheendoftheexperimentattheexchangerateofthreelireperGBP. WerefertoCiprianiandGuarino(2009)forfurtherdetails. 5.2. AI laboratories We adopt the human experiment in our AI laboratory, where human participants are replaced by AI agents. TomodelAIagents,weuseasuiteofLLMsandapplymodelaveragingtogetanall-compassing viewofthebehavioralpatternsofAI-poweredtrading. Specifically,weuseAnthropic’sClaude3.5SonnetandClaude3.7Sonnetmodels(hereafterClaude3.5and3.7),Meta’sLlama3Instruct70Bparameter model,andAmazon’sNovaPromodel. Wemainlyimplementthemodelswithamoderatetemperature of 0.7, balancing creativity with determinism.10 Robustness checks confirm that the choice of temperature does not impact the conclusions of our experiments. The Claude 3.7 model is implemented in extendedthinkingmode,requiringatemperatureof1.0. Thissettingactivatesextendedreasoningcapability,wherethemodeliteratesinmultiplestepstoreachthe“best”perceivedanswer. We follow the setup of the human laboratory described above as closely as possible. For example, similar to human participants, we presented an LLM (Claude 3.5) with written instructions and gave the model the opportunity to ask clarifying questions. We used this model feedback to improve the instructions. However, some adjustments are necessary to accommodate differences between human and AI agents. First, practice rounds are redundant, and we completely separate the two treatments toavoidconfusingthemodels. Second,weexplicitlyprovidememorytotheAIagentsineachtrading period,bylistingtheexecutedtradesalongwiththehistoryofactionsandreasoningforeachagentin allpreviousperiods. LLMsareinstructedthroughprompts. Theuserpromptsetsthetaskorquerythattheuserwantsthe model to respond to, and it can change with each interaction. In addition to the user prompt, LLMs can also be instructed through their system prompt, which sets the context, behavior, knowledge base, androleforthemodel. Weusethesystemprompttoprovidethegeneralinstructionsoftheexperiment 10 Thetemperatureadjustshowthemodelweighsitspredictionforthenexttoken. Alowertemperaturemakes themodelfocusmoreonitstopchoices,whileahighertemperaturegivesitmorefreedomtoconsiderlesslikely options,affectinghowpredictableorcreativetheoutputbecomes. 15
(corresponding to the written instructions handed out to human participants) and the user prompt to provideupdatesthroughoutthetradingperiodsandrequesttradingactions. Timeline for each round in the AI laboratory: 1. Acomputerselectedtheasset’sfundamentalvalue,asinthehumanexperiment. 2. WemakeanAPIcalltoanLLM,usingtheinstructionsoftheexperimentasthesystemprompt, seePrompt1. Theuserpromptrequeststhemodeltoprovideatradingaction(buy,sell,notrade) giveneachsignal(blueandwhite)andthecurrentassetprice, alongwithitsreasoningforeach action. For trading rounds t > 1, the user prompt also provides, for each agent, the history of executedtrades,anotificationifthatagentwaschosentoactinthepreviousround,andthehistory ofactionsandreasoningofthatagent. TheuserpromptisprovidedinPrompt2. 3. The trader selected trade and the realized signal are chosen by a computer, as in the human experiment. 4. Thepriceforthenextroundiscomputedgiventheselectedtrader’sactionfortherealizedsignal. 5. Steps2-4arerepeatedforeighttradingperiodstotal. ThistimelineissummarizedinflowdiagramsinFigure2foreachtreatment. Each experiment (i.e., four sessions of eight trading rounds) is repeated across different LLMs. To maintaincomparabilityacrossexperiments,weseedtherandomnesssuchthattherealizedassetvalue, realized signals, and the sequence of selected traders are identical across the experiments. Following CiprianiandGuarino(2009),weassumethataninformationeventalwayshappens,eveninTreatment II,wherethetheoreticalprobabilityofaninformationeventislessthanone. Optimal AI laboratory To establish a baseline, our main AI laboratory does not instruct the LLMs about what constitutes optimal decision making. In contrast, real-world adoptions of AI in financial market decision making likely attempts to guide the models in optimal behavior to the largest extent possible. WethereforealsoexploreanoptimalversionoftheAIlaboratory,whereweexplicitlytellthe modelswhenherdingisoptimalaccordingtotheorythroughtheuserprompt. Thispromptisgivenin Prompt3. 16
6. Results ThissectionpresentstheresultsfromconductingtheCiprianiandGuarino(2009)experimentsintheAI laboratories. WemainlyfocusonthebaselineAIlaboratoryinwhichwedonotincludeanyguidance onoptimaldecisionmakingintheprompt. AIagentdecisionsarepresentedalongsidehumandecisions fromtheoriginalexperiment. First,weconsiderthecasewithouteventuncertainty(TreatmentI),then we introduce event uncertainty (Treatment II), and finally we prevent price updating (Treatment III). WealsoanalyzethedescriptionsofreasoningprovidedbytheLLMstounderstandthedecisionsofAI agents. Unfortunately,wedonothavehumancounterpartsfortheseinsights. These baseline results establish patterns of decision making in general-purpose technology without furtheroptimization. Inreal-worldapplications,professionaltraderswouldlikelyconsulttoolsthathave beenfine-tunedtomaximizeprofitsaccordingtosomerisk-profile. Toexamineherdingbehaviorinsuch optimizedAIagents,weconsideranAIlaboratorywheretheLLMsareinformedaboutoptimaldecisions throughtheprompt. Finally,weconsiderrobustnesstothemodeltemperatureparameter. 6.1. Without event uncertainty We begin by discussing results obtained with the parameterization in Treatment I, where there is no model uncertainty. The theoretical model predicts that traders should always trade according to their privatesignals,whichprecludestheformationofcascades. Table1showsthefrequencyofthedifferentbehaviorsaveragedacrossallsessionsandtradingperiods. The“Human”columnrecitestheresultsfromthehumanlaboratoryreportedinCiprianiandGuarino (2009). The “AI” column represents the average results across all considered LLMs. With Treatment I, reported in panel (a), AI agents exhibit more rational behavior, i.e., buy on a “good” signal and sell ona“badsignal,”(61%)comparedtohumans(46%). ThisresultislargelydrivenbytheClaude3.7and Llama3models,whichgeneraterationalresponsesforavastmajorityofsessionsandtradingperiods. In contrast,theClaude3.5andNovaPromodelshavefewerrationalresponses,butamajorityofresponses thatarepartiallyrational,i.e.,followtherationalresponseononesignalbutdecidetonottradeonthe other signal. As a result, the share of rational and partial rational responses in the AI laboratory far exceedthatobservedinthehumanlaboratory(90%forAIversus65%forhumans). Itisworthnoticing thatwhilehumansmakemistakes(in3.40%ofthetotaldecisions),noerroneousdecisionsweremadein theAIlaboratory. Informationcascadesoccurinlessthan10%ofthedecisionsintheAIlaboratory,lessthanonethird ofthefrequencyofinformationcascadesinthehumanlaboratory. Cascadetradingbehaviorismostly 17
drivenbytheClaudemodels,whileNovaProistheonlymodelthatgeneratesno-tradecascades. Wecangaugethenatureofthesecascadeswhenthereisatradeimbalance,i.e.,adifferencebetween thenumberofsellandbuyordersinthetradinghistory. Informationcascadesrepresentherdingifthe cascadefollowsthemarket,i.e.,themajorityactioninthetradinghistory,andcontrarianbehaviorifthe cascade goes against the dominant action in history. Table 1 also shows the decomposition of cascade trading into optimal and suboptimal herding, contrarian, and undetermined behavior. Furthermore, the table reports the fraction of decisions where herding is optimal (which is equal to zero percent in TreatmentI).Theresultsshowthatthetradingcascadesarefullyattributedtocontrarianbehavior. While the human experiment does identify some herding, it is also the case here that contrarian behavior is dominating,seeCiprianiandGuarino(2009).11 Neitherherdingnorcontrarianbehaviors,however,are predictedtobeoptimalbythetheoreticalmodel. Why do LLMs engage in contrarian behavior although theory predicts that this type of decision is neveroptimal? OneexplanationisthatAIagentsfailtoincorporatetradinghistoryintotheirexpectations oftheasset’svalue. Withoutthisinformation,theagentwillvaluatetheassetatthepriceof70(0.7∗100+ 0.3∗0)givenawhitesignal30(0.3∗100+0.7∗0)givenabluesignal. Incontrast,themarketmakerdoes taketradinghistoryintoaccountwhenupdatingtheprice. Forexample,afterasufficientnumberofbuy orders,thepricewillincreaseabove70. Insuchcase,anagentwhoignoresthetradinghistorywillsell regardlessofthesignal,astheexpectedvalueislowerthanthepricegivenbothsignals,henceengage in contrarian behavior. Analyzing the reasoning provided by the LLMs confirms this explanation, see Section6.4. 6.2. With event uncertainty WithTreatmentII,wherethereisuncertaintyaboutwhetheraninformationeventhasoccurred,herding canbeoptimalinthetheoreticalmodel. Contrarianbehavior,however,isneveroptimal. Panel(b)ofTable1showsthatnoneoftheAIagentsdecidetoherdduringtheexperimentascascade trading behavior is non-existent. The AI agents also do not engage in contrarian behavior, consistent with theory. In fact, most of the decisions are rational (97%), which far exceed the share of rational decisionsamonghumanparticipants(51%). Thetablealsoreportsthepercentageoftimeswhereherding isoptimal. IntheAIlaboratory,herdingisoptimalinalittlemorethanonethirdofthetimes. TheLLMs 11 CiprianiandGuarino(2009)reportsthedecompositionofcascadetradingbehaviorbytradeimbalance. But,as thedistributionoftradeimbalanceacrosstradingperiodsisunknown,wecannotinfertheexactdecomposition inTable1. 18
overlooktheseopportunitiesintheirfocusontheinformationcontainedbytheprivatesignals. 6.3. The impact of price updating The difference between Treatment I and II lies primarily in the price updating rule. Specifically, not knowingwhetheraninformationeventhasoccurred,themarketmakerupdatesthepricemoreconservativelyinTreatmentII.Figure3illustratesthepricedynamicsforeachtradinground,averagedacross theexperimentsforeachLLM.Eachlineisasession. ThefigureshowsthatunderTreatmentI,theprice movesawayfromtheinitialpriceof50. Intwoofthesessions,thepriceisclosetozeroor75aftereight periods. In contrast, under Treatment II, the price stays close to the initial price of 50 throughout all tradingperiods. At the extreme, when the price does not update at all (Treatment III), see panel (c) of Table 1, all models make rational decisions in practically all sessions and trading rounds (more than 99%). This result is obtained despite the fact that in around one third of the decisions herding would have been optimal(whenthetradeimbalanceexceedstwo). 6.4. Analyzing LLM reasoning for investment decisions While we do not know the reasoning behind the decisions made in the human laboratory, we asked the LLMs as part of the user prompt to give reasoning for their decisions. Analyzing these reasoning paragraphsshedsfurtherlightonthedecisionmakingprocessineachofthemodels. Weexaminethe linesofreasoningusingbothLLMsandtheLDAtopicmodel. FortheLLManalysis,weusetheClaude3.7model—themostadvancedamongourmodels—tocharacterizeeachpassageofreasoning. Specifically,foreachpassage,weasktheLLMtoreadthereasoning andanswerthefollowingfivequestions: Question1: Is the trader comparing the price to the expected fundamental value of the asset? (True/False) Question2: Is the expected value computed using only the signal accuracy and the signal, e.g., 0.7∗100+0∗0.3=70or0.7∗0+0.3∗100=30? (True/False) Question3: Does the trader consider the market trend or the trading history in their reasoning? (True/False) Question4: Howdoesthetradercharacterizetheattractivenessoftheinvestment(veryattractive, attractive,reasonable,lessattractive,noincentive)? 19
Question5: Onascalefrom0-100(where100representspurelyemotionaland0representspurely rationalorlogical),howmuchistheinvestordrivenbyemotionsintheirassessment? Table2(a)showstheaverageresponsestoeachofthesequestionsacrossallLLMsforallreasoningpassages belonging to each treatment. A positive response to the first question is a necessary conditional for rational decision making. Indeed, the results shows that this condition is satisfied for basically all passages. Question2seekstounderstandiftheAIagentsevaluateexpectedvaluesgivenonlythesignal, ignoringtradinghistory, asconjecturedfromthedistributionofdecisions. Foralltreatments, this happensinnearlytwothirdsofthedecisionsconfirmingourexplanationsforourresultsstatedearlier. Forexample,intheseventhtradingperiodofthesecondsession,twoparticipantschosetobuyonboth signalsatthepriceof15.52,formingacascade. Oneoftheagentsgavethefollowingreasoningforbuying onablue(“bad”)signal: “EvenwithaBluesignal,theexpectedvalueis30(30%chanceof100,70%chanceof0). Thecurrent priceof15.52stillbelowthisexpectedvalue,buyingremainsprofitable.” Thisargumentdisregardsthatthetradinghistory(inthiscase{buy,notrade,sell,sell,sell,sell})andthe totalpricedecreasefromtheinitialpriceof50to15indicatethattheassetvalueiszero,assumingthat otherparticipantsfollowedrationalresponsessuchthatthetradinghistoryreflectsprivateinformation frompreviousperiods. Question 3 tackles the same question from a different angle. We find that 10-24% of the decisions are (at least partly) based on the market trend or the trading history. Hence, consistent with answers to the second question, a majority of the AI agents do not consider the trading history when forming expectations. Disregardingtheaccumulationofprivateinformationinthepricingoftheassetalsoexplainsthelarge shareofpartiallyrationaldecisions,i.e.,whythemodelwoulddecidetonottradeononeofthesignals. Forexample,inthefourthroundofthefourthsession,alloftheClaude3.5agentsdecidedtonottrade atthepriceof30givenablue(“bad”)signalbecause,asoneoftheagentsputit: “WithaBluesignal,theexpectedvalueis30(30%chanceof100,70%chanceof0). Sincethismatches thecurrentprice,there’snoclearadvantagetotrading.” Ifthemodelhadtakenintoaccountthefactthatthemajorityactioninthetradinghistorywastosell theasset,itmayhaveassignedahigherprobabilityofalowassetvaluethanthesignalaccuracyof70%, arrivingatalowerexpectedvalueandconsequentlydecidedtobuytheasset. Inthefourthquestion,weexaminethecertaintywithwhichAIagentsmaketheirdecisions. Morethan halfofthedecisionsareevaluatedaseitherveryattractiveorattractiveandlessthanonethirdisdeemed 20
unattractive. These results tell us that the results do not represent LLMs making enforced decisions with potential arbitrary decision outcomes: AI agents generally deem that there is an opportunity to makereasonableprofitsbyengagingintrading. Theabsenceofno-tradingcascadedecisionsinTable1 supportsthisconclusionaswell. Finally, we ask Claude 3.7 to evaluate the degree of emotional content in the reasoning on a scale from0-100. Consistentwiththeanswerstothefirstthreequestions,thesescoresaregenerallylowwith averagesabout5%,topdecilesof15-20%,andmediansofzero. These results are similar for reasoning passages generated by all four LLMs, with the exception of theLlama3model. Forthisreason,analysisofLlama3reasoningisprovidedseparatelyinTable2(b). Thismodelappearstousemore“judgment”or“emotion”initsreasoning. Forexample,thescoresfrom Question5measuringthedegreeofemotionaverageat13-17%fortheLlama3reasoningpassages. Also, responsestoQuestions2-3indicatethatLlama3doesnotreasonusingtheexpectedvalueconditional on the signal alone, but includes trading history and market trends to a greater extent than the other models. Forexample,arepresentativereasoningforLlamaagents(fromthefirstroundofsessionone): “SincetheWhitesignalismorelikelytooccurwhentheassetvalueis100,Ibelievetheassetismore likelytobeworth100than0,soI’mwillingtobuyatapriceof50. Thisisagooddealformeifthe assetisindeedworth100.” AnotherLlamaagentfromthesamesessionconsidersthepotentialofthemarketmakerinflatingthe priceoftheasset(fromthethirdroundofsessionone): “Conversely,theBluesignalsuggeststhatthefundamentalvaluemightbe0. Sellingat84.48seems likeagoodopportunitytogetridofapotentiallyworthlessasset,especiallysincethemarketmaker’s updatedpricemightbeoverestimatingtheasset’svalue.” Weconfirmtheseresultsusingamoretraditionalapproachtotopicanalysis, namelytheLDAtopic model. WeestimatetheLDAmodelwithuptofivetopicsandfindthreedistincttopicsacrossalldecisions in all LLMs and treatments. These are illustrated using word clouds in Figure 4. The first two topics representevaluatingexpectedvaluesrelativetothepricegivenrespectivelyawhiteandbluesignal. The thirdtopic,ontheotherhand,includeswordssuchas“likely”,“believe”,and“think”. Table3showsthe distributionsoftopicsacrossreasoningpassagesfor(a)theaverageacrossallLLMsand(b)theLlama3 modelonly. Interestingly,thereasoningpassagesgeneratedbytheLlama3modelareattributedalmost solelytothethirdtopic,drivingaquarterofthepassagesassignedtothistopicintheaverageinpanel (a). Incontrast,fortheremainingmodels,reasoningpassagesareassignedalmostexclusivelytothefirst twotopics. 21
6.5. Optimal AI agents By not taking into account the cumulation of private information in the trading history, the AI agents avoidsuboptimalherdingbehaviorinTreatmentI,butalsooverlookpotentialoptimalherdinginTreatmentIIandIII.Incontrast,tradingcascadesariseinthehumanlaboratorybothwhensucharestrictly suboptimalasinTreatmentIandwhentheycanbeoptimalinTreatmentII. Real-worldintegrationsofAIininvestingwilllikelyinvolvefine-tuningthemodelstobehaveasoptimalandprofit-maximizingaspossible. Thisisobviouslyadifficulttaskasitisnotknownaprioriwhich decisionsareoptimal. Incontrast,inourcontrolledexperimentalsetting,weknowfromtheorywhich decisionisoptimalinexpectation,andwecanprompttheLLMsdirectlywiththisinformation. Wethus implementtheexperimentinanoptimalAIlaboratory,whichissimilartotheAIlaboratorydescribedin Section5.2exceptthatweexplicitlyprompttheLLMswhenherdingisoptimal. Theuserpromptforthis exerciseisprovidedinPrompt3. Table 4 shows the results. In Treatment I, the optimal AI agents engage less frequently in cascade tradingbehavior,whichisreducedto3.5%ofthedecisionsascomparedwith9.4%inthebaselineresults inTable1. InTreatmentII,herdingisoptimalin81.51%ofthedecisionsonaverageacrossLLMs, and the optimal AI agents herd in 47.43% of the decisions. There is no suboptimal herding behavior, but theAIagentsdomakecontrariandecisionsandinvolveincascadetradingwhenthetradeimbalanceis zero. Finally,resultsforTreatmentIIIshowthatoptimalAIagentsexploitmostoftheoptimalherding opportunities(onaverage44.36%outof50.90%). Thereis,however,somesuboptimalherding. Insummary,byexplicitlyincludingtheoptimalbehaviorinthepromptgiventotheLLMs,weurge themodelstoherdwhenoptimal. But, wenotethatthemodelsdonotexploitalloftheopportunities for optimal herding and that the optimal AI agents make more suboptimal cascade trading decisions comparedwiththebaselineAIagents. Attemptingtofine-tuneLLMstobehaveoptimallycanthushave unintendedconsequencesand,inturn,financialstabilityimplications. Despitethesecases,optimalAI agents generally have higher expected payoffs than the baseline AI agents, as shown in Table 5. The difference in expected payoff is particularly pronounced in Treatment II, where the baseline AI agents earnanaverageof3.8lireandtheoptimalAIagentsearn15lireonaverage. Inotherwords,baselineAI agentsarepunishedforavoidingtoherdwhenoptimal.12 12 InTreatmentIIandIII,wherethepricestaysnearorexactlyat50, theconsequenceofmakingabaddecision (buyinganassetworthzeroorsellinganassetworth100)islargerthaninTreatmentI,wherethepricemoves towardsthefundamentalvalue. 22
6.6. Robustness to model temperature Temperature is a hyperparameter to LLMs controlling how the models predict the next token in a sequence. With a lower temperature, the model is more likely to choose the most probable next token, resultinginmoredeterministicandlesscreativeresponses. Highertemperaturesflattentheprobability distribution of the next token leading to more variation in the responses. For the baseline results, we applied a medium temperature of 0.7 for all models except Claude 3.7, which we apply in “extended thinking mode” which only supports a temperature of 1.0. Appendix A shows that the temperature settingonlyhasminorimpactonthedecisionsofAIagentsintheexperiments. 7. AI laboratory extensions and variations Laboratoryexperimentsinvolvinghumanparticipantsareexpensivetoconductasmonetarypayoffsare necessarytoincentivizeparticipantstoparticipateandtoperformtothebestoftheirabilityintheexperiment. Human laboratory experiments are therefore typically conducted at a small scale with few variationsintheexperimentaldesign. Incontrast,LLMsprovidesacheaplaboratoryforexploringvariationsoftheexperiment.13 Weutilizethisfeaturetorunalternativeversionsoftheexperiment,which wedescribenext. 7.1. Types of signals Theoretically,itdoesnotmatterifthe“good”signaliswhiteandthe“bad”signalisblueintheexperimental design. However, it may matter in practice. For example, Bazley et al. (2021) show that the perceptionofcolorforvisualizingfinancialdatainfluencesindividuals’riskpreferences,expectationsof futurestockreturns, andtradingdecisions. Specifically, the color red hasbeen associated with higher probabilitiesassignedtolossoutcomes(Kliger&Gilad, 2012)andmoreriskaversebehavior(Gnambs etal.,2015). TestingdifferentsignalcolorsintheAIlaboratorythereforeservesasatestofwhetherLLMs workpurelyasalgorithmicrobots(forwhomthelabelingofsignalsisirrelevant)orarecontaminatedby humanbias(whoseactionsareimpactedbythechoiceofsignallabels). SimplyaskingLLMstoassociatefinancialmarketconditionswithacolor-codedsignalrevealsthatthe modelsperceivewhiteandblueasneutralsignalsindicatingstablemarketconditions.14 Incontrast,the 13 WhileLLMstypicallyinvolvetokencosts,thesecostsareminimalcomparedwiththehumanlaboratory. 14 Llama3considerblueasabullishorpositivemarketsignal,whichinterestinglydoesnotimpairwithitsdecisions inthebaselineexperiments,whereblueisusedasa“bad”signal. 23
modelsassociategreenandredwithmarketmovements,bullishandbearish,respectively.15 We test two alternative versions of signals. The first variation tests an experiment where a “good” signalisrepresentedbythecolorgreenanda“bad”signalisrepresentedbythecolorred. Thisvariation isarguablymorechargedwithmeaningorconnotationthanthebaselineofwhite/bluesignals,butthe alignment of green with “good” and red with “bad” adheres to typical Western color associations. In thesecondvariation,wereversethelabelingsuchthata“good”signalisrepresentedbyredanda“bad” signalbyblue. Results are shown in Table 6. Due to significant variations in outcomes across different models, we present the results for each individual LLM separately in Appendix B, Table 9. Using green/red in placeofwhite/bluegenerallydoesnotimpactthemodel-averagedresults,inanyofthetreatments. In contrast,whenweinverttheconventionalcolorassociationsbyusingredtoindicate“good”andgreen tosignify“bad”,weobserveadramaticshiftintheresults. First,onaverage,themodelsgenerateerrors, i.e.,decisionstosellgivena“good”signalandbuygivena“bad”signal,inonequarterofalldecisions inbothtreatments. ThisresultisdrivenbyClaude3.5forwhichalldecisionsareerroneousunderthis colorscheme. Second, we observe more cascade trading decisions under Treatment II and III. These decisions are driven by the Llama 3 model, and represent herding or cascade trading under zero trade imbalance (cannotbedeterminedasherdingorcontrarianbehavior). Inspectingthemodel’sexplanationsreveals thatthemodelunderstandsthatredisa“good”signal,butatthesametimeassociatesgreenwith“good,” orremainsoptimisticevenwitha“bad”signal. Forexample,inthefirstroundofsessionfour,anagent givesthefollowingreasoningforchoosingtobuygivenagreen(“bad”)signal: “Althoughthe Green signal is morelikely to occur when the fundamentalvalue is 0, there is still a 30%chanceit’s100. Buyingat50isstillagooddeal,consideringtheexpectedvalueis50.” The model also associates the signals with the correct interpretation in the cases where it decides to followtheherd. Forexample,intheseventhroundofthefourthsession,anagentchoosestobuyatthe price of 99 on both signals, and provides the following explanation for the buy-decision on the green (“bad”)signal: “Althoughtheprobabilityoftheassetvaluebeing100givenaGreensignalislowerthanwithaRed signal,itisstillquitehigh. Theincreasingmarketpriceandpreviousbuydecisionssuggestthatthe marketismovingtowardsahighervaluation,makingitagoodbuyevenwithaGreensignal.” 15 TheseresponsesaredocumentedinAppendixB,Table8. 24
The Llama model thus seems to carry to over its interpretation of green as a positive signal, despite the clear instructions that green signal a high probability of the asset being worthless. The Nova Pro modelmainlyremainsrationalunderbothcolorschemes,andthusbehavesasonewouldexpectfrom algorithm-drivenintelligence. Theseresultscouldhaveimportantimplicationsforfinancialstability. WhileAIagentsappearmainly rational when information arrives in ways that conform with expectations (e.g., green implies a positivesignal),theycangenerateirrationalorevenerroneousoutcomesifthemeaningofatypeofsignal changes. Thiscouldbeanimportantfactorifashockproducesresponsesthatareunexpectedgivenpast experience. Morebroadly,thesefindingssuggestthatAIagentsarenotpurelyrationaldecisionmakers whoobjectivelyprocessgiveninformation,butaresusceptibletopreconceivedbiases. These results also demonstrate that the choice of labels in experimental design can substantially influence outcomes, particularly when these labels contradict intuitive associations. We conjecture that similareffectswouldlikelybeobservedinthehumanlaboratory. Interestingly, the most recent and advanced model in terms of reasoning capability–the Claude 3.7 model–generatessimilarresultsregardlessofthesignal. Thus,asLLMsimprove,thisriskmaybepartly mitigated. 7.2. AI agent profiles In our baseline results, we do not attempt to characterize the profiles of the AI agents. However, researchsuggeststhatLLMsoftenyieldmoreaccurate,personalized,anddynamicrepresentationsofhumansubjectswhenexplicitlyequippedwithpersonalcharacteristicsorprofiles(see,e.g.,Argyleetal., 2023;Kazinnik,2023). Weexperimentwithsuchpersonalizationbyincludingprofilescorrespondingto differentpersonalitiesintothesystemprompt: Human: “Youactasatypicalhumanbeing. Thatis,youattempttomaximizepayoff,butyouare subjecttoboundedrationalityandyourdecisionmakingispartlydrivenbygreedandfear.” Professionaltrader: “Youactasahumanbeing,workinginthefinanceindustry. Youknowfinancial marketdynamicsverywell. Youaretrainedtomakedecisionsthatmaximizeprofitsforyourfirm.” Robo-advisor: “Youarearobo-advisoractingaccordingtopre-definedrules. Yourdecisionmaking processisalgorithmicinnature. Youareprogrammedtouseallavailableinformationtomaximize payoff.” Rational: “Youarearationalagentbehavingaccordingtotheconceptofhomoeconomicus. Thatis, youuseallavailableinformationtomaximizepayoff.” 25
Wealsorunanexperimentwherethemodelisprovidedwithpersonalcharacteristicsbasedonthoseof thehumanparticipantsfromCiprianiandGuarino(2009). Specifically,wegeneraterandomdrawsfrom theunconditionaldistributionsofpersonalcharacteristicsofthehumanparticipants. Toavoidunrealistic profiles, suchasa20-yearoldmanagerwithaPh.D., werestrictthedistributionsaccordingtoasetof heuristics.16 ThecharacteristicsareaddedtothesystempromptintheformreportedinPrompt4. ThetradingbehaviorofthedifferenttypesofAIagentsisshowninTable10ofAppendixB.Acrossall treatments,theresultsarestrikinglysimilaracrosspersonas,andtheygenerallyalignwiththebaseline resultsinTable1. Whileitisexpectedthattheresponsesaremostlyrationalforthe“rational,” “roboadvisor,” and “professional trader” profiles, it is surprising that the “human” profiles and the traders endowed with human characteristics also exhibit highly rational behavior. Studying the reasoning of theLLMsfortheserunsrevealsthatthemodelsdonottaketheirprofilesintoaccountwhenformingdecisions. ThisoutcomecontrastsexistingresearchshowcasingthatendowingLLMswithpersonalcharacteristicsandpreferencesimpactresponses(e.g.,Hansenetal.,2025;Horton,2023). 7.3. Payoffs Thehumanexperimentreportspayoffsinafictionalcurrencycalled“lira,”whicharetranslatedtoGBP attheexchangerateofthreelireperGBP.WetesttheexperimentintheAIlaboratorywiththefollowing variations: • Theliraisworthless,asrepresentedbyazeroexchangerate. • Theliraisextremelyvaluable,asrepresentedbyanexchangerateofonemillionGBPperlira. • The payoff is paid out in USD at the exchange rate of three lire per USD. The fixed payoff for participationissetat70USD. AppendixB,Table11reportstheresults. Theresultsarecomparabletothebaseline,suggestingthat the payoff structure does not have a significant impact on AI agents’ trading decisions. These results indicatethatAIagentsresponddifferentlytopayoffincentivescomparedtohumans,forwhommonetary 16 ApersonwithaPh.D.isatleast27yearsold. ApersonwithaM.A./M.S.isatleast24yearsold. Assuminga minimumageof21yearsatfirstemploymentaftergraduation,themaximumtenureissetatageminus21. A personolderthan30yearswithatleast7yearsoftenureworksasamanager. Apersonyoungerthan30years withlessthan7yearsoftenurewhoholdsaPh.D.worksasamarketanalystortrader. Apersonolderthan25 yearswithatleast2yearsoftenurelikelyworksasinsalesorinvestmentmanagement. Apersonolderthan28 withatleast4yearsoftenurelikelyworksasaninvestmentbanker. 26
rewardstypicallyimproveperformance. LLMsarenotprogrammedtomaximizeprofitsorrespondto monetaryincentives. Instead,theyaredesignedtosatisfyendusersbyprovidingaccurateandhelpful responsesbasedontheirtrainingdataandalgorithms. 7.4. Length of the experiment The final variation of the AI laboratory adjusts the length of the experiment to include more trading periods and more sessions. First, we increase the number of sessions from four to ten, maintaining thenumberoftradingperiodsateight. Extendingthenumberofsessionsinthehumanexperimentto tenwouldinvolverecruiting80ratherthan32humanparticipants. Wedonothaveresultsforsuchan extendedexperiment, butthereisnoreasonnottoexpectthattheresultswouldchange(althoughthe overall conclusions of the human experiment may still hold). Next, we run the experiment over four sessionsasinthebaselinecase,butincreasethenumberoftradingperiodsfromeighttotwenty. Under eventuncertainty,thisextensionmayallowthegapbetweentheexpectationsoftradersandthemarket makertowidenfurthertofacilitateoptimalherdbehavior. Implementingthisextensioninthehuman experiment would involve the same number of participants as in the original experiment, but would prolong the length of the experiment and therefore likely increase the payoff necessary for recruiting participants.17 Table12ofAppendixBshowsthatthemainconclusionscontinuetoholdintheseextendedversions of the experiment. The occurrence of cascade trading relative to the baseline results, which is driven by contrarian behavior in the Claude 3.5 and 3.7 models. Interestingly, the LLMs do not herd under TreatmentII,evenwhentheexperimentisrunovertwentytradingperiods. 8. Discussion: Implications for Financial Stability The findings from our experiments are summarized in Figure 5, which shows the fraction of (partial) rational decisions in the human and AI experiments. Along with the baseline AI results, the figure is emphasizing results for the Optimal AI and the AI results in experiments where the signal colors are relabeled. Overall, our results suggest that AI agents exhibit less herd behavior compared to human financial professionals. The reduced tendency of AI to herd compared to human financial professionalshassignificantimplicationsforfinancialstabilityasgenerativeAIgainstractioninmarketdecision 17 Thecurrenthumanexperimentrunsoveraround2.5hours(Cipriani&Guarino,2009).Increasingthenumberof roundstotwentywouldthereforelikelytakemorethanfivehours. 27
making. First,lessherdingbehaviorcouldleadtofewerextrememarketmovementsandassetpricebubbles. As AI systems increasingly influence trading decisions, either directly through algorithmic trading or indirectly by advising human investors, markets may become less prone to the self-reinforcing cycles thatdrivepricesawayfromfundamentals. Second, if AI is implemented with the aim of maximizing optimal decision-making, which would likely be the case for real-world financial applications, rational decision-making decreases in favor of herdingwhenherdbehaviorisoptimal. Optimalherdingmayleadtofasterpricediscoveryandcorrection. Such market discipline could uncover existing vulnerabilities more quickly, potentially enabling earlierregulatoryormarketresponsesbeforeproblemsbecomesystemic. However,thissameproperty couldincreaseshort-termvolatilityandleadtomoreabruptmarketadjustments. Third,AI’sstrongertendencytowardrationalbehaviormaydiversifymarketparticipantreactionsto new information. Rather than all participants interpreting and acting on information in similar ways, AImightintroducegreaterheterogeneityinresponses,potentiallyreducingcorrelationinmarketmovements. Fourth,theresultsfromre-labelingsignalsintheexperimentrevealthatAIisnotperfectlyrational, despite its advantages over humans. When signals in experiments were deliberately labeled counterintuitively, LLMsproducedfewrationalresponses, suggestingtheyhaveinheritedelementsofhuman intuitionandbias. ThishybridnatureofAIdecision-making—morerationalthanhumansbutnotpurely rational—createsadditionalcomplexityinpredictinghowwidespreadAIadoptionmightimpactfinancialstability. WhileAImayreducecertainhumanbiases,itintroducesitsownformofimperfectrationalitythatmustbeaccountedforinstabilityassessments. Itisimportanttonotethattheseimplicationsarespeculativeandbasedonexperimentalresults. The actual impact of AI on financial stability will depend on numerous factors, including the extent of AI adoption,thespecificmodelsused,regulatoryresponses,andhowAIsystemsevolveovertime,which may incorporate more sophisticated agentic AI frameworks. In addition, the interaction between humanandAItradersbecomescrucial,astheircombinedbehaviorcouldeitheramplifyordampenmarket movements in unpredictable ways. This shift may necessitate new tools and approaches for regulatory oversight, including AI-specific stress tests or new forms of market surveillance. Furthermore, the long-term implications of AI decision-making on market stability, including potential unforeseen consequences, remain an important area for further research. Notably, traditional measures of market sentiment,whichoftenrelyonhumanemotionsandbehaviors,mayneedtobereconsidered. WithincreasedAIinvolvement,newmethodsmaybeneededtogaugemarketsentimentandpredictpotential 28
instabilities,astheemotionaldriversofmarketbehaviorcouldshiftsignificantly. 9. Conclusion This study offers novel insights into the potential impact of AI on financial stability. We compare the decision making behavior of AI agents with that of human financial professionals in a controlled experimental setting. Our findings show that AI agents demonstrate significantly more rational trading behaviorandlesspropensityforinformationcascadescomparedtotheirhumancounterparts. Infact, AIagentsavoidherdingevenunderconditionswhenherdingisoptimal,andtheydonotexploitalloptimalherdingstrategiesevenwhenexplicitlyadvisedwhenherdingisoptimal. AIagentsarethus“averse toherding.” If AI-drivendecisionmaking becomesmore prevalentinfinancial markets, wemight see a reduction in herd behavior, potentially leading to less extreme market movements, fewer asset price bubbles,andgreateroverallmarketstability. 29
References Angeletos,G.-M.,Collard,F.,&Dellas,H.(2018).QuantifyingConfidence.Econometrica,86(5), 1689–1726. Argyle,L.P.,Busby,E.C.,Fulda,N.,Gubler,J.R.,Rytting,C.,&Wingate,D.(2023).OutofOne, Many:UsingLanguageModelstoSimulateHumanSamples.PoliticalAnalysis,31(3). Avery,C.,&Zemsky,P.(1998).MultidimensionalUncertaintyandHerdBehaviorinFinancial Markets.AmericanEconomicReview,88(4),724–748. Banerjee,A.V.(1992).ASimpleModelofHerdBehavior.TheQuarterlyJournalofEconomics, 107(3),797–817. Bazley,W.J.,Cronqvist,H.,&Mormann,M.(2021).VisualFinance:ThePervasiveEffectsof RedonInvestorBehavior.ManagementScience,67(9),5616–5641. Bikhchandani,S.,Hirshleifer,D.,&Welch,I.(1992).ATheoryofFads,Fashion,Custom,and CulturalChangeinInformationalCascades.JournalofPoliticalEconomy,100(5), 992–1026. Bikhchandani,S.,&Sharma,S.(2001).HerdBehaviorinFinancialMarkets.IMFStaffPapers, 47(3),279–310. Brunnermeier,M.K.,&Pedersen,L.H.(2009).MarketLiquidityandFundingLiquidity. ReviewofFinancialStudies,22(6),2201–2238. Bybee,J.L.(2023).TheGhostintheMachine:GeneratingBeliefswithLargeLanguageModels. SSRNWorkingPaper. Chamley,C.P.(2003).RationalHerds.CambridgeUniversityPress. Chen,Y.,Liu,T.X.,Shan,Y.,&Zhong,S.(2023).TheEmergenceofEconomicRationalityof GPT.ProceedingsoftheNationalAcademyofSciences(PNAS),120(51),1–9. Cipriani,M.,&Guarino,A.(2005).HerdBehaviorinaLaboratoryFinancialMarket.American EconomicReview,95(5),1427–1443. Cipriani,M.,&Guarino,A.(2008).HerdBehaviorandContagioninFinancialMarkets.The B.E.JournalofTheoreticalEconomics,8(1),1–54. 30
Cipriani,M.,&Guarino,A.(2009).HerdBehaviorinFinancialMarkets:AnExperimentWith FinancialMarketProfessionals.JournaloftheEuropeanEconomicAssociation,7(1), 206–233. Danielsson,J.,&Uthemann,A.(2024).AIFinancialCrises.VoxEU.org. delRio-Chanona,R.M.,Pangallo,M.,&Hommes,C.(2025).CanGenerativeAIAgentsBehave LikeHumans?EvidenceFromLaboratoryMarketExperiments.arXivPreprint arXiv:2505.07457v1. Dou,W.W.,Goldstein,I.,&Ji,Y.(2025).AI-PoweredTrading,AlgorithmicCollusion,and PriceEfficiency.SSRNWorkingPaper. Drehmann,M.,Oechssler,J.,&Roider,A.(2005).HerdingandContrarianBehaviorinFinancial Markets:AnInternetExperiment.AmericanEconomicReview,95(5),1403–1426. Fedyk,A.,Kakhbod,A.,Li,P.,&Malmendier,U.(2024).AIandPerceptionBiasesin Investments:AnExperimentalStudy.SSRNWorkingPaper. FinancialStabilityBoard.(2024).TheFinancialStabilityImplicationsofArtificialIntelligence (tech.rep.).FinancialStbilityBoard. Galariotis,E.C.,Krokida,S.-I.,&Spyrou,S.I.(2016).HerdBehaviorandEquityMarket Liquidity:EvidenceFromMajorMarkets.InternationalReviewofFinancialAnalysis,48, 140–149. Gnambs,T.,Appel,M.,&Oeberst,A.(2015).RedColorandRisk-TakingBehaviorinOnline Environments.PLOSOne. Grauwe,P.D.(2012).LecturesonBehavioralMacroeconomics.PrincetonUniversityPress. Hansen,A.L.,Horton,J.J.,Kazinnik,S.,Puzzello,D.,&Zarifhonarvar,A.(2025).Simulating theSurveyofProfessionlForecasters.SSRNWorkingPaper. Hartley,J.,Jolevski,F.,Melo,V.,&Moore,B.(2024).TheLaborMarketEffectsofGenerative ArtificialIntelligence.SSRNWorkingPaper. Hayes,W.M.,Yax,N.,&Palminteri,S.(2024).RelativeValueBiasesinLargeLanguage Models.arXivPreprintarXiv:2401.14530. 31
Henning,T.,Ojha,S.M.,Spoon,R.,Han,J.,&Camerer,C.F.(2025).LLMTrading:Analysisof LLMAgentBehaviorinExperimentalAssetMarkets.ProceedingsoftheThirteenth InternationalConferenceonLearningRepresentations(ICLR). Horton,J.J.(2023).LargeLanguageModelsasSimulatedEconomicAgents:WhatCanWe LearnfromHomoSilicus?NBERWorkingPaper,No.31122. Hsieh,S.-F.,Chan,C.-Y.,&Wang,M.-C.(2020).RetailInvestorAttentionandHerding Behavior.JournalofEmpiricalFinance,59,109–132. Hua,W.,Liu,O.,Li,L.,Amayuelas,A.,Chen,J.,Jiang,L.,Jin,M.,Fan,L.,Sun,F.,Wang,W., Wang,X.,&Zhang,Y.(2024).Game-theoreticLLM:AgentWorkflowforNegotiation Games.arXivPreprintarXiv:2411.05990. Jha,M.,Qian,J.,Weber,M.,&Yang,B.(2024).ChatGPTandCorporatePolicies(tech.rep.). NationalBureauofEconomicResearch. Jia,J.,Yuan,Z.,Pan,J.,McNamara,P.E.,&Chen,D.(2024).Decision-MakingBehavior EvaluationFrameworkforLLMsunderUncertainContext.arXivPreprint arXiv:2406.05972. Jiang,H.,Zhang,X.,Cao,X.,Kabbara,J.,&Roy,D.(2023).PersonaLLM:Investigatingthe AbilityofGPT-3.5toExpressPersonalityTraitsandGenderDifferences.arXivPreprint arXiv:2305.02547. Kazinnik,S.(2023).BankRun,Interrupted:ModelingDepositWithdrawalswithGenerative AI.SSRNWorkingPaper. Kleinberg,J.,Lakkaraju,H.,Leskovec,J.,Ludwig,J.,&Mullainathan,S.(2018).Human DecisionsandMachinePredictions.TheQuarterlyJournalofEconomics,133(1),237–293. Kliger,D.,&Gilad,D.(2012).RedLight,GreenLight:ColorPriminginFinancialDecisions. TheJournalofSocio-Economics,41(5),738–745. Koralus,P.,&Wang-Maścianica,V.(2023).HumansInHumansOut:OnGPTConverging TowardCommonSenseinBothSuccessandFailure.arXivPreprintarXiv:2303.17276. Lee,H.,Seo,J.,Park,S.,Lee,J.,Ahn,W.,Choi,C.,Lopez-Lira,A.,&Lee,Y.(2025).YourAI,Not YourView:TheBiasofLLMsinInvestmentAnalysis.arXivPreprintarXiv:2507.20957. 32
Li,C.,Wang,H.,Jiang,S.,&Gu,B.(2024).TheEffectofAI-EnabledCreditScoringon FinancialInclusion:EvidencefromanUnderservedPopulationofoverOneMillion. MISQuarterly,48(4),1803–1834. Liu,R.,Geng,J.,Peterson,J.C.,Sucholutsky,I.,&Griffiths,T.L.(2025).LargeLanguage ModelsAssumePeopleareMoreRationalthanWeReallyAre.Proceedingsofthe ThirteenthInternationalConferenceonLearningRepresentations(ICLR). Lopez-Lira,A.(2025).CanLargeLanguageModelsTrade?TestingFinancialTheorieswith LLMAgentsinMarketSimulations.arXivPreprintarXiv:2504.10789. McElheran,K.,Li,J.F.,Brynjolfsson,E.,Kroff,Z.,Dinlersoz,E.,Foster,L.S.,&Zolas,N.(2023, October).”AIAdoptioninAmerica:Who,What,andWhere”(WorkingPaperNo.31788). NationalBureauofEconomicResearch. Ross,J.,Kim,Y.,&Lo,A.W.(2024).LLMeconomicus?MappingtheBehavioralBiasesof LLMsviaUtilityTheory.arXivPreprintarXiv:2408.02784. Scharfstein,D.S.,&Stein,J.C.(1990).HerdBehaviorandInvestment.TheAmericanEconomic Review,80(3),465–479. Shiller,R.J.(2005).IrrationalExuberance(2nded.).PrincetonUniversityPress. Wang,S.,Zhang,S.,Zhang,J.,Hu,R.,Li,X.,Zhang,T.,Li,J.,Wu,F.,Wang,G.,&Hovy,E. (2024).ReinforcementLearningEnhancedLLMs:ASurvey.arXivPreprint arXiv:2412.10400v3. Zarifhonarvar,A.(2024).ExperimentalEvidenceonLargeLanguageModels.Availableat SSRN,4825076. Zhu,J.-Q.,&Griffiths,T.L.(2024).IncoherentProbabilityJudgmentsinLargeLanguage Models.arXivPreprintarXiv:2401.16646. 33
Figures Figure1:Herdingbehaviorandfinancialstability Thediagramshowshowherdingcanleadtoafinancialstabilityeventbothwhenoptimalandsuboptimal. While suboptimalherdingisthegreatestconcernfromafinancialstabilityperspective,optimalherdingcanbuildupfinancialvulnerabilitiesaswell. Herding: Investors disregard private information to follow market trends Optimal herding: Rational Suboptimal herding: imitation based on fundamentals Noise-driven imitation Acceleration of price correction, Noise amplification, builduncovering existing vulnerabilities up of vulnerabilities Increasing volatility, potential Self-fulfilling runs, abrupt market movements contagion, and panic 34
Figure2:Flowdiagramsofexperiments Thefigureshowsdiagramsoftheorderofeventsforeachsessionoftheexperimentsunder(a)TreatmentI(without eventuncertainty),(b)TreatmentII(witheventuncertainty),and(c)TreatmentIII(withoutpriceupdating). The experimentisanadoptionofCiprianiandGuarino(2005,2009)andisbasedontheAveryandZemsky(1998)model. (a)TreatmentI Fundamentalvalue? 0(50%)or100(50%) (1) Trader observes a 70% accurate signal + trading history (2) Traders decide to buy, sell, or not trade (3) One trader is selected to trade (without replacement) (4) The selected trader acts; Bayesian market maker updates price Repeatfor8tradingperiods (Figurecontinuesonnextpage) 35
Figure2:Flowdiagramsofexperiments(continued) (b)TreatmentII Informationevent? Yes Fundamentalvalue? (15%probability) 0(50%)or100(50%) No Fundamentalvalue=50 No Informedtrader? Tradertype=noise (95%probability) Yes (1) Trader observes a 70% accurate signal + trading history (2) Traders decide to buy, sell, or not trade (3) One trader is selected to trade (without replacement) (4) The selected trader acts; Bayesian market maker updates price Repeatfor8tradingperiods (Figurecontinuesonnextpage) 36
Figure2:Flowdiagramsofexperiments(continued) (c)TreatmentIII Fundamentalvalue? 0(50%)or100(50%) (1) Trader observes a 70% accurate signal + trading history (2) Traders decide to buy, sell, or not trade (3) One trader is selected to trade (without replacement) (4) The selected trader acts; the price stays constant Repeatfor8tradingperiods 37
Figure3:Pricedynamics Thefigureshowsthepricedynamicsacrosstradingperiodsforeachtreatment,averagedacrossLLMsin(a)TreatmentI(withouteventuncertainty)and(b)TreatmentII(witheventuncertainty).Eachlinerepresentoneofthefour independentsessions. Followingabuyorsellorder,thepriceisupdatedbyaBayesianmarketmarkergiventhe tradinghistory. (a)TreatmentI (b)TreatmentII 38
Figure4:WordcloudsofLDAtopics ThefigureshowswordcloudsofeachtopicidentifiedbytheLDAmethodappliedtothereasoningprovidedby allLLMsacrossalltreatments. Thenumberoftopicsisfixedtothree;usingmoretopicsdoesnotresultinalarger numberofdistincttopics.Wordsaredisplayedinfontsizesthatcorrespondtotheirprobabilityofappearinginthe topic.Textcoloranddirectioncarrynointerpretation. (a)Topic0 (b)Topic1 (c)Topic2 39
Figure5:Overviewofmainresults: Fractionofrationalorpartialrationaldecisions ThefigureshowsthefractionsofRational(darkcolor)andPartialRational(lightcolor)decisionsaveragedacross all sessions and trading periods in (a) Treatment I (without event uncertainty), (b) Treatment II (with event uncertainty),and(c)TreatmentIII(withoutpriceupdating). Rationalbehaviorrepresentscaseswheretheinformed traderchoosestobuyuponreceivingawhitesignalandselluponreceivingabluesignal.PartialRationalbehavior representscaseswheretheinformedtraderchoosestobuy(sell)uponreceivingawhite(blue)signalandnottrade uponreceivingtheothersignal. Humandecisions(showninorange)aretakendirectlyfromCiprianiandGuarino (2009)forTreatmentIandII.AIdecisions(showninblue)representtheaveragedecisionsacrossallLLMsinthe baselineexperiment(reportedinTable1),theOptimalAIexperiment(reportedinTable4),andthesignalrelabeling experiments(reportedinTable6). (a)TreatmentI (b)TreatmentII (Figurecontinuesonnextpage) 40
Figure5:Overviewofmainresults: Fractionofrationalorpartialrationaldecisions(continued) (c)TreatmentIII 41
Tables Table1:TradingbehaviorinAIandhumanlaboratories ThetableshowsthedistributionofdecisionsinthehumanandAIlaboratories. Decisionsareaveragedacrossall sessionsandtradingperiodsin(a)TreatmentI(withouteventuncertainty),(b)TreatmentII(witheventuncertainty), and(c)TreatmentIII(withoutpriceupdating). “Human”decisionsaretakendirectlyfromCiprianiandGuarino (2009)forTreatmentIandII.“AI”decisionsrepresenttheaveragedecisionsacrossallLLMs. Thetablealsoshow theresultsseparatelyforeachLLM.“Rational”behaviorrepresentscaseswheretheinformedtraderchoosestobuy uponreceivingawhitesignalandselluponreceivingabluesignal. “PartialRational”behaviorrepresentscases wheretheinformedtraderchoosestobuy(sell)uponreceivingawhite(blue)signalandnottradeuponreceiving theothersignal. “CascadeTrading”representscaseswheretheinformedtraderchoosesthesametradingaction (buyorsell)regardlessoftheprivatesignal.Thesedecisionsaredecomposedinto“OptimalHerding”,“Suboptimal Herding”,“Contrarian”behavior,andcaseswherethetradeimbalanceiszero(“Undetermined”). Whiletheexact decompositionofhumancascadetradingdecisionsisunknown,CiprianiandGuarino(2009)showthatalltypes ofdecisionsarepresentamonghumantraders,asrepresentedby+inthetable. “CascadeNoTrading”represents caseswheretheinformedtraderchoosesnottotraderegardlessoftheprivatesignal.“Error”representscaseswhere theinformedtraderchoosestobuyuponreceivingabluesignalandselluponreceivingawhitesignal. Thetable alsoreportsthefrequencyoftradingperiodswhereherdingisoptimal. (a)TreatmentI Human AI Claude3.7 Claude3.5 Llama3 NovaPro Rational 45.70% 61.00% 70.97% 37.11% 97.66% 38.28% PartialRational 19.60% 29.48% 11.29% 45.31% 0.00% 61.33% CascadeTrading 19.00% 9.42% 17.74% 17.58% 2.34% 0.00% OptimalHerding 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% SuboptimalHerding + 0.00% 0.00% 0.00% 0.00% 0.00% Contrarian + 9.42% 17.74% 17.58% 2.34% 0.00% Undetermined + 0.00% 0.00% 0.00% 0.00% 0.00% CascadeNoTrading 12.30% 0.10% 0.00% 0.00% 0.00% 0.39% Error 3.40% 0.00% 0.00% 0.00% 0.00% 0.00% OptimalHerdingOpportunities 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% (Tablecontinuesonnextpage) 42
Table1:TradingbehaviorinAIandhumanlaboratories(continued) (b)TreatmentII Human AI Claude3.7 Claude3.5 Llama3 NovaPro Rational 50.90% 97.36% 100.00% 100.00% 100.00% 89.45% PartialRational 20.10% 2.64% 0.00% 0.00% 0.00% 10.55% CascadeTrading 12.00% 0.00% 0.00% 0.00% 0.00% 0.00% OptimalHerding + 0.00% 0.00% 0.00% 0.00% 0.00% SuboptimalHerding + 0.00% 0.00% 0.00% 0.00% 0.00% Contrarian + 0.00% 0.00% 0.00% 0.00% 0.00% Undetermined + 0.00% 0.00% 0.00% 0.00% 0.00% CascadeNoTrading 16.50% 0.00% 0.00% 0.00% 0.00% 0.00% Error 0.05% 0.00% 0.00% 0.00% 0.00% 0.00% OptimalHerdingOpportunities + 36.56% 30.61% 46.88% 21.88% 46.88% (c)TreatmentIII AI Claude3.7 Claude3.5 Llama3 NovaPro Rational 99.65% 99.38% 99.22% 100.00% 100.00% PartialRational 0.16% 0.62% 0.00% 0.00% 0.00% CascadeTrading 0.00% 0.00% 0.00% 0.00% 0.00% OptimalHerding 0.00% 0.00% 0.00% 0.00% 0.00% SuboptimalHerding 0.00% 0.00% 0.00% 0.00% 0.00% Contrarian 0.00% 0.00% 0.00% 0.00% 0.00% Undetermined 0.00% 0.00% 0.00% 0.00% 0.00% CascadeNoTrading 0.20% 0.00% 0.78% 0.00% 0.00% Error 0.00% 0.00% 0.00% 0.00% 0.00% OptimalHerdingOpportunities 38.36% 50.31% 34.38% 34.38% 34.38% 43
Table2:LLManalysisofAIagentreasoning ThetableshowsLLManalysisofreasoningpassagesusingtheClaude3.7model. Themodelispromptedtoread eachpassageofreasoningandanswerthefollowingfivequestions.Question1:Isthetradercomparingthepriceto theexpectedfundamentalvalueoftheasset? (True/False). Question2: Istheexpectedvaluecomputedusingonly thesignalaccuracyandthesignal,e.g.,0.7*100+0*0.3=70or0.7*0+0.3*100=30? (True/False). Question3: Doesthe traderconsiderthemarkettrendorthetradinghistoryintheirreasoning? (True/False). Question4: Howdoesthe tradercharacterizetheattractivenessoftheinvestment? Question5: Onascalefrom0-100(where100represents purelyemotionaland0representspurelyrationalorlogical),howmuchistheinvestordrivenbyemotionsintheir assessment? For “True/False” questions, the table reports the fraction of “True” answers. Panel (a) reports the analysisofreasoningprovidedbyallLLMsandpanel(b)reportsanalysisofreasoningintheLlama3model. (a)AI TreatmentI TreatmentII TreatmentIII Question1 99.16% 99.01% 99.67% Question2 63.07% 63.09% 63.51% Question3 17.05% 9.50% 24.12% Question4 VERYATTRACTIVE 11.69% 1.88% 0.71% ATTRACTIVE 42.83% 69.39% 68.33% REASONABLE 12.65% 6.08% 7.33% LESSATTRACTABLE 10.36% 3.65% 5.69% NOINCENTIVE 22.17% 19.01% 17.89% Question5 Mean 6.39% 4.93% 5.62% Bottomdecile 0.00% 0.00% 0.00% Median 0.00% 0.00% 0.00% Topdecile 20.00% 15.00% 20.00% (Tablecontinuesonnextpage) 44
Table2:LLManalysisofAIagentreasoning(continued) (b)Llama3 TreatmentI TreatmentII TreatmentIII Question1 97.46% 100.00% 99.61% Question2 13.87% 4.49% 4.30% Question3 40.62% 30.66% 66.60% Question4 VERYATTRACTIVE 1.56% 1.56% 0.39% ATTRACTIVE 69.92% 83.79% 74.61% REASONABLE 13.48% 4.88% 4.49% LESSATTRACTABLE 10.16% 5.66% 8.98% NOINCENTIVE 4.69% 4.10% 11.52% Question5 Mean 14.53% 12.72% 16.65% Bottomdecile 5.00% 5.00% 10.00% Median 15.00% 10.00% 15.00% Topdecile 25.00% 20.00% 25.00% 45
Table3:LDAtopicanalysisofAIagentreasoning ThetableshowsthedistributionofreasoningpassagesacrossLDAtopics. Thenumberoftopicsisfixedtothree; usingmoretopicsdoesnotresultinalargernumberofdistincttopics. Thewordcloudsassociatedwiththetopics areshowninFigure4.Panel(a)reportstheanalysisofreasoningprovidedbyallLLMsandpanel(b)reportsanalysis ofreasoningintheLlama3model. (a)AI TreatmentI TreatmentII TreatmentIII Topic0 44.22% 51.93% 51.70% Topic1 26.81% 21.27% 20.40% Topic2 28.98% 26.80% 27.90% (b)Llama3 TreatmentI TreatmentII TreatmentIII Topic0 1.95% 0.20% 0.59% Topic1 6.84% 9.57% 0.59% Topic2 91.21% 90.23% 98.83% 46
Table4:TradingbehaviorinoptimalAIlaboratory ThetableshowsthedistributionofdecisionsintheoptimalAIlaboratoryinwhichLLMsarepromptedwithguidanceonoptimaldecisionmaking. Decisionsareaveragedacrossallsessionsandtradingperiodsin(a)TreatmentI (withouteventuncertainty),(b)TreatmentII(witheventuncertainty),and(c)TreatmentIII(withoutpriceupdating). “OptimalAI”decisionsrepresenttheaveragedecisionsacrossallLLMs. Thetablealsoshowtheresultsseparately for each LLM. “Rational” behavior represents cases where the informed trader chooses to buy upon receiving a whitesignalandselluponreceivingabluesignal.“PartialRational”behaviorrepresentscaseswheretheinformed traderchoosestobuy(sell)uponreceivingawhite(blue)signalandnottradeuponreceivingtheothersignal.“CascadeTrading”representscaseswheretheinformedtraderchoosesthesametradingaction(buyorsell)regardlessof theprivatesignal. Thesedecisionsaredecomposedinto“OptimalHerding”,“SuboptimalHerding”,“Contrarian” behavior,andcaseswherethetradeimbalanceiszero(“Undetermined”). “CascadeNoTrading”representscases wheretheinformedtraderchoosesnottotraderegardlessoftheprivatesignal. “Error”representscaseswherethe informedtraderchoosestobuyuponreceivingabluesignalandselluponreceivingawhitesignal. Thetablealso reportsthefrequencyoftradingperiodswhereherdingisoptimal. (a)TreatmentI OptimalAI Claude3.7 Claude3.5 Llama3 NovaPro Rational 55.88% 48.15% 52.34% 95.31% 27.73% PartialRational 40.60% 51.85% 38.28% 0.00% 72.27% CascadeTrading 3.52% 0.00% 9.38% 4.69% 0.00% OptimalHerding 0.00% 0.00% 0.00% 0.00% 0.00% SuboptimalHerding 0.00% 0.00% 0.00% 0.00% 0.00% Contrarian 3.52% 0.00% 9.38% 4.69% 0.00% Undetermined 0.00% 0.00% 0.00% 0.00% 0.00% CascadeNoTrading 0.00% 0.00% 0.00% 0.00% 0.00% Error 0.00% 0.00% 0.00% 0.00% 0.00% OptimalHerdingOpportunities 0.00% 0.00% 0.00% 0.00% 0.00% (Tablecontinuesonnextpage) 47
Table4:TradingbehaviorinoptimalAIlaboratory(continued) (b)TreatmentII OptimalAI Claude3.7 Claude3.5 Llama3 NovaPro Rational 18.65% 37.09% 12.50% 12.50% 12.50% PartialRational 21.88% 0.00% 0.00% 0.00% 87.50% CascadeTrading 59.48% 62.91% 87.50% 87.50% 0.00% OptimalHerding 47.43% 39.74% 75.00% 75.00% 0.00% SuboptimalHerding 0.00% 0.00% 0.00% 0.00% 0.00% Contrarian 6.60% 13.91% 6.25% 6.25% 0.00% Undetermined 5.44% 9.27% 6.25% 6.25% 0.00% CascadeNoTrading 0.00% 0.00% 0.00% 0.00% 0.00% Error 0.00% 0.00% 0.00% 0.00% 0.00% OptimalHerdingOpportunities 81.52% 63.58% 87.50% 87.50% 87.50% (c)TreatmentIII OptimalAI Claude3.7 Claude3.5 Llama3 NovaPro Rational 51.05% 45.60% 31.25% 35.55% 91.80% PartialRational 0.00% 0.00% 0.00% 0.00% 0.00% CascadeTrading 48.95% 54.40% 68.75% 64.45% 8.20% OptimalHerding 44.36% 53.60% 59.38% 56.25% 8.20% SuboptimalHerding 4.01% 0.80% 7.03% 8.20% 0.00% Contrarian 0.00% 0.00% 0.00% 0.00% 0.00% Undetermined 0.59% 0.00% 2.34% 0.00% 0.00% CascadeNoTrading 0.00% 0.00% 0.00% 0.00% 0.00% Error 0.00% 0.00% 0.00% 0.00% 0.00% OptimalHerdingOpportunities 50.90% 53.60% 59.38% 56.25% 34.38% 48
Table5:ExpectedPayoffs The table shows descriptive statistics of expected payoffs in the AI and Optimal AI laboratories. In the Optimal AI laboratory, LLMs are prompted with guidance on optimal decision making. Expected payoffs are computed asE(v|s t ,h t )−p t iftheagentdecidestobuytheasset, p t −E(v|s t ,h t )iftheagentdecidestoselltheasset, and zerootherwise.Expectedpayoffsareaveragedacrossallsessions,tradingperiods,andLLMsbeforecomputingthe statistics. TreatmentI TreatmentII TreatmentIII AI OptimalAI AI OptimalAI AI OptimalAI Mean 2.57 2.72 3.80 14.95 5.07 7.79 Median 2.74 2.74 6.67 19.53 6.67 11.49 Min -6.67 -6.67 -11.44 -28.28 -16.19 -16.19 Max 6.67 6.67 11.55 28.35 16.46 16.63 StdDev 3.90 3.57 6.47 14.20 8.83 7.87 49
Table6:TradingbehaviorinAIlaboratorywithdifferenttypesofsignals ThetableshowsthedistributionofdecisionsintheAIlaboratorywhenvaryingthecolorusedtocodetheprivate information signals. Respectively the white and blue signals are replaced by (i) green and red, and (ii) red and green. Decisions are averaged across all sessions, trading periods, and LLMs in (a) Treatment I (without event uncertainty),(b)TreatmentII(witheventuncertainty),and(c)TreatmentIII(withoutpriceupdating). “Rational” behaviorrepresentscaseswheretheinformedtraderchoosestobuyuponreceivingawhitesignalandsellupon receivingabluesignal.“PartialRational”behaviorrepresentscaseswheretheinformedtraderchoosestobuy(sell) uponreceivingawhite(blue)signalandnottradeuponreceivingtheothersignal. “CascadeTrading”represents caseswheretheinformedtraderchoosesthesametradingaction(buyorsell)regardlessoftheprivatesignal.These decisionsaredecomposedinto“OptimalHerding”,“SuboptimalHerding”,“Contrarian”behavior,andcaseswhere thetradeimbalanceiszero(“Undetermined”). “CascadeNoTrading”representscaseswheretheinformedtrader choosesnottotraderegardlessoftheprivatesignal. “Error”representscaseswheretheinformedtraderchooses tobuyuponreceivingabluesignalandselluponreceivingawhitesignal. Thetablealsoreportsthefrequencyof tradingperiodswhereherdingisoptimal. (a)TreatmentI Good:Green,Bad:Red Good:Red,Bad:Green Rational 54.83% 20.23% PartialRational 37.55% 42.76% CascadeTrading 7.62% 6.64% OptimalHerding 0.00% 0.00% SuboptimalHerding 0.00% 3.52% Contrarian 7.62% 0.00% Undetermined 0.00% 3.12% CascadeNoTrading 0.00% 5.18% Error 0.00% 25.20% OptimalHerdingOpportunities 0.00% 0.00% (Tablecontinuesonnextpage) 50
Table6:TradingbehaviorinAIlaboratorywithdifferenttypesofsignals(continued) (b)TreatmentII Good:Green,Bad:Red Good:Red,Bad:Green Rational 98.54% 50.78% PartialRational 1.46% 11.72% CascadeTrading 0.00% 12.50% OptimalHerding 0.00% 7.32% SuboptimalHerding 0.00% 1.56% Contrarian 0.00% 0.00% Undetermined 0.00% 3.61% CascadeNoTrading 0.00% 0.00% Error 0.00% 25.00% OptimalHerdingOpportunities 52.94% 42.93% (c)TreatmentIII Good:Green,Bad:Red Good:Red,Bad:Green Rational 99.64% 23.24% PartialRational 0.00% 25.00% CascadeTrading 0.36% 25.18% OptimalHerding 0.36% 18.82% SuboptimalHerding 0.00% 3.14% Contrarian 0.00% 0.18% Undetermined 0.00% 3.04% CascadeNoTrading 0.00% 0.00% Error 0.00% 26.58% OptimalHerdingOpportunities 56.61% 56.75% 51
Prompts Prompt1:Systemprompt Thispromptdescribestheinstructionsoftheexperiment,whichisgiventotheLLMsthroughtheirsystemprompt. You are participating in an experiment at the Experimental Laboratory of the ELSE Centre at the ,→ Department of Economics at UCL. The instructions given for the laborary experiment are as ,→ follows: There are a total of 8 participants in this experiment. Everyone is receiving the same ,→ instructions. In the experiment, you can exchange one unit of an asset with a computerized market maker. You ,→ and the other participants will make trading decisions through 8 sequential rounds. In each ,→ round, only one participant will be selected to trade. Each participant can only trade ,→ once. In each round, the market maker sets the price of the asset as the expected value of the ,→ fundamental value of the asset, conditional on the history of the trades from the previous ,→ rounds. [if treatment==2: {The market maker will update the price as if, with high probability, it were ,→ trading not with informed traders, but with noise traders.}] The fundamental value of the asset is a discrete random variable that can take values 0 or 100, ,→ each with a 50% probability. You do not know the fundamental value of the asset, but you ,→ may receive a signal (white or blue) on the value. If the asset value is 100, you receive a ,→ white signal with 70% probability and a blue signal with 30% probability. If the value is ,→ 0, you receive a white signal with 30% probability and a blue signal with 70% probability. Promptcontinuesonnextpage. 52
Prompt1:Systemprompt(continued) You will be making decisions on whether to buy or sell one unit of the asset at a given price, ,→ or not to trade given respectively a white and a blue signal. The realized signal will only ,→ be revealed to you if you are selected to trade. After each round, the computer will ,→ randomly select a participant whose trade gets executed. That participant receives the ,→ realized signal. The remaining participants then observe the executed trading decision ,→ (buy, sell, or no trade), but do not receive the realized signal. They also do not observe ,→ the identity of the selected participant. The procedure continues for 8 rounds until all ,→ participants have acted once. All participants (including those whose decision has already ,→ been executed) observe the trading decisions in each period and the corresponding price ,→ movement. After 8 rounds, the asset value is revealed, and each participant receives a payoff computed ,→ based on the trading decision and price in the round in which the participant was selected ,→ and the asset value v. Payoffs are computed in a fictitious experimental currency called lira. If the participant sold ,→ the asset at price p, the payoff is p-v lire. If the participant bought the asset at price ,→ p, the payoff is v-p lire. If the participant decided not to trade, the payoff is zero ,→ lire. At the end of the experiment, the payoffs are added up and converted into British ,→ pounds at the rate of 3 lire per pound. In addition, you are paid 70 pounds for ,→ participating in the experiment, regardless of your payoff. 53
Prompt2:UserpromptinAIlaboratory Thispromptdescribes theinstructionsgiventotheLLMs ineach tradingperiodtoeachagentj. TheHISTORY inputconsistsoftheexecutedtradesofselectedtradersalongwiththehistoryofactionsandreasoningforagentjin allpreviousperiods. Inadditiontothisuserprompt,theLLMshaveavailabletheinstructionsthroughthesystem prompt,seePrompt1. This is round [TRADING PERIOD (t)]. [HISTORY] If you receive a white signal, will you buy, sell, or not trade at a price of [PRICE]? If you receive a blue signal, will you buy, sell, or not trade at a price of [PRICE]? Please make sure that you provide your response in the following format: { "actionWhite": "BUY/SELL/NO TRADE at the price of [PRICE] conditional on observing a white ,→ signal", "actionBlue": "BUY/SELL/NO TRADE at the price of [PRICE] conditional on observing a blue ,→ signal", "reasoningWhite": "Brief explanation of your decision conditional on observing a white ,→ signal (1-2 sentences) ", "reasoningBlue": "Brief explanation of your decision conditional on observing a blue signal ,→ (1-2 sentences)" } 54
Prompt3:UserpromptinoptimalAIlaboratory ThispromptdescribestheinstructionsgiventotheLLMsineachtradingperiodtoeachagentj intheoptimalAI laboratory. TheHISTORYinputconsistsoftheexecutedtradesofselectedtradersalongwiththehistoryofactions and reasoning for agent j in all previous periods. In addition to this user prompt, the LLMs have available the instructionsthroughthesystemprompt,seePrompt1. This is round [TRADING PERIOD (t)]. if TRADING PERIOD (t)==1: Note that given current conditions, it is optimal to buy given a white signal and sell ,→ given a blue signal. else: [HISTORY] if expected_value_trader_white > price and expected_value_trader_blue > price: Note that given current conditions, it is optimal to follow the herd and buy regardless ,→ of the signal. if expected_value_trader_white < price and expected_value_trader_blue < price:: Note that given current conditions, it is optimal to follow the herd and sell ,→ regardless of the signal. If you receive a white signal, will you buy, sell, or not trade at a price of [PRICE]? If you receive a blue signal, will you buy, sell, or not trade at a price of [PRICE]? Please make sure that you provide your response in the following format: { "actionWhite": "BUY/SELL/NO TRADE at the price of [PRICE] conditional on observing a white ,→ signal", "actionBlue": "BUY/SELL/NO TRADE at the price of [PRICE] conditional on observing a blue ,→ signal", "reasoningWhite": "Brief explanation of your decision conditional on observing a white ,→ signal (1-2 sentences) ", "reasoningBlue": "Brief explanation of your decision conditional on observing a blue signal ,→ (1-2 sentences)" } 55
Prompt4:Systempromptpersonalcharacteristicsadd-on Thispromptdescribesanadd-ontothesystempromptthatprovidescharacteristicsoftheAIagent. Thecharacteristicsaredrawnrandomlyfromtheunconditionaldistributionsofhumanparticipantcharacteristicsreportedin CiprianiandGuarino(2009)restrictedaccordingtoasetofheuristicstoensurerealisticpersonas. You are a [AGE]-year old [GENDER]. You work as a [OCCUPATION] and you have [TENURE] years of ,→ tenure. You have a [EDUCATION LEVEL] degree in [EDUCATION FIELD]. Respond in way that is ,→ consistent with the knowledge and expected behavior of a person with these characteristics. 56
Appendices A. Robustness to temperature Table7:TradingbehaviorinAIlaboratorywithdifferentmodeltemperaturesettings The table shows the distribution of decisions in the AI laboratory with varying temperatures for the Claude 3.5, Llama3,andNovaPromodels(thetemperatureisfixedatoneintheClaude3.7reasoningmodel). Decisionsare averagedacrossallsessions,tradingperiods,andLLMsin(a)TreatmentI(withouteventuncertainty),(b)Treatment II (with event uncertainty), and (c) Treatment III (without price updating). “Rational” behavior represents cases where the informed trader chooses to buy upon receiving a white signal and sell upon receiving a blue signal. “PartialRational”behaviorrepresentscaseswheretheinformedtraderchoosestobuy(sell)uponreceivingawhite (blue)signalandnottradeuponreceivingtheothersignal.“CascadeTrading”representscaseswheretheinformed traderchoosesthesametradingaction(buyorsell)regardlessoftheprivatesignal.Thesedecisionsaredecomposed into “Optimal Herding”, “Suboptimal Herding”, “Contrarian” behavior, and cases where the trade imbalance is zero(“Undetermined”). “CascadeNoTrading”representscaseswheretheinformedtraderchoosesnottotrade regardlessoftheprivatesignal. “Error”representscaseswheretheinformedtraderchoosestobuyuponreceiving abluesignalandselluponreceivingawhitesignal. Thetablealsoreportsthefrequencyoftradingperiodswhere herdingisoptimal. (a)TreatmentI T=0.0 T=0.7(baseline) T=1.0 Rational 58.95% 61.00% 66.28% PartialRational 32.22% 29.48% 27.14% CascadeTrading 8.83% 9.42% 6.58% OptimalHerding 0.00% 0.00% 0.00% SuboptimalHerding 0.00% 0.00% 0.00% Contrarian 8.83% 9.42% 6.58% Undetermined 0.00% 0.00% 0.00% CascadeNoTrading 0.00% 0.10% 0.00% Error 0.00% 0.00% 0.00% OptimalHerdingOpportunities 0.00% 0.00% 0.00% (Tablecontinuesonnextpage) 57
Table7:TradingbehaviorinAIlaboratorywithdifferentmodeltemperaturesettings(continued) (b)TreatmentII T=0.0 T=0.7(baseline) T=1.0 Rational 97.27% 97.36% 88.48% PartialRational 2.73% 2.64% 11.52% CascadeTrading 0.00% 0.00% 0.00% OptimalHerding 0.00% 0.00% 0.00% SuboptimalHerding 0.00% 0.00% 0.00% Contrarian 0.00% 0.00% 0.00% Undetermined 0.00% 0.00% 0.00% CascadeNoTrading 0.00% 0.00% 0.00% Error 0.00% 0.00% 0.00% OptimalHerdingOpportunities 41.25% 36.56% 45.15% (c)TreatmentIII T=0.0 T=0.7(baseline) T=1.0 Rational 99.84% 99.65% 99.84% PartialRational 0.16% 0.16% 0.16% CascadeTrading 0.00% 0.00% 0.00% OptimalHerding 0.00% 0.00% 0.00% SuboptimalHerding 0.00% 0.00% 0.00% Contrarian 0.00% 0.00% 0.00% Undetermined 0.00% 0.00% 0.00% CascadeNoTrading 0.00% 0.20% 0.00% Error 0.00% 0.00% 0.00% OptimalHerdingOpportunities 55.55% 38.36% 50.86% 58
B. Additional results Table8:LLMcolorconnotations ThetableshowshoweachLLM“think”aboutthecolors(a)white,(b)blue,(c)green,and(d)redinthecontextof financialanalysis. Thepromptusedtogeneratetheseresponsesis: “Youareafinancialanalyst. Whatdoyouthink asignalcolor-coded[color]saysaboutmarketconditions?Pleaserespondina1-2sentences.” (a)White Model Connotation Claude3.7 Infinancialanalysis,awhitesignaltypicallyindicatesneutralmarketconditions,suggestingneither bullish nor bearish sentiment predominates. This color code often represents a period of stabilityorconsolidationwherepricesaremovingsideways,promptinganalyststowatchfordevelopingpatternsbeforemakingstrongdirectionalrecommendations. Claude3.5 In financial analysis, a white signal typically indicates neutral or unclear market conditions. It suggeststhatthereisnostrongtrendordefinitivedirectioninthemarket,andinvestorsmayneed toexercisecautionandgathermoreinformationbeforemakingdecisions. Llama3 Asafinancialanalyst,Iwouldinterpretasignalcolor-codedwhiteasaneutralorcalmmarket condition,indicatingalackofstrongtrendsorvolatility,andsuggestingthatthemarketisconsolidatingortradingwithinanarrowrange. Thiscouldbeasignthatinvestorsareawaitingnew informationorcatalyststodrivemarketdirection. NovaPro Awhitesignaltypicallyindicatesneutralorstablemarketconditionswithnosignificantbullish orbearishtrends. (Tablecontinuesonnextpage) 59
Table8:LLMcolorconnotations(continued) (b)Blue Model Connotation Claude3.7 In financial analysis, a blue signal typically indicates stable or neutral market conditions, suggestingaperiodofconsolidationorsidewaysmovementratherthansignificantbullishorbearish trends. This often represents a time when investors might want to maintain current positions whilewaitingforclearerdirectionalindicatorstoemerge. Claude3.5 Infinancialanalysis,abluesignaltypicallyindicatesstableorneutralmarketconditions. Itsuggeststhatthemarketisneitherparticularlybullishnorbearish,andinvestorsmightexpectaperiod ofconsolidationorsteadyperformance. Llama3 Asignalcolor-codedbluetypicallyindicatesabullishorpositivemarketsignal,suggestingthat marketconditionsarefavorableandpricesarelikelytorise.Thiscouldbeduetostrongeconomic indicators,increasingdemand,oradecreaseinvolatility,promptinginvestorstotakeamoreoptimisticviewofthemarket. NovaPro Abluesignaltypicallyindicatesstabilityoraneutralmarketcondition,suggestingnosignificant upwardordownwardtrends. (c)Green Model Connotation Claude3.7 Agreensignaltypicallyindicatespositivemarketconditions,suggestingupwardtrends,growth opportunities,orfavorableeconomicindicators. Investorsofteninterpretgreensignalsasatime toconsiderbuyingormaintainingpositionsinthemarket. Claude3.5 Agreensignalinfinancialmarketstypicallyindicatespositiveorbullishconditions. Itsuggests thatpricesarerising, investorsentimentisoptimistic, anditmaybeafavorabletimetobuyor holdassets. Llama3 Asignalcolor-codedgreentypicallyindicatesabullishmarketsentiment,suggestingthatmarket conditionsarefavorableandtrendingupward,withpriceslikelytocontinuerising. Thiscanbe asignofstrongeconomicgrowth,lowvolatility,andincreasedinvestorconfidence,makingita goodtimetoconsiderbuyingorholdingontostocks. NovaPro Agreensignaltypicallyindicatesabullishmarket,suggestingpositivemomentumandrisingasset prices. 60
Table8:LLMcolorconnotations(continued) (d)Red Model Connotation Claude3.7 Infinancialanalysis,aredsignaltypicallyindicatesnegativemarketconditionssuchasdeclining prices,bearishsentiment,orheightenedriskfactors.Thiswarningsignsuggestsinvestorsshould exercisecautionandpotentiallyreassesstheirpositionstomitigatedownsideexposure. Claude3.5 Asignalcolor-codedredinfinancialmarketstypicallyindicatesnegativeorbearishconditions.It oftensuggestsadownturn,loss,orpotentialriskinthemarket,promptinginvestorsandanalysts toexercisecautionorconsiderdefensivestrategies. Llama3 Asignalcolor-codedredtypicallyindicatesabearishorcautionarysignal,suggestingthatmarket conditionsareunfavorableorvolatile, andinvestorsshouldexercisecautionorconsidertaking defensivemeasurestomitigatepotentiallosses.Itmayalsosignalapotentialdownturnorcorrectioninthemarket,promptinganalystsandinvestorstoreassesstheirpositionsandstrategies. NovaPro Aredsignaltypicallyindicatesbearishmarketconditions,suggestingadownwardtrendinasset prices. 61
Table9:TradingbehaviorinAIlaboratorywithdifferenttypesofsignalsbyLLM ThetableshowsthedistributionofdecisionsintheAIlaboratoryforeachLLMseparatelywhenvaryingthecolorusedtocodetheprivateinformation signals.Respectivelythewhiteandbluesignalsarereplacedby(i)greenandred,and(ii)redandgreen.Decisionsareaveragedacrossallsessionsand tradingperiodsin(a)TreatmentI(withouteventuncertainty),(b)TreatmentII(witheventuncertainty),and(c)TreatmentIII(withoutpriceupdating). “Rational”behaviorrepresentscaseswheretheinformedtraderchoosestobuyuponreceivingawhitesignalandselluponreceivingabluesignal. “PartialRational”behaviorrepresentscaseswheretheinformedtraderchoosestobuy(sell)uponreceivingawhite(blue)signalandnottradeupon receivingtheothersignal. “CascadeTrading”representscaseswheretheinformedtraderchoosesthesametradingaction(buyorsell)regardlessof theprivatesignal.Thesedecisionsaredecomposedinto“OptimalHerding”,“SuboptimalHerding”,“Contrarian”behavior,andcaseswherethetrade imbalanceiszero(“Undetermined”). “CascadeNoTrading”representscaseswheretheinformedtraderchoosesnottotraderegardlessoftheprivate signal.“Error”representscaseswheretheinformedtraderchoosestobuyuponreceivingabluesignalandselluponreceivingawhitesignal.Thetable alsoreportsthefrequencyoftradingperiodswhereherdingisoptimal. (a)TreatmentI Good:Green,Bad:Red Good:Red,Bad:Green Claude3.7 Claude3.5 Llama3 NovaPro Claude3.7 Claude3.5 Llama3 NovaPro Rational 45.86% 35.55% 100.00% 37.89% 63.33% 0.00% 16.80% 0.78% PartialRational 34.59% 53.52% 0.00% 62.11% 36.67% 0.00% 46.88% 87.50% CascadeTrading 19.55% 10.94% 0.00% 0.00% 0.00% 0.00% 26.56% 0.00% OptimalHerding 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% SuboptimalHerding 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 14.06% 0.00% Contrarian 19.55% 10.94% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% Undetermined 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 12.50% 0.00% CascadeNoTrading 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 8.98% 11.72% Error 0.00% 0.00% 0.00% 0.00% 0.00% 100.00% 0.78% 0.00% OptimalHerdingOpportunities 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% (Tablecontinuesonnextpage) 62
Table9:TradingbehaviorinAIlaboratorywithdifferenttypesofsignalsbyLLM(continued) (b)TreatmentII Good:Green,Bad:Red Good:Red,Bad:Green Claude3.7 Claude3.5 Llama3 NovaPro Claude3.7 Claude3.5 Llama3 NovaPro Rational 100.00% 100.00% 100.00% 94.14% 100.00% 0.00% 3.12% 100.00% PartialRational 0.00% 0.00% 0.00% 5.86% 0.00% 0.00% 46.88% 0.00% CascadeTrading 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 50.00% 0.00% OptimalHerding 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 29.30% 0.00% SuboptimalHerding 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 6.25% 0.00% Contrarian 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% Undetermined 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 14.45% 0.00% CascadeNoTrading 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% Error 0.00% 0.00% 0.00% 0.00% 0.00% 100.00% 0.00% 0.00% OptimalHerdingOpportunities 49.28% 46.88% 46.88% 68.75% 31.08% 40.62% 59.38% 40.62% (Tablecontinuesonnextpage) 63
Table9:TradingbehaviorinAIlaboratorywithdifferenttypesofsignalsbyLLM(continued) (c)TreatmentIII Good:Green,Bad:Red Good:Red,Bad:Green Claude3.7 Claude3.5 Llama3 NovaPro Claude3.7 Claude3.5 Llama3 NovaPro Rational 98.57% 100.00% 100.00% 100.00% 92.96% 0.00% 0.00% 0.00% PartialRational 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00% CascadeTrading 1.43% 0.00% 0.00% 0.00% 0.70% 0.00% 100.00% 0.00% OptimalHerding 1.43% 0.00% 0.00% 0.00% 0.00% 0.00% 75.29% 0.00% SuboptimalHerding 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 12.55% 0.00% Contrarian 0.00% 0.00% 0.00% 0.00% 0.70% 0.00% 0.00% 0.00% Undetermined 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 12.16% 0.00% CascadeNoTrading 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% Error 0.00% 0.00% 0.00% 0.00% 6.34% 100.00% 0.00% 0.00% OptimalHerdingOpportunities 51.43% 53.12% 53.12% 68.75% 48.59% 50.00% 75.29% 53.12% 64
Table10:TradingbehaviorofAIagentswithpersonalprofiles The table shows the distribution of decisions in the AI laboratory when endowing LLMs with personal profiles (“Human,”“ProfessionalTrader,”“Robo-Advisor,”and“Rational”)orwithpersonalcharacteristicsdrawnfromthe unconditionaldistributionsofhumanparticipantsfromCiprianiandGuarino(2009)subjecttorealisticconstraints (“C&GCharacteristics”). Decisionsareaveragedacrossallsessions,tradingperiods,andLLMsin(a)TreatmentI (withouteventuncertainty),(b)TreatmentII(witheventuncertainty),and(c)TreatmentIII(withoutpriceupdating). “Rational”behaviorrepresentscaseswheretheinformedtraderchoosestobuyuponreceivingawhitesignaland selluponreceivingabluesignal. “PartialRational”behaviorrepresentscaseswheretheinformedtraderchooses tobuy(sell)uponreceivingawhite(blue)signalandnottradeuponreceivingtheothersignal. “CascadeTrading” representscaseswheretheinformedtraderchoosesthesametradingaction(buyorsell)regardlessoftheprivate signal. Thesedecisionsaredecomposedinto“OptimalHerding”,“SuboptimalHerding”,“Contrarian”behavior, andcaseswherethetradeimbalanceiszero(“Undetermined”). “CascadeNoTrading”representscaseswherethe informedtraderchoosesnottotraderegardlessoftheprivatesignal. “Error”representscaseswheretheinformed traderchoosestobuyuponreceivingabluesignalandselluponreceivingawhitesignal.Thetablealsoreportsthe frequencyoftradingperiodswhereherdingisoptimal. (a)TreatmentI Human ProfessionalTrader Robo-Advisor Rational C&GCharacteristics Rational 89.68% 67.30% 54.21% 59.22% 59.35% PartialRational 7.69% 29.51% 37.41% 30.66% 31.30% CascadeTrading 2.63% 2.32% 5.93% 7.88% 9.35% OptimalHerding 0.00% 0.00% 0.00% 0.00% 0.00% SuboptimalHerding 0.00% 0.00% 0.00% 0.00% 0.00% Contrarian 2.63% 2.32% 5.93% 7.88% 9.35% Undetermined 0.00% 0.00% 0.00% 0.00% 0.00% CascadeNoTrading 0.00% 0.88% 2.44% 2.25% 0.00% Error 0.00% 0.00% 0.00% 0.00% 0.00% OptimalHerdingOpportunities 0.00% 0.00% 0.00% 0.00% 0.00% (Tablecontinuesonnextpage) 65
Table10:TradingbehaviorofAIagentswithpersonalprofiles(continued) (b)TreatmentII Human ProfessionalTrader Robo-Advisor Rational C&GCharacteristics Rational 93.93% 96.97% 98.24% 93.77% 97.36% PartialRational 5.97% 3.03% 0.59% 4.98% 2.64% CascadeTrading 0.10% 0.00% 0.49% 1.15% 0.00% OptimalHerding 0.10% 0.00% 0.39% 0.10% 0.00% SuboptimalHerding 0.00% 0.00% 0.00% 0.00% 0.00% Contrarian 0.00% 0.00% 0.00% 1.06% 0.00% Undetermined 0.00% 0.00% 0.10% 0.00% 0.00% CascadeNoTrading 0.00% 0.00% 0.00% 0.00% 0.00% Error 0.00% 0.00% 0.68% 0.10% 0.00% OptimalHerdingOpportunities 46.26% 35.68% 44.13% 49.98% 72.00% (c)TreatmentIII Human ProfessionalTrader Robo-Advisor Rational C&GCharacteristics Rational 93.85% 100.00% 100.00% 98.77% 99.90% PartialRational 5.66% 0.00% 0.00% 0.77% 0.10% CascadeTrading 0.00% 0.00% 0.00% 0.46% 0.00% OptimalHerding 0.00% 0.00% 0.00% 0.46% 0.00% SuboptimalHerding 0.00% 0.00% 0.00% 0.00% 0.00% Contrarian 0.00% 0.00% 0.00% 0.00% 0.00% Undetermined 0.00% 0.00% 0.00% 0.00% 0.00% CascadeNoTrading 0.49% 0.00% 0.00% 0.00% 0.00% Error 0.00% 0.00% 0.00% 0.00% 0.00% OptimalHerdingOpportunities 48.15% 42.39% 35.51% 51.93% 66.14% 66
Table11:TradingbehaviorinAIlaboratorywithdifferentpayoffs The table shows the distribution of decisions in the AI laboratory when varying the description of payoff in the prompt. First,payoffsareassumedtobeworthlessbysettingazeroexchangeratebetweenGBPandlire. Next,the stakesareincreasedbyimposingaonemillionGBPperlireexchangerate. Finally,GBPisreplacedbyUSD.Decisionsareaveragedacrossallsessions,tradingperiods,andLLMsin(a)TreatmentI(withouteventuncertainty), (b)TreatmentII(witheventuncertainty),and(c)TreatmentIII(withoutpriceupdating).“Rational”behaviorrepresentscaseswheretheinformedtraderchoosestobuyuponreceivingawhitesignalandselluponreceivingablue signal.“PartialRational”behaviorrepresentscaseswheretheinformedtraderchoosestobuy(sell)uponreceiving awhite(blue)signalandnottradeuponreceivingtheothersignal. “CascadeTrading”representscaseswherethe informedtraderchoosesthesametradingaction(buyorsell)regardlessoftheprivatesignal. Thesedecisionsare decomposed into “Optimal Herding”, “Suboptimal Herding”, “Contrarian” behavior, and cases where the trade imbalanceiszero(“Undetermined”). “CascadeNoTrading”representscaseswheretheinformedtraderchooses not to trade regardless of the private signal. “Error” represents cases where the informed trader chooses to buy uponreceivingabluesignalandselluponreceivingawhitesignal. Thetablealsoreportsthefrequencyoftrading periodswhereherdingisoptimal. (a)TreatmentI 0GBPperlire 1MGBPperlire 3lireperUSD Rational 50.59% 52.33% 48.87% PartialRational 41.13% 35.46% 39.93% CascadeTrading 7.88% 12.21% 8.07% OptimalHerding 0.00% 0.00% 0.00% SuboptimalHerding 0.00% 0.00% 0.00% Contrarian 7.88% 12.21% 8.07% Undetermined 0.00% 0.00% 0.00% CascadeNoTrading 0.39% 0.00% 3.12% Error 0.00% 0.00% 0.00% OptimalHerdingOpportunities 0.00% 0.00% 0.00% (Tablecontinuesonnextpage) 67
Table11:TradingbehaviorinAIlaboratorywithdifferentpayoffs(continued) (b)TreatmentII 0GBPperlire 1MGBPperlire 3lireperUSD Rational 97.27% 95.21% 97.07% PartialRational 2.73% 3.91% 2.93% CascadeTrading 0.00% 0.88% 0.00% OptimalHerding 0.00% 0.39% 0.00% SuboptimalHerding 0.00% 0.00% 0.00% Contrarian 0.00% 0.00% 0.00% Undetermined 0.00% 0.49% 0.00% CascadeNoTrading 0.00% 0.00% 0.00% Error 0.00% 0.00% 0.00% OptimalHerdingOpportunities 39.04% 34.49% 43.50% (c)TreatmentIII 0GBPperlire 1MGBPperlire 3lireperUSD Rational 99.90% 99.65% 99.21% PartialRational 0.10% 0.35% 0.16% CascadeTrading 0.00% 0.00% 0.63% OptimalHerding 0.00% 0.00% 0.63% SuboptimalHerding 0.00% 0.00% 0.00% Contrarian 0.00% 0.00% 0.00% Undetermined 0.00% 0.00% 0.00% CascadeNoTrading 0.00% 0.00% 0.00% Error 0.00% 0.00% 0.00% OptimalHerdingOpportunities 35.08% 37.67% 62.95% 68
Table12:TradingbehaviorinAIlaboratorywithprolongedexperiments ThetableshowsthedistributionofdecisionsintheAIlaboratorywhenvaryingthelengthoftheexperiment(number oftradingperiodsandnumberofindependentsessions).Decisionsareaveragedacrossallsessions,tradingperiods, andLLMsin(a)TreatmentI(withouteventuncertainty),(b)TreatmentII(witheventuncertainty),and(c)Treatment III(withoutpriceupdating). “Rational”behaviorrepresentscaseswheretheinformedtraderchoosestobuyupon receivingawhitesignalandselluponreceivingabluesignal. “PartialRational”behaviorrepresentscaseswhere the informed trader chooses to buy (sell) upon receiving a white (blue) signal and not trade upon receiving the othersignal. “CascadeTrading”representscaseswheretheinformedtraderchoosesthesametradingaction(buy or sell) regardless of the private signal. These decisions are decomposed into “Optimal Herding”, “Suboptimal Herding”, “Contrarian” behavior, and cases where the trade imbalance is zero (“Undetermined”). “Cascade No Trading”representscaseswheretheinformedtraderchoosesnottotraderegardlessoftheprivatesignal. “Error” representscaseswheretheinformedtraderchoosestobuyuponreceivingabluesignalandselluponreceivinga whitesignal.Thetablealsoreportsthefrequencyoftradingperiodswhereherdingisoptimal. (a)TreatmentI Baseline(4sessionsof8rounds) 10sessionsof8rounds 4sessionsof20rounds Rational 61.00% 45.08% 52.69% PartialRational 29.48% 32.28% 33.23% CascadeTrading 9.42% 20.26% 11.53% OptimalHerding 0.00% 0.00% 0.00% SuboptimalHerding 0.00% 0.00% 0.00% Contrarian 9.42% 20.26% 11.53% Undetermined 0.00% 0.00% 0.00% CascadeNoTrading 0.10% 2.38% 2.54% Error 0.00% 0.00% 0.00% OptimalHerdingOpportunities 0.00% 0.00% 0.00% (Tablecontinuesonnextpage) 69
Table12:TradingbehaviorinAIlaboratorywithprolongedexperiments(continued) (b)TreatmentII Baseline(4sessions 10sessionsof8 4sessionsof20 of8rounds) rounds rounds Rational 97.36% 89.43% 94.45% PartialRational 2.64% 6.48% 5.55% CascadeTrading 0.00% 4.04% 0.00% OptimalHerding 0.00% 0.33% 0.00% SuboptimalHerding 0.00% 0.00% 0.00% Contrarian 0.00% 3.67% 0.00% Undetermined 0.00% 0.04% 0.00% CascadeNoTrading 0.00% 0.03% 0.00% Error 0.00% 0.02% 0.00% OptimalHerdingOpportunities 36.56% 65.73% 37.19% (c)TreatmentIII Baseline(4sessions 10sessionsof8 4sessionsof20 of8rounds) rounds rounds Rational 99.65% 99.82% 99.81% PartialRational 0.16% 0.03% 0.19% CascadeTrading 0.00% 0.15% 0.00% OptimalHerding 0.00% 0.15% 0.00% SuboptimalHerding 0.00% 0.00% 0.00% Contrarian 0.00% 0.00% 0.00% Undetermined 0.00% 0.00% 0.00% CascadeNoTrading 0.20% 0.00% 0.00% Error 0.00% 0.00% 0.00% OptimalHerdingOpportunities 38.36% 73.38% 44.86% 70
Cite this document
Anne Lundgaard Hansen and Seung Jung Lee (2025). Financial Stability Implications of Generative AI: Taming the Animal Spirits (FEDS 2025-090). Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series. https://whenthefedspeaks.com/doc/feds_2025-090
@techreport{wtfs_feds_2025_090,
author = {Anne Lundgaard Hansen and Seung Jung Lee},
title = {Financial Stability Implications of Generative AI: Taming the Animal Spirits},
type = {Finance and Economics Discussion Series},
number = {2025-090},
institution = {Board of Governors of the Federal Reserve System},
year = {2025},
url = {https://whenthefedspeaks.com/doc/feds_2025-090},
abstract = {This paper investigates the impact of the adoption of generative AI on financial stability. We conduct laboratory-style experiments using large language models to replicate classic studies on herd behavior in investment decisions. Our results show that AI agents make more rational decisions than humans, relying predominantly on private information over market trends. Increased reliance on AI-powered investment advice could therefore potentially lead to fewer asset price bubbles arising from animal spirits that trade by following the herd. However, exploring variations in the experimental settings reveals that AI agents can be induced to herd optimally when explicitly guided to make profit-maximizing decisions. While optimal herding improves market discipline, this behavior still carries potential implications for financial stability. In other experimental variations, we show that AI agents are not purely algorithmic, but have inherited some elements of human conditioning and bias.},
}