UneVEn:UniversalValueExplorationforMulti-AgentReinforcementLearningTarunGupta1AnujMahajan1BeiPeng1WendelinBo¨hmer2ShimonWhiteson1Abstractfactorization,thejointactionvaluefunctioncanbedecen-trallym...
Tesseract:TensorisedActorsforMulti-AgentReinforcementLearningAnujMahajan1MikayelSamvelyan2LeiMao3ViktorMakoviychuk3AnimeshGarg3JeanKossaifi3ShimonWhiteson1YukeZhu3AnimashreeAnandkumar3Abstractarise...
ScalingMulti-AgentReinforcementLearningwithSelectiveParameterSharingFilipposChristianos1GeorgiosPapoudakis1ArrasyRahman1StefanoV.Albrecht1Abstract(e.g.(Guptaetal.,2017))wherebyagentssharesomeorallp...
ScalableEvaluationofMulti-AgentReinforcementLearningwithMeltingPotJoelZ.Leibo1EdgarDue´n˜ez-Guzma´n1AlexanderSashaVezhnevets1JohnP.Agapiou1PeterSunehag1RaphaelKoster1JaydMatyas1CharlesBeattie1Ig...
RandomizedEntity-wiseFactorizationforMulti-AgentReinforcementLearningShariqIqbal1ChristianA.SchroederdeWitt2BeiPeng2WendelinBo¨hmer3ShimonWhiteson2FeiSha14AbstractFigure1:Breakawaysub-scenarioinso...
Multi-AgentTrainingbeyondZero-SumwithCorrelatedEquilibriumMeta-SolversLukeMarris12PaulMuller13MarcLanctot1KarlTuyls1ThoreGraepel12AbstractAvisetal.,2010;Harsanyi&Selten,1988).2Two-player,constant-s...
LearningFairPoliciesinDecentralizedCooperativeMulti-AgentReinforcementLearningMatthieuZimmer1ClaireGlanois1UmerSiddique1PaulWeng12Abstractcurrentmainfocusisontheirperformancewithrespecttothetotal(o...
Large-ScaleMulti-AgentDeepFBSDEsTianrongChen1ZiyiWang2IoannisExarchos3EvangelosA.Theodorou24AbstractAttheequilibrium,eachplayercannotgainanybenefitbymodifyinghis/herownstrategygivenopponents’strat...
EmergentSocialLearningviaMulti-AgentReinforcementLearningKamalNdousse1DouglasEck2SergeyLevine23NatashaJaques23AbstractHumansareabletolearnfromoneanotherwithoutdirectaccesstotheexperiencesormemories...
CooperativeExplorationforMulti-AgentDeepReinforcementLearningIou-JenLiu1UnnatJain1RaymondA.Yeh1AlexanderG.Schwing1Abstract(MADDPG)(Loweetal.,2017),andcounterfactualMulti-Agentpolicygradients(COMA)(...
Coach-PlayerMulti-AgentReinforcementLearningforDynamicTeamCompositionBoLiu1QiangLiu1PeterStone1AnimeshGarg23YukeZhu13AnimashreeAnandkumar34AbstractcoachomniscientcoachomniscientInreal-worldmulti-ag...
ROMA:Multi-AgentReinforcementLearningwithEmergentRolesTonghanWang1HengDong1VictorLesser2ChongjieZhang1Abstract598Theroleconceptprovidesausefultooltode-signandunderstandcomplexMulti-Agentsys-162tems...
CooperativeMulti-AgentBanditswithHeavyTailsAbhimanyuDubey1AlexPentland1AbstractAv,t∈A,wherethespaceofactionsAisassumedtobefiniteandcountable(A=K).Itthenobtainsani.i.d.Westudytheheavy-tailedstochas...
Multi-AgentDeterminantalQ-LearningYaodongYang12YingWen12LihengChen3JunWang2KunShao1DavidMguni1WeinanZhang3AbstractAfullspectrumofMARLalgorithmshasbeendevelopedtosolvecooperativetasks(Panait&Luke,20...
Multi-AgentRoutingValueIterationNetworkQuinlanSykoraMengyeRenRaquelUrtasunAbstractFigure1.Avisualizationoftherouteproducedbyafleetoftwentyvehiclesusingourproposedalgorithm.ColorsdenotedifferentInth...
LearningEfficientMulti-AgentCommunication:AnInformationBottleneckApproachRundongWang1XuHe1RunshengYu1Weiqiu1BoAn1ZinoviRabinovich1Abstractcorrelationthatbenefitsagroup’scooperation.Therefore,manyr...
KernelMethodsforCooperativeContextualBanditsAbhimanyuDubey1AlexPentland1Abstracttoselectactionsthatminimizetheexpectedgroupregret:CooperativeMulti-AgentdecisionmakinginvolvesTagroupofagentscooperat...
Finite-TimeLast-IterateConvergenceforMulti-AgentLearninginGamesTianyiLin1ZhengyuanZhou2PanayotisMertikopoulos3Michael.I.Jordan4Abstractselectingarouteinatrafficnetwork).Inthispaper,weconsidermulti-...
TarMAC:TargetedMulti-AgentCommunicationAbhishekDas1‹ThéophileGervet2JoshuaRomoff23DhruvBatra13DeviParikh13MichaelRabbat23JoellePineau23Abstractandswifttransport,toteamsofrobotsonsearch-and-rescue...
SocialInfluenceasIntrinsicMotivationforMulti-AgentDeepReinforcementLearningNatashaJaques12AngelikiLazaridou2EdwardHughes2CaglarGulcehre2PedroA.Ortega2DJStrouse3JoelZ.Leibo2NandodeFreitas2Abstractac...