OnlineLearningwithOptimismandDelayGenevieveFlaspohler12FrancescoOrabona3JudahCohen4SoukaynaMouatadid5MirunaOprescu6PauloOrenstein7LesterMackey6AbstractialonlineLearningalgorithmsproviderobustperfor...
OnlineLearninginUnknownMarkovGamesYiTian1YuanhaoWang2TianchengYu1SuvritSra1Abstractcontrolboth/allplayersandaimtominimizethenumberofepisodesrequiredtofindagoodpolicy;and(2)theonlineWestudyonlinelea...
OnlineGraphDictionaryLearningCe´dricVincent-Cuaz1TitouanVayer2Re´miFlamary3MarcoCorneli14NicolasCourty5Abstractchallenging,astheirnatureisbyessencenon-vectorial,andrequiresdedicatedmodellingofthe...
RegretMinimizationinStochasticNon-ConvexLearningviaaProximal-GradientApproachNadavHallak1PanayotisMertikopoulos2VolkanCevher3Abstractproblems,andtheycanadapttodifferentmeasuresofregretunderdifferen...
RecomposingtheReinforcementLearningBuildingBlockswithHypernetworksEladSarafian1ShaiKeynan1SaritKraus1AbstractResBlockmetavariablePrimarynetLinearBlock256ResBlockTheReinforcementLearning(RL)building...
RandomizedExplorationforReinforcementLearningwithGeneralValueFunctionApproximationHaqueIshfaq12QiwenCui3VietNguyen12AlexAyoub4ZhuoranYang5ZhaoranWang6DoinaPrecup127LinF.Yang8Abstractwhengeneralfunc...
RandomizedEntity-wiseFactorizationforMulti-AgentReinforcementLearningShariqIqbal1ChristianA.SchroederdeWitt2BeiPeng2WendelinBo¨hmer3ShimonWhiteson2FeiSha14AbstractFigure1:Breakawaysub-scenarioinso...
OnReinforcementLearningwithAdversarialCorruptionandItsApplicationtoBlockMDPTianhaoWu12YunchangYang3SimonS.Du4LiweiWang35Abstractisvulnerabletocorrupteddatastemmingfrommaliciousentities(Huangetal.,2...
OfflineMeta-ReinforcementLearningwithAdvantageWeightingEricMitchell1RafaelRafailov1XueBinPeng2SergeyLevine2ChelseaFinn1AbstractofreinforcementLearningalgorithms,whenthegoalistoultimatelylearnmanyta...
OfflineReinforcementLearningwithFisherDivergenceCriticRegularizationIlyaKostrikov12JonathanTompson2RobFergus13OfirNachum2Abstractwheredeployinganewpolicytointeractwiththeliveen-vironmentisexpensive...
OfflineReinforcementLearningwithPseudometricLearningRobertDadashi1ShidehRezaeifar2NinoVieillard13Le´onardHussenot14OlivierPietquin1MatthieuGeist1Abstractthatgeneratedtheseexperiences(Pomerleau,199...
Off-BeliefLearningHengyuanHu1AdamLerer1BrandonCui1LuisPineda1NoamBrown1JakobFoerster1Abstractwhenpairedwithotheragents.Asaresult,strongjointpoli-ciesforSPoftenrelyonefficient,yetarbitraryconvention...
NotAllMemoriesareCreatedEqual:LearningtoForgetbyExpiringSainbayarSukhbaatar1DaJu1SpencerPoff1StephenRoller1ArthurSzlam1JasonWeston1AngelaFan12AbstractSukhbaataretal.,2019a).However,acriticalcompone...
NoiseandFluctuationofFiniteLearningRateStochasticGradientDescentKangqiaoLiu1LiuZiyin1MasahitoUeda123AbstractandTeh,2011).Whenthenoiseisduetominibatchsam-pling,thenoiseiscalledtheSGDnoiseorminibatch...
Neural-Pull:LearningSignedDistanceFunctionsfromPointCloudsbyLearningtoPullSpaceontoSurfacesBaoruiMa1ZhizhongHan2Yu-ShenLiu1MatthiasZwicker3Abstract2020;Takikawaetal.,2021;Marteletal.,2021;Oechsleet...
NeuralTransformationLearningforDeepAnomalyDetectionBeyondImagesChenQiu12TimoPfrommer1MariusKloft2StephanMandt3MajaRudolph1Abstractformationsareuseful,anditishardtodesignthesetrans-formationsmanuall...
NeighborhoodContrastiveLearningAppliedtoOnlinePatientMonitoringHugoYe`che1GideonDresdner1FrancescoLocatello2MatthiasHu¨ser1GunnarRa¨tsch1Abstractpliedthismethodologytomedicaltime-seriesdata(Cheng...
Near-OptimalRepresentationLearningforLinearBanditsandLinearRLJiachenHu1XiaoyuChen1ChiJin2LihongLi3LiweiWang14AbstractWhilerepresentationLearninghasachievedtremendoussuc-cessinavarietyofapplications...
Near-OptimalModel-FreeReinforcementLearninginNon-StationaryEpisodicMDPsWeichaoMao1KaiqingZhang1RuihaoZhu2DavidSimchi-Levi2TamerBas¸ar1Abstractthroughsequentialinteractionswithaninitiallyunknownbut...
NearlyOptimalReward-FreeReinforcementLearningZihanZhang1SimonS.Du2XiangyangJi1AbstractRLisexplorationforwhichtheagentneedstostrategicallyvisitnewstatestolearntransitionandrewardinformationWestudyth...