SampleEfficientReinforcementLearningInContinuousStateSpaces:APerspectiveBeyondLinearityDhruvMalik1AldoPacchiano2VishwakSrinivasan1YuanzhiLi1Abstractsuchabenchmark(Bellemareetal.,2013).Agentstrained...
SafeReinforcementLearningwithLinearFunctionApproximationSanaeAmani1ChristosThrampoulidis2LinF.Yang1Abstractactionmayleadtocatastrophicresults.Thus,safetyinRLhasbecomeaseriousissuethatrestrictstheap...
SafeReinforcementLearningUsingAdvantage-BasedInterventionNolanWagener1ByronBoots2Ching-AnCheng3AbstractFigure1.Advantage-basedinterventionofSAILRandconstruc-tionofthesurrogateMDPM.InM,wheneverthepo...
RRL:ResnetasrepresentationforReinforcementLearningRutavShah1VikashKumar2AbstractSupervisedLearningTheabilitytoautonomouslylearnbehaviorsviaReinforcementdirectinteractionsinuninstrumentedenviron-Lea...
RobustUnsupervisedLearningviaL-StatisticMinimizationAndreasMaurer1DanielaA.Parletta12AndreaPaudice13MassimilanoPontil14Abstractrestrictattentionto“asufficientportionofthedataingoodagreementwithone...
RobustRepresentationLearningviaPerceptualSimilarityMetricsSaeidAsgariTaghanaki1KristyChoi2AmirKhasahmadi1AnirudhGoyal3Abstractofdeepneuralnetworks(Deanetal.,2012;LeCunetal.,2015)havebeenpivotaltowa...
RobustReinforcementLearningusingLeastSquaresPolicyIterationwithProvablePerformanceGuaranteesKishanPanaganti1DileepKalathil1AbstractThismismatchbetweenthetrainingandtestingenvironmentparameterscansi...
RobustLearningforDataPoisoningAttacksYunjuanWang1PooryaMianjy1RamanArora1Abstractinsettingswhereanadversarycanaffectanypartofthetrainingdata.Therefore,inthispaper,weareinterestedinWeinvestigatether...
RobustAsymmetricLearninginPOMDPsAndrewWarrington1J.WilderLavington23AdamS´cibior23MarkSchmidt24FrankWood235Abstracttheworld,tocompletethetask.Atrainee,observingonlyimages,canthenlearntomimictheact...
Risk-SensitiveReinforcementLearningwithFunctionApproximation:ADebiasingApproachYingjieFei1ZhuoranYang2ZhaoranWang1Abstractrisk-seekingobjectiveandβ<0inducesarisk-averseone.ItcanalsobeseenthatVβte...
RewardIdentificationinInverseReinforcementLearningKunoKim1KirankumarShiragur1ShivamGarg1StefanoErmon1AbstractMDPstobuildcomputationalmodels(Niv,2009)ofreal-world,rationaldecisionmakerssuchasinvesto...
RevisitingPeng’sQ(λ)forModernReinforcementLearningTadashiKozuno1YunhaoTang2MarkRowland3Re´miMunos4StevenKapturowski3WillDabney3MichalValko4DavidAbel3Abstract1996;Watkins,1989;Peng&Williams,1994;...
REPAINT:KnowledgeTransferinDeepReinforcementLearningYunzheTao1SahikaGenc1JonathanChung1TaoSun1SunilMallya1Abstractimproveperformanceonothertasks.AcceleratingLearningprocessesforcomplextasksTransfer...
ReinforcementLearningwithPrototypicalRepresentationsDenisYarats12RobFergus1AlessandroLazaric2LerrelPinto1Abstractfromrewardsaloneissampleinefficientandleadstopoorperformance.Priorwork(Srinivasetal....
ReinforcementLearningUnderMoralUncertaintyAdrienEcoffet12JoelLehman12AbstractWhilesuchaccomplishmentsaresignificant,progresshasbeenlessstraight-forwardinapplyingRLtounstructuredAnambitiousgoalforma...
ReinforcementLearningofImplicitandExplicitControlFlowinInstructionsEthanA.Brooks1JanarthananRajendran1RichardL.Lewis2SatinderSingh1Abstracttaskinstructionsthatrequiretheagenttolearncontrolfloweithe...
ReinforcementLearningforCost-AwareMarkovDecisionProcessesWesleyA.Suttle1KaiqingZhang2ZhuoranYang3DavidN.Kraemer1JiLiu4Abstractquentlyusedinpractice.Nevertheless,alternativeobjectiveshaveseenincreas...
Zeroth-OrderNon-ConvexLearningviaHierarchicalDualAveragingAmélieHéliou1MatthieuMartin1PanayotisMertikopoulos21ThibaudRahier1Abstractalsorequiresthattheproblem’sobjectiveremainstationaryduringthe...
WorldModelasaGraph:LearningLatentLandmarksforPlanningLunjunZhang12GeYang3BradlyStadie4Abstract1.IntroductionPlanning,theabilitytoanalyzethestructureofaAnintelligentagentshouldbeabletosolvedifficult...
WhiteningforSelf-SupervisedRepresentationLearningAleksandrErmolov1AliaksandrSiarohin1EnverSangineto1NicuSebe1Abstractofwordsinasentenceisusedtolearnalanguagemodel(Mikolovetal.,2013a;b;Devlinetal.,2...