OfflineReinforcementLearningwithFisherDivergenceCriticRegularizationIlyaKostrikov12JonathanTompson2RobFergus13OfirNachum2Abstractwheredeployinganewpolicytointeractwiththeliveen-vironmentisexpensive...