Home
Blog
Light
Dark
Automatic
0
Preprocessing Reward Functions for Interpretability
We present a method for simplifying a learned reward model before visualizing it and show that this can make the reward more interpretable.
Erik Jenner
,
Adam Gleave
PDF
Cite
Code
Cite
×