Safe Learning Under Irreversible Dynamics via Asking for Help

Benjamin Plaut · Postdoc at CHAI (Center for Human-Compatible AI)

December 2025

Standard online-learning algorithms with formal guarantees often rely on trying all possible behaviors, which is unsafe when some errors cannot be recovered from. This work allows a learning agent to ask for help from a mentor and to transfer knowledge between similar states. The resulting algorithm learns both safely and effectively.

Watch recording

Readings

arXiv:2502.14043