Talk at LISA (London Initiative for Safe AI): Moral Alignment for RL and LLM Agents

Date:

Gave a 30-min talk dsicussing Moral Alignment for RL and LLM Agents (i.e., our 4 latest papers), as part of the ARENA programme. In this talk I discussed agency in AI, existing approaches to Alignment (as described in our preprint), and my work on training and fine-tuning RL and LLM agents with intrinsic rewards, with particular focus on the unexpected findings from our papers at IJCAI’23, AIES’24 and ICLR’25.