How difficult is AI alignment? | Anthropic Research Salon

How difficult is AI alignment? | Anthropic Research Salon https://tube.grossholtz.net/videos/watch/5e100e5e-71e0-46fa-a852-c1be08e13a4f At an Anthropic Research Salon event in San Francisco, four of our researchers—Alex Tamkin, Jan Leike, Amanda Askell and Josh Batson—discussed alignment science, interpretability, and the future of AI research. Further reading: Anthropic’s research: https://anthropic.com/research Claude’s character: https://www.anthropic.com/news/claude-character Evaluating feature steering: https://www.anthropic.com/research/evaluating-feature-steering 0:00 Introduction 0:30 An overview of alignment 4:48 Challenges of scaling 8:08 Role of interpretability 12:02 How models can help 14:31 Signs of whether alignment is easy or hard 18:28 Q&A — Multi-agent deliberation 20:38 Q&A — Model alignment epiphenomenon 23:43 Q&A — What solving alignment could look like Mon, 06 Apr 2026 03:10:13 GMT https://validator.w3.org/feed/docs/rss2.html PeerTube - https://tube.grossholtz.net How difficult is AI alignment? | Anthropic Research Salon https://tube.grossholtz.net/client/assets/images/icons/icon-512x512.png https://tube.grossholtz.net/videos/watch/5e100e5e-71e0-46fa-a852-c1be08e13a4f All rights reserved, unless otherwise specified in the terms specified at https://tube.grossholtz.net/about and potential licenses granted by each content's rightholder.