<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>Controlling powerful AI</title>
        <link>https://tube.grossholtz.net/videos/watch/d32d4279-ec9c-40b3-b197-146a2e08ec4e</link>
        <description>Anthropic researchers Ethan Perez, Joe Benton, and Akbir Khan discuss AI control—an approach to managing the risks of advanced AI systems. They discuss real-world evaluations showing how humans struggle to detect deceptive AI, the three major threat models researchers are working to mitigate, and the overall idea of controlling highly-capable AI systems whose goals may differ from our own. 0:00 Introduction 0:33 What is AI control? 2:56 Control evaluations in practice 5:39 Results from evaluations 7:27 Monitoring protocols 13:18 How control differs from alignment 16:09 The challenge of alignment faking 23:10 Ensuring evaluations work for future models 26:09 Open questions in control research 34:15 Lessons learned from control 37:14 Why work on control now? 43:26 Key threat models 48:35 Optimistic signs</description>
        <lastBuildDate>Mon, 06 Apr 2026 03:04:39 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>PeerTube - https://tube.grossholtz.net</generator>
        <image>
            <title>Controlling powerful AI</title>
            <url>https://tube.grossholtz.net/client/assets/images/icons/icon-512x512.png</url>
            <link>https://tube.grossholtz.net/videos/watch/d32d4279-ec9c-40b3-b197-146a2e08ec4e</link>
        </image>
        <copyright>All rights reserved, unless otherwise specified in the terms specified at https://tube.grossholtz.net/about and potential licenses granted by each content's rightholder.</copyright>
        <atom:link href="https://tube.grossholtz.net/feeds/video-comments.xml?videoId=d32d4279-ec9c-40b3-b197-146a2e08ec4e" rel="self" type="application/rss+xml"/>
    </channel>
</rss>