Listen to a podcast, please open Podcast Republic app. Available on Google Play Store and Apple App Store.
| Episode | Date |
|---|---|
|
How human-like do safe AI motivations need to be?
|
Nov 12, 2025 |
|
Leaving Open Philanthropy, going to Anthropic
|
Nov 03, 2025 |
|
Controlling the options AIs can pursue
|
Sep 29, 2025 |
|
Giving AIs safe motivations
|
Aug 18, 2025 |
|
The stakes of AI moral status
|
May 21, 2025 |
|
Can we safely automate alignment research?
|
Apr 30, 2025 |
|
AI for AI safety
|
Mar 14, 2025 |
|
Paths and waystations in AI safety
|
Mar 11, 2025 |
|
When should we worry about AI power-seeking?
|
Feb 19, 2025 |
|
What is it to solve the alignment problem?
|
Feb 13, 2025 |
|
How do we solve the alignment problem?
|
Feb 13, 2025 |
|
Fake thinking and real thinking
|
Jan 28, 2025 |
|
Takes on "Alignment Faking in Large Language Models"
|
Dec 18, 2024 |
|
(Part 2, AI takeover) Extended audio from my conversation with Dwarkesh Patel
|
Sep 30, 2024 |
|
(Part 1, Otherness) Extended audio from my conversation with Dwarkesh Patel
|
Sep 30, 2024 |
|
Introduction and summary for "Otherness and control in the age of AGI"
|
Jun 21, 2024 |
|
Second half of full audio for "Otherness and control in the age of AGI"
|
Jun 18, 2024 |
|
First half of full audio for "Otherness and control in the age of AGI"
|
Jun 17, 2024 |
|
Loving a world you don't trust
|
Jun 17, 2024 |
|
On attunement
|
Mar 25, 2024 |
|
On green
|
Mar 21, 2024 |
|
On the abolition of man
|
Jan 18, 2024 |
|
Being nicer than Clippy
|
Jan 16, 2024 |
|
An even deeper atheism
|
Jan 11, 2024 |
|
Does AI risk "other" the AIs?
|
Jan 09, 2024 |
|
When "yang" goes wrong
|
Jan 08, 2024 |
|
Deep atheism and AI risk
|
Jan 04, 2024 |
|
Gentleness and the artificial Other
|
Jan 02, 2024 |
|
In search of benevolence (or: what should you get Clippy for Christmas?)
|
Dec 27, 2023 |
|
Empirical work that might shed light on scheming (Section 6 of "Scheming AIs")
|
Nov 16, 2023 |
|
Summing up "Scheming AIs" (Section 5)
|
Nov 16, 2023 |
|
Speed arguments against scheming (Section 4.4-4.7 of "Scheming AIs")
|
Nov 16, 2023 |
|
Simplicity arguments for scheming (Section 4.3 of "Scheming AIs")
|
Nov 16, 2023 |
|
The counting argument for scheming (Sections 4.1 and 4.2 of "Scheming AIs")
|
Nov 16, 2023 |
|
Arguments for/against scheming that focus on the path SGD takes (Section 3 of "Scheming AIs")
|
Nov 16, 2023 |
|
Non-classic stories about scheming (Section 2.3.2 of "Scheming AIs")
|
Nov 16, 2023 |
|
Does scheming lead to adequate future empowerment? (Section 2.3.1.2 of "Scheming AIs")
|
Nov 16, 2023 |
|
The goal-guarding hypothesis (Section 2.3.1.1 of "Scheming AIs")
|
Nov 16, 2023 |
|
How useful for alignment-relevant work are AIs with short-term goals? (Section 2.2.4.3 of "Scheming AIs")
|
Nov 16, 2023 |
|
Is scheming more likely if you train models to have long-term goals? (Sections 2.2.4.1-2.2.4.2 of "Scheming AIs")
|
Nov 16, 2023 |
|
"Clean" vs. "messy" goal-directedness (Section 2.2.3 of "Scheming AIs")
|
Nov 16, 2023 |
|
Two sources of beyond-episode goals (Section 2.2.2 of "Scheming AIs")
|
Nov 16, 2023 |
|
Two concepts of an "episode" (Section 2.2.1 of "Scheming AIs")
|
Nov 16, 2023 |
|
Situational awareness (Section 2.1 of "Scheming AIs")
|
Nov 16, 2023 |
|
On "slack" in training (Section 1.5 of "Scheming AIs")
|
Nov 16, 2023 |
|
Why focus on schemers in particular? (Sections 1.3-1.4 of "Scheming AIs")
|
Nov 16, 2023 |
|
A taxonomy of non-schemer models (Section 1.2 of "Scheming AIs")
|
Nov 16, 2023 |
|
Varieties of fake alignment (Section 1.1 of "Scheming AIs")
|
Nov 16, 2023 |
|
Full audio for "Scheming AIs: Will AIs fake alignment during training in order to get power?"
|
Nov 15, 2023 |
|
Introduction and summary of "Scheming AIs: Will AIs fake alignment during training in order to get power?"
|
Nov 14, 2023 |
|
In memory of Louise Glück
|
Oct 15, 2023 |
|
On the limits of idealized values
|
May 12, 2023 |
|
Predictable updating about AI risk
|
May 08, 2023 |
|
Existential Risk from Power-Seeking AI (shorter version)
|
Mar 19, 2023 |
|
Problems of evil
|
Mar 05, 2023 |
|
Seeing more whole
|
Feb 17, 2023 |
|
Why should ethical anti-realists do ethics?
|
Feb 16, 2023 |
|
Is Power-Seeking AI an Existential Risk?
|
Jan 25, 2023 |
|
On sincerity
|
Dec 23, 2022 |
|
Against meta-ethical hedonism
|
Dec 01, 2022 |
|
Against the normative realist's wager
|
Oct 09, 2022 |
|
Against neutrality about creating happy lives
|
Oct 05, 2022 |
|
Actually possible: thoughts on Utopia
|
Oct 05, 2022 |
|
On infinite ethics
|
Oct 05, 2022 |
|
Can you control the past?
|
Oct 05, 2022 |
|
On future people, looking back at 21st century longtermism
|
Oct 05, 2022 |
|
On clinging
|
Oct 05, 2022 |
|
Killing the ants
|
Oct 05, 2022 |
|
Thoughts on being mortal
|
Oct 05, 2022 |