LessWrong (Curated & Popular) Podcast Republic

LessWrong (Curated & Popular)

By LessWrong

Listen to a podcast, please open Podcast Republic app. Available on Google Play Store and Apple App Store.

Image by LessWrong

Category: Technology

Open in Apple Podcasts

Open RSS feed

Open Website

Rate for this podcast

Subscribers: 16
Reviews: 0
Episodes: 572

Description

Audio narrations of LessWrong posts. Includes all curated posts and all posts with 125+ karma.If you'd like more, subscribe to the “Lesswrong (30+ karma)” feed.

Episode	Date
“Many prediction markets would be better off as batched auctions” by William Howard Read the full episode description	Aug 04, 2025
“Whence the Inkhaven Residency?” by Ben Pace Read the full episode description	Aug 04, 2025
“I am worried about near-term non-LLM AI developments” by testingthewaters Read the full episode description	Aug 01, 2025
“Optimizing The Final Output Can Obfuscate CoT (Research Note)” by lukemarks, jacob_drori, cloud, TurnTrout Read the full episode description	Jul 31, 2025
“About 30% of Humanity’s Last Exam chemistry/biology answers are likely wrong” by bohaska Read the full episode description	Jul 30, 2025
“Maya’s Escape” by Bridgett Kay Read the full episode description	Jul 30, 2025
“Do confident short timelines make sense?” by TsviBT, abramdemski Read the full episode description	Jul 26, 2025
“HPMOR: The (Probably) Untold Lore” by Gretta Duleba, Eliezer Yudkowsky Read the full episode description	Jul 26, 2025
“On ‘ChatGPT Psychosis’ and LLM Sycophancy” by jdp Read the full episode description	Jul 25, 2025
“Subliminal Learning: LLMs Transmit Behavioral Traits via Hidden Signals in Data” by cloud, mle, Owain_Evans Read the full episode description	Jul 23, 2025
“Love stays loved (formerly ‘Skin’)” by Swimmer963 (Miranda Dixon-Luinenburg) Read the full episode description	Jul 21, 2025
“Make More Grayspaces” by Duncan Sabien (Inactive) Read the full episode description	Jul 21, 2025
“Shallow Water is Dangerous Too” by jefftk Read the full episode description	Jul 21, 2025
“Narrow Misalignment is Hard, Emergent Misalignment is Easy” by Edward Turner, Anna Soligo, Senthooran Rajamanoharan, Neel Nanda Read the full episode description	Jul 18, 2025
“Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety” by Tomek Korbak, Mikita Balesni, Vlad Mikulik, Rohin Shah Read the full episode description	Jul 16, 2025
“the jackpot age” by thiccythot Read the full episode description	Jul 14, 2025
“Surprises and learnings from almost two months of Leo Panickssery” by Nina Panickssery Read the full episode description	Jul 14, 2025
“An Opinionated Guide to Using Anki Correctly” by Luise Read the full episode description	Jul 13, 2025
“Lessons from the Iraq War about AI policy” by Buck Read the full episode description	Jul 12, 2025
“So You Think You’ve Awoken ChatGPT” by JustisMills Read the full episode description	Jul 11, 2025
“Generalized Hangriness: A Standard Rationalist Stance Toward Emotions” by johnswentworth Read the full episode description	Jul 11, 2025
“Comparing risk from internally-deployed AI to insider and outsider threats from humans” by Buck Read the full episode description	Jul 10, 2025
“Why Do Some Language Models Fake Alignment While Others Don’t?” by abhayesian, John Hughes, Alex Mallen, Jozdien, janus, Fabien Roger Read the full episode description	Jul 10, 2025
“A deep critique of AI 2027’s bad timeline models” by titotal Read the full episode description	Jul 09, 2025
“‘Buckle up bucko, this ain’t over till it’s over.’” by Raemon Read the full episode description	Jul 09, 2025
“Shutdown Resistance in Reasoning Models” by benwr, JeremySchlatter, Jeffrey Ladish Read the full episode description	Jul 08, 2025
“Authors Have a Responsibility to Communicate Clearly” by TurnTrout Read the full episode description	Jul 08, 2025
“The Industrial Explosion” by rosehadshar, Tom Davidson Read the full episode description	Jul 07, 2025
“Race and Gender Bias As An Example of Unfaithful Chain of Thought in the Wild” by Adam Karvonen, Sam Marks Read the full episode description	Jul 03, 2025
“The best simple argument for Pausing AI?” by Gary Marcus Read the full episode description	Jul 03, 2025
“Foom & Doom 2: Technical alignment is hard” by Steven Byrnes Read the full episode description	Jul 01, 2025
“Proposal for making credible commitments to AIs.” by Cleo Nardo Read the full episode description	Jun 30, 2025
“X explains Z% of the variance in Y” by Leon Lang Read the full episode description	Jun 28, 2025
“A case for courage, when speaking of AI danger” by So8res Read the full episode description	Jun 27, 2025
“My pitch for the AI Village” by Daniel Kokotajlo Read the full episode description	Jun 25, 2025
“Foom & Doom 1: ‘Brain in a box in a basement’” by Steven Byrnes Read the full episode description	Jun 24, 2025
“Futarchy’s fundamental flaw” by dynomight Read the full episode description	Jun 21, 2025
“Do Not Tile the Lightcone with Your Confused Ontology” by Jan_Kulveit Read the full episode description	Jun 19, 2025
“Endometriosis is an incredibly interesting disease” by Abhishaike Mahajan Read the full episode description	Jun 19, 2025
“Estrogen: A trip report” by cube_flipper Read the full episode description	Jun 19, 2025
“New Endorsements for ‘If Anyone Builds It, Everyone Dies’” by Malo Read the full episode description	Jun 18, 2025
[Linkpost] “the void” by nostalgebraist Read the full episode description	Jun 17, 2025
“Mech interp is not pre-paradigmatic” by Lee Sharkey Read the full episode description	Jun 17, 2025
“Distillation Robustifies Unlearning” by Bruce W. Lee, Addie Foote, alexinf, leni, Jacob G-W, Harish Kamath, Bryce Woodworth, cloud, TurnTrout Read the full episode description	Jun 17, 2025
“Intelligence Is Not Magic, But Your Threshold For ‘Magic’ Is Pretty Low” by Expertium Read the full episode description	Jun 17, 2025
“A Straightforward Explanation of the Good Regulator Theorem” by Alfred Harwood Read the full episode description	Jun 17, 2025
“Beware General Claims about ‘Generalizable Reasoning Capabilities’ (of Modern AI Systems)” by LawrenceC Read the full episode description	Jun 17, 2025
“Season Recap of the Village: Agents raise $2,000” by Shoshannah Tekofsky Read the full episode description	Jun 07, 2025
“The Best Reference Works for Every Subject” by Parker Conley Read the full episode description	Jun 06, 2025
“‘Flaky breakthroughs’ pervade coaching — and no one tracks them” by Chipmonk Read the full episode description	Jun 05, 2025
“The Value Proposition of Romantic Relationships” by johnswentworth Read the full episode description	Jun 04, 2025
“It’s hard to make scheming evals look realistic” by Igor Ivanov, dan_moken Read the full episode description	Jun 02, 2025
[Linkpost] “Social Anxiety Isn’t About Being Liked” by Chipmonk Read the full episode description	Jun 01, 2025
“Truth or Dare” by Duncan Sabien (Inactive) Read the full episode description	May 31, 2025
“Meditations on Doge” by Martin Sustrik Read the full episode description	May 30, 2025
[Linkpost] “If you’re not sure how to sort a list or grid—seriate it!” by gwern Read the full episode description	May 28, 2025
“What We Learned from Briefing 70+ Lawmakers on the Threat from AI” by leticiagarcia Read the full episode description	May 28, 2025
“Winning the power to lose” by KatjaGrace Read the full episode description	May 23, 2025
[Linkpost] “Gemini Diffusion: watch this space” by Yair Halberstadt Read the full episode description	May 22, 2025
“AI Doomerism in 1879” by David Gross Read the full episode description	May 21, 2025
“Consider not donating under $100 to political candidates” by DanielFilan Read the full episode description	May 16, 2025
“It’s Okay to Feel Bad for a Bit” by moridinamael Read the full episode description	May 16, 2025
“Explaining British Naval Dominance During the Age of Sail” by Arjun Panickssery Read the full episode description	May 15, 2025
“Eliezer and I wrote a book: If Anyone Builds It, Everyone Dies” by So8res Read the full episode description	May 14, 2025
“Too Soon” by Gordon Seidoh Worley Read the full episode description	May 14, 2025
“PSA: The LessWrong Feedback Service” by JustisMills Read the full episode description	May 13, 2025
“Orienting Toward Wizard Power” by johnswentworth Read the full episode description	May 08, 2025
“Interpretability Will Not Reliably Find Deceptive AI” by Neel Nanda Read the full episode description	May 05, 2025
“Slowdown After 2028: Compute, RLVR Uncertainty, MoE Data Wall” by Vladimir_Nesov Read the full episode description	May 03, 2025
“Early Chinese Language Media Coverage of the AI 2027 Report: A Qualitative Analysis” by jeanne_, eeeee Read the full episode description	May 01, 2025
[Linkpost] “Jaan Tallinn’s 2024 Philanthropy Overview” by jaan Read the full episode description	Apr 25, 2025
“Impact, agency, and taste” by benkuhn Read the full episode description	Apr 24, 2025
[Linkpost] “To Understand History, Keep Former Population Distributions In Mind” by Arjun Panickssery Read the full episode description	Apr 24, 2025
“AI-enabled coups: a small group could use AI to seize power” by Tom Davidson, Lukas Finnveden, rosehadshar Read the full episode description	Apr 23, 2025
“Accountability Sinks” by Martin Sustrik Read the full episode description	Apr 23, 2025
“Training AGI in Secret would be Unsafe and Unethical” by Daniel Kokotajlo Read the full episode description	Apr 21, 2025
“Why Should I Assume CCP AGI is Worse Than USG AGI?” by Tomás B. Read the full episode description	Apr 20, 2025
“Surprising LLM reasoning failures make me think we still need qualitative breakthroughs for AGI” by Kaj_Sotala Read the full episode description	Apr 17, 2025
“Frontier AI Models Still Fail at Basic Physical Tasks: A Manufacturing Case Study” by Adam Karvonen Read the full episode description	Apr 16, 2025
“Negative Results for SAEs On Downstream Tasks and Deprioritising SAE Research (GDM Mech Interp Team Progress Update #2)” by Neel Nanda, lewis smith, Senthooran Rajamanoharan, Arthur Conmy, Callum McDougall, Tom Lieberum, János Kramár, Rohin Shah Read the full episode description	Apr 12, 2025
[Linkpost] “Playing in the Creek” by Hastings Read the full episode description	Apr 11, 2025
“Thoughts on AI 2027” by Max Harms Read the full episode description	Apr 10, 2025
“Short Timelines don’t Devalue Long Horizon Research” by Vladimir_Nesov Read the full episode description	Apr 09, 2025
“Alignment Faking Revisited: Improved Classifiers and Open Source Extensions” by John Hughes, abhayesian, Akbir Khan, Fabien Roger Read the full episode description	Apr 09, 2025
“METR: Measuring AI Ability to Complete Long Tasks” by Zach Stein-Perlman Read the full episode description	Apr 07, 2025
“Why Have Sentence Lengths Decreased?” by Arjun Panickssery Read the full episode description	Apr 04, 2025
“AI 2027: What Superintelligence Looks Like” by Daniel Kokotajlo, Thomas Larsen, elifland, Scott Alexander, Jonas V, romeo Read the full episode description	Apr 03, 2025
“OpenAI #12: Battle of the Board Redux” by Zvi Read the full episode description	Apr 03, 2025
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit Read the full episode description	Apr 03, 2025
“OpenAI #12: Battle of the Board Redux” by Zvi Read the full episode description	Apr 03, 2025
“You will crash your car in front of my house within the next week” by Richard Korzekwa Read the full episode description	Apr 02, 2025
“My ‘infohazards small working group’ Signal Chat may have encountered minor leaks” by Linch Read the full episode description	Apr 02, 2025
“Leverage, Exit Costs, and Anger: Re-examining Why We Explode at Home, Not at Work” by at_the_zoo Read the full episode description	Apr 02, 2025
“PauseAI and E/Acc Should Switch Sides” by WillPetillo Read the full episode description	Apr 02, 2025
“VDT: a solution to decision theory” by L Rudolf L Read the full episode description	Apr 02, 2025
“LessWrong has been acquired by EA” by habryka Read the full episode description	Apr 01, 2025
“We’re not prepared for an AI market crash” by Remmelt Read the full episode description	Apr 01, 2025
“Conceptual Rounding Errors” by Jan_Kulveit Read the full episode description	Mar 29, 2025
“Tracing the Thoughts of a Large Language Model” by Adam Jermyn Read the full episode description	Mar 28, 2025
“Recent AI model progress feels mostly like bullshit” by lc Read the full episode description	Mar 25, 2025
“AI for AI safety” by Joe Carlsmith Read the full episode description	Mar 25, 2025
“Policy for LLM Writing on LessWrong” by jimrandomh Read the full episode description	Mar 25, 2025
“Will Jesus Christ return in an election year?” by Eric Neyman Read the full episode description	Mar 25, 2025
“Good Research Takes are Not Sufficient for Good Strategic Takes” by Neel Nanda Read the full episode description	Mar 23, 2025
“Intention to Treat” by Alicorn Read the full episode description	Mar 22, 2025
“On the Rationality of Deterring ASI” by Dan H Read the full episode description	Mar 22, 2025
[Linkpost] “METR: Measuring AI Ability to Complete Long Tasks” by Zach Stein-Perlman Read the full episode description	Mar 19, 2025
“I make several million dollars per year and have hundreds of thousands of followers—what is the straightest line path to utilizing these resources to reduce existential-level AI threats?” by shrimpy Read the full episode description	Mar 19, 2025
“Claude Sonnet 3.7 (often) knows when it’s in alignment evaluations” by Nicholas Goldowsky-Dill, Mikita Balesni, Jérémy Scheurer, Marius Hobbhahn Read the full episode description	Mar 18, 2025
“Levels of Friction” by Zvi Read the full episode description	Mar 18, 2025
“Why White-Box Redteaming Makes Me Feel Weird” by Zygi Straznickas Read the full episode description	Mar 17, 2025
“Reducing LLM deception at scale with self-other overlap fine-tuning” by Marc Carauleanu, Diogo de Lucena, Gunnar_Zarncke, Judd Rosenblatt, Mike Vaiana, Cameron Berg Read the full episode description	Mar 17, 2025
“Auditing language models for hidden objectives” by Sam Marks, Johannes Treutlein, dmz, Sam Bowman, Hoagy, Carson Denison, Akbir Khan, Euan Ong, Christopher Olah, Fabien Roger, Meg, Drake Thomas, Adam Jermyn, Monte M, evhub Read the full episode description	Mar 16, 2025
“The Most Forbidden Technique” by Zvi Read the full episode description	Mar 14, 2025
“Trojan Sky” by Richard_Ngo Read the full episode description	Mar 13, 2025
“OpenAI:” by Daniel Kokotajlo Read the full episode description	Mar 11, 2025
“How Much Are LLMs Actually Boosting Real-World Programmer Productivity?” by Thane Ruthenis Read the full episode description	Mar 09, 2025
“So how well is Claude playing Pokémon?” by Julian Bradshaw Read the full episode description	Mar 09, 2025
“Methods for strong human germline engineering” by TsviBT Read the full episode description	Mar 07, 2025
“Have LLMs Generated Novel Insights?” by abramdemski, Cole Wyeth Read the full episode description	Mar 06, 2025
“A Bear Case: My Predictions Regarding AI Progress” by Thane Ruthenis Read the full episode description	Mar 06, 2025
“Statistical Challenges with Making Super IQ babies” by Jan Christian Refsgaard Read the full episode description	Mar 05, 2025
“Self-fulfilling misalignment data might be poisoning our AI models” by TurnTrout Read the full episode description	Mar 04, 2025
“Judgements: Merging Prediction & Evidence” by abramdemski Read the full episode description	Mar 01, 2025
“The Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better” by Thane Ruthenis Read the full episode description	Feb 26, 2025
“Power Lies Trembling: a three-book review” by Richard_Ngo Read the full episode description	Feb 26, 2025
“Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs” by Jan Betley, Owain_Evans Read the full episode description	Feb 26, 2025
“The Paris AI Anti-Safety Summit” by Zvi Read the full episode description	Feb 22, 2025
“Eliezer’s Lost Alignment Articles / The Arbital Sequence” by Ruby Read the full episode description	Feb 20, 2025
“Arbital has been imported to LessWrong” by RobertM, jimrandomh, Ben Pace, Ruby Read the full episode description	Feb 20, 2025
“How to Make Superbabies” by GeneSmith, kman Read the full episode description	Feb 20, 2025
“A computational no-coincidence principle” by Eric Neyman Read the full episode description	Feb 19, 2025
“A History of the Future, 2025-2040” by L Rudolf L Read the full episode description	Feb 19, 2025
“It’s been ten years. I propose HPMOR Anniversary Parties.” by Screwtape Read the full episode description	Feb 18, 2025
“Some articles in ‘International Security’ that I enjoyed” by Buck Read the full episode description	Feb 16, 2025
“The Failed Strategy of Artificial Intelligence Doomers” by Ben Pace Read the full episode description	Feb 16, 2025
“Murder plots are infohazards” by Chris Monteiro Read the full episode description	Feb 14, 2025
“Why Did Elon Musk Just Offer to Buy Control of OpenAI for $100 Billion?” by garrison Read the full episode description	Feb 11, 2025
“The ‘Think It Faster’ Exercise” by Raemon Read the full episode description	Feb 09, 2025
“So You Want To Make Marginal Progress...” by johnswentworth Read the full episode description	Feb 08, 2025
“What is malevolence? On the nature, measurement, and distribution of dark traits” by David Althaus Read the full episode description	Feb 08, 2025
“How AI Takeover Might Happen in 2 Years” by joshc Read the full episode description	Feb 08, 2025
“Gradual Disempowerment, Shell Games and Flinches” by Jan_Kulveit Read the full episode description	Feb 05, 2025
“Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development” by Jan_Kulveit, Raymond D, Nora_Ammann, Deger Turan, David Scott Krueger (formerly: capybaralet), David Duvenaud Read the full episode description	Feb 04, 2025
“Planning for Extreme AI Risks” by joshc Read the full episode description	Feb 03, 2025
“Catastrophe through Chaos” by Marius Hobbhahn Read the full episode description	Feb 03, 2025
“Will alignment-faking Claude accept a deal to reveal its misalignment?” by ryan_greenblatt Read the full episode description	Feb 01, 2025
“‘Sharp Left Turn’ discourse: An opinionated review” by Steven Byrnes Read the full episode description	Jan 30, 2025
“Ten people on the inside” by Buck Read the full episode description	Jan 29, 2025
“Anomalous Tokens in DeepSeek-V3 and r1” by henry Read the full episode description	Jan 28, 2025
“Tell me about yourself:LLMs are aware of their implicit behaviors” by Martín Soto, Owain_Evans Read the full episode description	Jan 28, 2025
“Instrumental Goals Are A Different And Friendlier Kind Of Thing Than Terminal Goals” by johnswentworth, David Lorell Read the full episode description	Jan 27, 2025
“A Three-Layer Model of LLM Psychology” by Jan_Kulveit Read the full episode description	Jan 26, 2025
“Training on Documents About Reward Hacking Induces Reward Hacking” by evhub Read the full episode description	Jan 24, 2025
“AI companies are unlikely to make high-assurance safety cases if timelines are short” by ryan_greenblatt Read the full episode description	Jan 24, 2025
“Mechanisms too simple for humans to design” by Malmesbury Read the full episode description	Jan 24, 2025
“The Gentle Romance” by Richard_Ngo Read the full episode description	Jan 22, 2025
“Quotes from the Stargate press conference” by Nikola Jurkovic Read the full episode description	Jan 22, 2025
“The Case Against AI Control Research” by johnswentworth Read the full episode description	Jan 21, 2025
“Don’t ignore bad vibes you get from people” by Kaj_Sotala Read the full episode description	Jan 20, 2025
“[Fiction] [Comic] Effective Altruism and Rationality meet at a Secular Solstice afterparty” by tandem Read the full episode description	Jan 19, 2025
“Building AI Research Fleets” by bgold, Jesse Hoogland Read the full episode description	Jan 18, 2025
“What Is The Alignment Problem?” by johnswentworth Read the full episode description	Jan 17, 2025
“Applying traditional economic thinking to AGI: a trilemma” by Steven Byrnes Read the full episode description	Jan 14, 2025
“Passages I Highlighted in The Letters of J.R.R.Tolkien” by Ivan Vendrov Read the full episode description	Jan 14, 2025
“Parkinson’s Law and the Ideology of Statistics” by Benquo Read the full episode description	Jan 13, 2025
“Capital Ownership Will Not Prevent Human Disempowerment” by beren Read the full episode description	Jan 11, 2025
“Activation space interpretability may be doomed” by bilalchughtai, Lucius Bushnaq Read the full episode description	Jan 10, 2025
“What o3 Becomes by 2028” by Vladimir_Nesov Read the full episode description	Jan 09, 2025
“What Indicators Should We Watch to Disambiguate AGI Timelines?” by snewman Read the full episode description	Jan 09, 2025
“How will we update about scheming?” by ryan_greenblatt Read the full episode description	Jan 08, 2025
“OpenAI #10: Reflections” by Zvi Read the full episode description	Jan 08, 2025
“Maximizing Communication, not Traffic” by jefftk Read the full episode description	Jan 07, 2025
“What’s the short timeline plan?” by Marius Hobbhahn Read the full episode description	Jan 02, 2025
“Shallow review of technical AI safety, 2024” by technicalities, Stag, Stephen McAleese, jordine, Dr. David Mathers Read the full episode description	Dec 30, 2024
“By default, capital will matter more than ever after AGI” by L Rudolf L Read the full episode description	Dec 29, 2024
“Review: Planecrash” by L Rudolf L Read the full episode description	Dec 28, 2024
“The Field of AI Alignment: A Postmortem, and What To Do About It” by johnswentworth Read the full episode description	Dec 26, 2024
“When Is Insurance Worth It?” by kqr Read the full episode description	Dec 23, 2024
“Orienting to 3 year AGI timelines” by Nikola Jurkovic Read the full episode description	Dec 23, 2024
“What Goes Without Saying” by sarahconstantin Read the full episode description	Dec 21, 2024
“o3” by Zach Stein-Perlman Read the full episode description	Dec 21, 2024
“‘Alignment Faking’ frame is somewhat fake” by Jan_Kulveit Read the full episode description	Dec 21, 2024
“AIs Will Increasingly Attempt Shenanigans” by Zvi Read the full episode description	Dec 19, 2024
“Alignment Faking in Large Language Models” by ryan_greenblatt, evhub, Carson Denison, Benjamin Wright, Fabien Roger, Monte M, Sam Marks, Johannes Treutlein, Sam Bowman, Buck Read the full episode description	Dec 18, 2024
“Communications in Hard Mode (My new job at MIRI)” by tanagrabeast Read the full episode description	Dec 15, 2024
“Biological risk from the mirror world” by jasoncrawford Read the full episode description	Dec 13, 2024
“Subskills of ‘Listening to Wisdom’” by Raemon Read the full episode description	Dec 13, 2024
“Understanding Shapley Values with Venn Diagrams” by Carson L Read the full episode description	Dec 13, 2024
“LessWrong audio: help us choose the new voice” by PeterH Read the full episode description	Dec 12, 2024
“Understanding Shapley Values with Venn Diagrams” by agucova Read the full episode description	Dec 11, 2024
“o1: A Technical Primer” by Jesse Hoogland Read the full episode description	Dec 11, 2024
“Gradient Routing: Masking Gradients to Localize Computation in Neural Networks” by cloud, Jacob G-W, Evzen, Joseph Miller, TurnTrout Read the full episode description	Dec 09, 2024
“Frontier Models are Capable of In-context Scheming” by Marius Hobbhahn, AlexMeinke, Bronson Schoen Read the full episode description	Dec 06, 2024
“(The) Lightcone is nothing without its people: LW + Lighthaven’s first big fundraiser” by habryka Read the full episode description	Nov 30, 2024
“Repeal the Jones Act of 1920” by Zvi Read the full episode description	Nov 29, 2024
“China Hawks are Manufacturing an AI Arms Race” by garrison Read the full episode description	Nov 29, 2024
“Information vs Assurance” by johnswentworth Read the full episode description	Nov 27, 2024
“You are not too ‘irrational’ to know your preferences.” by DaystarEld Read the full episode description	Nov 27, 2024
“‘The Solomonoff Prior is Malign’ is a special case of a simpler argument” by David Matolcsi Read the full episode description	Nov 25, 2024
“‘It’s a 10% chance which I did 10 times, so it should be 100%’” by egor.timatkov Read the full episode description	Nov 20, 2024
“OpenAI Email Archives” by habryka Read the full episode description	Nov 19, 2024
“Ayn Rand’s model of ‘living money’; and an upside of burnout” by AnnaSalamon Read the full episode description	Nov 18, 2024
“Neutrality” by sarahconstantin Read the full episode description	Nov 17, 2024
“Making a conservative case for alignment” by Cameron Berg, Judd Rosenblatt, phgubbins, AE Studio Read the full episode description	Nov 16, 2024
“OpenAI Email Archives (from Musk v. Altman)” by habryka Read the full episode description	Nov 16, 2024
“Catastrophic sabotage as a major threat model for human-level AI systems” by evhub Read the full episode description	Nov 15, 2024
“The Online Sports Gambling Experiment Has Failed” by Zvi Read the full episode description	Nov 12, 2024
“o1 is a bad idea” by abramdemski Read the full episode description	Nov 12, 2024
“Current safety training techniques do not fully transfer to the agent setting” by Simon Lermen, Govind Pimpale Read the full episode description	Nov 09, 2024
“Explore More: A Bag of Tricks to Keep Your Life on the Rails” by Shoshannah Tekofsky Read the full episode description	Nov 04, 2024
“Survival without dignity” by L Rudolf L Read the full episode description	Nov 04, 2024
“The Median Researcher Problem” by johnswentworth Read the full episode description	Nov 04, 2024
“The Compendium, A full argument about extinction risk from AGI” by adamShimi, Gabriel Alfour, Connor Leahy, Chris Scammell, Andrea_Miotti Read the full episode description	Nov 01, 2024
“What TMS is like” by Sable Read the full episode description	Oct 31, 2024
“The hostile telepaths problem” by Valentine Read the full episode description	Oct 28, 2024
“A bird’s eye view of ARC’s research” by Jacob_Hilton Read the full episode description	Oct 27, 2024
“A Rocket–Interpretability Analogy” by plex Read the full episode description	Oct 25, 2024
“I got dysentery so you don’t have to” by eukaryote Read the full episode description	Oct 24, 2024
“Overcoming Bias Anthology” by Arjun Panickssery Read the full episode description	Oct 23, 2024
“Arithmetic is an underrated world-modeling technology” by dynomight Read the full episode description	Oct 22, 2024
“My theory of change for working in AI healthtech” by Andrew_Critch Read the full episode description	Oct 15, 2024
“Why I’m not a Bayesian” by Richard_Ngo Read the full episode description	Oct 15, 2024
“The AGI Entente Delusion” by Max Tegmark Read the full episode description	Oct 14, 2024
“Momentum of Light in Glass” by Ben Read the full episode description	Oct 14, 2024
“Overview of strong human intelligence amplification methods” by TsviBT Read the full episode description	Oct 09, 2024
“Struggling like a Shadowmoth” by Raemon Read the full episode description	Oct 03, 2024
“Three Subtle Examples of Data Leakage” by abstractapplic Read the full episode description	Oct 03, 2024
“the case for CoT unfaithfulness is overstated” by nostalgebraist Read the full episode description	Sep 30, 2024
“Cryonics is free” by Mati_Roy Read the full episode description	Sep 30, 2024
“Stanislav Petrov Quarterly Performance Review” by Ricki Heicklen Read the full episode description	Sep 29, 2024
“Laziness death spirals” by PatrickDFarley Read the full episode description	Sep 29, 2024
“‘Slow’ takeoff is a terrible term for ‘maybe even faster takeoff, actually’” by Raemon Read the full episode description	Sep 29, 2024
“ASIs will not leave just a little sunlight for Earth ” by Eliezer Yudkowsky Read the full episode description	Sep 23, 2024
“Skills from a year of Purposeful Rationality Practice ” by Raemon Read the full episode description	Sep 21, 2024
“How I started believing religion might actually matter for rationality and moral philosophy ” by zhukeepa Read the full episode description	Sep 19, 2024
“Did Christopher Hitchens change his mind about waterboarding? ” by Isaac King Read the full episode description	Sep 17, 2024
“The Great Data Integration Schlep ” by sarahconstantin Read the full episode description	Sep 15, 2024
“Contra papers claiming superhuman AI forecasting ” by nikos, Peter Mühlbacher, Lawrence Phillips, dschwarz Read the full episode description	Sep 14, 2024
“OpenAI o1 ” by Zach Stein-Perlman Read the full episode description	Sep 13, 2024
“The Best Lay Argument is not a Simple English Yud Essay ” by J Bostock Read the full episode description	Sep 11, 2024
“My Number 1 Epistemology Book Recommendation: Inventing Temperature ” by adamShimi Read the full episode description	Sep 10, 2024
“That Alien Message - The Animation ” by Writer Read the full episode description	Sep 09, 2024
“Pay Risk Evaluators in Cash, Not Equity ” by Adam Scholl Read the full episode description	Sep 07, 2024
“Survey: How Do Elite Chinese Students Feel About the Risks of AI? ” by Nick Corvino Read the full episode description	Sep 07, 2024
“things that confuse me about the current AI market. ” by DMMF Read the full episode description	Sep 02, 2024
“Nursing doubts ” by dynomight Read the full episode description	Sep 01, 2024
“Principles for the AGI Race ” by William_S Read the full episode description	Aug 31, 2024
“The Information: OpenAI shows ‘Strawberry’ to feds, races to launch it ” by Martín Soto Read the full episode description	Aug 29, 2024
“What is it to solve the alignment problem? ” by Joe Carlsmith Read the full episode description	Aug 28, 2024
“Limitations on Formal Verification for AI Safety ” by Andrew Dickson Read the full episode description	Aug 27, 2024
“Would catching your AIs trying to escape convince AI developers to slow down or undeploy? ” by Buck Read the full episode description	Aug 27, 2024
“Liability regimes for AI ” by Ege Erdil Read the full episode description	Aug 23, 2024
“AGI Safety and Alignment at Google DeepMind:A Summary of Recent Work ” by Rohin Shah, Seb Farquhar, Anca Dragan Read the full episode description	Aug 21, 2024
“Fields that I reference when thinking about AI takeover prevention” by Buck Read the full episode description	Aug 15, 2024
“WTH is Cerebrolysin, actually?” by gsfitzgerald, delton137 Read the full episode description	Aug 13, 2024
“You can remove GPT2’s LayerNorm by fine-tuning for an hour” by StefanHex Read the full episode description	Aug 10, 2024
“Leaving MIRI, Seeking Funding” by abramdemski Read the full episode description	Aug 09, 2024
“How I Learned To Stop Trusting Prediction Markets and Love the Arbitrage” by orthonormal Read the full episode description	Aug 08, 2024
“This is already your second chance” by Malmesbury Read the full episode description	Aug 07, 2024
“0. CAST: Corrigibility as Singular Target” by Max Harms Read the full episode description	Aug 07, 2024
“Self-Other Overlap: A Neglected Approach to AI Alignment” by Marc Carauleanu, Mike Vaiana, Judd Rosenblatt, Diogo de Lucena Read the full episode description	Aug 07, 2024
“You don’t know how bad most things are nor precisely how they’re bad.” by Solenoid_Entity Read the full episode description	Aug 07, 2024
“Recommendation: reports on the search for missing hiker Bill Ewasko” by eukaryote Read the full episode description	Aug 07, 2024
“The ‘strong’ feature hypothesis could be wrong” by lsgos Read the full episode description	Aug 07, 2024
“‘AI achieves silver-medal standard solving International Mathematical Olympiad problems’” by gjm Read the full episode description	Jul 30, 2024
“Decomposing Agency — capabilities without desires” by owencb, Raymond D Read the full episode description	Jul 29, 2024
“Universal Basic Income and Poverty” by Eliezer Yudkowsky Read the full episode description	Jul 27, 2024
“Optimistic Assumptions, Longterm Planning, and ‘Cope’” by Raemon Read the full episode description	Jul 19, 2024
“Superbabies: Putting The Pieces Together” by sarahconstantin Read the full episode description	Jul 15, 2024
“Poker is a bad game for teaching epistemics. Figgie is a better one.” by rossry Read the full episode description	Jul 12, 2024
“Reliable Sources: The Story of David Gerard” by TracingWoodgrains Read the full episode description	Jul 11, 2024
“When is a mind me?” by Rob Bensinger Read the full episode description	Jul 08, 2024
“80,000 hours should remove OpenAI from the Job Board (and similar orgs should do similarly)” by Raemon Read the full episode description	Jul 04, 2024
[Linkpost] “introduction to cancer vaccines” by bhauth Read the full episode description	Jul 02, 2024
“Priors and Prejudice” by MathiasKB Read the full episode description	Jul 02, 2024
“My experience using financial commitments to overcome akrasia” by William Howard Read the full episode description	Jul 02, 2024
“The Incredible Fentanyl-Detecting Machine” by sarahconstantin Read the full episode description	Jul 01, 2024
“AI catastrophes and rogue deployments” by Buck Read the full episode description	Jul 01, 2024
“Loving a world you don’t trust” by Joe Carlsmith Read the full episode description	Jul 01, 2024
“Formal verification, heuristic explanations and surprise accounting” by paulfchristiano Read the full episode description	Jun 27, 2024
“LLM Generality is a Timeline Crux” by eggsyntax Read the full episode description	Jun 25, 2024
“SAE feature geometry is outside the superposition hypothesis” by jake_mendel Read the full episode description	Jun 25, 2024
“Connecting the Dots: LLMs can Infer & Verbalize Latent Structure from Training Data” by Johannes Treutlein, Owain_Evans Read the full episode description	Jun 23, 2024
“Boycott OpenAI” by PeterMcCluskey Read the full episode description	Jun 21, 2024
“Sycophancy to subterfuge: Investigating reward tampering in large language models” by evhub, Carson Denison Read the full episode description	Jun 20, 2024
“I would have shit in that alley, too” by Declan Molony Read the full episode description	Jun 18, 2024
“Getting 50% (SoTA) on ARC-AGI with GPT-4o” by ryan_greenblatt Read the full episode description	Jun 18, 2024
“Why I don’t believe in the placebo effect” by transhumanist_atom_understander Read the full episode description	Jun 15, 2024
“Safety isn’t safety without a social model (or: dispelling the myth of per se technical safety)” by Andrew_Critch Read the full episode description	Jun 14, 2024
“My AI Model Delta Compared To Christiano” by johnswentworth Read the full episode description	Jun 13, 2024
“My AI Model Delta Compared To Yudkowsky” by johnswentworth Read the full episode description	Jun 10, 2024
“Response to Aschenbrenner’s ‘Situational Awareness’” by Rob Bensinger Read the full episode description	Jun 07, 2024
“Humming is not a free $100 bill” by Elizabeth Read the full episode description	Jun 07, 2024
“Announcing ILIAD — Theoretical AI Alignment Conference ” by Nora_Ammann, Alexander Gietelink Oldenziel Read the full episode description	Jun 06, 2024
“Non-Disparagement Canaries for OpenAI” by aysja, Adam Scholl Read the full episode description	May 31, 2024
“MIRI 2024 Communications Strategy” by Gretta Duleba Read the full episode description	May 30, 2024
“OpenAI: Fallout” by Zvi Read the full episode description	May 28, 2024
[HUMAN VOICE] Update on human narration for this podcast Read the full episode description	May 28, 2024
“Maybe Anthropic’s Long-Term Benefit Trust is powerless” by Zach Stein-Perlman Read the full episode description	May 28, 2024
“Notifications Received in 30 Minutes of Class” by tanagrabeast Read the full episode description	May 27, 2024
“AI companies aren’t really using external evaluators” by Zach Stein-Perlman Read the full episode description	May 24, 2024
“EIS XIII: Reflections on Anthropic’s SAE Research Circa May 2024” by scasper Read the full episode description	May 24, 2024
“What’s Going on With OpenAI’s Messaging?” by ozziegoen Read the full episode description	May 22, 2024
“Language Models Model Us” by eggsyntax Read the full episode description	May 21, 2024
Jaan Tallinn’s 2023 Philanthropy Overview Read the full episode description	May 21, 2024
“OpenAI: Exodus” by Zvi Read the full episode description	May 21, 2024
DeepMind’s ”Frontier Safety Framework” is weak and unambitious Read the full episode description	May 20, 2024
Do you believe in hundred dollar bills lying on the ground? Consider humming Read the full episode description	May 18, 2024
Deep Honesty Read the full episode description	May 12, 2024
On Not Pulling The Ladder Up Behind You Read the full episode description	May 02, 2024
Mechanistically Eliciting Latent Behaviors in Language Models Read the full episode description	May 02, 2024
Ironing Out the Squiggles Read the full episode description	May 01, 2024
Introducing AI Lab Watch Read the full episode description	May 01, 2024
Refusal in LLMs is mediated by a single direction Read the full episode description	Apr 28, 2024
Funny Anecdote of Eliezer From His Sister Read the full episode description	Apr 24, 2024
Thoughts on seed oil Read the full episode description	Apr 21, 2024
Why Would Belief-States Have A Fractal Structure, And Why Would That Matter For Interpretability? An Explainer Read the full episode description	Apr 19, 2024
Express interest in an “FHI of the West” Read the full episode description	Apr 18, 2024
Transformers Represent Belief State Geometry in their Residual Stream Read the full episode description	Apr 17, 2024
Paul Christiano named as US AI Safety Institute Head of AI Safety Read the full episode description	Apr 16, 2024
[HUMAN VOICE] "How could I have thought that faster?" by mesaoptimizer Read the full episode description	Apr 12, 2024
[HUMAN VOICE] "My PhD thesis: Algorithmic Bayesian Epistemology" by Eric Neyman Read the full episode description	Apr 12, 2024
[HUMAN VOICE] "Toward a Broader Conception of Adverse Selection" by Ricki Heicklen Read the full episode description	Apr 12, 2024
[HUMAN VOICE] "On green" by Joe Carlsmith Read the full episode description	Apr 12, 2024
LLMs for Alignment Research: a safety priority? Read the full episode description	Apr 06, 2024
[HUMAN VOICE] "Social status part 1/2: negotiations over object-level preferences" by Steven Byrnes Read the full episode description	Apr 05, 2024
[HUMAN VOICE] "Using axis lines for good or evil" by dynomight Read the full episode description	Apr 05, 2024
[HUMAN VOICE] "Scale Was All We Needed, At First" by Gabriel Mukobi Read the full episode description	Apr 05, 2024
[HUMAN VOICE] "Acting Wholesomely" by OwenCB Read the full episode description	Apr 05, 2024
The Story of “I Have Been A Good Bing” Read the full episode description	Apr 01, 2024
The Best Tacit Knowledge Videos on Every Subject Read the full episode description	Apr 01, 2024
[HUMAN VOICE] "My Clients, The Liars" by ymeskhout Read the full episode description	Mar 20, 2024
[HUMAN VOICE] "Deep atheism and AI risk" by Joe Carlsmith Read the full episode description	Mar 20, 2024
[HUMAN VOICE] "CFAR Takeaways: Andrew Critch" by Raemon Read the full episode description	Mar 10, 2024
[HUMAN VOICE] "Speaking to Congressional staffers about AI risk" by Akash, hath Read the full episode description	Mar 10, 2024
Many arguments for AI x-risk are wrong Read the full episode description	Mar 09, 2024
Tips for Empirical Alignment Research Read the full episode description	Mar 07, 2024
Timaeus’s First Four Months Read the full episode description	Feb 29, 2024
Contra Ngo et al. “Every ‘Every Bay Area House Party’ Bay Area House Party” Read the full episode description	Feb 23, 2024
[HUMAN VOICE] "And All the Shoggoths Merely Players" by Zack_M_Davis Read the full episode description	Feb 20, 2024
[HUMAN VOICE] "Updatelessness doesn't solve most problems" by Martín Soto Read the full episode description	Feb 20, 2024
Every “Every Bay Area House Party” Bay Area House Party Read the full episode description	Feb 19, 2024
2023 Survey Results Read the full episode description	Feb 19, 2024
Raising children on the eve of AI Read the full episode description	Feb 18, 2024
“No-one in my org puts money in their pension” Read the full episode description	Feb 18, 2024
Masterpiece Read the full episode description	Feb 16, 2024
CFAR Takeaways: Andrew Critch Read the full episode description	Feb 15, 2024
[HUMAN VOICE] "Believing In" by Anna Salamon Read the full episode description	Feb 14, 2024
[HUMAN VOICE] "Attitudes about Applied Rationality" by Camille Berger Read the full episode description	Feb 14, 2024
Scale Was All We Needed, At First Read the full episode description	Feb 14, 2024
Sam Altman’s Chip Ambitions Undercut OpenAI’s Safety Strategy Read the full episode description	Feb 11, 2024
[HUMAN VOICE] "A Shutdown Problem Proposal" by johnswentworth, David Lorell Read the full episode description	Feb 09, 2024
Brute Force Manufactured Consensus is Hiding the Crime of the Century Read the full episode description	Feb 04, 2024
[HUMAN VOICE] "Without fundamental advances, misalignment and catastrophe are the default outcomes of training powerful AI" by Jeremy Gillen, peterbarnett Read the full episode description	Feb 03, 2024
Leading The Parade Read the full episode description	Feb 02, 2024
[HUMAN VOICE] "The case for ensuring that powerful AIs are controlled" by ryan_greenblatt, Buck Read the full episode description	Feb 02, 2024
Processor clock speeds are not how fast AIs think Read the full episode description	Feb 01, 2024
Without fundamental advances, misalignment and catastrophe are the default outcomes of training powerful AI Read the full episode description	Jan 31, 2024
Making every researcher seek grants is a broken model Read the full episode description	Jan 29, 2024
The case for training frontier AIs on Sumerian-only corpus Read the full episode description	Jan 28, 2024
This might be the last AI Safety Camp Read the full episode description	Jan 25, 2024
[HUMAN VOICE] "There is way too much serendipity" by Malmesbury Read the full episode description	Jan 22, 2024
[HUMAN VOICE] "How useful is mechanistic interpretability?" by ryan_greenblatt, Neel Nanda, Buck, habryka Read the full episode description	Jan 20, 2024
[HUMAN VOICE] "Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training" by evhub et al Read the full episode description	Jan 20, 2024
The impossible problem of due process Read the full episode description	Jan 17, 2024
[HUMAN VOICE] "Gentleness and the artificial Other" by Joe Carlsmith Read the full episode description	Jan 14, 2024
Introducing Alignment Stress-Testing at Anthropic Read the full episode description	Jan 14, 2024
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training Read the full episode description	Jan 13, 2024
[HUMAN VOICE] "Meaning & Agency" by Abram Demski Read the full episode description	Jan 07, 2024
What’s up with LLMs representing XORs of arbitrary features? Read the full episode description	Jan 07, 2024
Gentleness and the artificial Other Read the full episode description	Jan 05, 2024
MIRI 2024 Mission and Strategy Update Read the full episode description	Jan 05, 2024
The Plan - 2023 Version Read the full episode description	Jan 04, 2024
Apologizing is a Core Rationalist Skill Read the full episode description	Jan 03, 2024
[HUMAN VOICE] "A case for AI alignment being difficult" by jessicata Read the full episode description	Jan 02, 2024
The Dark Arts Read the full episode description	Jan 01, 2024
Critical review of Christiano’s disagreements with Yudkowsky Read the full episode description	Dec 28, 2023
Most People Don’t Realize We Have No Idea How Our AIs Work Read the full episode description	Dec 27, 2023
Discussion: Challenges with Unsupervised LLM Knowledge Discovery Read the full episode description	Dec 26, 2023
Succession Read the full episode description	Dec 24, 2023
Nonlinear’s Evidence: Debunking False and Misleading Claims Read the full episode description	Dec 21, 2023
Effective Aspersions: How the Nonlinear Investigation Went Wrong Read the full episode description	Dec 20, 2023
Constellations are Younger than Continents Read the full episode description	Dec 20, 2023
The ‘Neglected Approaches’ Approach: AE Studio’s Alignment Agenda Read the full episode description	Dec 19, 2023
“Humanity vs. AGI” Will Never Look Like “Humanity vs. AGI” to Humanity Read the full episode description	Dec 18, 2023
Is being sexy for your homies? Read the full episode description	Dec 17, 2023
[HUMAN VOICE] "Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible" by Gene Smith and Kman Read the full episode description	Dec 17, 2023
[HUMAN VOICE] "Moral Reality Check (a short story)" by jessicata Read the full episode description	Dec 15, 2023
AI Control: Improving Safety Despite Intentional Subversion Read the full episode description	Dec 15, 2023
2023 Unofficial LessWrong Census/Survey Read the full episode description	Dec 13, 2023
The likely first longevity drug is based on sketchy science. This is bad for science and bad for longevity. Read the full episode description	Dec 13, 2023
[HUMAN VOICE] "What are the results of more parental supervision and less outdoor play?" by Julia Wise Read the full episode description	Dec 13, 2023
Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible Read the full episode description	Dec 12, 2023
re: Yudkowsky on biological materials Read the full episode description	Dec 11, 2023
Speaking to Congressional staffers about AI risk Read the full episode description	Dec 05, 2023
[HUMAN VOICE] "Shallow review of live agendas in alignment & safety" by technicalities & Stag Read the full episode description	Dec 04, 2023
Thoughts on “AI is easy to control” by Pope & Belrose Read the full episode description	Dec 02, 2023
The 101 Space You Will Always Have With You Read the full episode description	Nov 30, 2023
[HUMAN VOICE] "Social Dark Matter" by Duncan Sabien Read the full episode description	Nov 28, 2023
Shallow review of live agendas in alignment & safety Read the full episode description	Nov 28, 2023
Ability to solve long-horizon tasks correlates with wanting things in the behaviorist sense Read the full episode description	Nov 25, 2023
[HUMAN VOICE] "The 6D effect: When companies take risks, one email can be very powerful." by scasper Read the full episode description	Nov 23, 2023
OpenAI: The Battle of the Board Read the full episode description	Nov 22, 2023
OpenAI: Facts from a Weekend Read the full episode description	Nov 20, 2023
Sam Altman fired from OpenAI Read the full episode description	Nov 18, 2023
Social Dark Matter Read the full episode description	Nov 17, 2023
"You can just spontaneously call people you haven't met in years" by lc Read the full episode description	Nov 17, 2023
[HUMAN VOICE] "Thinking By The Clock" by Screwtape Read the full episode description	Nov 17, 2023
"EA orgs' legal structure inhibits risk taking and information sharing on the margin" by Elizabeth Read the full episode description	Nov 17, 2023
[HUMAN VOICE] "AI Timelines" by habryka, Daniel Kokotajlo, Ajeya Cotra, Ege Erdil Read the full episode description	Nov 17, 2023
"Integrity in AI Governance and Advocacy" by habryka, Olivia Jimenez Read the full episode description	Nov 17, 2023
Loudly Give Up, Don’t Quietly Fade Read the full episode description	Nov 16, 2023
[HUMAN VOICE] "Towards Monosemanticity: Decomposing Language Models With Dictionary Learning" by Zac Hatfield-Dodds Read the full episode description	Nov 09, 2023
[HUMAN VOICE] "Deception Chess: Game #1" by Zane et al. Read the full episode description	Nov 09, 2023
"Does davidad's uploading moonshot work?" by jacobjabob et al. Read the full episode description	Nov 09, 2023
"The other side of the tidal wave" by Katja Grace Read the full episode description	Nov 09, 2023
"The 6D effect: When companies take risks, one email can be very powerful." by scasper Read the full episode description	Nov 09, 2023
Comp Sci in 2027 (Short story by Eliezer Yudkowsky) Read the full episode description	Nov 09, 2023
"My thoughts on the social response to AI risk" by Matthew Barnett Read the full episode description	Nov 09, 2023
"Propaganda or Science: A Look at Open Source AI and Bioterrorism Risk" by 1a3orn Read the full episode description	Nov 09, 2023
"President Biden Issues Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence" by Tristan Williams Read the full episode description	Nov 03, 2023
"Thoughts on the AI Safety Summit company policy requests and responses" by So8res Read the full episode description	Nov 03, 2023
[Human Voice] "Book Review: Going Infinite" by Zvi Read the full episode description	Oct 31, 2023
"Announcing Timaeus" by Jesse Hoogland et al. Read the full episode description	Oct 30, 2023
"Thoughts on responsible scaling policies and regulation" by Paul Christiano Read the full episode description	Oct 30, 2023
"AI as a science, and three obstacles to alignment strategies" by Nate Soares Read the full episode description	Oct 30, 2023
"Architects of Our Own Demise: We Should Stop Developing AI" by Roko Read the full episode description	Oct 30, 2023
"At 87, Pearl is still able to change his mind" by rotatingpaguro Read the full episode description	Oct 30, 2023
"We're Not Ready: thoughts on "pausing" and responsible scaling policies" by Holden Karnofsky Read the full episode description	Oct 30, 2023
[HUMAN VOICE] "Alignment Implications of LLM Successes: a Debate in One Act" by Zack M Davis Read the full episode description	Oct 23, 2023
"LoRA Fine-tuning Efficiently Undoes Safety Training from Llama 2-Chat 70B" by Simon Lermen & Jeffrey Ladish. Read the full episode description	Oct 23, 2023
"Holly Elmore and Rob Miles dialogue on AI Safety Advocacy" by jacobjacob, Robert Miles & Holly_Elmore Read the full episode description	Oct 23, 2023
"Labs should be explicit about why they are building AGI" by Peter Barnett Read the full episode description	Oct 19, 2023
[HUMAN VOICE] "Sum-threshold attacks" by TsviBT Read the full episode description	Oct 18, 2023
"Will no one rid me of this turbulent pest?" by Metacelsus Read the full episode description	Oct 18, 2023
"RSPs are pauses done right" by evhub Read the full episode description	Oct 15, 2023
[HUMAN VOICE] "Inside Views, Impostor Syndrome, and the Great LARP" by John Wentworth Read the full episode description	Oct 15, 2023
"Cohabitive Games so Far" by mako yass Read the full episode description	Oct 15, 2023
"Announcing MIRI’s new CEO and leadership team" by Gretta Duleba Read the full episode description	Oct 15, 2023
"Comparing Anthropic's Dictionary Learning to Ours" by Robert_AIZI Read the full episode description	Oct 15, 2023
"Announcing Dialogues" by Ben Pace Read the full episode description	Oct 09, 2023
"Towards Monosemanticity: Decomposing Language Models With Dictionary Learning" by Zac Hatfield-Dodds Read the full episode description	Oct 09, 2023
"Evaluating the historical value misspecification argument" by Matthew Barnett Read the full episode description	Oct 09, 2023
"Response to Quintin Pope’s Evolution Provides No Evidence For the Sharp Left Turn" by Zvi Read the full episode description	Oct 09, 2023
"Thomas Kwa's MIRI research experience" by Thomas Kwa and others Read the full episode description	Oct 06, 2023
"EA Vegan Advocacy is not truthseeking, and it’s everyone’s problem" by Elizabeth Read the full episode description	Oct 03, 2023
"How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions" by Jan Brauner et al. Read the full episode description	Oct 03, 2023
"The Lighthaven Campus is open for bookings" by Habryka Read the full episode description	Oct 03, 2023
"'Diamondoid bacteria' nanobots: deadly threat or dead-end? A nanotech investigation" by titotal Read the full episode description	Oct 03, 2023
"The King and the Golem" by Richard Ngo Read the full episode description	Sep 29, 2023
"Sparse Autoencoders Find Highly Interpretable Directions in Language Models" by Logan Riggs et al Read the full episode description	Sep 27, 2023
"Inside Views, Impostor Syndrome, and the Great LARP" by John Wentworth Read the full episode description	Sep 26, 2023
"There should be more AI safety orgs" by Marius Hobbhahn Read the full episode description	Sep 25, 2023
"The Talk: a brief explanation of sexual dimorphism" by Malmesbury Read the full episode description	Sep 22, 2023
"A Golden Age of Building? Excerpts and lessons from Empire State, Pentagon, Skunk Works and SpaceX" by jacobjacob Read the full episode description	Sep 20, 2023
"AI presidents discuss AI alignment agendas" by TurnTrout & Garrett Baker Read the full episode description	Sep 19, 2023
"UDT shows that decision theory is more puzzling than ever" by Wei Dai Read the full episode description	Sep 18, 2023
"Sum-threshold attacks" by TsviBT Read the full episode description	Sep 11, 2023
"A list of core AI safety problems and how I hope to solve them" by Davidad Read the full episode description	Sep 09, 2023
"Report on Frontier Model Training" by Yafah Edelman Read the full episode description	Sep 09, 2023
"Defunding My Mistake" by ymeskhout Read the full episode description	Sep 08, 2023
"Sharing Information About Nonlinear" by Ben Pace Read the full episode description	Sep 08, 2023
"One Minute Every Moment" by abramdemski Read the full episode description	Sep 08, 2023
"What I would do if I wasn’t at ARC Evals" by LawrenceC Read the full episode description	Sep 08, 2023
"The U.S. is becoming less stable" by lc Read the full episode description	Sep 04, 2023
"Meta Questions about Metaphilosophy" by Wei Dai Read the full episode description	Sep 04, 2023
"OpenAI API base models are not sycophantic, at any size" by Nostalgebraist Read the full episode description	Sep 04, 2023
"Dear Self; we need to talk about ambition" by Elizabeth Read the full episode description	Aug 30, 2023
"Book Launch: "The Carving of Reality," Best of LessWrong vol. III" by Raemon Read the full episode description	Aug 28, 2023
"Assume Bad Faith" by Zack_M_Davis Read the full episode description	Aug 28, 2023
"Large Language Models will be Great for Censorship" by Ethan Edwards Read the full episode description	Aug 23, 2023
"Ten Thousand Years of Solitude" by agp Read the full episode description	Aug 22, 2023
"6 non-obvious mental health issues specific to AI safety" by Igor Ivanov Read the full episode description	Aug 22, 2023
"Against Almost Every Theory of Impact of Interpretability" by Charbel-Raphaël Read the full episode description	Aug 21, 2023
"Inflection.ai is a major AGI lab" by Nikola Read the full episode description	Aug 15, 2023
"Feedbackloop-first Rationality" by Raemon Read the full episode description	Aug 15, 2023
"When can we trust model evaluations?" bu evhub Read the full episode description	Aug 09, 2023
"Model Organisms of Misalignment: The Case for a New Pillar of Alignment Research" by evhub, Nicholas Schiefer, Carson Denison, Ethan Perez Read the full episode description	Aug 09, 2023
"My current LK99 questions" by Eliezer Yudkowsky Read the full episode description	Aug 04, 2023
"The "public debate" about AI is confusing for the general public and for policymakers because it is a three-sided debate" by Adam David Long Read the full episode description	Aug 04, 2023
"ARC Evals new report: Evaluating Language-Model Agents on Realistic Autonomous Tasks" by Beth Barnes Read the full episode description	Aug 04, 2023
"Thoughts on sharing information about language model capabilities" by paulfchristiano Read the full episode description	Aug 02, 2023
"Yes, It's Subjective, But Why All The Crabs?" by johnswentworth Read the full episode description	Jul 31, 2023
"Self-driving car bets" by paulfchristiano Read the full episode description	Jul 31, 2023
"Cultivating a state of mind where new ideas are born" by Henrik Karlsson Read the full episode description	Jul 31, 2023
"Rationality !== Winning" by Raemon Read the full episode description	Jul 28, 2023
"Brain Efficiency Cannell Prize Contest Award Ceremony" by Alexander Gietelink Oldenziel Read the full episode description	Jul 28, 2023
"Grant applications and grand narratives" by Elizabeth Read the full episode description	Jul 28, 2023
"Cryonics and Regret" by MvB Read the full episode description	Jul 28, 2023
"Unifying Bargaining Notions (2/2)" by Diffractor Read the full episode description	Jun 12, 2023
"The ants and the grasshopper" by Richard Ngo Read the full episode description	Jun 06, 2023
"Steering GPT-2-XL by adding an activation vector" by TurnTrout et al. Read the full episode description	May 18, 2023
"An artificially structured argument for expecting AGI ruin" by Rob Bensinger Read the full episode description	May 16, 2023
"How much do you believe your results?" by Eric Neyman Read the full episode description	May 10, 2023
"Mental Health and the Alignment Problem: A Compilation of Resources (updated April 2023)" by Chris Scammell & DivineMango Read the full episode description	Apr 27, 2023
"On AutoGPT" by Zvi Read the full episode description	Apr 19, 2023
"GPTs are Predictors, not Imitators" by Eliezer Yudkowsky Read the full episode description	Apr 12, 2023
"Discussion with Nate Soares on a key alignment difficulty" by Holden Karnofsky Read the full episode description	Apr 05, 2023
"A stylized dialogue on John Wentworth's claims about markets and optimization" by Nate Soares Read the full episode description	Apr 05, 2023
"Deep Deceptiveness" by Nate Soares Read the full episode description	Apr 05, 2023
"The Onion Test for Personal and Institutional Honesty" by Chana Messinger & Andrew Critch Read the full episode description	Mar 28, 2023
"There’s no such thing as a tree (phylogenetically)" by Eukaryote Read the full episode description	Mar 28, 2023
"Losing the root for the tree" by Adam Zerner Read the full episode description	Mar 28, 2023
"Lies, Damn Lies, and Fabricated Options" by Duncan Sabien Read the full episode description	Mar 28, 2023
"Why I think strong general AI is coming soon" by Porby Read the full episode description	Mar 28, 2023
"It Looks Like You’re Trying To Take Over The World" by Gwern Read the full episode description	Mar 28, 2023
"What failure looks like" by Paul Christiano Read the full episode description	Mar 28, 2023
"More information about the dangerous capability evaluations we did with GPT-4 and Claude." by Beth Barnes Read the full episode description	Mar 21, 2023
""Carefully Bootstrapped Alignment" is organizationally hard" by Raemon Read the full episode description	Mar 21, 2023
"The Parable of the King and the Random Process" by moridinamael Read the full episode description	Mar 14, 2023
"Enemies vs Malefactors" by Nate Soares Read the full episode description	Mar 14, 2023
"The Waluigi Effect (mega-post)" by Cleo Nardo Read the full episode description	Mar 08, 2023
"Acausal normalcy" by Andrew Critch Read the full episode description	Mar 06, 2023
"Please don't throw your mind away" by TsviBT Read the full episode description	Mar 01, 2023
"Cyborgism" by Nicholas Kees & Janus Read the full episode description	Feb 15, 2023
"Childhoods of exceptional people" by Henrik Karlsson Read the full episode description	Feb 14, 2023
"What I mean by "alignment is in large part about making cognition aimable at all"" by Nate Soares Read the full episode description	Feb 13, 2023
"On not getting contaminated by the wrong obesity ideas" by Natália Coelho Mendonça Read the full episode description	Feb 10, 2023
"SolidGoldMagikarp (plus, prompt generation)" Read the full episode description	Feb 08, 2023
"Focus on the places where you feel shocked everyone's dropping the ball" by Nate Soares Read the full episode description	Feb 03, 2023
"Basics of Rationalist Discourse" by Duncan Sabien Read the full episode description	Feb 02, 2023
"Sapir-Whorf for Rationalists" by Duncan Sabien Read the full episode description	Jan 31, 2023
"My Model Of EA Burnout" by Logan Strohl Read the full episode description	Jan 31, 2023
"The Social Recession: By the Numbers" by Anton Stjepan Cebalo Read the full episode description	Jan 25, 2023
"Recursive Middle Manager Hell" by Raemon Read the full episode description	Jan 24, 2023
"How 'Discovering Latent Knowledge in Language Models Without Supervision' Fits Into a Broader Alignment Scheme" by Collin Read the full episode description	Jan 12, 2023
"Models Don't 'Get Reward'" by Sam Ringer Read the full episode description	Jan 12, 2023
"The Feeling of Idea Scarcity" by John Wentworth Read the full episode description	Jan 12, 2023
"The next decades might be wild" by Marius Hobbhahn Read the full episode description	Dec 21, 2022
"Lessons learned from talking to >100 academics about AI safety" by Marius Hobbhahn Read the full episode description	Nov 17, 2022
"How my team at Lightcone sometimes gets stuff done" by jacobjacob Read the full episode description	Nov 10, 2022
"Decision theory does not imply that we get to have nice things" by So8res Read the full episode description	Nov 08, 2022
"What 2026 looks like" by Daniel Kokotajlo Read the full episode description	Nov 07, 2022
Counterarguments to the basic AI x-risk case Read the full episode description	Nov 04, 2022
"Introduction to abstract entropy" by Alex Altair Read the full episode description	Oct 29, 2022
"Consider your appetite for disagreements" by Adam Zerner Read the full episode description	Oct 25, 2022
"My resentful story of becoming a medical miracle" by Elizabeth Read the full episode description	Oct 21, 2022
"The Redaction Machine" by Ben Read the full episode description	Oct 02, 2022
"Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover" by Ajeya Cotra Read the full episode description	Sep 27, 2022
"The shard theory of human values" by Quintin Pope & TurnTrout Read the full episode description	Sep 22, 2022
"Two-year update on my personal AI timelines" by Ajeya Cotra Read the full episode description	Sep 22, 2022
"You Are Not Measuring What You Think You Are Measuring" by John Wentworth Read the full episode description	Sep 21, 2022
"Do bamboos set themselves on fire?" by Malmesbury Read the full episode description	Sep 20, 2022
"Toni Kurz and the Insanity of Climbing Mountains" by Gene Smith Read the full episode description	Sep 18, 2022
"Deliberate Grieving" by Raemon Read the full episode description	Sep 18, 2022
"Survey advice" by Katja Grace Read the full episode description	Sep 18, 2022
"Language models seem to be much better than humans at next-token prediction" by Buck, Fabien and LawrenceC Read the full episode description	Sep 15, 2022
"Humans are not automatically strategic" by Anna Salamon Read the full episode description	Sep 15, 2022
"Local Validity as a Key to Sanity and Civilization" by Eliezer Yudkowsky Read the full episode description	Sep 15, 2022
"Toolbox-thinking and Law-thinking" by Eliezer Yudkowsky Read the full episode description	Sep 15, 2022
"Moral strategies at different capability levels" by Richard Ngo Read the full episode description	Sep 14, 2022
"Worlds Where Iterative Design Fails" by John Wentworth Read the full episode description	Sep 11, 2022
"(My understanding of) What Everyone in Technical Alignment is Doing and Why" by Thomas Larsen & Eli Lifland Read the full episode description	Sep 11, 2022
"Unifying Bargaining Notions (1/2)" by Diffractor Read the full episode description	Sep 09, 2022
'Simulators' by Janus Read the full episode description	Sep 05, 2022
"Humans provide an untapped wealth of evidence about alignment" by TurnTrout & Quintin Pope Read the full episode description	Aug 08, 2022
"Changing the world through slack & hobbies" by Steven Byrnes Read the full episode description	Jul 30, 2022
"«Boundaries», Part 1: a key missing concept from utility theory" by Andrew Critch Read the full episode description	Jul 28, 2022
"ITT-passing and civility are good; "charity" is bad; steelmanning is niche" by Rob Bensinger Read the full episode description	Jul 24, 2022
"What should you change in response to an "emergency"? And AI risk" by Anna Salamon Read the full episode description	Jul 23, 2022
"On how various plans miss the hard bits of the alignment challenge" by Nate Soares Read the full episode description	Jul 17, 2022
"Humans are very reliable agents" by Alyssa Vance Read the full episode description	Jul 13, 2022
"Looking back on my alignment PhD" by TurnTrout Read the full episode description	Jul 08, 2022
"It’s Probably Not Lithium" by Natália Coelho Mendonça Read the full episode description	Jul 05, 2022
"What Are You Tracking In Your Head?" by John Wentworth Read the full episode description	Jul 02, 2022
"Security Mindset: Lessons from 20+ years of Software Security Failures Relevant to AGI Alignment" by elspood Read the full episode description	Jun 29, 2022
"Where I agree and disagree with Eliezer" by Paul Christiano Read the full episode description	Jun 22, 2022
"Six Dimensions of Operational Adequacy in AGI Projects" by Eliezer Yudkowsky Read the full episode description	Jun 21, 2022
"Moses and the Class Struggle" by lsusr Read the full episode description	Jun 21, 2022
"Benign Boundary Violations" by Duncan Sabien Read the full episode description	Jun 20, 2022
"AGI Ruin: A List of Lethalities" by Eliezer Yudkowsky Read the full episode description	Jun 20, 2022