LessWrong Curated Podcast

By LessWrong

Listen to a podcast, please open Podcast Republic app. Available on Google Play Store and Apple App Store.

Image by LessWrong

Category: Technology

Open in Apple Podcasts


Open RSS feed


Open Website


Rate for this podcast

Subscribers: 16
Reviews: 0
Episodes: 240

Description

Audio version of the posts shared in the LessWrong Curated newsletter.

Episode Date
[HUMAN VOICE] "My Clients, The Liars" by ymeskhout
Mar 20, 2024
[HUMAN VOICE] "Deep atheism and AI risk" by Joe Carlsmith
Mar 20, 2024
[HUMAN VOICE] "CFAR Takeaways: Andrew Critch" by Raemon
Mar 10, 2024
[HUMAN VOICE] "Speaking to Congressional staffers about AI risk" by Akash, hath
Mar 10, 2024
Many arguments for AI x-risk are wrong
Mar 09, 2024
Tips for Empirical Alignment Research
Mar 07, 2024
Timaeus’s First Four Months
Feb 29, 2024
Contra Ngo et al. “Every ‘Every Bay Area House Party’ Bay Area House Party”
Feb 23, 2024
[HUMAN VOICE] "And All the Shoggoths Merely Players" by Zack_M_Davis
Feb 20, 2024
[HUMAN VOICE] "Updatelessness doesn't solve most problems" by Martín Soto
Feb 20, 2024
Every “Every Bay Area House Party” Bay Area House Party
Feb 19, 2024
2023 Survey Results
Feb 19, 2024
Raising children on the eve of AI
Feb 18, 2024
“No-one in my org puts money in their pension”
Feb 18, 2024
Masterpiece
Feb 16, 2024
CFAR Takeaways: Andrew Critch
Feb 15, 2024
[HUMAN VOICE] "Believing In" by Anna Salamon
Feb 14, 2024
[HUMAN VOICE] "Attitudes about Applied Rationality" by Camille Berger
Feb 14, 2024
Scale Was All We Needed, At First
Feb 14, 2024
Sam Altman’s Chip Ambitions Undercut OpenAI’s Safety Strategy
Feb 11, 2024
[HUMAN VOICE] "A Shutdown Problem Proposal" by johnswentworth, David Lorell
Feb 09, 2024
Brute Force Manufactured Consensus is Hiding the Crime of the Century
Feb 04, 2024
[HUMAN VOICE] "Without fundamental advances, misalignment and catastrophe are the default outcomes of training powerful AI" by Jeremy Gillen, peterbarnett
Feb 03, 2024
Leading The Parade
Feb 02, 2024
[HUMAN VOICE] "The case for ensuring that powerful AIs are controlled" by ryan_greenblatt, Buck
Feb 02, 2024
Processor clock speeds are not how fast AIs think
Feb 01, 2024
Without fundamental advances, misalignment and catastrophe are the default outcomes of training powerful AI
Jan 31, 2024
Making every researcher seek grants is a broken model
Jan 29, 2024
The case for training frontier AIs on Sumerian-only corpus
Jan 28, 2024
This might be the last AI Safety Camp
Jan 25, 2024
[HUMAN VOICE] "There is way too much serendipity" by Malmesbury
Jan 22, 2024
[HUMAN VOICE] "How useful is mechanistic interpretability?" by ryan_greenblatt, Neel Nanda, Buck, habryka
Jan 20, 2024
[HUMAN VOICE] "Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training" by evhub et al
Jan 20, 2024
The impossible problem of due process
Jan 17, 2024
[HUMAN VOICE] "Gentleness and the artificial Other" by Joe Carlsmith
Jan 14, 2024
Introducing Alignment Stress-Testing at Anthropic
Jan 14, 2024
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
Jan 13, 2024
[HUMAN VOICE] "Meaning & Agency" by Abram Demski
Jan 07, 2024
What’s up with LLMs representing XORs of arbitrary features?
Jan 07, 2024
Gentleness and the artificial Other
Jan 05, 2024
MIRI 2024 Mission and Strategy Update
Jan 05, 2024
The Plan - 2023 Version
Jan 04, 2024
Apologizing is a Core Rationalist Skill
Jan 03, 2024
[HUMAN VOICE] "A case for AI alignment being difficult" by jessicata
Jan 02, 2024
The Dark Arts
Jan 01, 2024
Critical review of Christiano’s disagreements with Yudkowsky
Dec 28, 2023
Most People Don’t Realize We Have No Idea How Our AIs Work
Dec 27, 2023
Discussion: Challenges with Unsupervised LLM Knowledge Discovery
Dec 26, 2023
Succession
Dec 24, 2023
Nonlinear’s Evidence: Debunking False and Misleading Claims
Dec 21, 2023
Effective Aspersions: How the Nonlinear Investigation Went Wrong
Dec 20, 2023
Constellations are Younger than Continents
Dec 20, 2023
The ‘Neglected Approaches’ Approach: AE Studio’s Alignment Agenda
Dec 19, 2023
“Humanity vs. AGI” Will Never Look Like “Humanity vs. AGI” to Humanity
Dec 18, 2023
Is being sexy for your homies?
Dec 17, 2023
[HUMAN VOICE] "Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible" by Gene Smith and Kman
Dec 17, 2023
[HUMAN VOICE] "Moral Reality Check (a short story)" by jessicata
Dec 15, 2023
AI Control: Improving Safety Despite Intentional Subversion
Dec 15, 2023
2023 Unofficial LessWrong Census/Survey
Dec 13, 2023
The likely first longevity drug is based on sketchy science. This is bad for science and bad for longevity.
Dec 13, 2023
[HUMAN VOICE] "What are the results of more parental supervision and less outdoor play?" by Julia Wise
Dec 13, 2023
Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible
Dec 12, 2023
re: Yudkowsky on biological materials
Dec 11, 2023
Speaking to Congressional staffers about AI risk
Dec 05, 2023
[HUMAN VOICE] "Shallow review of live agendas in alignment & safety" by technicalities & Stag
Dec 04, 2023
Thoughts on “AI is easy to control” by Pope & Belrose
Dec 02, 2023
The 101 Space You Will Always Have With You
Nov 30, 2023
[HUMAN VOICE] "Social Dark Matter" by Duncan Sabien
Nov 28, 2023
Shallow review of live agendas in alignment & safety
Nov 28, 2023
Ability to solve long-horizon tasks correlates with wanting things in the behaviorist sense
Nov 25, 2023
[HUMAN VOICE] "The 6D effect: When companies take risks, one email can be very powerful." by scasper
Nov 23, 2023
OpenAI: The Battle of the Board
Nov 22, 2023
OpenAI: Facts from a Weekend
Nov 20, 2023
Sam Altman fired from OpenAI
Nov 18, 2023
Social Dark Matter
Nov 17, 2023
"You can just spontaneously call people you haven't met in years" by lc
Nov 17, 2023
[HUMAN VOICE] "Thinking By The Clock" by Screwtape
Nov 17, 2023
"EA orgs' legal structure inhibits risk taking and information sharing on the margin" by Elizabeth
Nov 17, 2023
[HUMAN VOICE] "AI Timelines" by habryka, Daniel Kokotajlo, Ajeya Cotra, Ege Erdil
Nov 17, 2023
"Integrity in AI Governance and Advocacy" by habryka, Olivia Jimenez
Nov 17, 2023
Loudly Give Up, Don’t Quietly Fade
Nov 16, 2023
[HUMAN VOICE] "Towards Monosemanticity: Decomposing Language Models With Dictionary Learning" by Zac Hatfield-Dodds
Nov 09, 2023
[HUMAN VOICE] "Deception Chess: Game #1" by Zane et al.
Nov 09, 2023
"Does davidad's uploading moonshot work?" by jacobjabob et al.
Nov 09, 2023
"The other side of the tidal wave" by Katja Grace
Nov 09, 2023
"The 6D effect: When companies take risks, one email can be very powerful." by scasper
Nov 09, 2023
Comp Sci in 2027 (Short story by Eliezer Yudkowsky)
Nov 09, 2023
"My thoughts on the social response to AI risk" by Matthew Barnett
Nov 09, 2023
"Propaganda or Science: A Look at Open Source AI and Bioterrorism Risk" by 1a3orn
Nov 09, 2023
"President Biden Issues Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence" by Tristan Williams
Nov 03, 2023
"Thoughts on the AI Safety Summit company policy requests and responses" by So8res
Nov 03, 2023
[Human Voice] "Book Review: Going Infinite" by Zvi
Oct 31, 2023
"Announcing Timaeus" by Jesse Hoogland et al.
Oct 30, 2023
"Thoughts on responsible scaling policies and regulation" by Paul Christiano
Oct 30, 2023
"AI as a science, and three obstacles to alignment strategies" by Nate Soares
Oct 30, 2023
"Architects of Our Own Demise: We Should Stop Developing AI" by Roko
Oct 30, 2023
"At 87, Pearl is still able to change his mind" by rotatingpaguro
Oct 30, 2023
"We're Not Ready: thoughts on "pausing" and responsible scaling policies" by Holden Karnofsky
Oct 30, 2023
[HUMAN VOICE] "Alignment Implications of LLM Successes: a Debate in One Act" by Zack M Davis
Oct 23, 2023
"LoRA Fine-tuning Efficiently Undoes Safety Training from Llama 2-Chat 70B" by Simon Lermen & Jeffrey Ladish.
Oct 23, 2023
"Holly Elmore and Rob Miles dialogue on AI Safety Advocacy" by jacobjacob, Robert Miles & Holly_Elmore
Oct 23, 2023
"Labs should be explicit about why they are building AGI" by Peter Barnett
Oct 19, 2023
[HUMAN VOICE] "Sum-threshold attacks" by TsviBT
Oct 18, 2023
"Will no one rid me of this turbulent pest?" by Metacelsus
Oct 18, 2023
"RSPs are pauses done right" by evhub
Oct 15, 2023
[HUMAN VOICE] "Inside Views, Impostor Syndrome, and the Great LARP" by John Wentworth
Oct 15, 2023
"Cohabitive Games so Far" by mako yass
Oct 15, 2023
"Announcing MIRI’s new CEO and leadership team" by Gretta Duleba
Oct 15, 2023
"Comparing Anthropic's Dictionary Learning to Ours" by Robert_AIZI
Oct 15, 2023
"Announcing Dialogues" by Ben Pace
Oct 09, 2023
"Towards Monosemanticity: Decomposing Language Models With Dictionary Learning" by Zac Hatfield-Dodds
Oct 09, 2023
"Evaluating the historical value misspecification argument" by Matthew Barnett
Oct 09, 2023
"Response to Quintin Pope’s Evolution Provides No Evidence For the Sharp Left Turn" by Zvi
Oct 09, 2023
"Thomas Kwa's MIRI research experience" by Thomas Kwa and others
Oct 06, 2023
"EA Vegan Advocacy is not truthseeking, and it’s everyone’s problem" by Elizabeth
Oct 03, 2023
"How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions" by Jan Brauner et al.
Oct 03, 2023
"The Lighthaven Campus is open for bookings" by Habryka
Oct 03, 2023
"'Diamondoid bacteria' nanobots: deadly threat or dead-end? A nanotech investigation" by titotal
Oct 03, 2023
"The King and the Golem" by Richard Ngo
Sep 29, 2023
"Sparse Autoencoders Find Highly Interpretable Directions in Language Models" by Logan Riggs et al
Sep 27, 2023
"Inside Views, Impostor Syndrome, and the Great LARP" by John Wentworth
Sep 26, 2023
"There should be more AI safety orgs" by Marius Hobbhahn
Sep 25, 2023
"The Talk: a brief explanation of sexual dimorphism" by Malmesbury
Sep 22, 2023
"A Golden Age of Building? Excerpts and lessons from Empire State, Pentagon, Skunk Works and SpaceX" by jacobjacob
Sep 20, 2023
"AI presidents discuss AI alignment agendas" by TurnTrout & Garrett Baker
Sep 19, 2023
"UDT shows that decision theory is more puzzling than ever" by Wei Dai
Sep 18, 2023
"Sum-threshold attacks" by TsviBT
Sep 11, 2023
"A list of core AI safety problems and how I hope to solve them" by Davidad
Sep 09, 2023
"Report on Frontier Model Training" by Yafah Edelman
Sep 09, 2023
"Defunding My Mistake" by ymeskhout
Sep 08, 2023
"Sharing Information About Nonlinear" by Ben Pace
Sep 08, 2023
"One Minute Every Moment" by abramdemski
Sep 08, 2023
"What I would do if I wasn’t at ARC Evals" by LawrenceC
Sep 08, 2023
"The U.S. is becoming less stable" by lc
Sep 04, 2023
"Meta Questions about Metaphilosophy" by Wei Dai
Sep 04, 2023
"OpenAI API base models are not sycophantic, at any size" by Nostalgebraist
Sep 04, 2023
"Dear Self; we need to talk about ambition" by Elizabeth
Aug 30, 2023
"Book Launch: "The Carving of Reality," Best of LessWrong vol. III" by Raemon
Aug 28, 2023
"Assume Bad Faith" by Zack_M_Davis
Aug 28, 2023
"Large Language Models will be Great for Censorship" by Ethan Edwards
Aug 23, 2023
"Ten Thousand Years of Solitude" by agp
Aug 22, 2023
"6 non-obvious mental health issues specific to AI safety" by Igor Ivanov
Aug 22, 2023
"Against Almost Every Theory of Impact of Interpretability" by Charbel-Raphaël
Aug 21, 2023
"Inflection.ai is a major AGI lab" by Nikola
Aug 15, 2023
"Feedbackloop-first Rationality" by Raemon
Aug 15, 2023
"When can we trust model evaluations?" bu evhub
Aug 09, 2023
"Model Organisms of Misalignment: The Case for a New Pillar of Alignment Research" by evhub, Nicholas Schiefer, Carson Denison, Ethan Perez
Aug 09, 2023
"My current LK99 questions" by Eliezer Yudkowsky
Aug 04, 2023
"The "public debate" about AI is confusing for the general public and for policymakers because it is a three-sided debate" by Adam David Long
Aug 04, 2023
"ARC Evals new report: Evaluating Language-Model Agents on Realistic Autonomous Tasks" by Beth Barnes
Aug 04, 2023
"Thoughts on sharing information about language model capabilities" by paulfchristiano
Aug 02, 2023
"Yes, It's Subjective, But Why All The Crabs?" by johnswentworth
Jul 31, 2023
"Self-driving car bets" by paulfchristiano
Jul 31, 2023
"Cultivating a state of mind where new ideas are born" by Henrik Karlsson
Jul 31, 2023
"Rationality !== Winning" by Raemon
Jul 28, 2023
"Brain Efficiency Cannell Prize Contest Award Ceremony" by Alexander Gietelink Oldenziel
Jul 28, 2023
"Grant applications and grand narratives" by Elizabeth
Jul 28, 2023
"Cryonics and Regret" by MvB
Jul 28, 2023
"Unifying Bargaining Notions (2/2)" by Diffractor
Jun 12, 2023
"The ants and the grasshopper" by Richard Ngo
Jun 06, 2023
"Steering GPT-2-XL by adding an activation vector" by TurnTrout et al.
May 18, 2023
"An artificially structured argument for expecting AGI ruin" by Rob Bensinger
May 16, 2023
"How much do you believe your results?" by Eric Neyman
May 10, 2023
"Mental Health and the Alignment Problem: A Compilation of Resources (updated April 2023)" by Chris Scammell & DivineMango
Apr 27, 2023
"On AutoGPT" by Zvi
Apr 19, 2023
"GPTs are Predictors, not Imitators" by Eliezer Yudkowsky
Apr 12, 2023
"Discussion with Nate Soares on a key alignment difficulty" by Holden Karnofsky
Apr 05, 2023
"A stylized dialogue on John Wentworth's claims about markets and optimization" by Nate Soares
Apr 05, 2023
"Deep Deceptiveness" by Nate Soares
Apr 05, 2023
"The Onion Test for Personal and Institutional Honesty" by Chana Messinger & Andrew Critch
Mar 28, 2023
"There’s no such thing as a tree (phylogenetically)" by Eukaryote
Mar 28, 2023
"Losing the root for the tree" by Adam Zerner
Mar 28, 2023
"Lies, Damn Lies, and Fabricated Options" by Duncan Sabien
Mar 28, 2023
"Why I think strong general AI is coming soon" by Porby
Mar 28, 2023
"It Looks Like You’re Trying To Take Over The World" by Gwern
Mar 28, 2023
"What failure looks like" by Paul Christiano
Mar 28, 2023
"More information about the dangerous capability evaluations we did with GPT-4 and Claude." by Beth Barnes
Mar 21, 2023
""Carefully Bootstrapped Alignment" is organizationally hard" by Raemon
Mar 21, 2023
"The Parable of the King and the Random Process" by moridinamael
Mar 14, 2023
"Enemies vs Malefactors" by Nate Soares
Mar 14, 2023
"The Waluigi Effect (mega-post)" by Cleo Nardo
Mar 08, 2023
"Acausal normalcy" by Andrew Critch
Mar 06, 2023
"Please don't throw your mind away" by TsviBT
Mar 01, 2023
"Cyborgism" by Nicholas Kees & Janus
Feb 15, 2023
"Childhoods of exceptional people" by Henrik Karlsson
Feb 14, 2023
"What I mean by "alignment is in large part about making cognition aimable at all"" by Nate Soares
Feb 13, 2023
"On not getting contaminated by the wrong obesity ideas" by Natália Coelho Mendonça
Feb 10, 2023
"SolidGoldMagikarp (plus, prompt generation)"
Feb 08, 2023
"Focus on the places where you feel shocked everyone's dropping the ball" by Nate Soares
Feb 03, 2023
"Basics of Rationalist Discourse" by Duncan Sabien
Feb 02, 2023
"Sapir-Whorf for Rationalists" by Duncan Sabien
Jan 31, 2023
"My Model Of EA Burnout" by Logan Strohl
Jan 31, 2023
"The Social Recession: By the Numbers" by Anton Stjepan Cebalo
Jan 25, 2023
"Recursive Middle Manager Hell" by Raemon
Jan 24, 2023
"How 'Discovering Latent Knowledge in Language Models Without Supervision' Fits Into a Broader Alignment Scheme" by Collin
Jan 12, 2023
"Models Don't 'Get Reward'" by Sam Ringer
Jan 12, 2023
"The Feeling of Idea Scarcity" by John Wentworth
Jan 12, 2023
"The next decades might be wild" by Marius Hobbhahn
Dec 21, 2022
"Lessons learned from talking to >100 academics about AI safety" by Marius Hobbhahn
Nov 17, 2022
"How my team at Lightcone sometimes gets stuff done" by jacobjacob
Nov 10, 2022
"Decision theory does not imply that we get to have nice things" by So8res
Nov 08, 2022
"What 2026 looks like" by Daniel Kokotajlo
Nov 07, 2022
Counterarguments to the basic AI x-risk case
Nov 04, 2022
"Introduction to abstract entropy" by Alex Altair
Oct 29, 2022
"Consider your appetite for disagreements" by Adam Zerner
Oct 25, 2022
"My resentful story of becoming a medical miracle" by Elizabeth
Oct 21, 2022
"The Redaction Machine" by Ben
Oct 02, 2022
"Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover" by Ajeya Cotra
Sep 27, 2022
"The shard theory of human values" by Quintin Pope & TurnTrout
Sep 22, 2022
"Two-year update on my personal AI timelines" by Ajeya Cotra
Sep 22, 2022
"You Are Not Measuring What You Think You Are Measuring" by John Wentworth
Sep 21, 2022
"Do bamboos set themselves on fire?" by Malmesbury
Sep 20, 2022
"Toni Kurz and the Insanity of Climbing Mountains" by Gene Smith
Sep 18, 2022
"Deliberate Grieving" by Raemon
Sep 18, 2022
"Survey advice" by Katja Grace
Sep 18, 2022
"Language models seem to be much better than humans at next-token prediction" by Buck, Fabien and LawrenceC
Sep 15, 2022
"Humans are not automatically strategic" by Anna Salamon
Sep 15, 2022
"Local Validity as a Key to Sanity and Civilization" by Eliezer Yudkowsky
Sep 15, 2022
"Toolbox-thinking and Law-thinking" by Eliezer Yudkowsky
Sep 15, 2022
"Moral strategies at different capability levels" by Richard Ngo
Sep 14, 2022
"Worlds Where Iterative Design Fails" by John Wentworth
Sep 11, 2022
"(My understanding of) What Everyone in Technical Alignment is Doing and Why" by Thomas Larsen & Eli Lifland
Sep 11, 2022
"Unifying Bargaining Notions (1/2)" by Diffractor
Sep 09, 2022
'Simulators' by Janus
Sep 05, 2022
"Humans provide an untapped wealth of evidence about alignment" by TurnTrout & Quintin Pope
Aug 08, 2022
"Changing the world through slack & hobbies" by Steven Byrnes
Jul 30, 2022
"«Boundaries», Part 1: a key missing concept from utility theory" by Andrew Critch
Jul 28, 2022
"ITT-passing and civility are good; "charity" is bad; steelmanning is niche" by Rob Bensinger
Jul 24, 2022
"What should you change in response to an "emergency"? And AI risk" by Anna Salamon
Jul 23, 2022
"On how various plans miss the hard bits of the alignment challenge" by Nate Soares
Jul 17, 2022
"Humans are very reliable agents" by Alyssa Vance
Jul 13, 2022
"Looking back on my alignment PhD" by TurnTrout
Jul 08, 2022
"It’s Probably Not Lithium" by Natália Coelho Mendonça
Jul 05, 2022
"What Are You Tracking In Your Head?" by John Wentworth
Jul 02, 2022
"Security Mindset: Lessons from 20+ years of Software Security Failures Relevant to AGI Alignment" by elspood
Jun 29, 2022
"Where I agree and disagree with Eliezer" by Paul Christiano
Jun 22, 2022
"Six Dimensions of Operational Adequacy in AGI Projects" by Eliezer Yudkowsky
Jun 21, 2022
"Moses and the Class Struggle" by lsusr
Jun 21, 2022
"Benign Boundary Violations" by Duncan Sabien
Jun 20, 2022
"AGI Ruin: A List of Lethalities" by Eliezer Yudkowsky
Jun 20, 2022