LessWrong (30+ Karma)

By LessWrong

Listen to a podcast, please open Podcast Republic app. Available on Google Play Store and Apple App Store.

Image by LessWrong

Category: Technology

Open in Apple Podcasts


Open RSS feed


Open Website


Rate for this podcast

Subscribers: 1
Reviews: 0
Episodes: 500

Description

Audio narrations of LessWrong posts.

Episode Date
“Beware unfinished bridges” by Adam Zerner
May 12, 2024
“New intro textbook on AIXI” by Alex_Altair
May 12, 2024
“Can we build a better Public Doublecrux?” by Raemon
May 12, 2024
“Podcast with Yoshua Bengio on Why AI Labs are ‘Playing Dice with Humanity’s Future’” by garrison
May 11, 2024
“MATS Winter 2023-24 Retrospective” by Rocket, Ryan Kidd, LauraVaughan, McKennaFitzgerald, Christian Smith, Juan Gil, Henry Sleight
May 11, 2024
“shortest goddamn bayes guide ever” by lukehmiles
May 10, 2024
“How to be an amateur polyglot” by arisAlexis
May 10, 2024
“My thesis (Algorithmic Bayesian Epistemology) explained in more depth” by Eric Neyman
May 10, 2024
“We might be missing some key feature of AI takeoff; it’ll probably seem like ‘we could’ve seen this coming’” by Lukas_Gloor
May 10, 2024
“AI #63: Introducing Alpha Fold 3” by Zvi
May 10, 2024
“Why Care About Natural Latents?” by johnswentworth, David Lorell
May 10, 2024
“Dyslucksia” by Shoshannah Tekofsky
May 09, 2024
“some thoughts on LessOnline” by Raemon
May 09, 2024
“Dating Roundup #3: Third Time’s the Charm” by Zvi
May 09, 2024
[Linkpost] “Designing for a single purpose” by Itay Dreyfus
May 08, 2024
“Deep Honesty” by Aletheophile
May 07, 2024
“Observations on Teaching for Four Weeks” by ClareChiaraVincent
May 07, 2024
“Rapid capability gain around supergenius level seems probable even without intelligence needing to improve intelligence” by Towards_Keeperhood, Davanchama
May 06, 2024
[Linkpost] “Uncovering Deceptive Tendencies in Language Models: A Simulated Company AI Assistant” by Olli Järviniemi, evhub
May 06, 2024
“Explaining a Math Magic Trick” by Robert_AIZI
May 05, 2024
[Linkpost] “introduction to cancer vaccines” by bhauth
May 05, 2024
“AI #61: Meta Trouble” by Zvi
May 05, 2024
[Linkpost] “S-Risks: Fates Worse Than Extinction” by aggliu, Writer
May 05, 2024
“Introducing AI-Powered Audiobooks of Rational Fiction Classics” by Askwho
May 04, 2024
“Now THIS is forecasting: understanding Epoch’s Direct Approach” by Elliot_Mckernon, Zershaaneh Qureshi
May 04, 2024
[Linkpost] “My hour of memoryless lucidity” by Eric Neyman
May 04, 2024
“Apply to ESPR & PAIR, Rationality and AI Camps for Ages 16-21” by Anna Gajdova
May 04, 2024
[Linkpost] “‘AI Safety for Fleshy Humans’ an AI Safety explainer by Nicky Case” by habryka
May 03, 2024
“Key takeaways from our EA and alignment research surveys” by Cameron Berg, Judd Rosenblatt, florin_pop, AE Studio
May 03, 2024
“AI #62: Too Soon to Tell” by Zvi
May 03, 2024
“Mechanistic Interpretability Workshop Happening at ICML 2024!” by Neel Nanda
May 03, 2024
“Q&A on Proposed SB 1047” by Zvi
May 02, 2024
“Please stop publishing ideas/insights/research about AI” by Tamsin Leake
May 02, 2024
“Manifund Q1 Retro: Learnings from impact certs” by Austin Chen
May 02, 2024
“An explanation of evil in an organized world” by KatjaGrace
May 02, 2024
“Take SCIFs, it’s dangerous to go alone” by latterframe, Jeffrey Ladish, schroederdewitt
May 02, 2024
“The Intentional Stance, LLMs Edition” by Eleni Angelou
May 01, 2024
“ACX Covid Origins Post convinced readers” by ErnestScribbler
May 01, 2024
“LessWrong Community Weekend 2024, open for applications” by UnplannedCauliflower, jt
May 01, 2024
“Transcoders enable fine-grained interpretable circuit analysis for language models” by Jacob Dunefsky, Philippe Chlenski, Neel Nanda
May 01, 2024
“Questions for labs” by Zach Stein-Perlman
May 01, 2024
“Mechanistically Eliciting Latent Behaviors in Language Models” by Andrew Mack
Apr 30, 2024
“Why I’m doing PauseAI” by Joseph Miller
Apr 30, 2024
[Linkpost] “Introducing AI Lab Watch” by Zach Stein-Perlman
Apr 30, 2024
“Towards a formalization of the agent structure problem” by Alex_Altair
Apr 30, 2024
“Towards Multimodal Interpretability: Learning Sparse Interpretable Features in Vision Transformers” by hugofry
Apr 30, 2024
“Ironing Out the Squiggles” by Zack_M_Davis
Apr 29, 2024
“List your AI X-Risk cruxes!” by Aryeh Englander
Apr 29, 2024
“AISC9 has ended and there will be an AISC10” by Linda Linsefors
Apr 29, 2024
“[Aspiration-based designs] 1. Informal introduction” by B Jacobs, Jobst Heitzig, Simon Fischer, Simon Dima
Apr 29, 2024
“Constructability: Plainly-coded AGIs may be feasible in the near future” by Épiphanie Gédéon, Charbel-Raphaël
Apr 28, 2024
“We are headed into an extreme compute overhang” by devrandom
Apr 28, 2024
“So What’s Up With PUFAs Chemically?” by J Bostock
Apr 27, 2024
“Refusal in LLMs is mediated by a single direction” by Andy Arditi, Oscar Balcells Obeso, Aaquib111, wesg, Neel Nanda
Apr 27, 2024
“D&D.Sci Long War: Defender of Data-mocracy” by aphyer
Apr 27, 2024
“Superposition is not ‘just’ neuron polysemanticity” by LawrenceC
Apr 27, 2024
“Duct Tape security” by Isaac King
Apr 26, 2024
“On Not Pulling The Ladder Up Behind You” by Screwtape
Apr 26, 2024
“Scaling of AI training runs will slow down after GPT-5” by Maxime Riché
Apr 26, 2024
“Spatial attention as a ‘tell’ for empathetic simulation?” by Steven Byrnes
Apr 26, 2024
“Losing Faith In Contrarianism” by omnizoid
Apr 26, 2024
[Linkpost] “LLMs seem (relatively) safe” by JustisMills
Apr 26, 2024
[Linkpost] “Improving Dictionary Learning with Gated Sparse Autoencoders” by Neel Nanda
Apr 25, 2024
[Linkpost] “WSJ: Inside Amazon’s Secret Operation to Gather Intel on Rivals” by trevor
Apr 25, 2024
[Linkpost] “‘Why I Write’ by George Orwell (1946)” by Arjun Panickssery
Apr 25, 2024
“The first future and the best future” by KatjaGrace
Apr 25, 2024
[Linkpost] “The Inner Ring by C. S. Lewis” by Saul Munn
Apr 25, 2024
“This is Water by David Foster Wallace” by Nathan Young
Apr 25, 2024
“Changes in College Admissions” by Zvi
Apr 24, 2024
“Examples of Highly Counterfactual Discoveries?” by johnswentworth
Apr 24, 2024
[Linkpost] “On what research policymakers actually need” by MondSemmel
Apr 24, 2024
[Linkpost] “Let’s Design A School, Part 1” by Sable
Apr 24, 2024
[Linkpost] “Dequantifying first-order theories” by jessicata
Apr 24, 2024
[Linkpost] “Simple probes can catch sleeper agents” by Monte M, Carson Denison, Zac Hatfield-Dodds, evhub
Apr 23, 2024
“Rejecting Television” by Declan Molony
Apr 23, 2024
“Forget Everything (Statistical Mechanics Part 1)” by J Bostock
Apr 23, 2024
“Funny Anecdote of Eliezer From His Sister” by Daniel Birnbaum
Apr 22, 2024
[Linkpost] “Motivation gaps: Why so much EA criticism is hostile and lazy” by titotal
Apr 22, 2024
“Priors and Prejudice” by MathiasKB
Apr 22, 2024
[Linkpost] “AI Regulation is Unsafe” by Maxwell Tabarrok
Apr 22, 2024
“On Llama-3 and Dwarkesh Patel’s Podcast with Zuckerberg” by Zvi
Apr 22, 2024
“Transfer Learning in Humans” by niplav
Apr 22, 2024
“A couple productivity tips for overthinkers” by Steven Byrnes
Apr 20, 2024
“What’s up with all the non-Mormons? Weirdly specific universalities across LLMs” by mwatkins
Apr 20, 2024
[Linkpost] “Thoughts on seed oil” by dynomight
Apr 20, 2024
“Inducing Unprompted Misalignment in LLMs” by Sam Svenningsen, evhub, Henry Sleight
Apr 20, 2024
“Progress Update #1 from the GDM Mech Interp Team: Summary” by Neel Nanda, Arthur Conmy, lsgos, Senthooran Rajamanoharan, Tom Lieberum, János Kramár, Vikrant Varma
Apr 20, 2024
“Progress Update #1 from the GDM Mech Interp Team: Full Update” by Neel Nanda, Arthur Conmy, lsgos, Senthooran Rajamanoharan, Tom Lieberum, János Kramár, Vikrant Varma
Apr 19, 2024
[Linkpost] “Daniel Dennett has died (1924-2024)” by kave
Apr 19, 2024
“Experiment on repeating choices” by KatjaGrace
Apr 19, 2024
“[Fiction] A Confession” by Arjun Panickssery
Apr 19, 2024
“I’m open for projects (sort of)” by cousin_it
Apr 18, 2024
“AI #60: Oh the Humanity” by Zvi
Apr 18, 2024
“Discriminating Behaviorally Identical Classifiers: a model problem for applying interpretability to scalable oversight” by Sam Marks
Apr 18, 2024
“The Mom Test: Summary and Thoughts” by Adam Zerner
Apr 18, 2024
“Childhood and Education Roundup #5” by Zvi
Apr 18, 2024
[Linkpost] “Claude 3 Opus can operate as a Turing machine” by Gunnar_Zarncke
Apr 18, 2024
“Express interest in an ‘FHI of the West’” by habryka
Apr 18, 2024
“Why Would Belief-States Have A Fractal Structure, And Why Would That Matter For Interpretability? An Explainer” by johnswentworth, David Lorell
Apr 18, 2024
“Effectively Handling Disagreements - Introducing a New Workshop” by Camille Berger
Apr 17, 2024
[Linkpost] “Moving on from community living” by Vika
Apr 17, 2024
[Linkpost] “FHI (Future of Humanity Institute) has shut down (2005–2024)” by gwern
Apr 17, 2024
“When is a mind me?” by Rob Bensinger
Apr 17, 2024
“Mid-conditional love” by KatjaGrace
Apr 17, 2024
“Transformers Represent Belief State Geometry in their Residual Stream” by Adam Shai
Apr 16, 2024
[Linkpost] “Paul Christiano named as US AI Safety Institute Head of AI Safety” by Joel Burget
Apr 16, 2024
[Linkpost] “Essay competition on the Automation of Wisdom and Philosophy — $25k in prizes” by owencb, AI Impacts
Apr 16, 2024
“Monthly Roundup #17: April 2024” by Zvi
Apr 16, 2024
“Anthropic AI made the right call” by bhauth
Apr 16, 2024
“My experience using financial commitments to overcome akrasia” by William Howard
Apr 15, 2024
[Linkpost] “A High Decoupling Failure” by Maxwell Tabarrok
Apr 15, 2024
“Reconsider the anti-cavity bacteria if you are Asian” by Lao Mein
Apr 15, 2024
“Text Posts from the Kids Group: 2020” by jefftk
Apr 14, 2024
“Prompts for Big-Picture Planning” by Raemon
Apr 14, 2024
“Things Solenoid Narrates” by Solenoid_Entity
Apr 13, 2024
[Linkpost] “Carl Sagan, nuking the moon, and not nuking the moon” by eukaryote
Apr 13, 2024
[Linkpost] “MIRI’s April 2024 Newsletter” by Harlan, Rob Bensinger
Apr 13, 2024
“UDT1.01: Plannable and Unplanned Observations (3/10)” by Diffractor
Apr 12, 2024
“Generalized Stat Mech: The Boltzmann Approach” by David Lorell, johnswentworth
Apr 12, 2024
“A D&D.Sci Dodecalogue” by abstractapplic
Apr 12, 2024
“Announcing Atlas Computing” by miyazono
Apr 12, 2024
“A Gentle Introduction to Risk Frameworks Beyond Forecasting” by pendingsurvival
Apr 12, 2024
“How I select alignment research projects” by Ethan Perez, Henry Sleight, Mikita Balesni
Apr 11, 2024
“D&D.Sci: The Mad Tyrant’s Pet Turtles [Evaluation and Ruleset]” by abstractapplic
Apr 10, 2024
“RTFB: On the New Proposed CAIP AI Bill” by Zvi
Apr 10, 2024
“Medical Roundup #2” by Zvi
Apr 10, 2024
“Ophiology (or, how the Mamba architecture works)” by Danielle Ensign, SrGonao, Adrià Garriga-alonso
Apr 09, 2024
[Linkpost] “Conflict in Posthuman Literature” by Martín Soto
Apr 09, 2024
“Math-to-English Cheat Sheet” by nahoj
Apr 09, 2024
“PIBBSS is hiring in a variety of roles (alignment research and incubation program)” by Nora_Ammann, Lucas Teixeira, DusanDNesic
Apr 09, 2024
“Gated Attention Blocks: Preliminary Progress toward Removing Attention Head Superposition” by cmathw, Dennis Akar, Lee Sharkey
Apr 09, 2024
[Linkpost] “on the dollar-yen exchange rate” by bhauth
Apr 08, 2024
“How We Picture Bayesian Agents” by johnswentworth, David Lorell
Apr 08, 2024
“A Dozen Ways to Get More Dakka” by Davidmanheim
Apr 08, 2024
“My intellectual journey to (dis)solve the hard problem of consciousness” by Charbel-Raphaël
Apr 07, 2024
“Koan: divining alien datastructures from RAM activations” by TsviBT
Apr 07, 2024
“‘Fractal Strategy’ workshop report” by Raemon
Apr 07, 2024
[Linkpost] “The 2nd Demographic Transition” by Maxwell Tabarrok
Apr 07, 2024
“On Complexity Science” by Garrett Baker
Apr 05, 2024
“What’s with all the bans recently?” by Gerald Monroe
Apr 05, 2024
“Partial value takeover without world takeover” by KatjaGrace
Apr 05, 2024
“New report: A review of the empirical evidence for existential risk from AI via misaligned power-seeking” by Harlan, rosehadshar
Apr 05, 2024
“AI #58: Stargate AGI” by Zvi
Apr 05, 2024
“Run evals on base models too!” by orthonormal
Apr 04, 2024
“LLMs for Alignment Research: a safety priority?” by abramdemski
Apr 04, 2024
“A gentle introduction to mechanistic anomaly detection” by Erik Jenner
Apr 04, 2024
“Best in Class Life Improvement” by sapphire
Apr 04, 2024
“Sparsify: A mechanistic interpretability research agenda” by Lee Sharkey
Apr 03, 2024
“Notes on Dwarkesh Patel’s Podcast with Sholto Douglas and Trenton Bricken” by Zvi
Apr 02, 2024
“Thousands of malicious actors on the future of AI misuse” by Zershaaneh Qureshi, Corin Katzke, Convergence Analysis
Apr 02, 2024
“Gradient Descent on the Human Brain” by Jozdien, gaspode
Apr 02, 2024
“Coherence of Caches and Agents” by johnswentworth
Apr 02, 2024
“Announcing Suffering For Good” by Garrett Baker
Apr 01, 2024
“OMMC Announces RIP” by Adam Scholl, aysja
Apr 01, 2024
“The Evolution of Humans Was Net-Negative for Human Values” by Zack_M_Davis
Apr 01, 2024
“A Selection of Randomly Selected SAE Features” by CallumMcDougall, Joseph Bloom
Apr 01, 2024
“Apply to be a Safety Engineer at Lockheed Martin!” by yanni
Apr 01, 2024
[Linkpost] “Introducing Open Asteroid Impact” by Linch
Apr 01, 2024
“The Story of ‘I Have Been A Good Bing’” by habryka, kave
Apr 01, 2024
“SAE-VIS: Announcement Post” by CallumMcDougall, Joseph Bloom
Mar 31, 2024
“The Best Tacit Knowledge Videos on Every Subject” by Parker Conley
Mar 31, 2024
“My simple AGI investment & insurance strategy” by lc
Mar 31, 2024
[Linkpost] “Metascience of the Vesuvius Challenge” by Maxwell Tabarrok
Mar 30, 2024
“Back to Basics: Truth is Unitary” by lsusr
Mar 30, 2024
“Your LLM Judge may be biased” by rachelAF, Henry Papadatos
Mar 30, 2024
“D&D.Sci: The Mad Tyrant’s Pet Turtles” by abstractapplic
Mar 30, 2024
“SAE reconstruction errors are (empirically) pathological” by wesg
Mar 29, 2024
“How to safely use an optimizer” by Simon Fischer
Mar 29, 2024
[Linkpost] “Practically-A-Book Review: Rootclaim $100,000 Lab Leak Debate” by trevor
Mar 28, 2024
“Was Releasing Claude-3 Net-Negative?” by Logan Riggs
Mar 28, 2024
[Linkpost] “Come to Manifest 2024 (June 7-9 in Berkeley)” by Saul Munn
Mar 28, 2024
[Linkpost] “The Cognitive-Theoretic Model of the Universe: A Partial Summary and Review” by jessicata
Mar 27, 2024
[Linkpost] “Daniel Kahneman has died” by DanielFilan
Mar 27, 2024
[Linkpost] “Nick Bostrom’s new book, ‘Deep Utopia’, is out today” by PeterH
Mar 27, 2024
“AE Studio @ SXSW: We need more AI consciousness research (and further resources)” by AE Studio, Cameron Berg, Judd Rosenblatt, phgubbins, Diogo de Lucena
Mar 27, 2024
“Failures in Kindness” by silentbob
Mar 26, 2024
“Modern Transformers are AGI, and Human-Level” by abramdemski
Mar 26, 2024
“My Interview With Cade Metz on His Reporting About Slate Star Codex” by Zack_M_Davis
Mar 26, 2024
“Announcing Neuronpedia: Platform for accelerating research into Sparse Autoencoders” by Johnny Lin, Joseph Bloom
Mar 26, 2024
“Should rationalists be spiritual / Spirituality as overcoming delusion” by Kaj_Sotala, romeostevensit
Mar 26, 2024
“On attunement” by Joe Carlsmith
Mar 25, 2024
“My Detailed Notes & Commentary from Secular Solstice” by Jeffrey Heninger
Mar 25, 2024
“On Lex Fridman’s Second Podcast with Altman” by Zvi
Mar 25, 2024
“Do not delete your misaligned AGI.” by mako yass
Mar 25, 2024
“All About Concave and Convex Agents” by mako yass
Mar 25, 2024
“Vipassana Meditation and Active Inference: A Framework for Understanding Suffering and its Cessation” by Benjamin Sturgeon
Mar 24, 2024
“General Thoughts on Secular Solstice” by Jeffrey Heninger
Mar 24, 2024
“Dangers of Closed-Loop AI” by Gordon Seidoh Worley
Mar 23, 2024
“A Teacher vs. Everyone Else” by ronak69
Mar 23, 2024
“AI #56: Blackwell That Ends Well” by Zvi
Mar 23, 2024
[Linkpost] “Vernor Vinge, who coined the term ‘Technological Singularity’, dies at 79” by Kaj_Sotala
Mar 21, 2024
“ChatGPT can learn indirect control” by Raymond D
Mar 21, 2024
[Linkpost] “‘Deep Learning’ Is Function Approximation” by Zack_M_Davis
Mar 21, 2024
“On green” by Joe Carlsmith
Mar 21, 2024
[Linkpost] “DeepMind: Evaluating Frontier Models for Dangerous Capabilities” by Zach Stein-Perlman
Mar 21, 2024
“On the Gladstone Report” by Zvi
Mar 21, 2024
“Stagewise Development in Neural Networks” by Jesse Hoogland, Liam Carroll, Daniel Murfet
Mar 20, 2024
“Monthly Roundup #16: March 2024” by Zvi
Mar 20, 2024
“Natural Latents: The Concepts” by johnswentworth, David Lorell
Mar 20, 2024
“New report: Safety Cases for AI” by joshc
Mar 20, 2024
[Linkpost] “Increasing IQ by 10 Points is Possible” by George3d6
Mar 19, 2024
[Linkpost] “Inferring the model dimension of API-protected LLMs” by Ege Erdil
Mar 19, 2024
“Experimentation (Part 7 of ‘The Sense Of Physical Necessity’)” by LoganStrohl
Mar 19, 2024
“Neuroscience and Alignment” by Garrett Baker
Mar 19, 2024
[Linkpost] “Toki pona FAQ” by dkl9
Mar 19, 2024
“5 Physics Problems” by DaemonicSigil, Muireall
Mar 18, 2024
“Measuring Coherence of Policies in Toy Environments” by dx26, Richard_Ngo, Martín Soto
Mar 18, 2024
“Community Notes by X” by NicholasKees
Mar 18, 2024
“On Devin” by Zvi
Mar 18, 2024
“The Worst Form Of Government (Except For Everything Else We’ve Tried)” by johnswentworth
Mar 17, 2024
[Linkpost] “Anxiety vs. Depression” by Sable
Mar 17, 2024
[Linkpost] “My PhD thesis: Algorithmic Bayesian Epistemology” by Eric Neyman
Mar 16, 2024
[Linkpost] “How people stopped dying from diarrhea so much (& other life-saving decisions)” by Writer
Mar 16, 2024
“Rational Animations offers animation production and writing services!” by Writer
Mar 16, 2024
[Linkpost] “Introducing METR’s Autonomy Evaluation Resources” by Megan Kinniment, Beth Barnes
Mar 16, 2024
[Linkpost] “More people getting into AI safety should do a PhD” by AdamGleave
Mar 15, 2024
[Linkpost] “Constructive Cauchy sequences vs. Dedekind cuts” by jessicata
Mar 15, 2024
[Linkpost] “Conditional on Getting to Trade, Your Trade Wasn’t All That Great” by Ricki Heicklen
Mar 15, 2024
“Highlights from Lex Fridman’s interview of Yann LeCun” by Joel Burget
Mar 14, 2024
“How useful is ‘AI Control’ as a framing on AI X-Risk?” by habryka, ryan_greenblatt
Mar 14, 2024
“AI #55: Keep Clauding Along” by Zvi
Mar 14, 2024
“Jobs, Relationships, and Other Cults” by Ruby, Elizabeth
Mar 14, 2024
“On the Latest TikTok Bill” by Zvi
Mar 14, 2024
“‘Empiricism!’ as Anti-Epistemology” by Eliezer Yudkowsky
Mar 14, 2024
“Open consultancy: Letting untrusted AIs choose what answer to argue for” by Fabien Roger
Mar 13, 2024
[Linkpost] “Superforecasting the Origins of the Covid-19 Pandemic” by DanielFilan
Mar 13, 2024
“The Parable Of The Fallen Pendulum, Part 2” by johnswentworth
Mar 13, 2024
“OpenAI: The Board Expands” by Zvi
Mar 12, 2024
[Linkpost] “Results from an Adversarial Collaboration on AI Risk (FRI)” by Forecasting Research Institute
Mar 12, 2024
“‘Artificial General Intelligence’: an extremely brief FAQ” by Steven Byrnes
Mar 11, 2024
“Some (problematic) aesthetics of what constitutes good work in academia” by Steven Byrnes
Mar 11, 2024
“Twelve Lawsuits against OpenAI” by Remmelt
Mar 11, 2024
[Linkpost] “‘How could I have thought that faster?’” by mesaoptimizer
Mar 11, 2024
“Simple versus Short: Higher-order degeneracy and error-correction” by Daniel Murfet
Mar 11, 2024
“One-shot strategy games?” by Raemon
Mar 11, 2024
“Understanding SAE Features with the Logit Lens” by Joseph Bloom, hijohnnylin
Mar 11, 2024
“0th Person and 1st Person Logic” by Adele Lopez
Mar 11, 2024
[Linkpost] “Notes from a Prompt Factory” by Richard_Ngo
Mar 10, 2024
“Closeness To the Issue (Part 5 of ‘The Sense Of Physical Necessity’)” by LoganStrohl
Mar 09, 2024
“Lies and disrespect from the EA Infrastructure Fund” by Igor Ivanov
Mar 08, 2024
“Scenario Forecasting Workshop: Materials and Learnings” by elifland, charlie_griffin
Mar 08, 2024
“Woods’ new preprint on object permanence” by Steven Byrnes
Mar 08, 2024
“AI #54: Clauding Along” by Zvi
Mar 08, 2024
“MATS AI Safety Strategy Curriculum” by Ryan Kidd, Ronny Fernandez
Mar 08, 2024
[Linkpost] “Simple Kelly betting in prediction markets” by jessicata
Mar 07, 2024
“Mud and Despair (Part 4 of ‘The Sense Of Physical Necessity’)” by LoganStrohl
Mar 07, 2024
“Movie posters” by KatjaGrace
Mar 07, 2024
[Linkpost] “Using axis lines for good or evil” by dynomight
Mar 06, 2024
[Linkpost] “Research Report: Sparse Autoencoders find only 9/180 board state features in OthelloGPT” by Robert_AIZI
Mar 06, 2024
“On Claude 3.0” by Zvi
Mar 06, 2024
“Vote on Anthropic Topics to Discuss” by Ben Pace
Mar 06, 2024
“We Inspected Every Head In GPT-2 Small using SAEs So You Don’t Have To” by robertzk, Connor Kissane, Arthur Conmy, Neel Nanda
Mar 06, 2024
“My Clients, The Liars” by ymeskhout
Mar 05, 2024
“Social status part 1/2: negotiations over object-level preferences” by Steven Byrnes
Mar 05, 2024
“Read the Roon” by Zvi
Mar 05, 2024
“Many arguments for AI x-risk are wrong” by TurnTrout
Mar 05, 2024
[Linkpost] “Anthropic release Claude 3, claims >GPT-4 Performance” by LawrenceC
Mar 04, 2024
“Housing Roundup #7” by Zvi
Mar 04, 2024
“Are we so good to simulate?” by KatjaGrace
Mar 04, 2024
“The Broken Screwdriver and other parables” by bhauth
Mar 04, 2024
“Grief is a fire sale” by Nathan Young
Mar 04, 2024
“AI things that are perhaps as important as human-controlled AI (Chi version)” by Chi Nguyen
Mar 03, 2024
“Some costs of superposition” by Linda Linsefors
Mar 03, 2024
[Linkpost] “Self-Resolving Prediction Markets” by PeterMcCluskey
Mar 03, 2024
[Linkpost] “Agreeing With Stalin in Ways That Exhibit Generally Rationalist Principles” by Zack_M_Davis
Mar 02, 2024
“The World in 2029” by Nathan Young
Mar 02, 2024
[Linkpost] “If you weren’t such an idiot...” by kave
Mar 02, 2024
“Wholesomeness and Effective Altruism” by owencb
Mar 01, 2024
[Linkpost] “Increasing IQ is trivial” by George3d6
Mar 01, 2024
“Notes on Dwarkesh Patel’s Podcast with Demis Hassabis” by Zvi
Mar 01, 2024
“The Defence production act and AI policy” by NathanBarnard
Mar 01, 2024
“The Parable Of The Fallen Pendulum - Part 1” by johnswentworth
Mar 01, 2024
“Approaching Human-Level Forecasting with Language Models” by Fred Zhang, dannyhalawi, jsteinhardt
Feb 29, 2024
“AI #53: One More Leap” by Zvi
Feb 29, 2024
[Linkpost] “Bengio’s Alignment Proposal: ‘Towards a Cautious Scientist AI with Convergent Safety Bounds’” by mattmacdermott
Feb 29, 2024
“Tips for Empirical Alignment Research” by Ethan Perez
Feb 29, 2024
[Linkpost] “Post series on ‘Liability Law for reducing Existential Risk from AI’” by Nora_Ammann
Feb 29, 2024
“Locating My Eyes (Part 3 of ‘The Sense of Physical Necessity’)” by LoganStrohl
Feb 29, 2024
“Evidential cooperation in large worlds: Potential objections and FAQ” by Chi Nguyen
Feb 28, 2024
“Timaeus’s First Four Months” by Jesse Hoogland, Daniel Murfet, Stan van Wingerden, Alexander Gietelink Oldenziel
Feb 28, 2024
“Announcing ‘The LeastWrong’ and review winner post pages” by kave
Feb 28, 2024
“Counting arguments provide no evidence for AI doom” by Nora Belrose, Quintin Pope
Feb 27, 2024
“The Gemini Incident Continues” by Zvi
Feb 27, 2024
“How I internalized my achievements to better deal with negative feelings” by Raymond Koopmanschap
Feb 27, 2024
“Can we get an AI to do our alignment homework for us?” by Chris_Leong
Feb 27, 2024
“Examining Language Model Performance with Reconstructed Activations using Sparse Autoencoders” by Evan Anders, Joseph Bloom
Feb 27, 2024
“Acting Wholesomely” by owencb
Feb 26, 2024
“How I build and run behavioral interviews” by benkuhn
Feb 26, 2024
“China-AI forecasts” by NathanBarnard
Feb 25, 2024
[Linkpost] “Ideological Bayesians” by Kevin Dorst
Feb 25, 2024
“‘In-Context’ ‘Learning’” by Arjun Panickssery
Feb 25, 2024
“Cooperating with aliens and (distant) AGIs: An ECL explainer” by Chi Nguyen
Feb 24, 2024
“Choosing My Quest (Part 2 of ‘The Sense Of Physical Necessity’)” by LoganStrohl
Feb 24, 2024
“Rationality Research Report: Towards 10x OODA Looping?” by Raemon
Feb 24, 2024
[Linkpost] “We Need Major, But Not Radical, FDA Reform” by Maxwell Tabarrok
Feb 24, 2024
“Balancing Games” by jefftk
Feb 24, 2024
“How well do truth probes generalise?” by mishajw
Feb 24, 2024
“The Sense Of Physical Necessity: A Naturalism Demo (Introduction)” by LoganStrohl
Feb 24, 2024
“A starting point for making sense of task structure (in machine learning)” by Kaarel, RP, jake_mendel
Feb 24, 2024
“The Shutdown Problem: Incomplete Preferences as a Solution” by EJT
Feb 23, 2024
“Deep and obvious points in the gap between your thoughts and your pictures of thought” by KatjaGrace
Feb 23, 2024
“Complexity of value but not disvalue implies more focus on s-risk. Moral uncertainty and preference utilitarianism also do.” by Chi Nguyen
Feb 23, 2024
[Linkpost] “Contra Ngo et al. ‘Every ‘Every Bay Area House Party’ Bay Area House Party’” by Ricki Heicklen
Feb 22, 2024
“AI #52: Oops” by Zvi
Feb 22, 2024
“Gemini Has a Problem” by Zvi
Feb 22, 2024
[Linkpost] “Research Post: Tasks That Language Models Don’t Learn” by Bruce W. Lee
Feb 22, 2024
“Sora What” by Zvi
Feb 22, 2024
“Do sparse autoencoders find ‘true features’?” by Demian Till
Feb 22, 2024
“Everything Wrong with Roko’s Claims about an Engineered Pandemic” by EZ97
Feb 22, 2024
“The One and a Half Gemini” by Zvi
Feb 22, 2024
“The Byronic Hero Always Loses” by Cole Wyeth
Feb 22, 2024
“Job Listing: Managing Editor / Writer” by Gretta Duleba
Feb 21, 2024
“The Pareto Best and the Curse of Doom” by Screwtape
Feb 21, 2024
“Analogies between scaling labs and misaligned superintelligent AI” by scasper
Feb 21, 2024
“Dual Wielding Kindle Scribes” by mesaoptimizer
Feb 21, 2024
“Less Wrong automated systems are inadvertently Censoring me” by Roko
Feb 21, 2024
“AI #51: Altman’s Ambition” by Zvi
Feb 20, 2024
“Why does generalization work?” by Martín Soto
Feb 20, 2024
“Retirement Accounts and Short Timelines” by jefftk
Feb 19, 2024
“Protocol evaluations: good analogies vs control” by Fabien Roger
Feb 19, 2024
[Linkpost] “I’d also take $7 trillion” by bhauth
Feb 19, 2024
“On coincidences and Bayesian reasoning, as applied to the origins of COVID-19” by viking_math
Feb 19, 2024
“Things I’ve Grieved” by Raemon
Feb 18, 2024
“2023 Survey Results” by Screwtape
Feb 18, 2024
[Linkpost] “‘What if we could redesign society from scratch? The promise of charter cities.’ [Rational Animations video] — LessWrong” by Jackson Wagner
Feb 18, 2024
“Self-Awareness: Taxonomy and eval suite proposal” by Daniel Kokotajlo
Feb 17, 2024
“The Pointer Resolution Problem” by Jozdien
Feb 16, 2024
“Every ‘Every Bay Area House Party’ Bay Area House Party” by Richard_Ngo
Feb 16, 2024
[Linkpost] “‘No-one in my org puts money in their pension’” by Tobes
Feb 16, 2024
“Fixing Feature Suppression in SAEs” by Benjamin Wright
Feb 16, 2024
“Offering AI safety support calls for ML professionals” by Vael Gates
Feb 15, 2024
“Raising children on the eve of AI” by juliawise
Feb 15, 2024
[Linkpost] “FTX expects to return all customer money; clawbacks may go away” by Mikhail Samin
Feb 14, 2024
“CFAR Takeaways: Andrew Critch” by Raemon
Feb 14, 2024
[Linkpost] “Masterpiece” by Richard_Ngo
Feb 13, 2024
“Lsusr’s Rationality Dojo” by lsusr
Feb 13, 2024
“Where is the Town Square?” by Gretta Duleba
Feb 13, 2024
[Linkpost] “My cover story in Jacobin on AI capitalism and the x-risk debates” by garrison
Feb 12, 2024
“Tort Law Can Play an Important Role in Mitigating AI Risk” by Gabriel Weil
Feb 12, 2024
“On the Proposed California SB 1047” by Zvi
Feb 12, 2024
“Stop Defeating Akrasia (2/4)” by Edith
Feb 12, 2024
“Thoughts on ‘The Offense-Defense Balance Rarely Changes’” by Cullen
Feb 12, 2024
“Skepticism about DeepMind’s ‘Grandmaster-Level’ Chess Without Search” by Arjun Panickssery
Feb 12, 2024
“How do you actually obtain and report a likelihood function for scientific research?” by Peter Berggren
Feb 11, 2024
“And All the Shoggoths Merely Players” by Zack_M_Davis
Feb 10, 2024
[Linkpost] “Sam Altman’s Chip Ambitions Undercut OpenAI’s Safety Strategy” by garrison
Feb 10, 2024
“One True Love” by Zvi
Feb 09, 2024
“Skills I’d like my collaborators to have” by Raemon
Feb 09, 2024
“Transfer learning and generalization-qua-capability in Babbage and Davinci (or, why division is better than Spanish)” by RP, agg
Feb 09, 2024
“Running the Numbers on a Heat Pump” by jefftk
Feb 09, 2024
“Updatelessness doesn’t solve most problems” by Martín Soto
Feb 08, 2024
“AI #50: The Most Dangerous Thing” by Zvi
Feb 08, 2024
“Believing In” by AnnaSalamon
Feb 08, 2024
[Linkpost] “A Chess-GPT Linear Emergent World Representation” by karvonenadam
Feb 08, 2024
“Nitric oxide for covid and other viral infections” by Elizabeth
Feb 07, 2024
[Linkpost] “Debating with More Persuasive LLMs Leads to More Truthful Answers” by Akbir Khan, John Hughes, Dan Valentine, Sam Bowman, Ethan Perez
Feb 07, 2024
[Linkpost] “More Hyphenation” by Arjun Panickssery
Feb 07, 2024
“Why I think it’s net harmful to do technical safety research at AGI labs” by Remmelt
Feb 07, 2024
“story-based decision-making” by bhauth
Feb 07, 2024
“How to train your own ‘Sleeper Agents’” by evhub
Feb 07, 2024
“My guess at Conjecture’s vision: triggering a narrative bifurcation” by Alexandre Variengien
Feb 06, 2024
“what does davidad want from boundaries?” by Chipmonk, davidad
Feb 06, 2024
“Preventing exfiltration via upload limits seems promising” by ryan_greenblatt
Feb 06, 2024
“On the Debate Between Jezos and Leahy” by Zvi
Feb 06, 2024
“Toy models of AI control for concentrated catastrophe prevention” by Fabien Roger, Buck
Feb 06, 2024
[Linkpost] “Things You’re Allowed to Do: University Edition” by Saul Munn
Feb 06, 2024
“Value learning in the absence of ground truth” by Joel_Saarinen
Feb 05, 2024
“Implementing activation steering” by Annah
Feb 05, 2024
“Safe Stasis Fallacy” by Davidmanheim
Feb 05, 2024
“Noticing Panic” by Cole Wyeth
Feb 05, 2024
“Brute Force Manufactured Consensus is Hiding the Crime of the Century” by Roko
Feb 03, 2024
“Theories of Applied Rationality” by Camille Berger
Feb 03, 2024
“Why I no longer identify as transhumanist” by Kaj_Sotala
Feb 03, 2024
“Attention SAEs Scale to GPT-2 Small” by Connor Kissane, robertzk, Arthur Conmy, Neel Nanda
Feb 03, 2024
“Announcing the London Initiative for Safe AI (LISA)” by James Fox, mike_safeAI, Ryan Kidd
Feb 02, 2024
“Survey for alignment researchers: help us build better field-level models” by Cameron Berg, Judd Rosenblatt, AE Studio
Feb 02, 2024
“On Dwarkesh’s 3rd Podcast With Tyler Cowen” by Zvi
Feb 02, 2024
“Open Source Sparse Autoencoders for all Residual Stream Layers of GPT2-Small” by Joseph Bloom
Feb 02, 2024
[Linkpost] “Soft Prompts for Evaluation: Measuring Conditional Distance of Capabilities — LessWrong” by porby
Feb 02, 2024
[Linkpost] “Davidad’s Provably Safe AI Architecture - ARIA’s Programme Thesis” by simeon_c
Feb 01, 2024
“Wrong answer bias” by lukehmiles
Feb 01, 2024
“On Not Requiring Vaccination” by jefftk
Feb 01, 2024
“Managing risks while trying to do good” by Wei Dai
Feb 01, 2024
“Leading The Parade” by johnswentworth
Jan 31, 2024
“Per protocol analysis as medical malpractice” by braces
Jan 31, 2024
“Ten Modes of Culture War Discourse” by jchan
Jan 31, 2024
“Without Fundamental Advances, Rebellion and Coup d’État are the Inevitable Outcomes of Dictators & Monarchs Trying to Control Large, Capable Countries” by Roko
Jan 31, 2024
[Linkpost] “Explaining Impact Markets” by Saul Munn
Jan 31, 2024
[Linkpost] “on neodymium magnets” by bhauth
Jan 30, 2024
“Childhood and Education Roundup #4” by Zvi
Jan 30, 2024
“The case for more ambitious language model evals” by Jozdien
Jan 30, 2024
“Processor clock speeds are not how fast AIs think” by Ege Erdil
Jan 29, 2024
[Linkpost] “Chapter 1 of How to Win Friends and Influence People” by gull
Jan 29, 2024
“Simple distribution approximation: When sampled 100 times, can language models yield 80% A and 20% B?” by Teun van der Weij, Felix Hofstätter, Francis Rhys Ward
Jan 29, 2024
“Why I take short timelines seriously” by NicholasKees
Jan 28, 2024
[Linkpost] “Win Friends and Influence People Ch. 2: The Bombshell” by gull
Jan 28, 2024
[Linkpost] “Things You’re Allowed to Do: At the Dentist” by rbinnn
Jan 28, 2024
[Linkpost] “Palworld development blog post” by bhauth
Jan 28, 2024
“Don’t sleep on Coordination Takeoffs” by trevor
Jan 27, 2024
“Epistemic Hell” by rogersbacon
Jan 27, 2024
[Linkpost] “The Good Balsamic Vinegar” by jenn
Jan 26, 2024
[Linkpost] “Making every researcher seek grants is a broken model” by jasoncrawford
Jan 26, 2024
[Linkpost] “Surgery Works Well Without The FDA” by Maxwell Tabarrok
Jan 26, 2024
“Without fundamental advances, misalignment and catastrophe are the default outcomes of training powerful AI” by Jeremy Gillen, peterbarnett
Jan 26, 2024
“‘Does your paradigm beget new, good, paradigms?’” by Raemon
Jan 25, 2024
“AI #48: The Talk of Davos” by Zvi
Jan 25, 2024
[Linkpost] “[Repost] The Copenhagen Interpretation of Ethics” by mesaoptimizer
Jan 25, 2024
“The case for ensuring that powerful AIs are controlled” by ryan_greenblatt, Buck
Jan 24, 2024
“Monthly Roundup #14: January 2024” by Zvi
Jan 24, 2024
“This might be the last AI Safety Camp” by Remmelt, Linda Linsefors
Jan 24, 2024
“Humans aren’t fleeb.” by Charlie Steiner
Jan 24, 2024
“From Finite Factors to Bayes Nets” by J Bostock
Jan 23, 2024
“Making a Secular Solstice Songbook” by jefftk
Jan 23, 2024
[Linkpost] “Loneliness and suicide mitigation for students using GPT3-enabled chatbots (survey of Replika users in Nature)” by Kaj_Sotala
Jan 23, 2024
[Linkpost] “the subreddit size threshold” by bhauth
Jan 23, 2024
“There is way too much serendipity” by Malmesbury
Jan 22, 2024
“We need a science of evals” by Marius Hobbhahn, Jérémy Scheurer
Jan 22, 2024
“′ petertodd’’s last stand: The final days of open GPT-3 research” by mwatkins
Jan 22, 2024
[Linkpost] “InterLab – a toolkit for experiments with multi-agent interactions” by Tomáš Gavenčiak, Ada Böhm, Jan_Kulveit
Jan 22, 2024
“When Does Altruism Strengthen Altruism?” by jefftk
Jan 21, 2024
“A Shutdown Problem Proposal” by johnswentworth, David Lorell
Jan 21, 2024
[Linkpost] “Book review: Cuisine and Empire” by eukaryote
Jan 21, 2024
[Linkpost] “legged robot scaling laws” by bhauth
Jan 20, 2024
“A quick investigation of AI pro-AI bias” by Fabien Roger
Jan 19, 2024
“On ‘Geeks, MOPs, and Sociopaths’” by alkjash, Gordon Seidoh Worley
Jan 19, 2024
“Logical Line-Of-Sight Makes Games Sequential or Loopy” by StrivingForLegibility
Jan 19, 2024
[Linkpost] “Manifund: 2023 in Review” by Austin Chen
Jan 18, 2024
[Linkpost] “Against Nonlinear (Thing Of Things)” by tailcalled
Jan 18, 2024
[Linkpost] “The True Story of How GPT-2 Became Maximally Lewd” by Writer
Jan 18, 2024
“On the abolition of man” by Joe Carlsmith
Jan 18, 2024
“Good job opportunities for helping with the most important century” by HoldenKarnofsky
Jan 18, 2024
“AI #48: Exponentials in Geometry” by Zvi
Jan 18, 2024
“Does literacy remove your ability to be a bard as good as Homer?” by Adrià Garriga-alonso
Jan 18, 2024
“Four visions of Transformative AI success” by Steven Byrnes
Jan 17, 2024
[Linkpost] “AlphaGeometry: An Olympiad-level AI system for geometry” by alyssavance
Jan 17, 2024
“On Anthropic’s Sleeper Agents Paper” by Zvi
Jan 17, 2024
“An Introduction To The Mandelbrot Set That Doesn’t Mention Complex Numbers” by Yitz
Jan 17, 2024
“Medical Roundup #1” by Zvi
Jan 16, 2024
“Being nicer than Clippy” by Joe Carlsmith
Jan 16, 2024
“Managing catastrophic misuse without robust AIs” by ryan_greenblatt, Buck
Jan 16, 2024
“Why wasn’t preservation with the goal of potential future revival started earlier in history?” by Andy_McKenzie
Jan 16, 2024
“The impossible problem of due process” by mingyuan
Jan 16, 2024
“Sparse Autoencoders Work on Attention Layer Outputs” by Connor Kissane, robertzk, Arthur Conmy, Neel Nanda
Jan 16, 2024
“Goals selected from learned knowledge: an alternative to RL alignment” by Seth Herd
Jan 15, 2024
[Linkpost] “Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training” by evhub, Carson Denison, Meg, Monte M, David Duvenaud, Nicholas Schiefer, Ethan Perez
Jan 15, 2024
“The case for training frontier AIs on Sumerian-only corpus” by Alexandre Variengien, Charbel-Raphaël, Jonathan Claybrough
Jan 15, 2024
“What good is G-factor if you’re dumped in the woods? A field report from a camp counselor.” by Hastings
Jan 14, 2024
[Linkpost] “Land Reclamation is in the 9th Circle of Stagnation Hell” by Maxwell Tabarrok
Jan 14, 2024
[Linkpost] “Gender Exploration” by sapphire
Jan 14, 2024
“Universal Love Integration Test: Hitler” by Raemon
Jan 14, 2024
“D&D.Sci(-fi): Colonizing the SuperHyperSphere” by abstractapplic
Jan 14, 2024
[Linkpost] “The Leeroy Jenkins principle: How faulty AI could guarantee ‘warning shots’” by titotal
Jan 14, 2024
“Notice When People Are Directionally Correct” by Chris_Leong
Jan 14, 2024
“Against most AI risk analogies” by Matthew Barnett
Jan 14, 2024
[Linkpost] “The Perceptron Controversy” by Yuxi_Liu
Jan 13, 2024
“An even deeper atheism” by Joe Carlsmith
Jan 13, 2024
“AI #47: Meet the New Year” by Zvi
Jan 13, 2024
“Introducing Alignment Stress-Testing at Anthropic” by evhub
Jan 12, 2024
“The Aspiring Rationalist Congregation” by maia
Jan 12, 2024
“Apply to the PIBBSS Summer Research Fellowship” by Nora_Ammann, DusanDNesic, Lucas Teixeira
Jan 12, 2024
“Introduce a Speed Maximum” by jefftk
Jan 11, 2024
“An Actually Intuitive Explanation of the Oberth Effect” by Isaac King
Jan 10, 2024
“Saving the world sucks” by Defective Altruism
Jan 10, 2024
“On the Contrary, Steelmanning Is Normal; ITT-Passing Is Niche” by Zack_M_Davis
Jan 09, 2024
“Goodbye, Shoggoth: The Stage, its Animatronics, & the Puppeteer – a New Metaphor” by RogerDearnaley
Jan 09, 2024
“Does AI risk ‘other’ the AIs?” by Joe Carlsmith
Jan 09, 2024
“Learning Math in Time for Alignment” by NicholasKross
Jan 09, 2024
“A starter guide for evals” by Marius Hobbhahn, Mikita Balesni, Jérémy Scheurer, rusheb
Jan 08, 2024
“When ‘yang’ goes wrong” by Joe Carlsmith
Jan 08, 2024
“2023 Prediction Evaluations” by Zvi
Jan 08, 2024
“Reflections on my first year of AI safety research” by Jay Bailey
Jan 08, 2024
[Linkpost] “A model of research skill” by L Rudolf L
Jan 08, 2024
“Deceptive AI ≠ Deceptively-aligned AI” by Steven Byrnes
Jan 07, 2024
[Linkpost] “Bayesians Commit the Gambler’s Fallacy” by Kevin Dorst
Jan 07, 2024
[Linkpost] “Defending against hypothetical moon life during Apollo 11” by eukaryote
Jan 07, 2024
“AI Risk and the US Presidential Candidates” by Zane
Jan 06, 2024
“Survey of 2,778 AI authors: six parts in pictures” by KatjaGrace
Jan 06, 2024
[Linkpost] “Project ideas: Epistemics” by Lukas Finnveden
Jan 05, 2024
[Linkpost] “Almost everyone I’ve met would be well-served thinking more about what to focus on” by Henrik Karlsson
Jan 05, 2024
“The Next ChatGPT Moment: AI Avatars” by kolmplex, southpaw
Jan 05, 2024
“Catching AIs red-handed” by ryan_greenblatt, Buck
Jan 05, 2024
“MIRI 2024 Mission and Strategy Update” by Malo
Jan 05, 2024
“Deep atheism and AI risk” by Joe Carlsmith
Jan 04, 2024
“Some Vacation Photos” by johnswentworth
Jan 04, 2024
“AI #45: To Be Determined” by Zvi
Jan 04, 2024
“What’s up with LLMs representing XORs of arbitrary features?” by Sam Marks
Jan 03, 2024
“Safety First: safety before full alignment. The deontic sufficiency hypothesis.” by Chipmonk
Jan 03, 2024
[Linkpost] “Practically A Book Review: Appendix to ‘Nonlinear’s Evidence: Debunking False and Misleading Claims’ (ThingOfThings)” by tailcalled
Jan 03, 2024
“Copyright Confrontation #1” by Zvi
Jan 03, 2024
“Trading off Lives” by jefftk
Jan 03, 2024
“Gentleness and the artificial Other” by Joe Carlsmith
Jan 02, 2024
“Apologizing is a Core Rationalist Skill” by johnswentworth
Jan 02, 2024
“OpenAI’s Preparedness Framework: Praise & Recommendations” by Akash
Jan 02, 2024
“Dating Roundup #2: If At First You Don’t Succeed” by Zvi
Jan 02, 2024
“AI Is Not Software” by Davidmanheim
Jan 02, 2024
[Linkpost] “Steering Llama-2 with contrastive activation additions” by Nina Rimsky, Wuschel Schulz, NickGabs, Meg, evhub, TurnTrout
Jan 02, 2024
“Boston Solstice 2023 Retrospective” by jefftk
Jan 02, 2024
“Stop talking about p(doom)” by Isaac King
Jan 01, 2024
“2023 in AI predictions” by jessicata
Jan 01, 2024
“Bayesian updating in real life is mostly about understanding your hypotheses” by Max H
Jan 01, 2024