Listen to a podcast, please open Podcast Republic app. Available on Google Play Store and Apple App Store.
Episode | Date |
---|---|
“Beware unfinished bridges” by Adam Zerner
|
May 12, 2024 |
“New intro textbook on AIXI” by Alex_Altair
|
May 12, 2024 |
“Can we build a better Public Doublecrux?” by Raemon
|
May 12, 2024 |
“Podcast with Yoshua Bengio on Why AI Labs are ‘Playing Dice with Humanity’s Future’” by garrison
|
May 11, 2024 |
“MATS Winter 2023-24 Retrospective” by Rocket, Ryan Kidd, LauraVaughan, McKennaFitzgerald, Christian Smith, Juan Gil, Henry Sleight
|
May 11, 2024 |
“shortest goddamn bayes guide ever” by lukehmiles
|
May 10, 2024 |
“How to be an amateur polyglot” by arisAlexis
|
May 10, 2024 |
“My thesis (Algorithmic Bayesian Epistemology) explained in more depth” by Eric Neyman
|
May 10, 2024 |
“We might be missing some key feature of AI takeoff; it’ll probably seem like ‘we could’ve seen this coming’” by Lukas_Gloor
|
May 10, 2024 |
“AI #63: Introducing Alpha Fold 3” by Zvi
|
May 10, 2024 |
“Why Care About Natural Latents?” by johnswentworth, David Lorell
|
May 10, 2024 |
“Dyslucksia” by Shoshannah Tekofsky
|
May 09, 2024 |
“some thoughts on LessOnline” by Raemon
|
May 09, 2024 |
“Dating Roundup #3: Third Time’s the Charm” by Zvi
|
May 09, 2024 |
[Linkpost] “Designing for a single purpose” by Itay Dreyfus
|
May 08, 2024 |
“Deep Honesty” by Aletheophile
|
May 07, 2024 |
“Observations on Teaching for Four Weeks” by ClareChiaraVincent
|
May 07, 2024 |
“Rapid capability gain around supergenius level seems probable even without intelligence needing to improve intelligence” by Towards_Keeperhood, Davanchama
|
May 06, 2024 |
[Linkpost] “Uncovering Deceptive Tendencies in Language Models: A Simulated Company AI Assistant” by Olli Järviniemi, evhub
|
May 06, 2024 |
“Explaining a Math Magic Trick” by Robert_AIZI
|
May 05, 2024 |
[Linkpost] “introduction to cancer vaccines” by bhauth
|
May 05, 2024 |
“AI #61: Meta Trouble” by Zvi
|
May 05, 2024 |
[Linkpost] “S-Risks: Fates Worse Than Extinction” by aggliu, Writer
|
May 05, 2024 |
“Introducing AI-Powered Audiobooks of Rational Fiction Classics” by Askwho
|
May 04, 2024 |
“Now THIS is forecasting: understanding Epoch’s Direct Approach” by Elliot_Mckernon, Zershaaneh Qureshi
|
May 04, 2024 |
[Linkpost] “My hour of memoryless lucidity” by Eric Neyman
|
May 04, 2024 |
“Apply to ESPR & PAIR, Rationality and AI Camps for Ages 16-21” by Anna Gajdova
|
May 04, 2024 |
[Linkpost] “‘AI Safety for Fleshy Humans’ an AI Safety explainer by Nicky Case” by habryka
|
May 03, 2024 |
“Key takeaways from our EA and alignment research surveys” by Cameron Berg, Judd Rosenblatt, florin_pop, AE Studio
|
May 03, 2024 |
“AI #62: Too Soon to Tell” by Zvi
|
May 03, 2024 |
“Mechanistic Interpretability Workshop Happening at ICML 2024!” by Neel Nanda
|
May 03, 2024 |
“Q&A on Proposed SB 1047” by Zvi
|
May 02, 2024 |
“Please stop publishing ideas/insights/research about AI” by Tamsin Leake
|
May 02, 2024 |
“Manifund Q1 Retro: Learnings from impact certs” by Austin Chen
|
May 02, 2024 |
“An explanation of evil in an organized world” by KatjaGrace
|
May 02, 2024 |
“Take SCIFs, it’s dangerous to go alone” by latterframe, Jeffrey Ladish, schroederdewitt
|
May 02, 2024 |
“The Intentional Stance, LLMs Edition” by Eleni Angelou
|
May 01, 2024 |
“ACX Covid Origins Post convinced readers” by ErnestScribbler
|
May 01, 2024 |
“LessWrong Community Weekend 2024, open for applications” by UnplannedCauliflower, jt
|
May 01, 2024 |
“Transcoders enable fine-grained interpretable circuit analysis for language models” by Jacob Dunefsky, Philippe Chlenski, Neel Nanda
|
May 01, 2024 |
“Questions for labs” by Zach Stein-Perlman
|
May 01, 2024 |
“Mechanistically Eliciting Latent Behaviors in Language Models” by Andrew Mack
|
Apr 30, 2024 |
“Why I’m doing PauseAI” by Joseph Miller
|
Apr 30, 2024 |
[Linkpost] “Introducing AI Lab Watch” by Zach Stein-Perlman
|
Apr 30, 2024 |
“Towards a formalization of the agent structure problem” by Alex_Altair
|
Apr 30, 2024 |
“Towards Multimodal Interpretability: Learning Sparse Interpretable Features in Vision Transformers” by hugofry
|
Apr 30, 2024 |
“Ironing Out the Squiggles” by Zack_M_Davis
|
Apr 29, 2024 |
“List your AI X-Risk cruxes!” by Aryeh Englander
|
Apr 29, 2024 |
“AISC9 has ended and there will be an AISC10” by Linda Linsefors
|
Apr 29, 2024 |
“[Aspiration-based designs] 1. Informal introduction” by B Jacobs, Jobst Heitzig, Simon Fischer, Simon Dima
|
Apr 29, 2024 |
“Constructability: Plainly-coded AGIs may be feasible in the near future” by Épiphanie Gédéon, Charbel-Raphaël
|
Apr 28, 2024 |
“We are headed into an extreme compute overhang” by devrandom
|
Apr 28, 2024 |
“So What’s Up With PUFAs Chemically?” by J Bostock
|
Apr 27, 2024 |
“Refusal in LLMs is mediated by a single direction” by Andy Arditi, Oscar Balcells Obeso, Aaquib111, wesg, Neel Nanda
|
Apr 27, 2024 |
“D&D.Sci Long War: Defender of Data-mocracy” by aphyer
|
Apr 27, 2024 |
“Superposition is not ‘just’ neuron polysemanticity” by LawrenceC
|
Apr 27, 2024 |
“Duct Tape security” by Isaac King
|
Apr 26, 2024 |
“On Not Pulling The Ladder Up Behind You” by Screwtape
|
Apr 26, 2024 |
“Scaling of AI training runs will slow down after GPT-5” by Maxime Riché
|
Apr 26, 2024 |
“Spatial attention as a ‘tell’ for empathetic simulation?” by Steven Byrnes
|
Apr 26, 2024 |
“Losing Faith In Contrarianism” by omnizoid
|
Apr 26, 2024 |
[Linkpost] “LLMs seem (relatively) safe” by JustisMills
|
Apr 26, 2024 |
[Linkpost] “Improving Dictionary Learning with Gated Sparse Autoencoders” by Neel Nanda
|
Apr 25, 2024 |
[Linkpost] “WSJ: Inside Amazon’s Secret Operation to Gather Intel on Rivals” by trevor
|
Apr 25, 2024 |
[Linkpost] “‘Why I Write’ by George Orwell (1946)” by Arjun Panickssery
|
Apr 25, 2024 |
“The first future and the best future” by KatjaGrace
|
Apr 25, 2024 |
[Linkpost] “The Inner Ring by C. S. Lewis” by Saul Munn
|
Apr 25, 2024 |
“This is Water by David Foster Wallace” by Nathan Young
|
Apr 25, 2024 |
“Changes in College Admissions” by Zvi
|
Apr 24, 2024 |
“Examples of Highly Counterfactual Discoveries?” by johnswentworth
|
Apr 24, 2024 |
[Linkpost] “On what research policymakers actually need” by MondSemmel
|
Apr 24, 2024 |
[Linkpost] “Let’s Design A School, Part 1” by Sable
|
Apr 24, 2024 |
[Linkpost] “Dequantifying first-order theories” by jessicata
|
Apr 24, 2024 |
[Linkpost] “Simple probes can catch sleeper agents” by Monte M, Carson Denison, Zac Hatfield-Dodds, evhub
|
Apr 23, 2024 |
“Rejecting Television” by Declan Molony
|
Apr 23, 2024 |
“Forget Everything (Statistical Mechanics Part 1)” by J Bostock
|
Apr 23, 2024 |
“Funny Anecdote of Eliezer From His Sister” by Daniel Birnbaum
|
Apr 22, 2024 |
[Linkpost] “Motivation gaps: Why so much EA criticism is hostile and lazy” by titotal
|
Apr 22, 2024 |
“Priors and Prejudice” by MathiasKB
|
Apr 22, 2024 |
[Linkpost] “AI Regulation is Unsafe” by Maxwell Tabarrok
|
Apr 22, 2024 |
“On Llama-3 and Dwarkesh Patel’s Podcast with Zuckerberg” by Zvi
|
Apr 22, 2024 |
“Transfer Learning in Humans” by niplav
|
Apr 22, 2024 |
“A couple productivity tips for overthinkers” by Steven Byrnes
|
Apr 20, 2024 |
“What’s up with all the non-Mormons? Weirdly specific universalities across LLMs” by mwatkins
|
Apr 20, 2024 |
[Linkpost] “Thoughts on seed oil” by dynomight
|
Apr 20, 2024 |
“Inducing Unprompted Misalignment in LLMs” by Sam Svenningsen, evhub, Henry Sleight
|
Apr 20, 2024 |
“Progress Update #1 from the GDM Mech Interp Team: Summary” by Neel Nanda, Arthur Conmy, lsgos, Senthooran Rajamanoharan, Tom Lieberum, János Kramár, Vikrant Varma
|
Apr 20, 2024 |
“Progress Update #1 from the GDM Mech Interp Team: Full Update” by Neel Nanda, Arthur Conmy, lsgos, Senthooran Rajamanoharan, Tom Lieberum, János Kramár, Vikrant Varma
|
Apr 19, 2024 |
[Linkpost] “Daniel Dennett has died (1924-2024)” by kave
|
Apr 19, 2024 |
“Experiment on repeating choices” by KatjaGrace
|
Apr 19, 2024 |
“[Fiction] A Confession” by Arjun Panickssery
|
Apr 19, 2024 |
“I’m open for projects (sort of)” by cousin_it
|
Apr 18, 2024 |
“AI #60: Oh the Humanity” by Zvi
|
Apr 18, 2024 |
“Discriminating Behaviorally Identical Classifiers: a model problem for applying interpretability to scalable oversight” by Sam Marks
|
Apr 18, 2024 |
“The Mom Test: Summary and Thoughts” by Adam Zerner
|
Apr 18, 2024 |
“Childhood and Education Roundup #5” by Zvi
|
Apr 18, 2024 |
[Linkpost] “Claude 3 Opus can operate as a Turing machine” by Gunnar_Zarncke
|
Apr 18, 2024 |
“Express interest in an ‘FHI of the West’” by habryka
|
Apr 18, 2024 |
“Why Would Belief-States Have A Fractal Structure, And Why Would That Matter For Interpretability? An Explainer” by johnswentworth, David Lorell
|
Apr 18, 2024 |
“Effectively Handling Disagreements - Introducing a New Workshop” by Camille Berger
|
Apr 17, 2024 |
[Linkpost] “Moving on from community living” by Vika
|
Apr 17, 2024 |
[Linkpost] “FHI (Future of Humanity Institute) has shut down (2005–2024)” by gwern
|
Apr 17, 2024 |
“When is a mind me?” by Rob Bensinger
|
Apr 17, 2024 |
“Mid-conditional love” by KatjaGrace
|
Apr 17, 2024 |
“Transformers Represent Belief State Geometry in their Residual Stream” by Adam Shai
|
Apr 16, 2024 |
[Linkpost] “Paul Christiano named as US AI Safety Institute Head of AI Safety” by Joel Burget
|
Apr 16, 2024 |
[Linkpost] “Essay competition on the Automation of Wisdom and Philosophy — $25k in prizes” by owencb, AI Impacts
|
Apr 16, 2024 |
“Monthly Roundup #17: April 2024” by Zvi
|
Apr 16, 2024 |
“Anthropic AI made the right call” by bhauth
|
Apr 16, 2024 |
“My experience using financial commitments to overcome akrasia” by William Howard
|
Apr 15, 2024 |
[Linkpost] “A High Decoupling Failure” by Maxwell Tabarrok
|
Apr 15, 2024 |
“Reconsider the anti-cavity bacteria if you are Asian” by Lao Mein
|
Apr 15, 2024 |
“Text Posts from the Kids Group: 2020” by jefftk
|
Apr 14, 2024 |
“Prompts for Big-Picture Planning” by Raemon
|
Apr 14, 2024 |
“Things Solenoid Narrates” by Solenoid_Entity
|
Apr 13, 2024 |
[Linkpost] “Carl Sagan, nuking the moon, and not nuking the moon” by eukaryote
|
Apr 13, 2024 |
[Linkpost] “MIRI’s April 2024 Newsletter” by Harlan, Rob Bensinger
|
Apr 13, 2024 |
“UDT1.01: Plannable and Unplanned Observations (3/10)” by Diffractor
|
Apr 12, 2024 |
“Generalized Stat Mech: The Boltzmann Approach” by David Lorell, johnswentworth
|
Apr 12, 2024 |
“A D&D.Sci Dodecalogue” by abstractapplic
|
Apr 12, 2024 |
“Announcing Atlas Computing” by miyazono
|
Apr 12, 2024 |
“A Gentle Introduction to Risk Frameworks Beyond Forecasting” by pendingsurvival
|
Apr 12, 2024 |
“How I select alignment research projects” by Ethan Perez, Henry Sleight, Mikita Balesni
|
Apr 11, 2024 |
“D&D.Sci: The Mad Tyrant’s Pet Turtles [Evaluation and Ruleset]” by abstractapplic
|
Apr 10, 2024 |
“RTFB: On the New Proposed CAIP AI Bill” by Zvi
|
Apr 10, 2024 |
“Medical Roundup #2” by Zvi
|
Apr 10, 2024 |
“Ophiology (or, how the Mamba architecture works)” by Danielle Ensign, SrGonao, Adrià Garriga-alonso
|
Apr 09, 2024 |
[Linkpost] “Conflict in Posthuman Literature” by Martín Soto
|
Apr 09, 2024 |
“Math-to-English Cheat Sheet” by nahoj
|
Apr 09, 2024 |
“PIBBSS is hiring in a variety of roles (alignment research and incubation program)” by Nora_Ammann, Lucas Teixeira, DusanDNesic
|
Apr 09, 2024 |
“Gated Attention Blocks: Preliminary Progress toward Removing Attention Head Superposition” by cmathw, Dennis Akar, Lee Sharkey
|
Apr 09, 2024 |
[Linkpost] “on the dollar-yen exchange rate” by bhauth
|
Apr 08, 2024 |
“How We Picture Bayesian Agents” by johnswentworth, David Lorell
|
Apr 08, 2024 |
“A Dozen Ways to Get More Dakka” by Davidmanheim
|
Apr 08, 2024 |
“My intellectual journey to (dis)solve the hard problem of consciousness” by Charbel-Raphaël
|
Apr 07, 2024 |
“Koan: divining alien datastructures from RAM activations” by TsviBT
|
Apr 07, 2024 |
“‘Fractal Strategy’ workshop report” by Raemon
|
Apr 07, 2024 |
[Linkpost] “The 2nd Demographic Transition” by Maxwell Tabarrok
|
Apr 07, 2024 |
“On Complexity Science” by Garrett Baker
|
Apr 05, 2024 |
“What’s with all the bans recently?” by Gerald Monroe
|
Apr 05, 2024 |
“Partial value takeover without world takeover” by KatjaGrace
|
Apr 05, 2024 |
“New report: A review of the empirical evidence for existential risk from AI via misaligned power-seeking” by Harlan, rosehadshar
|
Apr 05, 2024 |
“AI #58: Stargate AGI” by Zvi
|
Apr 05, 2024 |
“Run evals on base models too!” by orthonormal
|
Apr 04, 2024 |
“LLMs for Alignment Research: a safety priority?” by abramdemski
|
Apr 04, 2024 |
“A gentle introduction to mechanistic anomaly detection” by Erik Jenner
|
Apr 04, 2024 |
“Best in Class Life Improvement” by sapphire
|
Apr 04, 2024 |
“Sparsify: A mechanistic interpretability research agenda” by Lee Sharkey
|
Apr 03, 2024 |
“Notes on Dwarkesh Patel’s Podcast with Sholto Douglas and Trenton Bricken” by Zvi
|
Apr 02, 2024 |
“Thousands of malicious actors on the future of AI misuse” by Zershaaneh Qureshi, Corin Katzke, Convergence Analysis
|
Apr 02, 2024 |
“Gradient Descent on the Human Brain” by Jozdien, gaspode
|
Apr 02, 2024 |
“Coherence of Caches and Agents” by johnswentworth
|
Apr 02, 2024 |
“Announcing Suffering For Good” by Garrett Baker
|
Apr 01, 2024 |
“OMMC Announces RIP” by Adam Scholl, aysja
|
Apr 01, 2024 |
“The Evolution of Humans Was Net-Negative for Human Values” by Zack_M_Davis
|
Apr 01, 2024 |
“A Selection of Randomly Selected SAE Features” by CallumMcDougall, Joseph Bloom
|
Apr 01, 2024 |
“Apply to be a Safety Engineer at Lockheed Martin!” by yanni
|
Apr 01, 2024 |
[Linkpost] “Introducing Open Asteroid Impact” by Linch
|
Apr 01, 2024 |
“The Story of ‘I Have Been A Good Bing’” by habryka, kave
|
Apr 01, 2024 |
“SAE-VIS: Announcement Post” by CallumMcDougall, Joseph Bloom
|
Mar 31, 2024 |
“The Best Tacit Knowledge Videos on Every Subject” by Parker Conley
|
Mar 31, 2024 |
“My simple AGI investment & insurance strategy” by lc
|
Mar 31, 2024 |
[Linkpost] “Metascience of the Vesuvius Challenge” by Maxwell Tabarrok
|
Mar 30, 2024 |
“Back to Basics: Truth is Unitary” by lsusr
|
Mar 30, 2024 |
“Your LLM Judge may be biased” by rachelAF, Henry Papadatos
|
Mar 30, 2024 |
“D&D.Sci: The Mad Tyrant’s Pet Turtles” by abstractapplic
|
Mar 30, 2024 |
“SAE reconstruction errors are (empirically) pathological” by wesg
|
Mar 29, 2024 |
“How to safely use an optimizer” by Simon Fischer
|
Mar 29, 2024 |
[Linkpost] “Practically-A-Book Review: Rootclaim $100,000 Lab Leak Debate” by trevor
|
Mar 28, 2024 |
“Was Releasing Claude-3 Net-Negative?” by Logan Riggs
|
Mar 28, 2024 |
[Linkpost] “Come to Manifest 2024 (June 7-9 in Berkeley)” by Saul Munn
|
Mar 28, 2024 |
[Linkpost] “The Cognitive-Theoretic Model of the Universe: A Partial Summary and Review” by jessicata
|
Mar 27, 2024 |
[Linkpost] “Daniel Kahneman has died” by DanielFilan
|
Mar 27, 2024 |
[Linkpost] “Nick Bostrom’s new book, ‘Deep Utopia’, is out today” by PeterH
|
Mar 27, 2024 |
“AE Studio @ SXSW: We need more AI consciousness research (and further resources)” by AE Studio, Cameron Berg, Judd Rosenblatt, phgubbins, Diogo de Lucena
|
Mar 27, 2024 |
“Failures in Kindness” by silentbob
|
Mar 26, 2024 |
“Modern Transformers are AGI, and Human-Level” by abramdemski
|
Mar 26, 2024 |
“My Interview With Cade Metz on His Reporting About Slate Star Codex” by Zack_M_Davis
|
Mar 26, 2024 |
“Announcing Neuronpedia: Platform for accelerating research into Sparse Autoencoders” by Johnny Lin, Joseph Bloom
|
Mar 26, 2024 |
“Should rationalists be spiritual / Spirituality as overcoming delusion” by Kaj_Sotala, romeostevensit
|
Mar 26, 2024 |
“On attunement” by Joe Carlsmith
|
Mar 25, 2024 |
“My Detailed Notes & Commentary from Secular Solstice” by Jeffrey Heninger
|
Mar 25, 2024 |
“On Lex Fridman’s Second Podcast with Altman” by Zvi
|
Mar 25, 2024 |
“Do not delete your misaligned AGI.” by mako yass
|
Mar 25, 2024 |
“All About Concave and Convex Agents” by mako yass
|
Mar 25, 2024 |
“Vipassana Meditation and Active Inference: A Framework for Understanding Suffering and its Cessation” by Benjamin Sturgeon
|
Mar 24, 2024 |
“General Thoughts on Secular Solstice” by Jeffrey Heninger
|
Mar 24, 2024 |
“Dangers of Closed-Loop AI” by Gordon Seidoh Worley
|
Mar 23, 2024 |
“A Teacher vs. Everyone Else” by ronak69
|
Mar 23, 2024 |
“AI #56: Blackwell That Ends Well” by Zvi
|
Mar 23, 2024 |
[Linkpost] “Vernor Vinge, who coined the term ‘Technological Singularity’, dies at 79” by Kaj_Sotala
|
Mar 21, 2024 |
“ChatGPT can learn indirect control” by Raymond D
|
Mar 21, 2024 |
[Linkpost] “‘Deep Learning’ Is Function Approximation” by Zack_M_Davis
|
Mar 21, 2024 |
“On green” by Joe Carlsmith
|
Mar 21, 2024 |
[Linkpost] “DeepMind: Evaluating Frontier Models for Dangerous Capabilities” by Zach Stein-Perlman
|
Mar 21, 2024 |
“On the Gladstone Report” by Zvi
|
Mar 21, 2024 |
“Stagewise Development in Neural Networks” by Jesse Hoogland, Liam Carroll, Daniel Murfet
|
Mar 20, 2024 |
“Monthly Roundup #16: March 2024” by Zvi
|
Mar 20, 2024 |
“Natural Latents: The Concepts” by johnswentworth, David Lorell
|
Mar 20, 2024 |
“New report: Safety Cases for AI” by joshc
|
Mar 20, 2024 |
[Linkpost] “Increasing IQ by 10 Points is Possible” by George3d6
|
Mar 19, 2024 |
[Linkpost] “Inferring the model dimension of API-protected LLMs” by Ege Erdil
|
Mar 19, 2024 |
“Experimentation (Part 7 of ‘The Sense Of Physical Necessity’)” by LoganStrohl
|
Mar 19, 2024 |
“Neuroscience and Alignment” by Garrett Baker
|
Mar 19, 2024 |
[Linkpost] “Toki pona FAQ” by dkl9
|
Mar 19, 2024 |
“5 Physics Problems” by DaemonicSigil, Muireall
|
Mar 18, 2024 |
“Measuring Coherence of Policies in Toy Environments” by dx26, Richard_Ngo, Martín Soto
|
Mar 18, 2024 |
“Community Notes by X” by NicholasKees
|
Mar 18, 2024 |
“On Devin” by Zvi
|
Mar 18, 2024 |
“The Worst Form Of Government (Except For Everything Else We’ve Tried)” by johnswentworth
|
Mar 17, 2024 |
[Linkpost] “Anxiety vs. Depression” by Sable
|
Mar 17, 2024 |
[Linkpost] “My PhD thesis: Algorithmic Bayesian Epistemology” by Eric Neyman
|
Mar 16, 2024 |
[Linkpost] “How people stopped dying from diarrhea so much (& other life-saving decisions)” by Writer
|
Mar 16, 2024 |
“Rational Animations offers animation production and writing services!” by Writer
|
Mar 16, 2024 |
[Linkpost] “Introducing METR’s Autonomy Evaluation Resources” by Megan Kinniment, Beth Barnes
|
Mar 16, 2024 |
[Linkpost] “More people getting into AI safety should do a PhD” by AdamGleave
|
Mar 15, 2024 |
[Linkpost] “Constructive Cauchy sequences vs. Dedekind cuts” by jessicata
|
Mar 15, 2024 |
[Linkpost] “Conditional on Getting to Trade, Your Trade Wasn’t All That Great” by Ricki Heicklen
|
Mar 15, 2024 |
“Highlights from Lex Fridman’s interview of Yann LeCun” by Joel Burget
|
Mar 14, 2024 |
“How useful is ‘AI Control’ as a framing on AI X-Risk?” by habryka, ryan_greenblatt
|
Mar 14, 2024 |
“AI #55: Keep Clauding Along” by Zvi
|
Mar 14, 2024 |
“Jobs, Relationships, and Other Cults” by Ruby, Elizabeth
|
Mar 14, 2024 |
“On the Latest TikTok Bill” by Zvi
|
Mar 14, 2024 |
“‘Empiricism!’ as Anti-Epistemology” by Eliezer Yudkowsky
|
Mar 14, 2024 |
“Open consultancy: Letting untrusted AIs choose what answer to argue for” by Fabien Roger
|
Mar 13, 2024 |
[Linkpost] “Superforecasting the Origins of the Covid-19 Pandemic” by DanielFilan
|
Mar 13, 2024 |
“The Parable Of The Fallen Pendulum, Part 2” by johnswentworth
|
Mar 13, 2024 |
“OpenAI: The Board Expands” by Zvi
|
Mar 12, 2024 |
[Linkpost] “Results from an Adversarial Collaboration on AI Risk (FRI)” by Forecasting Research Institute
|
Mar 12, 2024 |
“‘Artificial General Intelligence’: an extremely brief FAQ” by Steven Byrnes
|
Mar 11, 2024 |
“Some (problematic) aesthetics of what constitutes good work in academia” by Steven Byrnes
|
Mar 11, 2024 |
“Twelve Lawsuits against OpenAI” by Remmelt
|
Mar 11, 2024 |
[Linkpost] “‘How could I have thought that faster?’” by mesaoptimizer
|
Mar 11, 2024 |
“Simple versus Short: Higher-order degeneracy and error-correction” by Daniel Murfet
|
Mar 11, 2024 |
“One-shot strategy games?” by Raemon
|
Mar 11, 2024 |
“Understanding SAE Features with the Logit Lens” by Joseph Bloom, hijohnnylin
|
Mar 11, 2024 |
“0th Person and 1st Person Logic” by Adele Lopez
|
Mar 11, 2024 |
[Linkpost] “Notes from a Prompt Factory” by Richard_Ngo
|
Mar 10, 2024 |
“Closeness To the Issue (Part 5 of ‘The Sense Of Physical Necessity’)” by LoganStrohl
|
Mar 09, 2024 |
“Lies and disrespect from the EA Infrastructure Fund” by Igor Ivanov
|
Mar 08, 2024 |
“Scenario Forecasting Workshop: Materials and Learnings” by elifland, charlie_griffin
|
Mar 08, 2024 |
“Woods’ new preprint on object permanence” by Steven Byrnes
|
Mar 08, 2024 |
“AI #54: Clauding Along” by Zvi
|
Mar 08, 2024 |
“MATS AI Safety Strategy Curriculum” by Ryan Kidd, Ronny Fernandez
|
Mar 08, 2024 |
[Linkpost] “Simple Kelly betting in prediction markets” by jessicata
|
Mar 07, 2024 |
“Mud and Despair (Part 4 of ‘The Sense Of Physical Necessity’)” by LoganStrohl
|
Mar 07, 2024 |
“Movie posters” by KatjaGrace
|
Mar 07, 2024 |
[Linkpost] “Using axis lines for good or evil” by dynomight
|
Mar 06, 2024 |
[Linkpost] “Research Report: Sparse Autoencoders find only 9/180 board state features in OthelloGPT” by Robert_AIZI
|
Mar 06, 2024 |
“On Claude 3.0” by Zvi
|
Mar 06, 2024 |
“Vote on Anthropic Topics to Discuss” by Ben Pace
|
Mar 06, 2024 |
“We Inspected Every Head In GPT-2 Small using SAEs So You Don’t Have To” by robertzk, Connor Kissane, Arthur Conmy, Neel Nanda
|
Mar 06, 2024 |
“My Clients, The Liars” by ymeskhout
|
Mar 05, 2024 |
“Social status part 1/2: negotiations over object-level preferences” by Steven Byrnes
|
Mar 05, 2024 |
“Read the Roon” by Zvi
|
Mar 05, 2024 |
“Many arguments for AI x-risk are wrong” by TurnTrout
|
Mar 05, 2024 |
[Linkpost] “Anthropic release Claude 3, claims >GPT-4 Performance” by LawrenceC
|
Mar 04, 2024 |
“Housing Roundup #7” by Zvi
|
Mar 04, 2024 |
“Are we so good to simulate?” by KatjaGrace
|
Mar 04, 2024 |
“The Broken Screwdriver and other parables” by bhauth
|
Mar 04, 2024 |
“Grief is a fire sale” by Nathan Young
|
Mar 04, 2024 |
“AI things that are perhaps as important as human-controlled AI (Chi version)” by Chi Nguyen
|
Mar 03, 2024 |
“Some costs of superposition” by Linda Linsefors
|
Mar 03, 2024 |
[Linkpost] “Self-Resolving Prediction Markets” by PeterMcCluskey
|
Mar 03, 2024 |
[Linkpost] “Agreeing With Stalin in Ways That Exhibit Generally Rationalist Principles” by Zack_M_Davis
|
Mar 02, 2024 |
“The World in 2029” by Nathan Young
|
Mar 02, 2024 |
[Linkpost] “If you weren’t such an idiot...” by kave
|
Mar 02, 2024 |
“Wholesomeness and Effective Altruism” by owencb
|
Mar 01, 2024 |
[Linkpost] “Increasing IQ is trivial” by George3d6
|
Mar 01, 2024 |
“Notes on Dwarkesh Patel’s Podcast with Demis Hassabis” by Zvi
|
Mar 01, 2024 |
“The Defence production act and AI policy” by NathanBarnard
|
Mar 01, 2024 |
“The Parable Of The Fallen Pendulum - Part 1” by johnswentworth
|
Mar 01, 2024 |
“Approaching Human-Level Forecasting with Language Models” by Fred Zhang, dannyhalawi, jsteinhardt
|
Feb 29, 2024 |
“AI #53: One More Leap” by Zvi
|
Feb 29, 2024 |
[Linkpost] “Bengio’s Alignment Proposal: ‘Towards a Cautious Scientist AI with Convergent Safety Bounds’” by mattmacdermott
|
Feb 29, 2024 |
“Tips for Empirical Alignment Research” by Ethan Perez
|
Feb 29, 2024 |
[Linkpost] “Post series on ‘Liability Law for reducing Existential Risk from AI’” by Nora_Ammann
|
Feb 29, 2024 |
“Locating My Eyes (Part 3 of ‘The Sense of Physical Necessity’)” by LoganStrohl
|
Feb 29, 2024 |
“Evidential cooperation in large worlds: Potential objections and FAQ” by Chi Nguyen
|
Feb 28, 2024 |
“Timaeus’s First Four Months” by Jesse Hoogland, Daniel Murfet, Stan van Wingerden, Alexander Gietelink Oldenziel
|
Feb 28, 2024 |
“Announcing ‘The LeastWrong’ and review winner post pages” by kave
|
Feb 28, 2024 |
“Counting arguments provide no evidence for AI doom” by Nora Belrose, Quintin Pope
|
Feb 27, 2024 |
“The Gemini Incident Continues” by Zvi
|
Feb 27, 2024 |
“How I internalized my achievements to better deal with negative feelings” by Raymond Koopmanschap
|
Feb 27, 2024 |
“Can we get an AI to do our alignment homework for us?” by Chris_Leong
|
Feb 27, 2024 |
“Examining Language Model Performance with Reconstructed Activations using Sparse Autoencoders” by Evan Anders, Joseph Bloom
|
Feb 27, 2024 |
“Acting Wholesomely” by owencb
|
Feb 26, 2024 |
“How I build and run behavioral interviews” by benkuhn
|
Feb 26, 2024 |
“China-AI forecasts” by NathanBarnard
|
Feb 25, 2024 |
[Linkpost] “Ideological Bayesians” by Kevin Dorst
|
Feb 25, 2024 |
“‘In-Context’ ‘Learning’” by Arjun Panickssery
|
Feb 25, 2024 |
“Cooperating with aliens and (distant) AGIs: An ECL explainer” by Chi Nguyen
|
Feb 24, 2024 |
“Choosing My Quest (Part 2 of ‘The Sense Of Physical Necessity’)” by LoganStrohl
|
Feb 24, 2024 |
“Rationality Research Report: Towards 10x OODA Looping?” by Raemon
|
Feb 24, 2024 |
[Linkpost] “We Need Major, But Not Radical, FDA Reform” by Maxwell Tabarrok
|
Feb 24, 2024 |
“Balancing Games” by jefftk
|
Feb 24, 2024 |
“How well do truth probes generalise?” by mishajw
|
Feb 24, 2024 |
“The Sense Of Physical Necessity: A Naturalism Demo (Introduction)” by LoganStrohl
|
Feb 24, 2024 |
“A starting point for making sense of task structure (in machine learning)” by Kaarel, RP, jake_mendel
|
Feb 24, 2024 |
“The Shutdown Problem: Incomplete Preferences as a Solution” by EJT
|
Feb 23, 2024 |
“Deep and obvious points in the gap between your thoughts and your pictures of thought” by KatjaGrace
|
Feb 23, 2024 |
“Complexity of value but not disvalue implies more focus on s-risk. Moral uncertainty and preference utilitarianism also do.” by Chi Nguyen
|
Feb 23, 2024 |
[Linkpost] “Contra Ngo et al. ‘Every ‘Every Bay Area House Party’ Bay Area House Party’” by Ricki Heicklen
|
Feb 22, 2024 |
“AI #52: Oops” by Zvi
|
Feb 22, 2024 |
“Gemini Has a Problem” by Zvi
|
Feb 22, 2024 |
[Linkpost] “Research Post: Tasks That Language Models Don’t Learn” by Bruce W. Lee
|
Feb 22, 2024 |
“Sora What” by Zvi
|
Feb 22, 2024 |
“Do sparse autoencoders find ‘true features’?” by Demian Till
|
Feb 22, 2024 |
“Everything Wrong with Roko’s Claims about an Engineered Pandemic” by EZ97
|
Feb 22, 2024 |
“The One and a Half Gemini” by Zvi
|
Feb 22, 2024 |
“The Byronic Hero Always Loses” by Cole Wyeth
|
Feb 22, 2024 |
“Job Listing: Managing Editor / Writer” by Gretta Duleba
|
Feb 21, 2024 |
“The Pareto Best and the Curse of Doom” by Screwtape
|
Feb 21, 2024 |
“Analogies between scaling labs and misaligned superintelligent AI” by scasper
|
Feb 21, 2024 |
“Dual Wielding Kindle Scribes” by mesaoptimizer
|
Feb 21, 2024 |
“Less Wrong automated systems are inadvertently Censoring me” by Roko
|
Feb 21, 2024 |
“AI #51: Altman’s Ambition” by Zvi
|
Feb 20, 2024 |
“Why does generalization work?” by Martín Soto
|
Feb 20, 2024 |
“Retirement Accounts and Short Timelines” by jefftk
|
Feb 19, 2024 |
“Protocol evaluations: good analogies vs control” by Fabien Roger
|
Feb 19, 2024 |
[Linkpost] “I’d also take $7 trillion” by bhauth
|
Feb 19, 2024 |
“On coincidences and Bayesian reasoning, as applied to the origins of COVID-19” by viking_math
|
Feb 19, 2024 |
“Things I’ve Grieved” by Raemon
|
Feb 18, 2024 |
“2023 Survey Results” by Screwtape
|
Feb 18, 2024 |
[Linkpost] “‘What if we could redesign society from scratch? The promise of charter cities.’ [Rational Animations video] — LessWrong” by Jackson Wagner
|
Feb 18, 2024 |
“Self-Awareness: Taxonomy and eval suite proposal” by Daniel Kokotajlo
|
Feb 17, 2024 |
“The Pointer Resolution Problem” by Jozdien
|
Feb 16, 2024 |
“Every ‘Every Bay Area House Party’ Bay Area House Party” by Richard_Ngo
|
Feb 16, 2024 |
[Linkpost] “‘No-one in my org puts money in their pension’” by Tobes
|
Feb 16, 2024 |
“Fixing Feature Suppression in SAEs” by Benjamin Wright
|
Feb 16, 2024 |
“Offering AI safety support calls for ML professionals” by Vael Gates
|
Feb 15, 2024 |
“Raising children on the eve of AI” by juliawise
|
Feb 15, 2024 |
[Linkpost] “FTX expects to return all customer money; clawbacks may go away” by Mikhail Samin
|
Feb 14, 2024 |
“CFAR Takeaways: Andrew Critch” by Raemon
|
Feb 14, 2024 |
[Linkpost] “Masterpiece” by Richard_Ngo
|
Feb 13, 2024 |
“Lsusr’s Rationality Dojo” by lsusr
|
Feb 13, 2024 |
“Where is the Town Square?” by Gretta Duleba
|
Feb 13, 2024 |
[Linkpost] “My cover story in Jacobin on AI capitalism and the x-risk debates” by garrison
|
Feb 12, 2024 |
“Tort Law Can Play an Important Role in Mitigating AI Risk” by Gabriel Weil
|
Feb 12, 2024 |
“On the Proposed California SB 1047” by Zvi
|
Feb 12, 2024 |
“Stop Defeating Akrasia (2/4)” by Edith
|
Feb 12, 2024 |
“Thoughts on ‘The Offense-Defense Balance Rarely Changes’” by Cullen
|
Feb 12, 2024 |
“Skepticism about DeepMind’s ‘Grandmaster-Level’ Chess Without Search” by Arjun Panickssery
|
Feb 12, 2024 |
“How do you actually obtain and report a likelihood function for scientific research?” by Peter Berggren
|
Feb 11, 2024 |
“And All the Shoggoths Merely Players” by Zack_M_Davis
|
Feb 10, 2024 |
[Linkpost] “Sam Altman’s Chip Ambitions Undercut OpenAI’s Safety Strategy” by garrison
|
Feb 10, 2024 |
“One True Love” by Zvi
|
Feb 09, 2024 |
“Skills I’d like my collaborators to have” by Raemon
|
Feb 09, 2024 |
“Transfer learning and generalization-qua-capability in Babbage and Davinci (or, why division is better than Spanish)” by RP, agg
|
Feb 09, 2024 |
“Running the Numbers on a Heat Pump” by jefftk
|
Feb 09, 2024 |
“Updatelessness doesn’t solve most problems” by Martín Soto
|
Feb 08, 2024 |
“AI #50: The Most Dangerous Thing” by Zvi
|
Feb 08, 2024 |
“Believing In” by AnnaSalamon
|
Feb 08, 2024 |
[Linkpost] “A Chess-GPT Linear Emergent World Representation” by karvonenadam
|
Feb 08, 2024 |
“Nitric oxide for covid and other viral infections” by Elizabeth
|
Feb 07, 2024 |
[Linkpost] “Debating with More Persuasive LLMs Leads to More Truthful Answers” by Akbir Khan, John Hughes, Dan Valentine, Sam Bowman, Ethan Perez
|
Feb 07, 2024 |
[Linkpost] “More Hyphenation” by Arjun Panickssery
|
Feb 07, 2024 |
“Why I think it’s net harmful to do technical safety research at AGI labs” by Remmelt
|
Feb 07, 2024 |
“story-based decision-making” by bhauth
|
Feb 07, 2024 |
“How to train your own ‘Sleeper Agents’” by evhub
|
Feb 07, 2024 |
“My guess at Conjecture’s vision: triggering a narrative bifurcation” by Alexandre Variengien
|
Feb 06, 2024 |
“what does davidad want from boundaries?” by Chipmonk, davidad
|
Feb 06, 2024 |
“Preventing exfiltration via upload limits seems promising” by ryan_greenblatt
|
Feb 06, 2024 |
“On the Debate Between Jezos and Leahy” by Zvi
|
Feb 06, 2024 |
“Toy models of AI control for concentrated catastrophe prevention” by Fabien Roger, Buck
|
Feb 06, 2024 |
[Linkpost] “Things You’re Allowed to Do: University Edition” by Saul Munn
|
Feb 06, 2024 |
“Value learning in the absence of ground truth” by Joel_Saarinen
|
Feb 05, 2024 |
“Implementing activation steering” by Annah
|
Feb 05, 2024 |
“Safe Stasis Fallacy” by Davidmanheim
|
Feb 05, 2024 |
“Noticing Panic” by Cole Wyeth
|
Feb 05, 2024 |
“Brute Force Manufactured Consensus is Hiding the Crime of the Century” by Roko
|
Feb 03, 2024 |
“Theories of Applied Rationality” by Camille Berger
|
Feb 03, 2024 |
“Why I no longer identify as transhumanist” by Kaj_Sotala
|
Feb 03, 2024 |
“Attention SAEs Scale to GPT-2 Small” by Connor Kissane, robertzk, Arthur Conmy, Neel Nanda
|
Feb 03, 2024 |
“Announcing the London Initiative for Safe AI (LISA)” by James Fox, mike_safeAI, Ryan Kidd
|
Feb 02, 2024 |
“Survey for alignment researchers: help us build better field-level models” by Cameron Berg, Judd Rosenblatt, AE Studio
|
Feb 02, 2024 |
“On Dwarkesh’s 3rd Podcast With Tyler Cowen” by Zvi
|
Feb 02, 2024 |
“Open Source Sparse Autoencoders for all Residual Stream Layers of GPT2-Small” by Joseph Bloom
|
Feb 02, 2024 |
[Linkpost] “Soft Prompts for Evaluation:
Measuring Conditional Distance of Capabilities — LessWrong” by porby
|
Feb 02, 2024 |
[Linkpost] “Davidad’s Provably Safe AI Architecture - ARIA’s Programme Thesis” by simeon_c
|
Feb 01, 2024 |
“Wrong answer bias” by lukehmiles
|
Feb 01, 2024 |
“On Not Requiring Vaccination” by jefftk
|
Feb 01, 2024 |
“Managing risks while trying to do good” by Wei Dai
|
Feb 01, 2024 |
“Leading The Parade” by johnswentworth
|
Jan 31, 2024 |
“Per protocol analysis as medical malpractice” by braces
|
Jan 31, 2024 |
“Ten Modes of Culture War Discourse” by jchan
|
Jan 31, 2024 |
“Without Fundamental Advances, Rebellion and Coup d’État are the Inevitable Outcomes of Dictators & Monarchs Trying to Control Large, Capable Countries” by Roko
|
Jan 31, 2024 |
[Linkpost] “Explaining Impact Markets” by Saul Munn
|
Jan 31, 2024 |
[Linkpost] “on neodymium magnets” by bhauth
|
Jan 30, 2024 |
“Childhood and Education Roundup #4” by Zvi
|
Jan 30, 2024 |
“The case for more ambitious language model evals” by Jozdien
|
Jan 30, 2024 |
“Processor clock speeds are not how fast AIs think” by Ege Erdil
|
Jan 29, 2024 |
[Linkpost] “Chapter 1 of How to Win Friends and Influence People” by gull
|
Jan 29, 2024 |
“Simple distribution approximation: When sampled 100 times, can language models yield 80% A and 20% B?” by Teun van der Weij, Felix Hofstätter, Francis Rhys Ward
|
Jan 29, 2024 |
“Why I take short timelines seriously” by NicholasKees
|
Jan 28, 2024 |
[Linkpost] “Win Friends and Influence People Ch. 2: The Bombshell” by gull
|
Jan 28, 2024 |
[Linkpost] “Things You’re Allowed to Do: At the Dentist” by rbinnn
|
Jan 28, 2024 |
[Linkpost] “Palworld development blog post” by bhauth
|
Jan 28, 2024 |
“Don’t sleep on Coordination Takeoffs” by trevor
|
Jan 27, 2024 |
“Epistemic Hell” by rogersbacon
|
Jan 27, 2024 |
[Linkpost] “The Good Balsamic Vinegar” by jenn
|
Jan 26, 2024 |
[Linkpost] “Making every researcher seek grants is a broken model” by jasoncrawford
|
Jan 26, 2024 |
[Linkpost] “Surgery Works Well Without The FDA” by Maxwell Tabarrok
|
Jan 26, 2024 |
“Without fundamental advances, misalignment and catastrophe are the default outcomes of training powerful AI” by Jeremy Gillen, peterbarnett
|
Jan 26, 2024 |
“‘Does your paradigm beget new, good, paradigms?’” by Raemon
|
Jan 25, 2024 |
“AI #48: The Talk of Davos” by Zvi
|
Jan 25, 2024 |
[Linkpost] “[Repost] The Copenhagen Interpretation of Ethics” by mesaoptimizer
|
Jan 25, 2024 |
“The case for ensuring that powerful AIs are controlled” by ryan_greenblatt, Buck
|
Jan 24, 2024 |
“Monthly Roundup #14: January 2024” by Zvi
|
Jan 24, 2024 |
“This might be the last AI Safety Camp” by Remmelt, Linda Linsefors
|
Jan 24, 2024 |
“Humans aren’t fleeb.” by Charlie Steiner
|
Jan 24, 2024 |
“From Finite Factors to Bayes Nets” by J Bostock
|
Jan 23, 2024 |
“Making a Secular Solstice Songbook” by jefftk
|
Jan 23, 2024 |
[Linkpost] “Loneliness and suicide mitigation for students using GPT3-enabled chatbots (survey of Replika users in Nature)” by Kaj_Sotala
|
Jan 23, 2024 |
[Linkpost] “the subreddit size threshold” by bhauth
|
Jan 23, 2024 |
“There is way too much serendipity” by Malmesbury
|
Jan 22, 2024 |
“We need a science of evals” by Marius Hobbhahn, Jérémy Scheurer
|
Jan 22, 2024 |
“′ petertodd’’s last stand: The final days of open GPT-3 research” by mwatkins
|
Jan 22, 2024 |
[Linkpost] “InterLab – a toolkit for experiments with multi-agent interactions” by Tomáš Gavenčiak, Ada Böhm, Jan_Kulveit
|
Jan 22, 2024 |
“When Does Altruism Strengthen Altruism?” by jefftk
|
Jan 21, 2024 |
“A Shutdown Problem Proposal” by johnswentworth, David Lorell
|
Jan 21, 2024 |
[Linkpost] “Book review: Cuisine and Empire” by eukaryote
|
Jan 21, 2024 |
[Linkpost] “legged robot scaling laws” by bhauth
|
Jan 20, 2024 |
“A quick investigation of AI pro-AI bias” by Fabien Roger
|
Jan 19, 2024 |
“On ‘Geeks, MOPs, and Sociopaths’” by alkjash, Gordon Seidoh Worley
|
Jan 19, 2024 |
“Logical Line-Of-Sight Makes Games Sequential or Loopy” by StrivingForLegibility
|
Jan 19, 2024 |
[Linkpost] “Manifund: 2023 in Review” by Austin Chen
|
Jan 18, 2024 |
[Linkpost] “Against Nonlinear (Thing Of Things)” by tailcalled
|
Jan 18, 2024 |
[Linkpost] “The True Story of How GPT-2 Became Maximally Lewd” by Writer
|
Jan 18, 2024 |
“On the abolition of man” by Joe Carlsmith
|
Jan 18, 2024 |
“Good job opportunities for helping with the most important century” by HoldenKarnofsky
|
Jan 18, 2024 |
“AI #48: Exponentials in Geometry” by Zvi
|
Jan 18, 2024 |
“Does literacy remove your ability to be a bard as good as Homer?” by Adrià Garriga-alonso
|
Jan 18, 2024 |
“Four visions of Transformative AI success” by Steven Byrnes
|
Jan 17, 2024 |
[Linkpost] “AlphaGeometry: An Olympiad-level AI system for geometry” by alyssavance
|
Jan 17, 2024 |
“On Anthropic’s Sleeper Agents Paper” by Zvi
|
Jan 17, 2024 |
“An Introduction To The Mandelbrot Set That Doesn’t Mention Complex Numbers” by Yitz
|
Jan 17, 2024 |
“Medical Roundup #1” by Zvi
|
Jan 16, 2024 |
“Being nicer than Clippy” by Joe Carlsmith
|
Jan 16, 2024 |
“Managing catastrophic misuse without robust AIs” by ryan_greenblatt, Buck
|
Jan 16, 2024 |
“Why wasn’t preservation with the goal of potential future revival started earlier in history?” by Andy_McKenzie
|
Jan 16, 2024 |
“The impossible problem of due process” by mingyuan
|
Jan 16, 2024 |
“Sparse Autoencoders Work on Attention Layer Outputs” by Connor Kissane, robertzk, Arthur Conmy, Neel Nanda
|
Jan 16, 2024 |
“Goals selected from learned knowledge: an alternative to RL alignment” by Seth Herd
|
Jan 15, 2024 |
[Linkpost] “Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training” by evhub, Carson Denison, Meg, Monte M, David Duvenaud, Nicholas Schiefer, Ethan Perez
|
Jan 15, 2024 |
“The case for training frontier AIs on Sumerian-only corpus” by Alexandre Variengien, Charbel-Raphaël, Jonathan Claybrough
|
Jan 15, 2024 |
“What good is G-factor if you’re dumped in the woods? A field report from a camp counselor.” by Hastings
|
Jan 14, 2024 |
[Linkpost] “Land Reclamation is in the 9th Circle of Stagnation Hell” by Maxwell Tabarrok
|
Jan 14, 2024 |
[Linkpost] “Gender Exploration” by sapphire
|
Jan 14, 2024 |
“Universal Love Integration Test: Hitler” by Raemon
|
Jan 14, 2024 |
“D&D.Sci(-fi): Colonizing the SuperHyperSphere” by abstractapplic
|
Jan 14, 2024 |
[Linkpost] “The Leeroy Jenkins principle: How faulty AI could guarantee ‘warning shots’” by titotal
|
Jan 14, 2024 |
“Notice When People Are Directionally Correct” by Chris_Leong
|
Jan 14, 2024 |
“Against most AI risk analogies” by Matthew Barnett
|
Jan 14, 2024 |
[Linkpost] “The Perceptron Controversy” by Yuxi_Liu
|
Jan 13, 2024 |
“An even deeper atheism” by Joe Carlsmith
|
Jan 13, 2024 |
“AI #47: Meet the New Year” by Zvi
|
Jan 13, 2024 |
“Introducing Alignment Stress-Testing at Anthropic” by evhub
|
Jan 12, 2024 |
“The Aspiring Rationalist Congregation” by maia
|
Jan 12, 2024 |
“Apply to the PIBBSS Summer Research Fellowship” by Nora_Ammann, DusanDNesic, Lucas Teixeira
|
Jan 12, 2024 |
“Introduce a Speed Maximum” by jefftk
|
Jan 11, 2024 |
“An Actually Intuitive Explanation of the Oberth Effect” by Isaac King
|
Jan 10, 2024 |
“Saving the world sucks” by Defective Altruism
|
Jan 10, 2024 |
“On the Contrary, Steelmanning Is Normal; ITT-Passing Is Niche” by Zack_M_Davis
|
Jan 09, 2024 |
“Goodbye, Shoggoth: The Stage, its Animatronics, & the Puppeteer – a New Metaphor” by RogerDearnaley
|
Jan 09, 2024 |
“Does AI risk ‘other’ the AIs?” by Joe Carlsmith
|
Jan 09, 2024 |
“Learning Math in Time for Alignment” by NicholasKross
|
Jan 09, 2024 |
“A starter guide for evals” by Marius Hobbhahn, Mikita Balesni, Jérémy Scheurer, rusheb
|
Jan 08, 2024 |
“When ‘yang’ goes wrong” by Joe Carlsmith
|
Jan 08, 2024 |
“2023 Prediction Evaluations” by Zvi
|
Jan 08, 2024 |
“Reflections on my first year of AI safety research” by Jay Bailey
|
Jan 08, 2024 |
[Linkpost] “A model of research skill” by L Rudolf L
|
Jan 08, 2024 |
“Deceptive AI ≠ Deceptively-aligned AI” by Steven Byrnes
|
Jan 07, 2024 |
[Linkpost] “Bayesians Commit the Gambler’s Fallacy” by Kevin Dorst
|
Jan 07, 2024 |
[Linkpost] “Defending against hypothetical moon life during Apollo 11” by eukaryote
|
Jan 07, 2024 |
“AI Risk and the US Presidential Candidates” by Zane
|
Jan 06, 2024 |
“Survey of 2,778 AI authors: six parts in pictures” by KatjaGrace
|
Jan 06, 2024 |
[Linkpost] “Project ideas: Epistemics” by Lukas Finnveden
|
Jan 05, 2024 |
[Linkpost] “Almost everyone I’ve met would be well-served thinking more about what to focus on” by Henrik Karlsson
|
Jan 05, 2024 |
“The Next ChatGPT Moment: AI Avatars” by kolmplex, southpaw
|
Jan 05, 2024 |
“Catching AIs red-handed” by ryan_greenblatt, Buck
|
Jan 05, 2024 |
“MIRI 2024 Mission and Strategy Update” by Malo
|
Jan 05, 2024 |
“Deep atheism and AI risk” by Joe Carlsmith
|
Jan 04, 2024 |
“Some Vacation Photos” by johnswentworth
|
Jan 04, 2024 |
“AI #45: To Be Determined” by Zvi
|
Jan 04, 2024 |
“What’s up with LLMs representing XORs of arbitrary features?” by Sam Marks
|
Jan 03, 2024 |
“Safety First: safety before full alignment. The deontic sufficiency hypothesis.” by Chipmonk
|
Jan 03, 2024 |
[Linkpost] “Practically A Book Review: Appendix to ‘Nonlinear’s Evidence: Debunking False and Misleading Claims’ (ThingOfThings)” by tailcalled
|
Jan 03, 2024 |
“Copyright Confrontation #1” by Zvi
|
Jan 03, 2024 |
“Trading off Lives” by jefftk
|
Jan 03, 2024 |
“Gentleness and the artificial Other” by Joe Carlsmith
|
Jan 02, 2024 |
“Apologizing is a Core Rationalist Skill” by johnswentworth
|
Jan 02, 2024 |
“OpenAI’s Preparedness Framework: Praise & Recommendations” by Akash
|
Jan 02, 2024 |
“Dating Roundup #2: If At First You Don’t Succeed” by Zvi
|
Jan 02, 2024 |
“AI Is Not Software” by Davidmanheim
|
Jan 02, 2024 |
[Linkpost] “Steering Llama-2 with contrastive activation additions” by Nina Rimsky, Wuschel Schulz, NickGabs, Meg, evhub, TurnTrout
|
Jan 02, 2024 |
“Boston Solstice 2023 Retrospective” by jefftk
|
Jan 02, 2024 |
“Stop talking about p(doom)” by Isaac King
|
Jan 01, 2024 |
“2023 in AI predictions” by jessicata
|
Jan 01, 2024 |
“Bayesian updating in real life is mostly about understanding your hypotheses” by Max H
|
Jan 01, 2024 |