The Gradient: Perspectives on AI

By The Gradient

Listen to a podcast, please open Podcast Republic app. Available on Google Play Store.


Category: Technology

Open in Apple Podcasts


Open RSS feed


Open Website


Rate for this podcast

Subscribers: 43
Reviews: 0
Episodes: 76

Description

Interviews with various people who research, build, or use AI, including academics, engineers, artists, entrepreneurs, and more.

thegradientpub.substack.com

Episode Date
Riley Goodside: The Art and Craft of Prompt Engineering
3582

In episode 75 of The Gradient Podcast, Daniel Bashir speaks to Riley Goodside.

Riley is a Staff Prompt Engineer at Scale AI. Riley began posting GPT-3 prompt examples and screenshot demonstrations in 2022. He previously worked as a data scientist at OkCupid, Grindr, and CopyAI.

Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pub

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (01:37) Riley’s journey to becoming the first Staff Prompt Enginer

* (02:00) data science background in online dating industry

* (02:15) Sabbatical + catching up on LLM progress

* (04:00) AI Dungeon and first taste of GPT-3

* (05:10) Developing on codex, ideas about integrating codex with Jupyter Notebooks, start of posting on Twitter

* (08:30) “LLM ethnography”

* (09:12) The history of prompt engineering: in-context learning, Reinforcement Learning from Human Feedback (RLHF)

* (10:20) Models used to be harder to talk to

* (10:45) The three eras

* (10:45) 1 - Pre-trained LM era—simple next-word predictors

* (12:54) 2 - Instruction tuning

* (16:13) 3 - RLHF and overcoming instruction tuning’s limitations

* (19:24) Prompting as subtractive sculpting, prompting and AI safety

* (21:17) Riley on RLHF and safety

* (24:55) Riley’s most interesting experiments and observations

* (25:50) Mode collapse in RLHF models

* (29:24) Prompting models with very long instructions

* (33:13) Explorations with regular expressions, chain-of-thought prompting styles

* (36:32) Theories of in-context learning and prompting, why certain prompts work well

* (42:20) Riley’s advice for writing better prompts

* (49:02) Debates over prompt engineering as a career, relevance of prompt engineers

* (58:55) Outro

Links:

* Riley’s Twitter and LinkedIn

* Talk: LLM Prompt Engineering and RLHF: History and Techniques



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Jun 01, 2023
Talia Ringer: Formal Verification and Deep Learning
6335

In episode 74 of The Gradient Podcast, Daniel Bashir speaks to Professor Talia Ringer.

Professor Ringer is an Assistant Professor with the Programming Languages, Formal Methods, and Software Engineering group at the University of Illinois at Urbana Champaign. Their research leverages proof engineering to allow programmers to more easily build formally verified software systems.

Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pub

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Daniel’s long annoying intro

* (02:15) Origin Story

* (04:30) Why / when formal verification is important

* (06:40) Concerns about ChatGPT/AutoGPT et al failures, systems for accountability

* (08:20) Difficulties in making formal verification accessible

* (11:45) Tactics and interactive theorem provers, interface issues

* (13:25) How Prof Ringer’s research first crossed paths with ML

* (16:00) Concrete problems in proof automation

* (16:15) How ML can help people verifying software systems

* (20:05) Using LLMs for understanding / reasoning about code

* (23:05) Going from tests / formal properties to code

* (31:30) Is deep learning the right paradigm for dealing with relations for theorem proving?

* (36:50) Architectural innovations, neuro-symbolic systems

* (40:00) Hazy definitions in ML

* (41:50) Baldur: Proof Generation & Repair with LLMs

* (45:55) In-context learning’s effectiveness for LLM-based theorem proving

* (47:12) LLMs without fine-tuning for proofs

* (48:45) Something ~ surprising ~ about Baldur results (maybe clickbait or maybe not)

* (49:32) Asking models to construct proofs with restrictions, translating proofs to formal proofs

* (52:07) Methods of proofs and relative difficulties

* (57:45) Verifying / providing formal guarantees on ML systems

* (1:01:15) Verifying input-output behavior and basic considerations, nature of guarantees

* (1:05:20) Certified/verifies systems vs certifying/verifying systems—getting LLMs to spit out proofs along with code

* (1:07:15) Interpretability and how much model internals matter, RLHF, mechanistic interpretability

* (1:13:50) Levels of verification for deploying ML systems, HCI problems

* (1:17:30) People (Talia) actually use Bard

* (1:20:00) Dual-use and “correct behavior”

* (1:24:30) Good uses of jailbreaking

* (1:26:30) Talia’s views on evil AI / AI safety concerns

* (1:32:00) Issues with talking about “intelligence,” assumptions about what “general intelligence” means

* (1:34:20) Difficulty in having grounded conversations about capabilities, transparency

* (1:39:20) Great quotation to steal for your next thinkpiece + intelligence as socially defined

* (1:42:45) Exciting research directions

* (1:44:48) Outro

Links:

* Talia’s Twitter and homepage

* Research

* Concrete Problems in Proof Automation

* Baldur: Whole-Proof Generation and Repair with LLMs

* Research ideas



Get full access to The Gradient at thegradientpub.substack.com/subscribe
May 25, 2023
Brigham Hyde: AI for Clinical Decision-Making
2503

In episode 72 of The Gradient Podcast, Daniel Bashir speaks to Brigham Hyde.

Brigham is Co-Founder and CEO of Atropos Health. Prior to Atropos, he served as President of Data and Analytics at Eversana, a life sciences commercialization service provider. He led the investment in Concert AI in the oncology real-world data space at Symphony AI. Brigham has also held research faculty positions at Tufts University and the MIT Media Lab.

Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pub

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (01:55) Brigham’s background

* (06:00) Current challenges in healthcare

* (12:33) Interpretablity and delivering positive patient outcomes

* (17:10) How Atropos surfaces relevant data for patient interventions, on personalized observational research studies

* (22:10) Quality and quantity of data for patient interventions

* (27:25) Challenges and opportunities for generative AI in healthcare

* (35:17) Database augmentation for generative models

* (36:25) Future work for Atropos

* (39:15) Future directions for AI + healthcare

* (40:56) Outro

Links:

* Atropos Health homepage

* Brigham’s Twitter and LinkedIn



Get full access to The Gradient at thegradientpub.substack.com/subscribe
May 18, 2023
Scott Aaronson: Against AI Doomerism
4172

In episode 72 of The Gradient Podcast, Daniel Bashir speaks to Professor Scott Aaronson.

Scott is the Schlumberger Centennial Chair of Computer Science at the University of Texas at Austin and director of its Quantum Information Center. His research interests focus on the capabilities and limits of quantum computers and computational complexity theory more broadly. He has recently been on leave to work at OpenAI, where he is researching theoretical foundations of AI safety. 

Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pub

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (01:45) Scott’s background

* (02:50) Starting grad school in AI, transitioning to quantum computing and the AI / quantum computing intersection

* (05:30) Where quantum computers can give us exponential speedups, simulation overhead, Grover’s algorithm

* (10:50) Overselling of quantum computing applied to AI, Scott’s analysis on quantum machine learning

* (18:45) ML problems that involve quantum mechanics and Scott’s work

* (21:50) Scott’s recent work at OpenAI

* (22:30) Why Scott was skeptical of AI alignment work early on

* (26:30) Unexpected improvements in modern AI and Scott’s belief update

* (32:30) Preliminary Analysis of DALL-E 2 (Marcus & Davis)

* (34:15) Watermarking GPT outputs

* (41:00) Motivations for watermarking and language model detection

* (45:00) Ways around watermarking

* (46:40) Other aspects of Scott’s experience with OpenAI, theoretical problems

* (49:10) Thoughts on definitions for humanistic concepts in AI

* (58:45) Scott’s “reform AI alignment stance” and Eliezer Yudkowsky’s recent comments (+ Daniel pronounces Eliezer wrong), orthogonality thesis, cases for stopping scaling

* (1:08:45) Outro

Links:

* Scott’s blog

* AI-related work

* Quantum Machine Learning Algorithms: Read the Fine Print

* A very preliminary analysis of DALL-E 2 w/ Marcus and Davis

* New AI classifier for indicating AI-written text and Watermarking GPT Outputs

* Writing

* Should GPT exist?

* AI Safety Lecture

* Why I’m not terrified of AI



Get full access to The Gradient at thegradientpub.substack.com/subscribe
May 11, 2023
Ted Underwood: Machine Learning and the Literary Imagination
6239

In episode 71 of The Gradient Podcast, Daniel Bashir speaks to Ted Underwood.

Ted is a professor in the School of Information Sciences with an appointment in the Department of English at the University of Illinois at Urbana Champaign. Trained in English literary history, he turned his research focus to applying machine learning to large digital collections. His work explores literary patterns that become visible across long timelines when we consider many works at once—often, his work involves correcting and enriching digital collections to make them more amenable to interesting literary research.

Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pub

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (01:42) Ted’s background / origin story,

* (04:35) Context in interpreting statistics, “you need a model,” the need for data about human responses to literature and how that manifested in Ted’s work

* (07:25) The recognition that we can model literary prestige/genre because of ML

* (08:30) Distant reading and the import of statistics over large digital libraries

* (12:00) Literary prestige

* (12:45) How predictable is fiction? Scales of predictability in texts

* (13:55) Degrees of autocorrelation in biography and fiction and the structure of narrative, how LMs might offer more sophisticated analysis

* (15:15) Braided suspense / suspense at different scales of a story

* (17:05) The Literary Uses of High-Dimensional Space: how “big data” came to impact the humanities, skepticism from humanists and responses, what you can do with word count

* (20:50) Why we could use more time to digest statistical ML—how acceleration in AI advances might impact pedagogy

* (22:30) The value in explicit models

* (23:30) Poetic “revolutions” and literary prestige

* (25:53) Distant vs. close reading in poetry—follow-up work for “The Longue Durée”

* (28:20) Sophistication of NLP and approaching the human experience

* (29:20) What about poetry renders it prestigious?

* (32:20) Individualism/liberalism and evolution of poetic taste

* (33:20) Why there is resistance to quantitative approaches to literature

* (34:00) Fiction in other languages

* (37:33) The Life Cycles of Genres

* (38:00) The concept of “genre”

* (41:00) Inflationary/deflationary views on natural kinds and genre

* (44:20) Genre as a social and not a linguistic phenomenon

* (46:10) Will causal models impact the humanities?

* (48:30) (Ir)reducibility of cultural influences on authors

* (50:00) Machine Learning and Human Perspective

* (50:20) Fluent and perspectival categories—Miriam Posner on “the radical, unrealized potential of digital humanities.”

* (52:52) How ML’s vices can become virtues for humanists

* (56:05) Can We Map Culture? and The Historical Significance of Textual Distances

* (56:50) Are cultures and other social phenomena related to one another in a way we can “map”?

* (59:00) Is cultural distance Euclidean?

* (59:45) The KL Divergence’s use for humanists

* (1:03:32) We don’t already understand the broad outlines of literary history

* (1:06:55) Science Fiction Hasn’t Prepared us to Imagine Machine Learning

* (1:08:45) The latent space of language and what intelligence could mean

* (1:09:30) LLMs as models of culture

* (1:10:00) What it is to be a human in “the age of AI” and Ezra Klein’s framing

* (1:12:45) Mapping the Latent Spaces of Culture

* (1:13:10) Ted on Stochastic Parrots

* (1:15:55) The risk of AI enabling hermetically sealed cultures

* (1:17:55) “Postcards from an unmapped latent space,” more on AI systems’ limitations as virtues

* (1:20:40) Obligatory GPT-4 section

* (1:21:00) Using GPT-4 to estimate passage of time in fiction

* (1:23:39) Is deep learning more interpretable than statistical NLP?

* (1:25:17) The “self-reports” of language models: should we trust them?

* (1:26:50) University dependence on tech giants, open-source models

* (1:31:55) Reclaiming Ground for the Humanities

* (1:32:25) What scientists, alone, can contribute to the humanities

* (1:34:45) On the future of the humanities

* (1:35:55) How computing can enable humanists as humanists

* (1:37:05) Human self-understanding as a collaborative project

* (1:39:30) Is anything ineffable? On what AI systems can “grasp”

* (1:43:12) Outro

Links:

* Ted’s blog and Twitter

* Research

* The literary uses of high-dimensional space

* The Longue Durée of literary prestige

* The Historical Significance of Textual Distances

* Machine Learning and Human Perspective

* The life cycles of genres

* Can We Map Culture?

* Cohort Succession Explains Most Change in Literary Culture

* Other Writing

* Reclaiming Ground for the Humanities

* We don’t already understand the broad outlines of literary history

* Science fiction hasn’t prepared us to imagine machine learning.

* How predictable is fiction?

* Mapping the latent spaces of culture

* Using GPT-4 to measure the passage of time in fiction



Get full access to The Gradient at thegradientpub.substack.com/subscribe
May 04, 2023
Irene Solaiman: AI Policy and Social Impact
4331

In episode 70 of The Gradient Podcast, Daniel Bashir speaks to Irene Solaiman.

Irene is an expert in AI safety and policy and the Policy Director at HuggingFace, where she conducts social impact research and develops public policy. In her former role at OpenAI, she initiated and led bias and social impact research at OpenAI in addition to leading public policy. She built AI policy at Zillow group and advised poilcymakers on responsible autonomous decision-making and privacy as a fellow at Harvard’s Berkman Klein Center.

Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pub

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (02:00) Intro to Irene and her work

* (03:45) What tech people need to learn about policy, and vice versa

* (06:35) Societal impact—words and reality, Irene’s experience

* (08:30) OpenAI work on GPT-2 and release strategies (yes, this was recorded on Pi Day)

* (11:00) Open-source proponents and release

* (14:00) What does a multidisciplinary approach to working on AI look like?

* (16:30) Thinking about end users and enabling contributors with different sets of expertise

* (18:00) “Preparing for AGI” and current approaches to release

* (21:00) Who constitutes a researcher? What constitutes safety and who gets resourced? Limitations in red-teaming potentially dangerous systems.

* (22:35) PALMS and Values-Targeted Datasets

* (25:52) PALMS and RLHF

* (27:00) Homogenization in foundation models, cultural contexts

* (29:45) Anthropic’s moral self-correction paper and Irene’s concerns about marketing “de-biasing” and oversimplification

* (31:50) Data work, human systemic problems → AI bias

* (33:55) Why do language models get more toxic as they get larger? (if you have ideas, let us know!)

* (35:45) The gradient of generative AI release, Irene’s experience with the open-source world, tradeoffs along the release gradient

* (38:40) More on Irene’s orientation towards release

* (39:40) Pragmatics of keeping models closed, dealing with open-source by force

* (42:22) Norm setting for release and use, normalization of documentation on social impacts

* (46:30) Race dynamics :(

* (49:45) Resource allocation and advances in ethics/policy, conversations on integrity and disinformation

* (53:10) Organizational goals, balancing technical research with policy work

* (58:10) Thoughts on governments’ AI policies, impact of structural assumptions

* (1:04:00) Approaches to AI-generated sexual content, need for more voices represented in conversations about AI

* (1:08:25) Irene’s suggestions for AI practitioners / technologists

* (1:11:24) Outro

Links:

* Irene’s homepage and Twitter

* Papers

* Release Strategies and the Social Impacts of Language Models

* Hugh Zhang’s open letter in The Gradient from 2019

* Process for Adapting Large Models to Society (PALMS) with Values-Targeted Datasets

* The Gradient of Generative AI Release: Methods and Considerations



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Apr 27, 2023
Drago Anguelov: Waymo and Autonomous Vehicles
3923

In episode 69 of The Gradient Podcast, Daniel Bashir speaks to Drago Anguelov.

Drago is currently a Distinguished Scientist and Head of Research at Waymo, where he joined in 2018. Earlier, he spent eight years at Google working on 3D vision and pose estimation for StreetView, then leading a research team that developed computer vision systems for annotating Google Photos. He has been involved in developing popular neural network methods such as the Inception architecture and the SSD detector. Before joining Waymo, he also led the 3D perception team at Zoox.

Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pub

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (02:04) Drago’s background in AI and self-driving, work with Daphne Koller + Sebastian Thrun, computer vision / pose estimation

* (14:20) One- and two-stage object detectors

* (15:15) Early experiences and thoughts on self-driving and its prospects

* (21:00) An introduction to the “self-driving stack”: mapping & localization, perception, behavior modeling & planning, simulation

* (29:25) From Stuart Russell’s comments on early Waymo’s “old-fashioned” approach

* (37:34) Scaling 3D Detection: challenges and architectural innovations

* (43:20) Behavior modeling: making decisions and modeling interactions in multi-agent environments

* (52:42) Distributional RL (+ imitation learning) in self-driving?

* (54:10) The Waymo Open Dataset

* (1:01:48) Looking forward in self-driving

* (1:04:36) Outro

Links:

* Drago’s LinkedIn and Twitter

* Research

* SSD: Single-Shot Multibox Detector

* SCAPE: Shape completion and animation of people

* Behavior Models for Autonomous Driving

* Wayformer

* Symphony: Learning Realistic and Diverse Agents for Autonomous Driving Simulation

* Imitation Is Not Enough

* Scaling 3D Detection to the Long Tail



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Apr 20, 2023
Joanna Bryson: The Problems of Cognition
4385

In episode 68 of The Gradient Podcast, Daniel Bashir speaks to Professor Joanna Bryson.

Professor Bryson is Professor of Ethics and Technology at the Hertie School, where her research focuses on the impact of technology on human cooperation and AI/ICT governance. Professor Bryson has advised companies, governments, transnational agencies, and NGOs, particularly in AI policy. She is one of the few people doing this sort of work who actually has a PhD and work experience in AI, but also advanced degrees in the social sciences. She started her academic career though in the liberal arts, and publishes regularly in the natural sciences.

Have suggestions for future podcast guests (or other feedback)? Let us know here!

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (01:35) Intro to Professor Bryson’s work

* (06:37) Shifts in backgrounds expected of AI PhDs/researchers

* (09:40) Masters’ degree in Edinburgh, Behavior-Based AI

* (11:00) PhD, differences between MIT’s engineering focus and Edinburgh, systems engineering + AI

* (16:15) Comments on ways you can make contributions in AI

* (18:45) When definitions of “intelligence” are important

* (24:23) Non- and proto-linguistic aspects of intelligence, arguments about text as a description of human experience

* (31:45) Cognitive leaps in interacting with language models

* (37:00) Feelings of affiliation for robots, phenomenological experience in humans and (not) in AI systems

* (42:00) Language models and technological systems as cultural artifacts, expressing agency through machines

* (44:15) Capabilities development and moral patient status in AI systems

* (51:20) Prof. Bryson’s perspectives on recent AI regulation

* (1:00:55) Responsibility and recourse, Uber self-driving crash

* (1:07:30) “Preparing for AGI,” “Living with AGI,” how to respond to recent AI developments

* (1:12:18) Outro

Links:

* Professor Bryson’s homepage and Twitter

* Papers

* Systems AI

* Behavior Oriented Design, action selection, key differences in methodology/views between systems AI researchers and e.g. connectionists

* Agent architecture as object oriented design (1998)

* Intelligence by design: Principles of modularity and coordination for engineering complex adaptive agents (2001)

* Cognition

* Age-Related Inhibition and Learning Effects: Evidence from Transitive Performance (2013)

* Primate Errors in Transitive ‘Inference’: A Two-Tier Learning Model (2007)

* Skill Acquisition Through Program-Level Imitation in a Real-Time Domain

* Agent-Based Models as Scientific Methodology: A Case Study Analysing Primate Social Behaviour (2008, 2011)

* Social learning in a non-social reptile (Geochelone carbonaria) (2010)

* Understanding and Addressing Cultural Variation in Costly Antisocial Punishment (2014)

* Polarization Under Rising Inequality and Economic Decline (2020)

* Semantics derived automatically from language corpora contain human-like biases (2017)

* Evolutionary Psychology and Artificial Intelligence: The Impact of Artificial Intelligence on Human Behaviour (2020)

* Ethics/Policy

* Robots should be slaves (2010)

* Standardizing Ethical Design for Artificial Intelligence and Autonomous Systems (2017)

* Of, For, and By the People: The Legal Lacuna of Synthetic Persons (2017)

* Patiency is not a virtue: the design of intelligent systems and systems of ethics (2018)

* Other writing

* Reflections on the EU’s AI Act

* Is There an AI Cold War?

* Living with AGI

* One Day, AI Will Seem as Human as Anyone. What Then?



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Apr 13, 2023
Daniel Situnayake: AI on the Edge
7087

In episode 67 of The Gradient Podcast, Daniel Bashir speaks to Daniel Situnayake.

Daniel is head of Machine Learning at Edge Impulse. He is co-author of the O’Reilly books "AI at the Edge" and "TinyML". Previously, he’s worked on the Tensorflow Lite team at Google AI and co-founded Tiny Farms, an insect farming company. Daniel has also lectured in AIDC technologies at Birmingham City University.

Have suggestions for future podcast guests (or other feedback)? Let us know here!

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (1:40) Daniel S Origin Story: computer networking, RFID/barcoding, earlier jobs, Tiny Farms, Tensorflow Lite, writing on TinyML, and Edge Impulse

* (15:30) Edge AI and questions of embodiment/intelligence in AI

* (21:00) The role of hardware, other constraints in edge AI

* (25:00) Definitions of intelligence

* (29:45) What is edge AI?

* (37:30) The spectrum of edge devices

* (43:45) Innovations in edge AI (architecture, frameworks/toolchains, quantization)

* (53:45) Model compression tradeoffs in edge

* (1:00:30) Federated learning and challenges

* (1:09:00) Intro to Edge Impulse

* (1:20:30) Feature engineering for edge systems, fairness considerations

* (1:25:50) Edge AI and axes in AI (large/small, ethereal/embodied)

* (1:37:00) Daniel and Daniel go off the rails on panpsychism

* (1:54:20) Daniel’s advice for aspiring AI practitioners

* (1:57:20) Outro

Links:

* Daniel’s Twitter and blog

* Edge Impulse



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Apr 06, 2023
Soumith Chintala: PyTorch
4100

In episode 66 of The Gradient Podcast, Daniel Bashir speaks to Soumith Chintala.

Soumith is a Research Engineer at Meta AI Research in NYC. He is the co-creator and lead of Pytorch, and maintains a number of other open-source ML projects including Torch-7 and EBLearn. Soumith has previously worked on robotics, object and human detection, generative modeling, AI for video games, and ML systems research.

Have suggestions for future podcast guests (or other feedback)? Let us know here!

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (01:30) Soumith’s intro to AI journey to Pytorch

* (05:00) State of computer vision early in Soumith’s career

* (09:15) Institutional inertia and sunk costs in academia, identifying fads

* (12:45) How Soumith started working on GANs, frustrations

* (17:45) State of ML frameworks early in the deep learning era, differentiators

* (23:50) Frameworks and leveling the playing field, exceptions

* (25:00) Contributing to Torch and evolution into Pytorch

* (29:15) Soumith’s product vision for ML frameworks

* (32:30) From product vision to concrete features in Pytorch

* (39:15) Progressive disclosure of complexity (Chollet) in Pytorch

* (41:35) Building an open source community

* (43:25) The different players in today’s ML framework ecosystem

* (49:35) ML frameworks pioneered by Yann LeCun and Léon Bottou, their influences on Pytorch

* (54:37) Pytorch 2.0 and looking to the future

* (58:00) Soumith’s adventures in household robotics

* (1:03:25) Advice for aspiring ML practitioners

* (1:07:10) Be cool like Soumith and subscribe :)

* (1:07:33) Outro

Links:

* Soumith’s Twitter and homepage

* Papers

* Convolutional Neural Networks Applied to House Numbers Digit Classification

* GANs: LAPGAN, DCGAN, Wasserstein GAN

* Automatic differentiation in PyTorch

* PyTorch: An Imperative Style, High-Performance Deep Learning Library



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Mar 30, 2023
Sewon Min: The Science of Natural Language
6164

In episode 65 of The Gradient Podcast, Daniel Bashir speaks to Sewon Min.

Sewon is a fifth-year PhD student in the NLP group at the University of Washington, advised by Hannaneh Hajishirzi and Luke Zettlemoyer. She is a part-time visiting researcher at Meta AI and a recipient of the JP Morgan PhD Fellowship. She has previously spent time at Google Research and Salesforce research.

Have suggestions for future podcast guests (or other feedback)? Let us know here!

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (03:00) Origin Story

* (04:20) Evolution of Sewon’s interests, question-answering and practical NLP

* (07:00) Methodology concerns about benchmarks

* (07:30) Multi-hop reading comprehension

* (09:30) Do multi-hop QA benchmarks actually measure multi-hop reasoning?

* (12:00) How models can “cheat” multi-hop benchmarks

* (13:15) Explicit compositionality

* (16:05) Commonsense reasoning and background information

* (17:30) On constructing good benchmarks

* (18:40) AmbigQA and ambiguity

* (22:20) Types of ambiguity

* (24:20) Practical possibilities for models that can handle ambiguity

* (25:45) FaVIQ and fact-checking benchmarks

* (28:45) External knowledge

* (29:45) Fact verification and “complete understanding of evidence”

* (31:30) Do models do what we expect/intuit in reading comprehension?

* (34:40) Applications for fact-checking systems

* (36:40) Intro to in-context learning (ICL)

* (38:55) Example of an ICL demonstration

* (40:45) Rethinking the Role of Demonstrations and what matters for successful ICL

* (43:00) Evidence for a Bayesian inference perspective on ICL

* (45:00) ICL + gradient descent and what it means to “learn”

* (47:00) MetaICL and efficient ICL

* (49:30) Distance between tasks and MetaICL task transfer

* (53:00) Compositional tasks for language models, compositional generalization

* (55:00) The number and diversity of meta-training tasks

* (58:30) MetaICL and Bayesian inference

* (1:00:30) Z-ICL: Zero-Shot In-Context Learning with Pseudo-Demonstrations

* (1:02:00) The copying effect

* (1:03:30) Copying effect for non-identical examples

* (1:06:00) More thoughts on ICL

* (1:08:00) Understanding Chain-of-Thought Prompting

* (1:11:30) Bayes strikes again

* (1:12:30) Intro to Sewon’s text retrieval research

* (1:15:30) Dense Passage Retrieval (DPR)

* (1:18:40) Similarity in QA and retrieval

* (1:20:00) Improvements for DPR

* (1:21:50) Nonparametric Masked Language Modeling (NPM)

* (1:24:30) Difficulties in training NPM and solutions

* (1:26:45) Follow-on work

* (1:29:00) Important fundamental limitations of language models

* (1:31:30) Sewon’s experience doing a PhD

* (1:34:00) Research challenges suited for academics

* (1:35:00) Joys and difficulties of the PhD

* (1:36:30) Sewon’s advice for aspiring PhDs

* (1:38:30) Incentives in academia, production of knowledge

* (1:41:50) Outro

Links:

* Sewon’s homepage and Twitter

* Papers

* Solving and re-thinking benchmarks

* Multi-hop Reading Comprehension through Question Decomposition and Rescoring / Compositional Questions Do Not Necessitate Multi-hop Reasoning

* AmbigQA: Answering Ambiguous Open-domain Questions

* FaVIQ: FAct Verification from Information-seeking Questions

* Language Modeling

* Rethinking the Role of Demonstrations

* MetaICL: Learning to Learn In Context

* Towards Understanding CoT Prompting

* Z-ICL: Zero-Shot In-Context Learning with Pseudo-Demonstrations

* Text representation/retrieval

* Dense Passage Retrieval

* Nonparametric Masked Language Modeling



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Mar 23, 2023
Richard Socher: Re-Imagining Search
5869

In episode 64 of The Gradient Podcast, Daniel Bashir speaks to Richard Socher.

Richard is founder and CEO of you.com, a new search engine that lets you personalize your search workflow and eschews tracking and invasive ads. Richard was previously Chief Scientist at Salesforce where he led work on fundamental and applied research, product incubation, CRM search, customer service automation and a cross-product AI platform. He was an adjunct professor at Stanford’s CS department as well as founder and CEO/CTO of MetaMind, which was acquired by Salesforce in 2016. He received his PhD from Stanford’s CS Department in 2014.

Have suggestions for future podcast guests (or other feedback)? Let us know here!

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (02:20) Richard Socher origin story + time at Metamind, Salesforce (AI Economist, CTRL, ProGen)

* (22:00) Why Richard advocated for deep learning in NLP

* (27:00) Richard’s perspective on language

* (32:20) Is physical grounding and language necessary for intelligence?

* (40:10) Frankfurtian b******t and language model utterances as truth

* (47:00) Lessons from Salesforce Research

* (53:00) Balancing fundamental research with product focus

* (57:30) The AI Economist + how should policymakers account for limitations?

* (1:04:50) you.com, the chatbot wars, and taking on search giants

* (1:13:50) Re-imagining the vision for and components of a search engine

* (1:18:00) The future of generative models in search and the internet

* (1:28:30) Richard’s advice for early-career technologists

* (1:37:00) Outro

Links:

* Richard’s Twitter 

* YouChat by you.com

* Careers at you.com

* Papers mentioned

* Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions

* Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank

* Grounded Compositional Semantics for Finding and Describing Images with Sentences

* The AI Economist

* ProGen

* CTRL



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Mar 16, 2023
Joe Edelman: Meaning-Aligned AI
3983

In episode 63 of The Gradient Podcast, Daniel Bashir speaks to Joe Edelman.

Joe developed the meaning-based organizational metrics at Couchsurfing.com, then co-founded the Center for Humane Technology with Tristan Harris, and coined the term “Time Well Spent” for a family of metrics adopted by teams at Facebook, Google, and Apple. Since then, he's worked on the philosophical underpinnings for new business metrics, design methods, and political movements. The central idea is to make people's sources of meaning explicit, so that how meaningful or meaningless things are can be rigorously accounted for. His previous career was in HCI and programming language design.

Have suggestions for future podcast guests (or other feedback)? Let us know here!

Subscribe to The Gradient Podcast: Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro (yes Daniel is trying a new intro format)

* (01:30) Joe’s origin story

* (07:15) Revealed preferences and personal meaning, recommender systems

* (12:30) Is using revealed preferences necessary?

* (17:00) What are values and how do you detect them?

* (24:00) Figuring out what’s meaningful to us

* (28:45) The decline of spaces and togetherness

* (35:00) Individualism and economic/political theory, tensions between collectivism/individualism

* (41:00) What it looks like to build spaces, Habitat

* (47:15) Cognitive effects of social platforms

* (51:45) Atomized communication, re-imagining chat apps

* (55:50) Systems for social groups and medium independence

* (1:02:45) Spaces being built today

* (1:05:15) Joe is building research groups! Get in touch :)

* (1:05:40) Outro

Links:

* Joe's 80m lecture on techniques for rebuilding society on meaning (youtube, transcript)

* The discord for Rebuilding Meaning—join if you'd like to help build ML models or metrics using the methods discussed

* Writing/papers mentioned:

* Tech products (that don’t cause depression and war)

* Values, Preferences, Meaningful Choice

* Social Programming Considered as a Habitat for Groups

* Is Anything Worth Maximizing

* Joe’s homepage, Twitter, and YouTube page



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Mar 09, 2023
Ed Grefenstette: Language, Semantics, Cohere
4456

In episode 62 of The Gradient Podcast, Daniel Bashir speaks to Ed Grefenstette.

Ed is Head of Machine Learning at Cohere and an Honorary Professor at University College London. He previously held research scientist positions at Facebook AI Research and DeepMind, following a stint as co-founder and CTO of Dark Blue Labs. Before his time in industry, Ed worked at Oxford’s Department of Computer Science as a lecturer and Fulford Junior Research Fellow at Somerville College. Ed also received his MSc and DPhil from Oxford’s Computer Science Department.

Have suggestions for future podcast guests (or other feedback)? Let us know here!

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (02:18) The Ed Grefenstette Origin Story

* (08:15) Distributional semantics and Ed’s PhD research

* (14:30) Extending the distributional hypothesis, later Wittgenstein

* (18:00) Recovering parse trees in LMs, can LLMs understand communication and not just bare language?

* (23:15) LMs capture something about pragmatics, proxies for grounding and pragmatics

* (25:00) Human-in-the-loop training and RLHF—what is the essential differentiator?

* (28:15) A convolutional neural network for modeling sentences, relationship to attention

* (34:20) Difficulty of constructing supervised learning datasets, benchmark-driven development

* (40:00) Learning to Transduce with Unbounded Memory, Neural Turing Machines

* (47:40) If RNNs are like finite state machines, where are transformers?

* (51:40) Cohere and why Ed joined

* (56:30) Commercial applications of LLMs and Cohere’s product

* (59:00) Ed’s reply to stochastic parrots and thoughts on consciousness

* (1:03:30) Lessons learned about doing effective science

* (1:05:00) Where does scaling end?

* (1:07:00) Why Cohere is an exciting place to do science

* (1:08:00) Ed’s advice for aspiring ML {researchers, engineers, etc} and the role of communities in science

* (1:11:45) Cohere for AI plug!

* (1:13:30) Outro

Links:

* Ed’s homepage and Twitter

* (some of) Ed’s Papers

* Experimental support for a categorical compositional distributional model of meaning

* Multi-step regression learning

* “Not not bad” is not “bad”

* Towards a formal distributional semantics

* A CNN for modeling sentences

* Teaching machines to read and comprehend

* Reasoning about entailment with neural attention

* Learning to Transduce with Unbounded Memory

* Teaching Artificial Agents to Understand Language by Modelling Reward

* Other things mentioned

* Large language models are not zero-shot communicators (Laura Ruis + others and Ed)

* Looped Transformers as Programmable Computers and our Update 43 covering this paper

* Cohere and Cohere for AI (+ earlier episode w/ Sara Hooker on C4AI)

* David Chalmers interview on AI + consciousness



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Mar 02, 2023
Ken Liu: What Science Fiction Can Teach Us
7360

In episode 61 of The Gradient Podcast, Daniel Bashir speaks to Ken Liu.

Ken is an author of speculative fiction. A winner of the Nebula, Hugo, and World Fantasy awards, he is the author of silkpunk epic fantasy series Dandelion Dynasty and short story collections The Paper Menagerie and Other Stories and The Hidden Girl and Other Stories. Prior to writing full-time, Ken worked as a software engineer, corporate lawyer, and litigation consultant.

Have suggestions for future podcast guests (or other feedback)? Let us know here!

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (02:00) How Ken Liu became Ken Liu: A Saga

* (03:10) Time in the tech industry, interest in symbolic machines

* (04:40) Determining what stories to write, (07:00) art as failed communication

* (07:55) Law as creating abstract machines, importance of successful communication, stories in law

* (13:45) Misconceptions about science fiction

* (18:30) How we’ve been misinformed about literature and stories in school, stories as expressing multivalent truths, Dickens on narration (29:00)

* (31:20) Stories as imposing structure on the world

* (35:25) Silkpunk as aesthetic and writing approach

* (39:30) If modernity is a translated experience, what is it translated from? Alternative sources for the American pageant

* (47:30) The value of silkpunk for technologists and building the future

* (52:40) The engineer as poet

* (59:00) Technology language as constructing societies, what it is to be a technologist

* (1:04:00) The technology of language

* (1:06:10) The Google Wordcraft Workshop and co-writing with LaMDA

* (1:14:10) Possibilities and limitations of LMs in creative writing

* (1:18:45) Ken’s short fiction

* (1:19:30) Short fiction as a medium

* (1:24:45) “The Perfect Match” (from The Paper Menagerie and other stories)

* (1:34:00) Possibilities for better recommender systems

* (1:39:35) “Real Artists” (from The Hidden Girl and other stories)

* (1:47:00) The scaling hypothesis and creativity

* (1:50:25) “The Gods have not died in vain” & Moore’s Proof epigraph (The Hidden Girl)

* (1:53:10) More of The Singularity Trilogy (The Hidden Girl)

* (1:58:00) The role of science fiction today and how technologists should engage with stories

* (2:01:53) Outro

Links:

* Ken’s homepage

* The Dandelion Dynasty Series: Speaking Bones is out in paperback

* Books/Stories/Projects Mentioned

* “Evaluative Soliloquies” in Google Wordcraft

* The Paper Menagerie and Other Stories

* The Hidden Girl and Other Stories



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Feb 23, 2023
Hattie Zhou: Lottery Tickets and Algorithmic Reasoning in LLMs
6179

In episode 60 of The Gradient Podcast, Daniel Bashir speaks to Hattie Zhou.

Hattie is a PhD student at the Université de Montréal and Mila. Her research focuses on understanding how and why neural networks work, based on the belief that the performance of modern neural networks exceeds our understanding and that building more capable and trustworthy models requires bridging this gap. Prior to Mila, she spent time as a data scientist at Uber and did research with Uber AI Labs.

Have suggestions for future podcast guests (or other feedback)? Let us know here!

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (01:55) Hattie’s Origin Story, Uber AI Labs, empirical theory and other sorts of research

* (10:00) Intro to the Lottery Ticket Hypothesis & Deconstructing Lottery Tickets

* (14:30) Lottery tickets as lucky initialization

* (17:00) Types of masking and the “masking is training” claim

* (24:00) Type-0 masks and weight evolution over long training trajectories

* (27:00) Can you identify good masks or training trajectories a priori?

* (29:00) The role of signs in neural net initialization

* (35:27) The Supermask

* (41:00) Masks to probe pretrained models and model steerability

* (47:40) Fortuitous Forgetting in Connectionist Networks

* (54:00) Relationships to other work (double descent, grokking, etc.)

* (1:01:00) The iterative training process in fortuitous forgetting, scale and value of exploring alternatives

* (1:03:35) In-Context Learning and Teaching Algorithmic Reasoning

* (1:09:00) Learning + algorithmic reasoning, prompting strategy

* (1:13:50) What’s happening with in-context learning?

* (1:14:00) Induction heads

* (1:17:00) ICL and gradient descent

* (1:22:00) Algorithmic prompting vs discovery

* (1:24:45) Future directions for algorithmic prompting

* (1:26:30) Interesting work from NeurIPS 2022

* (1:28:20) Hattie’s perspective on scientific questions people pay attention to, underrated problems

* (1:34:30) Hattie’s perspective on ML publishing culture

* (1:42:12) Outro

Links:

* Hattie’s homepage and Twitter

* Papers

* Deconstructing Lottery Tickets: Zeros, signs, and the Supermask

* Fortuitous Forgetting in Connectionist Networks

* Teaching Algorithmic Reasoning via In-context Learning



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Feb 16, 2023
Kyunghyun Cho: Neural Machine Translation, Language, and Doing Good Science
7682

In episode 59 of The Gradient Podcast, Daniel Bashir speaks to Professor Kyunghyun Cho.

Professor Cho is an associate professor of computer science and data science at New York University and CIFAR Fellow of Learning in Machines & Brains. He is also a senior director of frontier research at the Prescient Design team within Genentech Research & Early Development. He was a research scientist at Facebook AI Research from 2017-2020 and a postdoctoral fellow at University of Montreal under the supervision of Prof. Yoshua Bengio after receiving his MSc and PhD degrees from Aalto University. He received the Samsung Ho-Am Prize in Engineering in 2021.

Have suggestions for future podcast guests (or other feedback)? Let us know here!

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (02:15) How Professor Cho got into AI, going to Finland for a PhD

* (06:30) Accidental and non-accidental parts of Prof Cho’s journey, the role of timing in career trajectories

* (09:30) Prof Cho’s M.Sc. thesis on Restricted Boltzmann Machines

* (17:00) The state of autodiff at the time

* (20:00) Finding non-mainstream problems and examining limitations of mainstream approaches, anti-dogmatism, Yoshua Bengio appreciation

* (24:30) Detaching identity from work, scientific training

* (26:30) The rest of Prof Cho’s PhD, the first ICLR conference, working in Yoshua Bengio’s lab

* (34:00) Prof Cho’s isolation during his PhD and its impact on his work—transcending insecurity and working on unsexy problems

* (41:30) The importance of identifying important problems and developing an independent research program, ceiling on the number of important research problems

* (46:00) Working on Neural Machine Translation, Jointly Learning to Align and Translate

* (1:01:45) What RNNs and earlier NN architectures can still teach us, why transformers were successful

* (1:08:00) Science progresses gradually

* (1:09:00) Learning distributed representations of sentences, extending the distributional hypothesis

* (1:21:00) Difficulty and limitations in evaluation—directions of dynamic benchmarks, trainable evaluation metrics

* (1:29:30) Mixout and AdapterFusion: fine-tuning and intervening on pre-trained models, pre-training as initialization, destructive interference

* (1:39:00) Analyzing neural networks as reading tea leaves

* (1:44:45) Importance of healthy skepticism for scientists

* (1:45:30) Language-guided policies and grounding, vision-language navigation

* (1:55:30) Prof Cho’s reflections on 2022

* (2:00:00) Obligatory ChatGPT content

* (2:04:50) Finding balance

* (2:07:15) Outro

Links:

* Professor Cho’s homepage and Twitter

* Papers

* M.Sc. thesis and PhD thesis

* NMT and attention

* Properties of NMT,

* Learning Phrase Representations

* Neural machine translation by jointly learning to align and translate

* More recent work

* Learning Distributed Representations of Sentences from Unlabelled Data

* Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models

* Generative Language-Grounded Policy in Vision-and-Language Navigation with Bayes’ Rule

* AdapterFusion: Non-Destructive Task Composition for Transfer Learning



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Feb 09, 2023
Steve Miller: Will AI Take Your Job? It's Not So Simple.
4225

In episode 58 of The Gradient Podcast, Daniel Bashir speaks to Professor Steve Miller.

Steve is a Professor Emeritus of Information Systems at Singapore Management University. Steve served as Founding Dean for the SMU School of Information Systems, and established and developed the technology core of SIS research and project capabilities in Cybersecurity, Data Management & Analytics, Intelligent Systems & Decision Analytics, and Software & Cyber-Physical Systems, as well as the management science oriented capability in Information Systems & Management. Steve works closely with a number of Singapore government ministries and agencies via steering committees, advisory boards, and advisory appointments.

Have suggestions for future podcast guests (or other feedback)? Let us know here!

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (02:40) Steve’s evolution of interests in AI, time in academia and industry

* (05:15) How different is this “industrial revolution”?

* (10:00) What new technologies enable, the human role in technology’s impact on jobs

* (11:35) Automation and augmentation and the realities of integrating new technologies in the workplace

* (21:50) Difficulties of applying AI systems in real-world contexts

* (32:45) Re-calibrating human work with intelligent machines

* (39:00) Steve’s thinking on the nature of human/machine intelligence, implications for human/machine hybrid work

* (47:00) Tradeoffs in using ML systems for automation/augmentation

* (52:40) Organizational adoption of AI and speed

* (1:01:55) Technology adoption is more than just a technology problem

* (1:04:50) Progress narratives, “safe to speed”

* (1:10:27) Outro

Links:

* Steve’s SMU Faculty Profile and Google Scholar

* Working with AI by Steve Miller and Tom Davenport



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Feb 02, 2023
Blair Attard-Frost: Canada’s AI strategy and the ethics of AI business practices
3502

In episode 57 of The Gradient Podcast, Andrey Kurenkov speaks to Blair Attard-Frost.

Note: this interview was recorded 8 months ago, and some aspects of Canada’s AI strategy have changed since then. It is still a good overview of AI governance and other topics, however.

Blair is a PhD Candidate at the University of Toronto’s Faculty of Information who researches the governance and management of artificial intelligence. More specifically, they are interested in the social construction of intelligence, unintelligence, and artificial intelligence, the relationship between organizational values and AI use, and the political economy, governance, and ethics of AI value chains. They integrate perspectives from service sciences, cognitive sciences, public policy, information management, and queer studies for their research.

Have suggestions for future podcast guests (or other feedback)? Let us know here!

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter or Mastodon

Outline:

* Intro

* Getting into AI research

* What is AI governance

* Canada’s AI strategy

* Other interests

Links:

* Once a promising leader, Canada’s artificial-intelligence strategy is now a fragmented laggard

* The Ethics of AI Business Practices: A Review of 47 Guidelines



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Jan 26, 2023
Linus Lee: At the Boundary of Machine and Mind
8926

In episode 56 of The Gradient Podcast, Daniel Bashir speaks to Linus Lee.

Linus is an independent researcher interested in the future of knowledge representation and creative work aided by machine understanding of language. He builds interfaces and knowledge tools that expand the domain of thoughts we can think and qualia we can feel. Linus has been writing online since 2014–his blog boasts half a million words–and has built well over 100 side projects. He has also spent time as a software engineer at Replit, Hack Club, and Spensa, and was most recently a Researcher in Residence at Betaworks in New York. 

Have suggestions for future podcast guests (or other feedback)? Let us know here!

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (02:00) Linus’s background and interests, vision-language models

* (07:45) Embodiment and limits for text-image

* (11:35) Ways of experiencing the world

* (16:55) Origins of the handle “thesephist”, languages

* (25:00) Math notation, reading papers

* (29:20) Operations on ideas

* (32:45) Overview of Linus’s research and current work

* (41:30) The Oak and Ink languages, programming languages

* (49:30) Personal search engines: Monocle and Reverie, what you can learn from personal data

* (55:55) Web browsers as mediums for thought

* (1:01:30) This AI Does Not Exist

* (1:03:05) Knowledge representation and notational intelligence

* Notation vs language

* (1:07:00) What notation can/should be

* (1:16:00) Inventing better notations and expanding human intelligence

* (1:23:30) Better interfaces between humans and LMs to provide precise control, inefficiency prompt engineering

* (1:33:00) Inexpressible experiences

* (1:35:42) Linus’s current work using latent space models

* (1:40:00) Ideas as things you can hold

* (1:44:55) Neural nets and cognitive computing

* (1:49:30) Relation to Hardware Lottery and AI accelerators

* (1:53:00) Taylor Swift Appreciation Session, mastery and virtuosity

* (1:59:30) Mastery/virtuosity and interfaces / learning curves

* (2:03:30) Linus’s stories, the work of fiction

* (2:09:00) Linus’s thoughts on writing

* (2:14:20) A piece of writing should be focused

* (2:16:15) On proving yourself

* (2:28:00) Outro

Links:

* Linus’s Twitter and website



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Jan 19, 2023
Suresh Venkatasubramanian: An AI Bill of Rights
6058

In episode 55 of The Gradient Podcast, Daniel Bashir speaks to Professor Suresh Venkatasubramanian.

Professor Venkatasubramanian is a Professor of Computer Science and Data Science at Brown University, where his research focuses on algorithmic fairness and the impact of automated decision-making systems in society. He recently served as Assistant Director for Science and Justice in the White House Office of Science and Technology Policy, where he co-authored the Blueprint for an AI Bill of Rights.

Have suggestions for future podcast guests (or other feedback)? Let us know here!

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (02:25) Suresh’s journey into AI and policymaking

* (08:00) The complex graph of designing and deploying “fair” AI systems

* (09:50) The Algorithmic Lens

* (14:55) “Getting people into a room” isn’t enough

* (16:30) Failures of incorporation

* (21:10) Trans-disciplinary vs interdisciplinary, the limiting nature of “my lane” / “your lane” thinking, going beyond existing scientific and philosophical ideas

* (24:50) The trolley problem is annoying, its usefulness and limitations

* (25:30) Breaking the frame of a discussion, self-driving doesn’t fit into the parameters of the trolley problem

* (28:00) Acknowledging frames and their limitations

* (29:30) Social science’s inclination to critique, flaws and benefits of solutionism

* (30:30) Computer security as a model for thinking about algorithmic protections, the risk of failure in policy

* (33:20) Suresh’s work on recourse

* (38:00) Kantian autonomy and the value of recourse, non-Western takes and issues with individual benefit/harm as the most morally salient question

* (41:00) Community as a valuable entity and its implications for algorithmic governance, surveillance systems

* (43:50) How Suresh got involved in policymaking / the OSTP

* (46:50) Gathering insights for the AI Bill of Rights Blueprint

* (51:00) One thing the Bill did miss… Struggles with balancing specificity and vagueness in the Bill

* (54:20) Should “automated system” be defined in legislation? Suresh’s approach and issues with the EU AI Act

* (57:45) The danger of definitions, overlap with chess world controversies

* (59:10) Constructive vagueness in law, partially theorized agreements

* (1:02:15) Digital privacy and privacy fundamentalism, focus on breach of individual autonomy as the only harm vector

* (1:07:40) GDPR traps, the “legacy problem” with large companies and post-hoc regulation

* (1:09:30) Considerations for legislating explainability

* (1:12:10) Criticisms of the Blueprint and Suresh’s responses

* (1:25:55) The global picture, AI legislation outside the US, legislation as experiment

* (1:32:00) Tensions in entering policy as an academic and technologist

* (1:35:00) Technologists need to learn additional skills to impact policy

* (1:38:15) Suresh’s advice for technologists interested in public policy

* (1:41:20) Outro

Links:

* Suresh is on Mastodon @geomblog@mastodon.social (and also Twitter)

* Suresh’s blog

* Blueprint for an AI Bill of Rights

* Papers

* Fairness and abstraction in sociotechnical systems

* A comparative study of fairness-enhancing interventions in machine learning

* The Philosophical Basis of Algorithmic Recourse

* Runaway Feedback Loops in Predictive Policing



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Jan 12, 2023
Pete Florence: Dense Visual Representations, NeRFs, and LLMs for Robotics
4524

In episode 54 of The Gradient Podcast, Andrey Kurenkov speaks with Pete Florence.

Note: this was recorded 2 months ago. Andrey should be getting back to putting out some episodes next year.

Pete Florence is a Research Scientist at Google Research on the Robotics at Google team inside Brain Team in Google Research. His research focuses on topics in robotics, computer vision, and natural language -- including 3D learning, self-supervised learning, and policy learning in robotics. Before Google, he finished his PhD in Computer Science at MIT with Russ Tedrake.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00:00) Intro

* (00:01:16) Start in AI

* (00:04:15) PhD Work with Quadcopters

* (00:08:40) Dense Visual Representations 

* (00:22:00) NeRFs for Robotics

* (00:39:00) Language Models for Robotics

* (00:57:00) Talking to Robots in Real Time

* (01:07:00) Limitations

* (01:14:00) Outro

Papers discussed:

* Aggressive quadrotor flight through cluttered environments using mixed integer programming 

* Integrated perception and control at high speed: Evaluating collision avoidance maneuvers without maps

* High-speed autonomous obstacle avoidance with pushbroom stereo

* Dense Object Nets: Learning Dense Visual Object Descriptors By and For Robotic Manipulation. (Best Paper Award, CoRL 2018)

* Self-Supervised Correspondence in Visuomotor Policy Learning (Best Paper Award, RA-L 2020 )

* iNeRF: Inverting Neural Radiance Fields for Pose Estimation.

* NeRF-Supervision: Learning Dense Object Descriptors from Neural Radiance Fields.

* Reinforcement Learning with Neural Radiance Fields

* Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language.

* Inner Monologue: Embodied Reasoning through Planning with Language Models

* Code as Policies: Language Model Programs for Embodied Control



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Jan 05, 2023
Melanie Mitchell: Abstraction and Analogy in AI
3287

Have suggestions for future podcast guests (or other feedback)? Let us know here!

In episode 53 of The Gradient Podcast, Daniel Bashir speaks to Professor Melanie Mitchell.

Professor Mitchell is the Davis Professor at the Santa Fe Institute. Her research focuses on conceptual abstraction, analogy-making, and visual recognition in AI systems. She is the author or editor of six books and her work spans the fields of AI, cognitive science, and complex systems. Her latest book is Artificial Intelligence: A Guide for Thinking Humans

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (02:20) Melanie’s intro to AI

* (04:35) Melanie’s intellectual influences, AI debates over time

* (10:50) We don’t have the right metrics for empirical study in AI

* (15:00) Why AI is Harder than we Think: the four fallacies

* (20:50) Difficulties in understanding what’s difficult for machines vs humans

* (23:30) Roles for humanlike and non-humanlike intelligence

* (27:25) Whether “intelligence” is a useful word

* (31:55) Melanie’s thoughts on modern deep learning advances, brittleness

* (35:35) Abstraction, Analogies, and their role in AI

* (38:40) Concepts as analogical and what that means for cognition

* (41:25) Where does analogy bottom out

* (44:50) Cognitive science approaches to concepts

* (45:20) Understanding how to form and use concepts is one of the key problems in AI

* (46:10) Approaching abstraction and analogy, Melanie’s work / the Copycat architecture

* (49:50) Probabilistic program induction as a promising approach to intelligence

* (52:25) Melanie’s advice for aspiring AI researchers

* (54:40) Outro

Links:

* Melanie’s homepage and Twitter

* Papers

* Difficulties in AI, hype cycles

* Why AI is Harder than we think

* The Debate Over Understanding in AI’s Large Language Models

* What Does It Mean for AI to Understand?

* Abstraction, analogies, and reasoning

* Abstraction and Analogy-Making in Artificial Intelligence

* Evaluating understanding on conceptual abstraction benchmarks



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Dec 15, 2022
Marc Bellemare: Distributional Reinforcement Learning
4342

Have suggestions for future podcast guests (or other feedback)? Let us know here!

In episode 52 of The Gradient Podcast, Daniel Bashir speaks to Professor Marc Bellemare.

Professor Bellemare leads the reinforcement learning efforts at Google Brain Montréal and is a core industry member at Mila, where he also holds the Canada CIFAR AI Chair. His PhD work, completed at the University of Alberta, proposed the use of Atari 2600 video games to benchmark progress in reinforcement learning (RL). He was a research scientist at DeepMind from 2013-2017, and his Arcade Learning Environment was very influential in DeepMind’s early RL research and remains one of the most widely-used RL benchmarks today. More recently he collaborated with Loon to deploy deep reinforcement learning to navigate stratospheric balloons. His book on distributional reinforcement learning, published by MIT Press, will be available in Spring 2023.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (03:10) Marc’s intro to AI and RL

* (07:00) Cross-pollination of deep learning research and RL in McGill and UDM

* (09:50) PhD work at U Alberta, continual learning, origins of the Arcade Learning Environment (ALE)

* (14:40) Challenges in the ALE, how the ALE drove RL research

* (23:10) Marc’s thoughts on the Avalon benchmark and what makes a good RL benchmark

* (28:00) Opinions on “Reward is Enough” and whether RL gets us to AGI

* (32:10) How Marc thinks about priors in learning, “reincarnating RL”

* (36:00) Distributional Reinforcement Learning and the problem of distribution estimation

* (43:00) GFlowNets and distributional RL

* (45:05) Contraction in RL and distributional RL, theory-practice gaps

* (52:45) Representation learning for RL

* (55:50) Structure of the value function space

* (1:00:00) Connections to open-endedness / evolutionary algorithms / curiosity

* (1:03:30) RL for stratospheric balloon navigation with Loon

* (1:07:30) New ideas for applying RL in the real world

* (1:10:15) Marc’s advice for young researchers

* (1:12:37) Outro

Links:

* Professor Bellemare’s Homepage

* Distributional Reinforcement Learning book

* Papers

* The Arcade Learning Environment: An Evaluation Platform for General Agents

* A Distributional Perspective on Reinforcement Learning

* Distributional Reinforcement Learning with Quantile Regression

* Distributional Reinforcement Learning with Linear Function Approximation

* Autonomous navigation of stratospheric balloons using reinforcement learning

* A Geometric Perspective on Optimal Representations for Reinforcement Learning

* The Value Function Polytope in Reinforcement Learning



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Dec 08, 2022
François Chollet: Keras and Measures of Intelligence
5330

In episode 51 of The Gradient Podcast, Daniel Bashir speaks to François Chollet.

François is a Senior Staff Software Engineer at Google and creator of the Keras deep learning library, which has enabled many people (including me) to get their hands dirty with the world of deep learning. Francois is also the author of the book “Deep Learning with Python.” Francois is interested in understanding the nature of abstraction and developing algorithms capable of autonomous abstraction and democratizing the development and deployment of AI technology, among other topics. 

Subscribe to The Gradient Podcast: Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro + Daniel has far too much fun pronouncing “François Chollet”

* (02:00) How François got into AI

* (08:00) Keras and user experience, library as product, progressive disclosure of complexity

* (18:20) François’ comments on the state of ML frameworks and what different frameworks are useful for

* (23:00) On the Measure of Intelligence: historical perspectives

* (28:00) Intelligence vs cognition, overlaps

* (32:30) How core is Core Knowledge?

* (39:15) Cognition priors, metalearning priors

* (43:10) Defining intelligence

* (49:30) François’ comments on modern deep learning systems

* (55:50) Program synthesis as a path to intelligence

* (1:02:30) Difficulties on program synthesis

* (1:09:25) François’ concerns about current AI

* (1:14:30) The need for regulation

* (1:16:40) Thoughts on longtermism

* (1:23:30) Where we can expect exponential progress in AI

* (1:26:35) François’ advice on becoming a good engineer

* (1:29:03) Outro

Links:

* François’ personal page

* On the Measure of Intelligence

* Keras



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Dec 01, 2022
Yoshua Bengio: The Past, Present, and Future of Deep Learning
4449

Happy episode 50! This week’s episode is being released on Monday to avoid Thanksgiving.

Have suggestions for future podcast guests (or other feedback)? Let us know here!

In episode 50 of The Gradient Podcast, Daniel Bashir speaks to Professor Yoshua Bengio.

Professor Bengio is a Full Professor at the Université de Montréal as well as Founder and Scientific Director of the MILA-Quebec AI Institute and the IVADO institute. Best known for his work in pioneering deep learning, Bengio was one of three awardees of the 2018 A.M. Turing Award along with Geoffrey Hinton and Yann LeCun. He is also the awardee of the prestigious Killam prize and, as of this year, the computer scientist with the highest h-index in the world.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (02:20) Journey into Deep Learning, PDP and Hinton

* (06:45) “Inspired by biology”

* (08:30) “Gradient Based Learning Applied to Document Recognition” and working with Yann LeCun

* (10:00) What Bengio learned from LeCun (and Larry Jackel) about being a research advisor

* (13:00) “Learning Long-Term Dependencies with Gradient Descent is Difficult,” why people don’t understand this paper well enough

* (18:15) Bengio’s work on word embeddings and the curse of dimensionality, “A Neural Probabilistic Language Model”

* (23:00) Adding more structure / inductive biases to LMs

* (24:00) The rise of deep learning and Bengio’s experience, “you have to be careful with inductive biases”

* (31:30) Bengio’s “Bayesian posture” in response to recent developments

* (40:00) Higher level cognition, Global Workspace Theory

* (45:00) Causality, actions as mediating distribution change

* (49:30) GFlowNets and RL

* (53:30) GFlowNets and actions that are not well-defined, combining with System II and modular, abstract ideas

* (56:50) GFlowNets and evolutionary methods

* (1:00:45) Bengio on Cartesian dualism

* (1:09:30) “When you are famous, it is hard to work on hard problems” (Richard Hamming) and Bengio’s response

* (1:11:10) Family background, art and its role in Bengio’s life

* (1:14:20) Outro

Links:

* Professor Bengio’s Homepage

* Papers

* Gradient-based learning applied to document recognition

* Learning Long-Term Dependencies with Gradient Descent is Difficult

* The Consciousness Prior

* Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Nov 21, 2022
Kanjun Qiu and Josh Albrecht: Generally Intelligent
2841

In episode 49 of The Gradient Podcast, Daniel Bashir speaks to Kanjun Qiu and Josh Albrecht.

Kanjun and Josh are CEO and CTO of Generally Intelligent, an AI startup aiming to develop general-purpose agents with human-like intelligence that can be safely deployed in the real world. Kanjun and Josh have played these roles together in the past as CEO and CTO of AI recruiting startup Sourceress. Kanjun is also involved with building the SF Neighborhood, and together with Josh invests in early-stage founders at Outset Capital.

Subscribe to The Gradient Podcast: Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (02:00) Kanjun’s and Josh’s intros to AI

* (06:45) How Kanjun and Josh met and started working together

* (08:40) Sourceress and AI in hiring, looking for unusual candidates

* (11:30) Generally Intelligent: origins and motivations

* (14:55) How Kanjun and Josh think about understanding the fundamentals of intelligence

* (17:20) AGI companies and long-term goals

* (19:20) How Kanjun and Josh think about intelligence + Generally Intelligent’s approach-agnosticism

* (22:30) Skill-acquisition efficiency

* (25:18) The Avalon Environment/Benchmark

* (27:40) Tasks with shared substrate

* (29:00) Blending of different approaches, baseline tuning

* (31:15) Approach to safety

* (33:33) Issues with interpretability + ML academic practices, ablations

* (36:30) Lessons about working with people, company culture

* (40:00) Human focus and diversity in companies, tech environment

* (44:10) Advice for potential (AI) founders

* (47:05) Outro

Links:

* Generally Intelligent

* Avalon: A Benchmark for RL Generalization

* Kanjun’s homepage

* Josh’s homepage



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Nov 17, 2022
Nathan Benaich: The State of AI Report
4731

* Have suggestions for future podcast guests (or other feedback)? Let us know here!

* Want to write with us? Send a pitch using this form :)

In episode 48 of The Gradient Podcast, Daniel Bashir speaks to Nathan Benaich.

Nathan is Founder and General Partner at Air Street Capital, a venture capital (VC) firm focused on investing in AI-first technology and life sciences companies. Nathan runs a number of communities focused on AI including the Research and Applied AI Summit and leads Spinout.fyi to improve the creation of university spinouts. Together with investor Ian Hogarth, Nathan co-authors the State of AI Report.

Subscribe to The Gradient Podcast: Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (02:40) Nathan’s interests in AI, life sciences, investing

* (04:10) Biotech and tech-bio companies

* (08:00) Why Nathan went into VC

* (10:15) Air Street Capital’s focus, investing in AI at an early stage

* (14:30) Why Nathan believes in specialism over generalism in AI, balancing consumer-focused ML with serious technical work

* (17:30) The European startup ecosystem

* (19:30) Spinouts and inventions born in academia

* (23:35) Spinout.fyi and issues with the European model

* (27:50) In the UK, only 4% of private AI companies are spinouts

* (30:00) Solutions

* (32:55) Origins of the State of AI Report

* (35:00) Looking back on Nathan’s 2021 predictions: Anthropic and JAX

* (39:00) AI semiconductors and the difficult reality

* (42:45) Nathan’s perspectives on AI safety/alignment

* (46:00) Long-termism and debates, safety research as an input into improving capabilities

* (49:50) Decentralization and the commercialization of open-source AI (Stability AI, Eleuther AI, etc.)

* (53:00) Second-order applications of diffusion models—chemistry, small molecule design, genome editors

* (59:00) Semiconductor restrictions and geopolitics

* (1:03:45) This year’s State of AI predictions

* (1:04:30) Trouble in semiconductor startup land

* (1:08:40) Predictions for AGI startups

* (1:14:20) How regulation of AGI startups might look

* (1:16:40) Nathan’s advice for founders, investors, and researchers

* (1:19:00) Outro

Links:

* State of AI Report

* Air Street Capital

* Spinout.fyi

* Rewriting the European spinout playbook

* Other sources mentioned

* Bridging the Gap: the case for an Incompletely Theorized Agreement on AI policy

* Choking Off China’s Access to the Future of AI

* China's New AI Governance Initiatives Shouldn't Be Ignored



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Nov 10, 2022
Matt Sheehan: China's AI Strategy and Governance
3990

* Have suggestions for future podcast guests (or other feedback)? Let us know here!

* Want to write with us? Send a pitch using this form :)

In episode 47 of The Gradient Podcast, Daniel Bashir speaks to Matt Sheehan.

Matt is a fellow at the Carnegie Endowment for International Peace, where he researches global technology with a focus on China. His writing and research explores China’s AI ecosystem, the future of China’s technology policy, and technology’s role in China’s political economy. Matt has also written for Foreign Affairs andThe Huffington Post, among other venues.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (02:28) Matt’s path to analyzing China’s AI governance

* (06:50) Matt’s experience understanding daily life in China and developing a bottom-up perspective

* (09:40) The development of government constraints in technology/AI in the US and China

* (12:40) Matt’s take on China’s priorities and motivations

* (17:00) How recent history influences China’s technology ambitions

* (17:30) Matt gives an overview of the Century of Humiliation

* (22:07) Adversarial perceptions, Xi Jinping’s brashness and its effect on discourse about International Relations, how this intersects with AI

* (24:40) Self-reliance and semiconductors. Was the recent chip ban the right move?

* (36:15) Matt’s question: could foundation models be trained on trailing edge chips if necessary? Limitations

* (38:30) Silicon Valley and China, The Transpacific Experiment and stories

* (46:17) 躺平 and how trends among youth in China interact with tech development, parallel trends in the US, work culture

* (51:05) China’s recent AI governance initiatives

* (56:25) Squaring China’s AI ethics stance with its use of AI

* (59:53) The US can learn from both Chinese and European regulators

* (1:02:03) How technologists should think about geopolitics and national tensions

* (1:05:43) Outro

Links:

* Matt’s Twitter

* China’s influences/ambitions

* Beijing’s Industrial Internet Ambitions

* Beijing’s Tech Ambitions: What Exactly Does It Want?

* US-China exchange and US responses

* Who benefits from American AI research in China?

* Two New Tech Bills Could Transform US Innovation

* Fear of Chinese Competition Won’t Preserve US Tech Leadership

* China’s tech standards, government initiatives and regulation in AI

* How US businesses view China’s growing influence in tech standards

* Three takeaways from China’s new standards strategy

* China’s new AI governance initiatives shouldn’t be ignored

* Semiconductors

* Biden’s Unprecedented Semiconductor Bet (a new piece from Matt!)

* Choking Off China’s Access to the Future of AI



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Nov 03, 2022
Luis Voloch: AI and Biology
2636

* Have suggestions for future podcast guests (or other feedback)? Let us know here!

* Want to write with us? Send a pitch using this form :)

In episode 46 of The Gradient Podcast, Daniel Bashir speaks to Luis Voloch.

Luis is co-founder of Immunai, a leading AI-led drug discovery company with over 140 employees and over one billion dollar valuation based out of NYC & Tel Aviv. Before Immunai, Luis was Head of Data Science and Machine Learning at ITC and worked at Palantir, where he worked on a variety of ML efforts. He did his studies and research in Math and CS in MIT.

He has also led AI, genomics, and software efforts at a number of other companies.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (02:25) Luis’s math background and getting into AI

* (06:35) Luis’s PhD experience, proving theoretical guarantees for recommendation systems

* (09:45) Why Luis left his PhD

* (15:45) Why Luis is excited about intersection of ML and biology

* (18:28) Challenges of applying AI to biology

* (22:55) Immunai

* (27:03) Challenges in building a biotech (or “tech-bio”) company

* (30:30) Research at Immunai, Neural Design for Genetic Perturbation Experiments

* (34:43) Interpretability in ML + biology

* (36:00) What Luis plans to do next

* (37:55) Luis’s advice for grad students / ML people interested in biology

* (40:00) Luis’s perspective on the future of AI + biology

* (43:10) Outro

Links:

* Luis on LinkedIn, Crunchbase

* Luis’s article on The convergence of deep neural networks and immunotherapy

* Papers

* Luis’s thesis

* Neural Design for Genetic Perturbation Experiments

* SystemMatch: optimizing preclinical drug models to human clinical outcomes via generative latent-space matching



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Oct 27, 2022
Zachary Lipton: Where Machine Learning Falls Short
6058

* Have suggestions for future podcast guests (or other feedback)? Let us know here!

* Want to write with us? Send a pitch using this form :)

In episode 45 of The Gradient Podcast, Daniel Bashir speaks to Zachary Lipton.

Zachary is an Assistant Professor of Machine Learning and Operations Research at Carnegie Mellon University, where he directs the Approximately Correct Machine Intelligence Lab. He holds a joint appointment between CMU’s ML Department and Tepper School of Business, and holds courtesy appointments at the Heinz School of Public Policy and the Software and Societal Systems Department. His research spans core ML methods and theory, applications in healthcare and natural language processing, and critical concerns about algorithms and their impacts.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (2:30) From jazz music to AI

* (4:40) “fix it in post” we had some technical issues :)

* (4:50) spicy takes, music and tech

* (7:30) Zack’s plan to get into grad school

* (9:45) selection bias in who gets faculty positions

* (12:20) The slow development of Zack’s wide range of research interests, Zack’s strengths coming into ML research

* (22:00) How Zack got attention early in his PhD

* (27:00) Should PhD students meander?

* (30:30) Faults in the QA model literature

* (35:00) Troubling Trends, antecedents in other fields

* (39:40) Pretraining LMs on nonsense words, new paper!

* The new paper (9/29)

* (47:25) what “BERT learns linguistic structure” misses

* (56:00) making causal claims in ML

* (1:05:40) domain-adversarial networks don’t solve distribution shift, underspecified problems

* (1:09:10) the benefits of floating between communities

* (1:14:30) advice on finding inspiration and learning

* (1:16:00) “fairness” and ML solutionism

* (1:21:10) epistemic questions, how we make determinations of fairness

* (1:29:00) Zack’s drives and motivations

Links:

* Zachary’s Homepage

* Papers

* DL Foundations, Distribution Shift, Generalization

* Does Pretraining for Summarization Require Knowledge Transfer?

* How Much Reading Does Reading Comprehension Require?

* Learning Robust Global Representations by Penalizing Local Predictive Power

* Detecting and Correcting for Label Shift with Black Box Predictors

* RATT

* Explanation/Interpretability/Fairness

* The Mythos of Model Interpretability

* Evaluating Explanations

* Does mitigating ML’s impact disparity require treatment disparity?

* Algorithmic Fairness from a Non-ideal Perspective

* Broader perspectives/critiques

* Troubling Trends in Machine Learning Scholarship

* When Curation Becomes Creation



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Oct 13, 2022
Stuart Russell: The Foundations of Artificial Intelligence
4247

Have suggestions for future podcast guests (or other feedback)? Let us know here!

In episode 44 of The Gradient Podcast, Daniel Bashir speaks to Professor Stuart Russell.

Stuart Russell is a Professor of Computer Science and the Smith-Zadeh Professor in Engineering at UC Berkeley, as well as an Honorary Fellow at Wadham College, Oxford. Professor Russell is the co-author with Peter Norvig of Artificial Intelligence: A Modern Approach, probably the most popular AI textbook in history. He is the founder and head of Berkeley’s Center for Human-Compatible Artificial Intelligence and recently authored the book Human Compatible: Artificial Intelligence and the Problem of Control. He has also served as co-chair on the World Economic Forum’s Council on AI and Robotics.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (02:45) Stuart’s introduction to AI

* (05:50) The two most important questions

* (07:25) Historical perspectives during Stuart’s PhD, agents and learning

* (14:30) Rationality and Intelligence, Bounded Optimality

* (20:30) Stuart’s work on Metareasoning

* (29:45) How does Metareasoning fit with Bounded Optimality?

* (37:39) “Civilization advances by reducing complex operations to be trivial”

* (39:20) Reactions to the rise of Deep Learning, connectionist/symbolic debates, probabilistic modeling

* (51:00) The Deep Learning and traditional AI communities will adopt each other’s ideas

* (51:55) Why Stuart finds the self-driving car arena interesting, Waymo’s old-fashioned AI approach

* (57:30) Effective generalization without the full expressive power of first-order logic—deep learning is a “weird way to go about it”

* (1:03:00) A very short shrift of Human Compatible and its ideas

* (1:10:42) Outro

Links:

* Stuart’s webpage

* Human Compatible page with reviews and interviews

* Papers mentioned

* Rationality and Intelligence

* Principles of Metareasoning



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Oct 06, 2022
Varun Ganapathi: AKASA, AI and Healthcare
3071

Have suggestions for future podcast guests (or other feedback)? Let us know here!

In episode 43 of The Gradient Podcast, Daniel Bashir speaks to Varun Ganapathi.

Varun is co-founder and CTO at AKASA, a company developing AI systems for healthcare operations. Varun’s previous entrepreneurial experience includes co-founding Numovis, a company focused on motion tracking and computer vision for user interaction that was acquired by Google, and Terminal.com, a browser-based IDE acquired by Udacity. Varun received his PhD from Stanford in 2014.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (1:50) Varun’s intro to AI

* (3:25) Working with Andrew Ng

* (7:37) Varun’s road to a PhD

* (13:20) Numovis, Google acquisition

* (15:00) Vacillating between research and entrepreneurship, Terminal.com

* (17:10) Roots of Varun’s interest in AI + healthcare

* (22:30) Research at AKASA, Deep Claim

* (25:45) Causality in claim denial, expert knowledge

* (25:52) we need to trademark the word “gradient”

* (28:20) AKASA’s Unified Automation, expert-in-the-loop

* (34:15) Varun’s near-term and long-term visions for AKASA

* (39:50) Towards “deploying a new version of healthcare”

* (42:25) Varun’s perspective on the role of AI in healthcare, the need for humans in the loop

* (47:02) Varun’s advice for aspiring AI researchers and practitioners

* (51:00) Outro

Links:

* AKASA’s Homepage

* Varun’s research

* AKASA is hiring!



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Sep 29, 2022
Joel Lehman: Open-Endedness and Evolution through Large Models
5933

Have suggestions for future podcast guests (or other feedback)? Let us know here!

In episode 42 of The Gradient Podcast, Daniel Bashir speaks to Joel Lehman.

Joel is a machine learning scientist interested in AI safety, reinforcement learning, and creative open-ended search algorithms. Joel has spent time at Uber AI Labs and OpenAI and is the co-author of the book Why Greatness Cannot be Planned: The Myth of the Objective

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (01:40) From game development to AI

* (03:20) Why evolutionary algorithms

* (10:00) Abandoning Objectives: Evolution Through the Search for Novelty Alone

* (24:10) Measuring a desired behavior post-hoc vs optimizing for that behavior

* (27:30) Neuroevolution through Augmenting Topologies (NEAT), Evolving a Diversity of Virtual Creatures

* (35:00) Humans are an inefficient solution to evolution’s objectives

* (47:30) Is embodiment required for understanding? Today’s LLMs as practical thought experiments in disembodied understanding

* (51:15) Evolution through Large Models (ELM)

* (1:01:07) ELM: Quality Diversity Algorithms, MAP-Elites, bootstrapping training data

* (1:05:25) Dimensions of Diversity in MAP-Elites, what is “interesting”?

* (1:12:30) ELM: Fine-tuning the language model

* (1:18:00) Results of invention in ELM, complexity in creatures

* (1:20:20) Future work building on ELM, key challenges in open-endedness

* (1:24:30) How Joel’s research affects his approach to life and work

* (1:28:30) Balancing novelty and exploitation in work

* (1:34:10) Intense competition in AI, Joel’s advice for people considering ML research

* (1:38:45) Daniel isn’t the worst interviewer ever

* (1:38:50) Outro

Links:

* Joel’s webpage

* Evolution through Large Models: The Tweet

* Papers:

* Abandoning Objectives: Evolution through the search for novelty alone

* Evolving a diversity of virtual creatures through novelty search and local competition

* Designing neural networks through neuroevolution

* Evolution through Large Models

* Resources for (aspiring) ML researchers!

* Cohere for AI

* ML Collective



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Sep 22, 2022
Andrew Feldman: Cerebras and AI Hardware
3407

Have suggestions for future podcast guests (or other feedback)? Let us know here!

In episode 42 of The Gradient Podcast, Daniel Bashir speaks to Andrew Feldman.

Andrew is the co-founder and CEO of Cerebras Systems, an AI accelerator company that has built the largest processor in the industry. Before Cerebras, Andrew co-founded and served as CEO of SeaMicro, which was acquired by AMD in 2012. He has also served in executive positions at Force10 Networks and RiverStone Networks.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (02:05) Andrew’s trajectory, from business school to Cerebras

* (10:00) The large model problem and Cerebras’ approach

* (19:50) Cerebras’s GPT-J announcement

* (22:20) Andrew explains weight streaming to Daniel

* (32:30) Andrew’s thoughts on the MLPerf benchmark

* (38:20) The venture landscape for AI accelerator companies

* (42:50) The hardware lottery, hardware support for sparsity

* (45:40) The CHIPS Act, NVIDIA China ban and the accelerator industry

* (48:00) Politics and Chips, US and China

* (52:20) Andrew’s perspective on tackling difficult problems

* (56:42) Outro

Links:

* Cerebras’ Homepage

* GPT-J Announcement

* TotalEnergies

* GlaxoSmithKline (GSK)

* Sources mentioned

* “Political Chips” by Ben Thompson (because Daniel’s a fanboy)

* Daniel’s conversation with Sara Hooker

* The Hardware Lottery



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Sep 15, 2022
Christopher Manning: Linguistics and the Development of NLP
4295

Have suggestions for future podcast guests (or other feedback)? Let us know here!

In episode 41 of The Gradient Podcast, Daniel Bashir speaks to Christopher Manning.

Chris is the Director of the Stanford AI Lab and an Associate Director of the Stanford Human-Centered Artificial Intelligence Institute. He is an ACM Fellow, an AAAI Fellow, and past President of ACL. His work currently focuses on applying deep learning to natural language processing; it has included tree recursive neural networks, GloVe, neural machine translation, and computational linguistic approaches to parsing, among other topics. 

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (02:40) Chris’s path to AI through computational linguistics

* (06:10) Human language acquisition vs. ML systems

* (09:20) Grounding language in the physical world, multimodality and DALL-E 2 vs. Imagen

* (26:15) Chris’s Linguistics PhD, splitting time between Stanford and Xerox PARC, corpus-based empirical NLP

* (34:45) Rationalist and Empiricist schools in linguistics, Chris’s work in 1990s

* (45:30) GloVe and Attention-based Neural Machine Translation, global and local context in language

* (50:30) Different Neural Architectures for Language, Chris’s work in the 2010s

* (58:00) Large-scale Pretraining, learning to predict the next word helps you learn about the world

* (1:00:00) mBERT’s Internal Representations vs. Universal Dependencies Taxonomy

* (1:01:30) The Need for Inductive Priors for Language Systems

* (1:05:55) Courage in Chris’s Research Career

* (1:10:50) Outro (yes Daniel does have a new outro with ~ music ~)

Links:

* Chris’s webpage

* Papers (1990s-2000s)

* Distributional Phrase Structure Induction

* Fast exact inference with a factored model for Natural Language Parsing

* Accurate Unlexicalized Parsing

* Corpus-based induction of syntactic structure

* Foundations of Statistical Natural Language Processing

* Papers (2010s):

* Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank

* GloVe

* Effective Approaches to Attention-based Neural Machine Translation

* Stanford’s Graph-based Neural dependency parser

* Papers (2020s)

* Electra: Pre-training text encoders as discriminators rather than generators

* Finding Universal Grammatical Relations in Multilingual BERT

* Emergent linguistic structure in artificial neural networks trained by self-supervision



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Sep 08, 2022
Jeff Clune: Genetic Algorithms, Quality-Diversity, Curiosity
4121

In episode 41 of The Gradient Podcast, Andrey Kurenkov speaks to Professor Jeff Clune.

Jeff is an Associate Professor of Computer Science at the University of British Columbia and a Faculty Member of the Vector Institute. Previously, he was a Research Team Leader at OpenAI and before that a Senior Research Manager and founding member of Uber AI Labs, and prior to that he was an Associate Professor in Computer Science at the University of Wyoming.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSS
Follow The Gradient on Twitter

The Gradient is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Outline:

(00:00) Intro
(01:05) Path into AI
(08:05) Studying biology with simulations
(10:30) Overview of genetic algorithms
(14:00) Evolving gaits with genetic algorithms
(20:00) Quality-Diversity Algorithms
(27:00) Evolving Soft Robots
(32:15) Genetic algorithms for studying Evolution
(39:30) Modularity for Catastrophic Forgetting
(45:15) Curiosity for Learning Diverse Skills
(51:15) Evolving Environments 
(58:3) The Surprising Creativity of Digital Evolution
(1:04:28) Hobbies Outside of Research
(1:07:25) Outro



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Sep 01, 2022
Catherine Olsson and Nelson Elhage: Anthropic, Understanding Transformers
2824

In episode 40 of The Gradient Podcast, Andrey Kurenkov speaks to Catherine Olsson and Nelson Elhage.

Catherine and Nelson are both members of technical staff at Anthropic, which is an AI safety and research company that’s working to build reliable, interpretable, and steerable AI systems. Catherine and Nelson’s focus is on interpretability, and we will discuss several of their recent works in this interview. 
Follow The Gradient on Twitter

Outline:

(00:00) Intro
(01:10) Catherine’s Path into AI
(03:25) Nelson’s Path into AI
(05:23) Overview of Anthropic
(08:21) Mechanistic Interpretability
(15:15) Transformer Circuits 
(21:30) Toy Transformer
(27:25) Induction Heads
(31:00) In-Context Learning
(35:10) Evidence for Induction Heads Enabling In-Context Learning
(39:30) What’s Next
(43:10) Replicating Results
(46:00) Outro

Links:

Anthropic

Zoom In: An Introduction to Circuits

Mechanistic Interpretability, Variables, and the Importance of Interpretable Bases

A Mathematical Framework for Transformer Circuits

In-context Learning and Induction Heads 

PySvelte



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Aug 26, 2022
Been Kim: Interpretable Machine Learning
4292

In episode 38 of The Gradient Podcast, Daniel Bashir speaks to Been Kim.

Been is a staff research scientist at Google Brain focused on interpretability–helping humans communicate with complex machine learning models by not only building tools but also studying how humans interact with these systems. She has served with a number of conferences including ICLR, NeurIPS, ICML, and AISTATS. She gave the keynotes at ICLR 2022, ECML 2020, and the G20 meeting in Argentina in 2018. Her work TCAV received the UNESCO Netexplo award, was featured at Google I/O 2019 and in Brian Christian’s book The Alignment Problem.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

(00:00) Intro(02:20) Path to AI/interpretability(06:10) The Progression of Been’s thinking / PhD thesis(11:30) Towards a Rigorous Science of Interpretable Machine Learning(24:52) Interpretability and Software Testing(27:00) Been’s ICLR Keynote and Human-Machine “Language”(37:30) TCAV(43:30) Mood Board Search and CAV Camera(48:00) TCAV’s Limitations and Follow-up Work(56:00) Acquisition of Chess Knowledge in AlphaZero(1:07:00) Daniel spends a very long time asking “what does it mean to you to be a researcher?”(1:09:00) The everyday drudgery, more lessons from Been(1:11:32) Outro

Links:

* Been’s website

* CAVcamera app



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Aug 18, 2022
Laura Weidinger: Ethical Risks, Harms, and Alignment of Large Language Models
3341

In episode 37 of The Gradient Podcast, Andrey Kurenkov speaks to Laura Weidinger

Laura is a senior research scientist at DeepMind, with her focus being AI ethics. Laura is also a PhD candidate at the University of Cambridge, studying philosophy of science and specifically approaches to measuring the ethics of AI systems. Previously Laura worked in technology policy at UK and EU levels, as a Policy executive at techUK. She then pivoted to cognitive science research and studied human learning at the Max Planck Institute for Human Development in Berlin, and was a Guest Lecturer at the Ada National College for Digital Skills. She received her Master's degree at the Humboldt University of Berlin, from the School of Mind and Brain, with her focus being Neuroscience/ Philosophy/ Cognitive science.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

(00:00) Intro(01:20) Path to AI(04:25) Research in Cognitive Science(06:40) Interest in AI Ethics(14:30) Ethics Considerations for Researchers(17:38) Ethical and social risks of harm from language models (25:30) Taxonomy of Risks posed by Language Models(27:33) Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models(33:25) Main Insight for Measuring Harm(35:40) The EU AI Act(39:10) Alignment of language agents(46:10) GPT-4Chan(53:40) Interests outside of AI(55:30) Outro

Links:

Ethical and social risks of harm from language models 

Taxonomy of Risks posed by Language Models

Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models

Alignment of language agents



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Aug 05, 2022
Sebastian Raschka: AI Education and Research
3812

In episode 36 of The Gradient Podcast, Daniel Bashir speaks to Sebastian Raschka.

Sebastian is an Assistant Professor of Statistics at the University of Wisconsin-Madison and Lead AI Educator at Lightning AI. He has written two bestselling books: Python Machine Learning and Machine Learning with PyTorch and Scikit-Learn.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Sections:

(00:00) Intro

(01:10) Sebastian’s intro to AI

(05:15) Sebastian’s process for learning new things

(12:15) Learning style varies with purpose

(16:10) Ordinal Regression

(31:00) Solving rank inconsistency with conditional probability

(35:00) Semi-Adversarial Networks

(44:15) Why Sebastian got into education

(52:45) Lightning AI

(1:00:00) Sebastian’s advice for educators

(1:03:30) Be cool like Sebastian and follow the Gradient

(1:03:40) Outro

Episode Links:

* Sebastian’s Homepage

* Sebastian’s Twitter

* Sebastian’s Books



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Jul 29, 2022
Lt. General Jack Shanahan: AI in the DoD, Project Maven, and Bridging the Tech-DoD Gap
1860

In episode 35 of The Gradient Podcast, guest host Sharon Zhou speaks to Jack Shanahan.

John (Jack) Shanahan was a Lieutenant General in the United States Air Force, retired after a 36-year military career. He was the inaugural Director of the Joint Artificial Intelligence Center (JAIC) in the U.S. Department of Defense (DoD). He was also the Director of the Algorithmic Warfare Cross-Functional Team (Project Maven). Currently, he is a Special Government Employee supporting the National Security Commission on Artificial Intelligence; serves on the Board of Advisors for the Common Mission Project; is an advisor to The Changing Character of War Centre (Oxford University); is a member of the CACI Strategic Advisory Group; and serves as an Advisor to the Military Cyber Professionals Association.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

(00:00) Intro(01:20) Introduction to Jack and Sharon(07:30) Project Maven(09:45) Relationship of Tech Sector and DoD(16:40) Need for AI in DoD(20:10) Bringing the tech-DoD divide(30:00) Conclusion

Episode Links:

* John N.T. Shanahan Wikipedia

* AI To Revolutionize U.S. Intelligence Community With General Shanahan

* Email: aidodconversations@gmail.com



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Jul 22, 2022
Sara Hooker: Cohere For AI, the Hardware Lottery, and DL Tradeoffs
3175

In episode 34 of The Gradient Podcast, Daniel Bashir speaks to Sara Hooker.

Sara (@sarahookr) leads Cohere for AI and is a former Research Scientist at Google. Sara founded a Bay Area non-profit called Delta Analytics, which works with non-profits and communities to build technical capacity. She is also one of the co-founders of the Trustworthy ML Initiative, an active participant of the ML Collective research group, and a host of the underrated ML podcast.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Sections:

(00:00) Intro(02:20) Podcasting gripe-fest(06:00) Sara’s journey: from economics to AI(09:15) Economics vs. AI research(12:45) The Hardware Lottery(19:15) Towards better hardware benchmarks(26:00) Getting away from the hardware lottery(32:30) The myth of compact, interpretable, robust, performant DNNs(35:15) Top-line metrics vs. disaggregated metrics(39:20) Solving memorization in the data pipeline, noisy examples(45:35) Cohere For AI

Episode Links:

* Cohere for AI

* Sara’s Homepage



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Jul 14, 2022
Lukas Biewald: Crowdsourcing at CrowdFlower and ML Tooling at Weights & Biases
2804

In episode 33 of The Gradient Podcast, Andrey Kurenkov speaks to Lukas Biewald.

Lukas Biewald is a co-founder of Weights and Biases, a company that creates developer tools for machine learning. Prior to that he was a co-founder and CEO of Figure Eight Inc. (formerly CrowdFlower) — an Internet company that collects training data for machine learning, which was sold for 300 million dollars.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (01:18) Start in AI

* (06:17) CrowdFlower / Crowdsourcing

* (21:06) Discovering Deep Learning 

* (25:10) Learning Deep Learning

* (32:50) Weights and Biases

* (37:30) State of Tooling for ML 

* (41:20) Exciting AI Trends

* (44:42) Interests outside of AI

* (45:40) Outro

Links:

* Lukas’s website

* Lukas’s GitHub

* Starting a Second Machine Learning Tools Company, Ten Years Later

* Confession of a so-called AI expert

* What I learned from looking at 200 machine learning tools

* CS 329S: Machine Learning Systems Design

* Designing Machine Learning Systems

Opportunity at Weights & Biases:



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Jul 07, 2022
Chip Huyen: Machine Learning Tools and Systems
2907

In episode 32 of The Gradient Podcast, Andrey Kurenkov speaks to Chip Huyen.

Chip Huyen is a co-founder of Claypot AI, a platform for real-time machine learning. Previously, she was with Snorkel AI and NVIDIA. She teaches CS 329S: Machine Learning Systems Design at Stanford. She has also written four bestselling Vietnamese books, and more recently her new O’Reilly book Designing Machine Learning Systems has just come out! 

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

She also maintains a Discord server with a focus on Machine Learning Systems.

Outline:

* (00:00) Intro

* (01:30) 3-year trip through Asia, Africa, and South America

* (04:00) Getting into AI at Stanford

* (11:30) Confession of a so-called AI expert

* (16:40) Academia vs Industry

* (17:40) Focus on ML Systems

* (20:00) ML in Academia vs Industry

* (28:15) Maturity of AI in Industry

* (31:45) ML Tools

* (37:20) Real Time ML

* (43:00) ML Systems Class and Book

Links:

* Chip’s website

* MLOps Discord server

* Confession of a so-called AI expert

* What I learned from looking at 200 machine learning tools

* CS 329S: Machine Learning Systems Design

* Designing Machine Learning Systems



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Jun 30, 2022
Preetum Nakkiran: An Empirical Theory of Deep Learning
5849

In episode 31 of The Gradient Podcast, Daniel Bashir speaks to Preetum Nakkiran.

Preetum is a Research Scientist at Apple, a Visiting Researcher at UCSD, and part of the NSF/Simons Collaboration on the Theoretical Foundations of Deep Learning. He completed his PhD at Harvard, where he co-founded the ML Foundations Group. Preetum’s research focuses on building conceptual tools for understanding learning systems.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Sections:

(00:00) Intro

(01:25) Getting into AI through Theoretical Computer Science (TCS)

(09:08) Lack of Motivation in TCS and Learning What Research Is

(12:12) Foundational vs Problem-Solving Research, Antipatterns in TCS

(16:30) Theory and Empirics in Deep Learning

(18:30) What is an Empirical Theory of Deep Learning

(28:21) Deep Double Descent

(40:00) Inductive Biases in SGD, epoch-wise double descent

(45:25) Inductive Biases Stick Around

(47:12) Deep Bootstrap

(59:40) Distributional Generalization - Paper Rejections

(1:02:30) Classical Generalization and Distributional Generalization

(1:16:46) Future Work: Studying Structure in Data

(1:20:51) The Tweets^TM

(1:37:00) Outro

Episode Links:

* Preetum’s Homepage

* Preetum’s PhD Thesis



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Jun 24, 2022
Max Woolf: Data Science at BuzzFeed and AI Content Generation
2903

In episode 30 of The Gradient Podcast, Daniel Bashir speaks to Max Woolf.

Max Woolf (@minimaxir) is currently a Data Scientist at BuzzFeed in San Francisco. Some work he’s done for BuzzFeed includes using StyleGAN to create AI-generated fake boyfriends and AI-generated art quizzes. In his free time, Max creates open source Python and R software on his GitHub. More recently, Max has been developing tooling for AI content generation, such as aitextgen for easy AI text generation.

Max’s projects are funded by his Patreon. If you have found anything on his website helpful, please help contribute!

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Sections:

(00:00) Intro

(01:20) Max’s Intro to Data Science and AI

(07:00) Software Engineering in Data Science, Max’s Perspectives

(09:00) Max’s Work at BuzzFeed

(23:10) Scaling, Inference, Large Models

(27:00) AI Content Generation

(30:45) Discourse About GPT-3

(34:30) AI Inventors

(38:35) Fun Projects and One-Offs: AI-generated Pokémon

(43:35) GPT-3-generated Discussion Topics

(46:30) Advice for Data Scientists

(48:10) BuzzFeed is Hiring :)

(48:20) Outro

Episode Links:

* Max’s Homepage

* Real-World Data Science



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Jun 16, 2022
Rosanne Liu: Paths in AI Research and ML Collective
4508

In episode 29 of The Gradient Podcast, we chat with Rosanne Liu. Rosanne is a research scientist in Google Brain, and co-founder and executive director of ML Collective, a nonprofit organization for open collaboration and accessible mentorship. Before that she was a founding member of Uber AI. Outside of research, she supports underrepresented communities, and organizes symposiums, workshops, and a weekly reading group “Deep Learning: Classics and Trends” since 2018. She is currently thinking deeply how to democratize AI research even further, and improve the diversity and fairness of the field, while working on multiple fronts of machine learning research including understanding training dynamics, rethinking model capacity and scaling. 

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (01:30) How did you go into AI / research

* (6:45) AI research: the unreasonably narrow path and how not to be miserable

* (16:30) ML Collective Overview

* (21:45) Deep Learning: Classics and Trends Reading Group

* (26:25) More details about ML Collective

* (39:35) ICLR 2022 Diversity, Equity & Inclusion

* (48:00) Narrowness vs Variety in research

* (57:20) Favorite Papers 

* (58:50) Measuring the Intrinsic Dimension of Objective Landscapes 

* (01:01:40) Natural Adversarial Objects 

* (01:03:00) Interests outside of AI - Writing

* (01:08:05) Interests outside of AI - Narrating Travels with Charley

* (01:13:22) Outro



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Jun 10, 2022
Ben Green: "Tech for Social Good" Needs to Do More
3190

In episode 28 of The Gradient Podcast, Daniel Bashir speaks to Ben Green, postdoctoral scholar in the Michigan Society of Fellows and Assistant Professor at the Gerald R. Ford School of Public Policy. Ben’s work focuses on the social and political impacts of government algorithms.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Sections:

(00:00) Intro

(02:00) Getting Started

(06:15) Soul Searching

(11:55) Decentering Algorithms

(19:50) The Future of the City

(27:25) Ethical Lip Service

(32:30) Ethics Research and Industry Incentives

(36:30) Broadening our Vision of Tech Ethics

(47:35) What Types of Research are Valued?

(52:40) Outro

Episode Links:

* Ben’s Homepage

* Algorithmic Realism

* Special Issue of the Journal of Social Computing



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Jun 02, 2022
Max Braun: Teaching Robots to Help People in their Everyday Lives
4971

In episode 27 of The Gradient Podcast, Andrey Kurenkov speaks to Max Braun, who leads the AI and robotics software engineering team at Everyday Robots, a moonshot to create robots that can learn to help people in their everyday lives. Previously, he worked on building frontier technology products as an entrepreneur and later at Google and X. Max enjoys exploring the intersection of art, technology, and philosophy as a writer and designer. 

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (01:00) Start in AI

* (5:45) Humanoid Research in Osaka

* (8:45) Joining Google X

* (12:15) Visual Search and Google Glass

* (15:58) Academia Industry Connection

* (18:45) Overview of Robotics Vision

* (26:00) Machine Learning for Robotics

* (32:00) Robot Platform

* (38:00) Development Process and History

* (43:35) QT-Opt

* (49:05) Imitation Learning

* (55:00) Simulation Platform

* (59:45) Sim2Real

* (1:07:00) SayCan

* (1:14:30) Current Objectives

* (1:17:00) Other Projects

* (1:21:40) Outro

Episode Links:

* Max Braun’s Website

* Everyday Robots

* Simulating Artificial Muscles for Controlling a Robotic Arm with Fluctuation

* Introducing the Everyday Robot Project

* Scalable Deep Reinforcement Learning from Robotic Manipulation (QT-Opt)

* Alphabet is putting its prototype robots to work cleaning up around Google’s offices

* Everyday robots are (slowly) leaving the lab

* Can Robots Follow Instructions for New Tasks?

* Efficiently Initializing Reinforcement Learning With Prior Policies

* Combining RL + IL at Scale

* Shortening the Sim to Real Gap

* Action-Image: Teaching Grasping in Sim

* SayCan

* I Made an AI Read Wittgenstein, Then Told It to Play Philosopher



Get full access to The Gradient at thegradientpub.substack.com/subscribe
May 26, 2022
Yejin Choi: Teaching Machines Common Sense and Morality
4584

In episode 26 of The Gradient Podcast, Daniel Bashir speaks to Yejin Choi, professor of Computer Science at the University of Washington, and senior research manager at the Allen Institute for Artificial Intelligence.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Sections:

(00:00) Intro

(01:42) Getting Started in the Winter

(09:17) Has NLP lost its way?

(12:57) The Mosaic Project, Commonsense Intelligence

(18:20) A Priori Intuitions and Common Sense in Machines

(21:35) Abductive Reasoning

(24:49) Benchmarking Common Sense

(33:00) DeLorean and COMET - Algorithms for Commonsense Reasoning

(43:30) Positive and Negative uses of Commonsense Models

(49:40) Moral Reasoning

(57:00) Descriptive Morality, Meta-Ethical Concerns

(1:04:30) Potential Misuse

(1:12:15) Future Work

(1:16:23) Outro

Episode Links:

* Yejin’s Homepage

* The Curious Case of Commonsense Intelligence in Daedalus

* Common Sense Comes Close to Computers in Quanta

* Can Computers Learn Common Sense? in The New Yorker



Get full access to The Gradient at thegradientpub.substack.com/subscribe
May 19, 2022
David Chalmers on AI and Consciousness
3129

In episode 25 of The Gradient Podcast, Daniel Bashir speaks to David Chalmers, professor of philosophy and Philosophy and Neural Science at New York University, and co-director of NYU’s center for Mind, Brain, and Consciousness.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Sections:

(00:00) Intro(00:42) “Today’s neural networks may be slightly conscious”(03:55) Openness to Machine Consciousness(09:37) Integrated Information Theory(18:41) Epistemic Gaps, Verbal Reports(25:52) Vision Models and Consciousness(33:37) Reasoning about Consciousness(38:20) Illusionism(41:30) Best Approaches to the Hard Problem(44:21) Panpsychism(46:35) Outro

Episode Links:

* Chalmers’ Homepage

* Facing Up to the Hard Problem of Consciousness (1995)

* Reality+: Virtual Worlds and the Problems of Philosophy

* Amanda Askell on AI Consciousness



Get full access to The Gradient at thegradientpub.substack.com/subscribe
May 12, 2022
Greg Yang on Communicating Research, Tensor Programs, and µTransfer
3950

In episode 24 of The Gradient Podcast, Daniel Bashir talks to Greg Yang, senior researcher at Microsoft Research. Greg Yang’s Tensor Programs framework recently received attention for its role in the µTransfer paradigm for tuning the hyperparameters of large neural networks.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Sections:

(00:00) Intro(01:50) Start in AI / Research(05:55) Fear of Math in ML(08:00) Presentation of Research(17:35) Path to MSR(21:20) Origin of Tensor Programs(26:05) Refining TP’s Presentation(39:55) The Sea of Garbage (Initializations) and the Oasis(47:44) Scaling Up Further(55:53) On Theory and Practice in Deep Learning(01:05:28) Outro

Episode Links:

* Greg’s Homepage

* Greg’s Twitter

* µP GitHub

* Visual Intro to Gaussian Processes (Distill)



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Apr 28, 2022
Nick Walton on AI Dungeon and the Future of AI in Games
3757

In the 23rd interview of The Gradient Podcast, we talk to Nick Walton, the CEO and Co-Founder of Latitude, the goal of which is to make AI a tool of freedom and creativity for everyone, and which is currently developing AI Dungeon and Voyage.

Subscribe to The Gradient Podcast:  

* Apple Podcasts

* Spotify

* Pocket Casts

* RSS

Outline:(00:00) Intro(01:38) How did you go into AI / research(3:50) Origin of AI Dungeon(8:15) What is a Dungeon Master(12:!5) Brief history of AI Dungeon(17:30) AI in videogames, past and future(23:35) Early days of AI Dungeon(29:45) AI Dungeon as a Creative Tool(33:50) Technical Aspects of AI Dungeon(39:15) Voyage(48:27) Visuals in AI Dungeon(50:45) How to Control AI in Games(55:38) Future of AI in Games(57:50) Funny stories(59:45) Interests / Hobbies(01:01:45) Outro



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Mar 24, 2022
Connor Leahy on EleutherAI, Replicating GPT-2/GPT-3, AI Risk and Alignment

In episode 22 of The Gradient Podcast, we talk to Connor Leahy, an AI researcher focused on AI alignment and a co-founder of EleutherAI.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Connor is an AI researcher working on understanding large ML models and aligning them to human values, and a cofounder of EleutherAI, a decentralized grassroots collective of volunteer researchers, engineers, and developers focused on AI alignment, scaling, and open source AI research. The organization's flagship project is the GPT-Neo family of models designed to match those developed by OpenAI as GPT-3.

Sections:

(00:00:00) Intro(00:01:20) Start in AI(00:08:00) Being excited about GPT-2 (00:18:00) Discovering AI safety and alignment(00:21:10) Replicating GPT-2 (00:27:30) Deciding whether to relese GPT-2 weights(00:36:15) Life after GPT-2 (00:40:05) GPT-3 and Start of Eleuther AI(00:44:40) Early days of Eleuther AI(00:47:30) Creating the Pile, GPT-Neo, Hacker Culture(00:55:10) Growth of Eleuther AI, Cultivating Community(01:02:22) Why release a large language model(01:08:50) AI Risk and Alignment(01:21:30) Worrying (or not) about Superhuman AI(01:25:20) AI alignment and releasing powerful models(01:32:08) AI risk and research norms(01:37:10) Work on GPT-3 replication, GPT-NeoX(01:38:48) Joining Eleuther AI(01:43:28) Personal interests / hobbies(01:47:20) Outro

Links to things discussed:

* Replicating GPT2–1.5B , GPT2, Counting Consciousness and the Curious Hacker

* The Hacker Learns to Trust

* The Pile

* GPT-Neo

* GPT-J

* Why Release a Large Language Model?

* What A Long, Strange Trip It's Been: EleutherAI One Year Retrospective

* GPT-NeoX



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Feb 03, 2022
Percy Liang on Machine Learning Robustness, Foundation Models, and Reproducibility
3054

In interview 21 of The Gradient Podcast, we talk to Percy Liang, an Associate Professor of Computer Science at Stanford University and the director of the Center for Research on Foundation Models.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Percy Liang’s research spans many topics in machine learning and natural language processing, including robustness, interpretability, semantics, and reasoning.  He is also a strong proponent of reproducibility through the creation of CodaLab Worksheets.  His awards include the Presidential Early Career Award for Scientists and Engineers (2019), IJCAI Computers and Thought Award (2016), an NSF CAREER Award (2016), a Sloan Research Fellowship (2015), a Microsoft Research Faculty Fellowship (2014), and multiple paper awards at ACL, EMNLP, ICML, and COLT.

Sections:

(00:00) Intro(01:21) Start in AI(06:52) Interest in Language(10:17) Start of PhD(12:22) Semantic Parsing(17:49) Focus on ML robustness(22:30) Foundation Models, model robustness(28:55) Foundation Model bias(34:48) Foundation Model research by academia(37:13) Current research interests(39:40) Surprising robustness results(44:24) Reproducibility and CodaLab(50:17) Outro

Papers / Topics discussed:

* On the Opportunities and Risks of Foundation Models

* Reflections on Foundation Models

* Removing spurious features can hurt accuracy and affect groups disproportionately.

* Selective classification can magnify disparities across groups

* Just train twice: improving group robustness without training group information

* LILA: language-informed latent actions

* CodaLab



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Jan 27, 2022
Eric Jang on Robots Learning at Google and Generalization via Language
5599

In episode 20 of The Gradient Podcast, we talk to Eric Jang, a research scientist on the Robotics team at Google.

Eric is a research scientist on the Robotics team at Google. His research focuses on answering whether big data and small algorithms can yield unprecedented capabilities in the domain of robotics, just like the computer vision, translation, and speech revolutions before it. Specifically, he focuses on robotic manipulation and self-supervised robotic learning.

Sections:

(00:00) Intro(00:50) Start in AI / Research(03:58) Joining Google Robotics(10:08) End to End Learning of Semantic Grasping(19:11) Off Policy RL for Robotic Grasping(29:33) Grasp2Vec(40:50) Watch, Try, Learn Meta-Learning from Demonstrations and Rewards(50:12) BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning(59:41) Just Ask for Generalization(01:09:02) Data for Robotics(01:22:10) To Understand Language is to Understand Generalization (01:32:38) Outro

Papers discussed:

* Grasp2Vec: Learning Object Representations from Self-Supervised Grasping

* End-to-End Learning of Semantic Grasping

* Deep reinforcement learning for vision-based robotic grasping: A simulated comparative evaluation of off-policy methods

* Watch, Try, Learn Meta-Learning from Demonstrations and Rewards

* BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning

* Just Ask for Generalization

* To Understand Language is to Understand Generalization

* Robots Must Be Ephemeralized



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Jan 08, 2022
Rishi Bommasani on Foundation Models
5631

In episode 19 of The Gradient Podcast, we talk to Rishi Bommasani, a Ph.D student at Stanford focused on Foundation Models.

Rish is a second-year Ph.D. student in the CS Department at Stanford, where he is advised by Percy Liang and Dan Jurafsky. His research focuses on understanding AI systems and their social impact, as well as using NLP to further scientific inquiry. Over the past year, he helped build and organize the Stanford Center for Research on Foundation Models (CRFM).

Sections:

(00:00:00) Intro(00:01:05) How did you get into AI?(00:09:55) Towards Understanding Position Embeddings(00:14:23) Long-Distance Dependencies don’t have to be Long(00:18:55) Interpreting Pretrained Contextualized Representations via Reductions to Static Embeddings(00:30:25) Masters Thesis(00:34:05) Start of PhD and work on foundation models(00:42:14) Why were people intested in foundation models(00:46:45) Formation of CRFM(00:51:25) Writing report on foundation models(00:56:33) Challenges in writing report(01:05:45) Response to reception(01:15:35) Goals of CRFM(01:25:43) Current research focus(01:30:35) Interests outside of research(01:33:10) Outro

Papers discussed:

* Towards Understanding Position Embeddings

* Long-Distance Dependencies don’t have to be Long: Simplifying through Provably (Approximately) Optimal Permutations

* Interpreting Pretrained Contextualized Representations via Reductions to Static Embeddings

* Generalized Optimal Linear Orders

* On the Opportunities and Risks of Foundation Models

* Reflections on Foundation Models



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Dec 09, 2021
Upol Ehsan on Human-Centered Explainable AI and Social Transparency
5716

In episode 18 of The Gradient Podcast, we talked to Upol Ehsan, an Explainable AI (XAI) researcher who combines his background in Philosophy and Human-Computer Interaction to address problems in XAI beyond just opening the "black-box" of AI. You can find his Gradient article charting this vision here.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Papers Discussed:

* Rationalization: A Neural Machine Translation Approach to Generating Natural Language Explanations

* Automated Rationale Generation: A Technique for Explainable AI and its Effects on Human Perceptions

* Human-centered Explainable AI: Towards a Reflective Sociotechnical Approach

* Expanding Explainability: Towards Social Transparency in AI systems

* The Who in Explainable AI: How AI Background Shapes Perceptions of AI Explanations

* Explainability Pitfalls: Beyond Dark Patterns in Explainable AI

Exciting update!

In addition to listening to the audio recording, you can now experience the interview over at The Gradient’s main site, with live captions and the ability to jump to certain sections.

In addition, you can experience it as follows: Interactive Transcript | Transcript PDF | Interview on YouTube

About Upol:Upol Ehsan cares about people first, technology second. He is a doctoral candidate in the School of Interactive Computing at Georgia Tech and an affiliate at the Data & Society Research Institute. Combining his expertise in AI and background in Philosophy, his work in Explainable AI (XAI) aims to foster a future where anyone, regardless of their background, can use AI-powered technology with dignity.

Actively publishing in top peer-reviewed venues like CHI, his work has received multiple awards and been covered in major media outlets. Bridging industry and academia, he serves on multiple program committees in HCI and AI conferences (e.g., DIS, IUI, NeurIPS) and actively connects these communities (e.g, the widely attended HCXAI workshop at CHI). By promoting equity and ethics in AI, he wants to ensure stakeholders who aren’t at the table do not end up on the menu. Outside research, he is an advisor for Aalor Asha, an educational institute he started for underprivileged children subjected to child labor.Follow him on Twitter: @upolehsan



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Dec 03, 2021
Miles Brundage on AI Misuse and Trustworthy AI
3243

In episode 17 of The Gradient Podcast, we talk to Miles Brundage, Head of Policy Research at OpenAI and a researcher passionate about the responsible governance of artificial intelligence.

Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSS

Links:

* Will Technology Make Work Better for Everyone?

* Economic Possibilities for Our Children: Artificial Intelligence and the Future of Work, Education, and Leisure

* Taking Superintelligence Seriously

* The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation

* Release Strategies and the Social Impact of Language Models

* All the News that’s Fit to Fabricate: AI-Generated Text as a Tool of Media Misinformation

* Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims

Timeline:

(00:00) Intro(01:05) How did you get started in AI(07:05) Writing about AI on Slate(09:20) Start of PhD(13:00) AI and the End of Scarcity(18:12) Malicious Uses of AI(28:00) GPT-2 and Publication Norms(33:30) AI-Generated Text for Misinformation(37:05) State of AI Misinformation(41:30) Trustworthy AI(48:50) OpenAI Policy Research Team(53:15) Outro

Miles is a researcher and research manager, and is passionate about the responsible governance of artificial intelligence. In 2018, he joined OpenAI, where he began as a Research Scientist and recently became Head of Policy Research. Before that, he was a Research Fellow at the University of Oxford's Future of Humanity Institute, where he is still a Research Affiliate).He also serves as a member of Axon's AI and Policing Technology Ethics Board. He completed a PhD in Human and Social Dimensions of Science and Technology from Arizona State University in 2019.

Podcast Theme: “MusicVAE: Trio 16-bar Sample #2” from "MusicVAE: A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music"

Hosted by Andrey Kurenkov (@andrey_kurenkov), a PhD student with the Stanford Vision and Learning Lab working on learning techniques for robotic manipulation and search.



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Nov 23, 2021
Jeffrey Ding on China's AI Dream, the AI 'Arms Race', and AI as a General Purpose Technology
4264

In episode 16 of The Gradient Podcast, we talk to Jeffrey Ding, a postdoctoral fellow at Stanford's Center for International Security and Cooperation

(01:35) Getting into AI research(04:20) Interest in studying China(06:50) Deciphering China’s AI Dream(23:25) Beyond the AI Arms Race(36:45) China's Current Capabilities in AI(46:45) AI as a General Purpose and Strategic Technology(57:38) ChinaAI Newsletter(01:04:20) Teaching AI to Policy People(01:06:30) Current Focus(01:09:10) Interests Outside of Work + Outro

Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSS

Jeffrey Ding (@jjding99) is a postdoctoral fellow at Stanford's Center for International Security and Cooperation, sponsored by Stanford's Institute for Human-Centered Artificial Intelligence, as well as a research affiliate with the Centre for the Governance of AI at the University of Oxford. His current research is centered on how technological change affects the rise and fall of great powers, with an eye toward the implications of advances in AI for a possible U.S.-China power transition. He also puts out the excellent ChinaAI newsletter, which has (sometimes) weekly translations of Chinese-language musings on AI and related topics.

Podcast Theme: “MusicVAE: Trio 16-bar Sample #2” from "MusicVAE: A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music"

Hosted by Andrey Kurenkov (@andrey_kurenkov), a PhD student with the Stanford Vision and Learning Lab working on learning techniques for robotic manipulation and search.



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Nov 18, 2021
Alex Tamkin on Self-Supervised Learning and Large Language Models
4241

In episode 15 of The Gradient Podcast, we talk to Stanford PhD Candidate Alex Tamkin

Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSS

Alex Tamkin is a fourth-year PhD student in Computer Science at Stanford, advised by Noah Goodman and part of the Stanford NLP Group. His research focuses on understanding, building, and controlling pretrained models, especially in domain-general or multimodal settings.

We discuss:

* Viewmaker Networks: Learning Views for Unsupervised Representation Learning

* DABS: A Domain-Agnostic Benchmark for Self-Supervised Learning

* On the Opportunities and Risks of Foundation Models

* Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models

* Mentoring, teaching and fostering a healthy and inclusive research culture

* Scientific communication and breaking down walls between fields

Podcast Theme: “MusicVAE: Trio 16-bar Sample #2” from "MusicVAE: A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music"



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Nov 11, 2021
Peter Henderson on RL Benchmarking, Climate Impacts of AI, and AI for Law
5322

In episode 14 of The Gradient Podcast, we interview Stanford PhD Candidate Peter Henderson

Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSS

Peter is a joint JD-PhD student at Stanford University advised by Dan Jurafsky. He is also an OpenPhilanthropy AI Fellow and a Graduate Student Fellow at the Regulation, Evaluation, and Governance Lab. His research focuses on creating robust decision-making systems, with three main goals: (1) use AI to make governments more efficient and fair; (2) ensure that AI isn’t deployed in ways that can harm people; (3) create new ML methods for applications that are beneficial to society.

Links:

* Reproducibility and Reusability in Deep Reinforcement Learning

* Benchmark Environments for Multitask Learning in Continuous Domains

* Reproducibility of Bench-marked Deep Reinforcement Learning Tasks for Continuous Control.

* Deep Reinforcement Learning that Matters

* Reproducibility and Replicability in Deep Reinforcement Learning (and Other Deep Learning Methods)

* Towards the Systematic Reporting of the Energy and Carbon Footprints of Machine Learning

* How blockers can turn into a paper: A retrospective on 'Towards The Systematic Reporting of the Energy and Carbon Footprints of Machine Learning

* When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataset

* How US law will evaluate artificial intelligence for Covid-19

Podcast Theme: “MusicVAE: Trio 16-bar Sample #2” from "MusicVAE: A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music"



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Oct 28, 2021
Chelsea Finn on Meta Learning & Model Based Reinforcement Learning
2980

In episode 13 of The Gradient Podcast, we interview Stanford Professor Chelsea Finn

Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSS

Chelsea is an Assistant Professor at Stanford University. Her lab, IRIS, studies intelligence through robotic interaction at scale, and is affiliated with SAIL and the Statistical ML Group. I also spend time at Google as a part of the Google Brain team. Her research deals with the capability of robots and other agents to develop broadly intelligent behavior through learning and interaction.

Links:

* Learning to Learn with Gradients

* Visual Model-Based Reinforcement Learning as a Path towards Generalist Robots

* RoboNet: A Dataset for Large-Scale Multi-Robot Learning

* Greedy Hierarchical Variational Autoencoders for Large-Scale Video

* Example-Driven Model-Based Reinforcement Learning for Solving Long-Horizon Visuomotor Tasks   

Podcast Theme: “MusicVAE: Trio 16-bar Sample #2” from "MusicVAE: A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music".



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Oct 14, 2021
Devi Parikh on Generative Art & AI for Creativity
3314

In episode 12 of The Gradient Podcast, we interview Devi Parikh, a professor at Georgia Tech whose research focuses on computer vision, natural language processing, embodied AI, human-AI collaboration, and AI for creativity.

Devi Parikh is an Associate Professor in the School of Interactive Computing at Georgia Tech, and a Research Scientist at Facebook AI Research (FAIR). Her research interests are in computer vision, natural language processing, embodied AI, human-AI collaboration, and AI for creativity. In the past, she has also been an Assistant Professor at Virginia Tech and a Research Assistant Professor at Toyota Technological Institute at Chicago (TTIC). She received her M.S. and Ph.D. degrees from the Electrical and Computer Engineering department at Carnegie Mellon University in 2007 and 2009 respectively. 

Links:

* Humans of AI Podcast

* Feel The Music: Automatically Generating A Dance For An Input Song

* Exploring Crowd Co-creation Scenarios for Sketches

* Neuro-Symbolic Generative Art: A Preliminary Study

* Creative Sketch Generation

Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSS

Podcast Theme: “MusicVAE: Trio 16-bar Sample #2” from "MusicVAE: A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music".



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Oct 01, 2021
Sergey Levine on Robot Learning & Offline RL
3231

In episode 11 of The Gradient Podcast, we interview Sergey Levine, a professor at Berkeley whose research focuses on machine learning for decision making and control, with an emphasis on deep learning and reinforcement learning algorithms for robotics.

Sergey Levine received a BS and MS in Computer Science from Stanford University in 2009, and a Ph.D. in Computer Science from Stanford University in 2014. He joined the faculty of the Department of Electrical Engineering and Computer Sciences at UC Berkeley in fall 2016. His work focuses on machine learning for decision making and control, with an emphasis on deep learning and reinforcement learning algorithms, and includes developing algorithms for end-to-end training of deep neural network policies that combine perception and control, scalable algorithms for inverse reinforcement learning, deep reinforcement learning algorithms, and more.

Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSS

Podcast Theme: “MusicVAE: Trio 16-bar Sample #2” from "MusicVAE: A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music".



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Sep 16, 2021
Jeremy Howard on Kaggle, Enlitic, and fast.ai
3483

In episode 10 of The Gradient Podcast, we interview data scientist, researcher, developer, educator, and entrepreneur Jeremy Howard.

Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSS

Jeremy Howard is a data scientist, researcher, developer, educator, and entrepreneur. Jeremy is a founding researcher at fast.ai, a research institute dedicated to making deep learning more accessible. He is also a Distinguished Research Scientist at the University of San Francisco, the chair of WAMRI, and is Chief Scientist at platform.ai. Previously, Jeremy was the founding CEO Enlitic, which was the first company to apply deep learning to medicine, was the President and Chief Scientist of the data science platform Kaggle, and was the founding CEO of two successful Australian startups.

Podcast Theme: “MusicVAE: Trio 16-bar Sample #2” from "MusicVAE: A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music".



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Sep 09, 2021
Evan Hubinger on Effective Altruism and AI Safety
4520

In episode 9 of The Gradient Podcast, we interview Yannic Kilcher, an AI researcher and educator.

Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSS

Evan is an AI safety veteran who’s done research at leading AI labs like OpenAI, and whose experience also includes stints at Google, Ripple andYelp. He currently works at the Machine Intelligence Research Institute (MIRI) as a Research Fellow, and joined me to talk about his views on AI safety, the alignment problem, and whether humanity is likely to survive the advent of superintelligent AI.

Podcast Theme: “MusicVAE: Trio 16-bar Sample #2” from "MusicVAE: A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music".



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Sep 03, 2021
Yannic Kilcher on Being an AI Researcher and Educator
2432

In episode 8 of The Gradient Podcast, we interview Yannic Kilcher, an AI researcher and educator.

Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSS

Yannic graduated with his PhD from ETH Zurich’s data analytics lab and is now the Chief Technology Officer of DeepJudge, a company building the next-generation AI-powered context-sensitive legal document processing platform. He famously produces videos on his very popular Youtube channel, which cover machine learning research papers, programming, and issues of the AI community, and the broader impact of AI in society.

Check out his Youtube channel here and follow him on Twitter here

Podcast Theme: “MusicVAE: Trio 16-bar Sample #2” from "MusicVAE: A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music".



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Aug 27, 2021
Alexander Veysov on Self-Teaching AI and Creating Open Speech-To-Text
2811

In episode 7 of The Gradient Podcast, we interview founder and owner of Silero Alexander Veysov. You can find a transcript of our conversation here, and the repositories for Open Speech To Text and Silero Models here.

Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSS

Alexander Veysov is the founder / owner of Silero, a small company building Speech / NLP enabled products, and author of Open STT. Silero has recently shipped its own Russian STT engine. Previously he worked in a then Moscow-based VC firm and Ponominalu.ru, a ticketing startup acquired by MTS (major Russian TelCo). He received his BA and MA in Economics in Moscow State University for International Relations (MGIMO). You can follow his channel in telegram (@snakers41).

Theme: “MusicVAE: Trio 16-bar Sample #2” from "MusicVAE: A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music".



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Aug 19, 2021
Yann LeCun on his Start in Research and Self-Supervised Learning
3348

In episode 6 of The Gradient Podcast, we interview Deep Learning pioneer Yann LeCun.Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Yann LeCun is the VP & Chief AI Scientist at Facebook and Silver Professor at NYU and he was also the founding Director of Facebook AI Research and of the NYU Center for Data Science. He famously pioneered the use of Convolutional Neural Nets for image processing in the 80s and 90s, and is generally regarded as one of the people whose work was pivotal to the Deep Learning revolution in AI. In fact he is the recipient of the 2018 ACM Turing Award (with Geoffrey Hinton and Yoshua Bengio) for "conceptual and engineering breakthroughs that have made deep neural networks a critical component of computing".

Theme: “MusicVAE: Trio 16-bar Sample #2” from "MusicVAE: A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music".



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Aug 05, 2021
Anna Rogers on the Flaws of Peer Review in AI
3790

In episode 5 of The Gradient Podcast, we interview NLP researcher Anna Rogers.

Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSS

Anna Rogers is a post-doctoral associate at the University of Copenhagen, working with the research groups in the Center for Social Data Science and Machine Learning section. Her main research area is Natural Language Processing, with focus on interpretability and evaluation of deep learning models. She is also known for her work on improving peer review in NLP and as organizer of the workshop on Insights from Negative Results in NLP.Check out her article What Can We Do To Improve Peer Review in NLP and her tutorial Reviewing NLP Research.



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Jul 30, 2021
Joel Simon on AI art and Artbreeder
3498

In episode 4 of The Gradient Podcast, we interview artist, engineer, and entrepreneur Joel Simon.

Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSS

Joel Simon is a multidisciplinary artist, toolmaker, and researcher. He studied computer science and art at Carnegie Mellon University, worked on bioinformatics at Rockefeller University, and most recently is the founder and director of Morphogen, a generative design company developing Artbreeder, a massively collaborative creative tool and network. His interests lie in the intersection of computer science, biology and design as well as furniture-design, collaborative-creativity, sculpture and game-design.

Theme: “MusicVAE: Trio 16-bar Sample #2” from "MusicVAE: A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music".



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Jul 20, 2021
Abubakar Abid on AI for Genomics, Gradio, and the Fatima Fellowship
2680

Subscribe to The Gradient Podcast: iTunes | RSS | Spotify

In episode 3 of The Gradient Podcast, we interview researcher and entrepreneur Abubakar Abid. Follow him on Twitter and check out the websites of his company Gradio and his side project the Fatima Fellowship.

Abubakar is an entrepreneur and researcher focused on AI and its applications to medicine. He is currently running the company Gradio, which is developing a product to generate an easy-to-use UI for any ML model, function, or API.  He is also running the Fatima Al-Fihri Predoctoral Fellowship, which is a 9-month program for computer science students from around the world who are planning on applying to PhD programs in the United States. 

Theme: “MusicVAE: Trio 16-bar Sample #2” from "MusicVAE: A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music".



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Jul 06, 2021
Helena Sarin on being an AI Artist
2449

In episode 2 of The Gradient Podcast, we interview AI artist Helana Sarin. Check out her work and follow her over at her Twitter @NeuralBricolage.

Helena Sarin is a visual artist and software engineer and is among the most prominent artists utilizing AI for their work. After she discovered GANs (Generative Adversarial Networks) several years ago and then made generative models her primary medium. She is a frequent speaker at ML/AI conferences, for the past year delivering invited talks at MIT, Library of Congress and Capitol One, and her artwork was exhibited at AI Art exhibitions in Zurich, Dubai, Oxford, Shanghai and Miami. Lastly, Helena was among the earliest authors to contribute a piece to The Gradient with 2018’s “Playing a game of GANstruction”, in which she described the process she follows to make her art.

Image credit: Happy Nation - The Waterpark By Helena Sarin

Theme: “MusicVAE: Trio 16-bar Sample #2” from "MusicVAE: A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music".



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Jun 19, 2021
Hello World from The Gradient Podcast!
1334

Hello world! After more than 3 years of publishing overviews and perspectives from the AI community on thegradient.pub, The Gradient now has a podcast. In this first episode our lead editors take a look back on how it all started, as well as a look ahead at where things are heading. Keep an eye out for our next episode, coming soon!

Theme: “MusicVAE: Trio 16-bar Sample #2” from "MusicVAE: A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music".



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Jun 01, 2021