The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

By Sam Charrington

Listen to a podcast, please open Podcast Republic app. Available on Google Play Store.

Category: Technology

Open in Apple Podcasts

Open RSS feed

Open Website

Rate for this podcast

Subscribers: 1778
Reviews: 3
Episodes: 649

 Dec 25, 2018

Elvis Alive
 Jul 20, 2018
Great podcast covering both business and technical aspects of ML and AI

A Podcast Republic user
 Jul 9, 2018


Machine learning and artificial intelligence are dramatically changing the way businesses operate and people live. The TWIML AI Podcast brings the top minds and ideas from the world of ML and AI to a broad and influential community of ML/AI researchers, data scientists, engineers and tech-savvy business and IT leaders. Hosted by Sam Charrington, a sought after industry analyst, speaker, commentator and thought leader. Technologies covered include machine learning, artificial intelligence, deep learning, natural language processing, neural networks, analytics, computer science, data science and more.

Episode Date
Towards Improved Transfer Learning with Hugo Larochelle - #631
Today we’re joined by Hugo Larochelle, a research scientist at Google Deepmind. In our conversation with Hugo, we discuss his work on transfer learning, understanding the capabilities of deep learning models, and creating the Transactions on Machine Learning Research journal. We explore the use of large language models in NLP, prompting, and zero-shot learning. Hugo also shares insights from his research on neural knowledge mobilization for code completion and discusses the adaptive prompts used in their system.  The complete show notes for this episode can be found at
May 29, 2023
Language Modeling With State Space Models with Dan Fu - #630
Today we’re joined by Dan Fu, a PhD student at Stanford University. In our conversation with Dan, we discuss the limitations of state space models in language modeling and the search for alternative building blocks that can help increase context length without being computationally infeasible. Dan walks us through the H3 architecture and Flash Attention technique, which can reduce the memory footprint of a model and make it feasible to fine-tune. We also explore his work on improving language models using synthetic languages, the issue of long sequence length affecting both training and inference in models, and the hope for finding something sub-quadratic that can perform language processing more effectively than the brute force approach of attention. The complete show notes for this episode can be found at
May 22, 2023
Building Maps and Spatial Awareness in Blind AI Agents with Dhruv Batra - #629
Today we continue our coverage of ICLR 2023 joined by Dhruv Batra, an associate professor at Georgia Tech and research director of the Fundamental AI Research (FAIR) team at META. In our conversation, we discuss Dhruv’s work on the paper Emergence of Maps in the Memories of Blind Navigation Agents, which won an Outstanding Paper Award at the event. We explore navigation with multilayer LSTM and the question of whether embodiment is necessary for intelligence. We delve into the Embodiment Hypothesis and the progress being made in language models and caution on the responsible use of these models. We also discuss the history of AI and the importance of using the right data sets in training. The conversation explores the different meanings of "maps" across AI and cognitive science fields, Dhruv’s experience in navigating mapless systems, and the early discovery stages of memory representation and neural mechanisms. The complete show notes for this episode can be found at
May 15, 2023
AI Agents and Data Integration with GPT and LLaMa with Jerry Liu - #628
Today we’re joined by Jerry Liu, co-founder and CEO of Llama Index. In our conversation with Jerry, we explore the creation of Llama Index, a centralized interface to connect your external data with the latest large language models. We discuss the challenges of adding private data to language models and how Llama Index connects the two for better decision-making. We discuss the role of agents in automation, the evolution of the agent abstraction space, and the difficulties of optimizing queries over large amounts of complex data. We also discuss a range of topics from combining summarization and semantic search, to automating reasoning, to improving language model results by exploiting relationships between nodes in data.  The complete show notes for this episode can be found at
May 08, 2023
Hyperparameter Optimization through Neural Network Partitioning with Christos Louizos - #627
Today we kick off our coverage of the 2023 ICLR conference joined by Christos Louizos, an ML researcher at Qualcomm Technologies. In our conversation with Christos, we explore his paper Hyperparameter Optimization through Neural Network Partitioning and a few of his colleague's works from the conference. We discuss methods for speeding up attention mechanisms in transformers, scheduling operations for computation graphs, estimating channels in indoor environments, and adapting to distribution shifts in test time with neural network modules. We also talk through the benefits and limitations of federated learning, exploring sparse models, optimizing communication between servers and devices, and much more.  The complete show notes for this episode can be found at
May 01, 2023
Are LLMs Overhyped or Underappreciated? with Marti Hearst - #626
Today we’re joined by Marti Hearst, Professor at UC Berkeley. In our conversation with Marti, we explore the intricacies of AI language models and their usefulness in improving efficiency but also their potential for spreading misinformation. Marti expresses skepticism about whether these models truly have cognition compared to the nuance of the human brain. We discuss the intersection of language and visualization and the need for specialized research to ensure safety and appropriateness for specific uses. We also delve into the latest tools and algorithms such as Copilot and Chat GPT, which enhance programming and help in identifying comparisons, respectively. Finally, we discuss Marti’s long research history in search and her breakthrough in developing a standard interaction that allows for finding items on websites and library catalogs. The complete show notes for this episode can be found at
Apr 24, 2023
Are Large Language Models a Path to AGI? with Ben Goertzel - #625
Today we’re joined by Ben Goertzel, CEO of SingularityNET. In our conversation with Ben, we explore all things AGI, including the potential scenarios that could arise with the advent of AGI and his preference for a decentralized rollout comparable to the internet or Linux. Ben shares his research in bridging neural nets, symbolic logic engines, and evolutionary programming engines to develop a common mathematical framework for AI paradigms. We also discuss the limitations of Large Language Models and the potential of hybridizing LLMs with other AGI approaches. Additionally, we chat about their work using LLMs for music generation and the limitations of formalizing creativity. Finally, Ben discusses his team's work with the OpenCog Hyperon framework and Simuli to achieve AGI, and the potential implications of their research in the future. The complete show notes for this episode can be found at
Apr 17, 2023
Open Source Generative AI at Hugging Face with Jeff Boudier - #624
Today we’re joined by Jeff Boudier, head of product at Hugging Face 🤗. In our conversation with Jeff, we explore the current landscape of open-source machine learning tools and models, the recent shift towards consumer-focused releases, and the importance of making ML tools accessible. We also discuss the growth of the Hugging Face Hub, which currently hosts over 150k models, and how formalizing their collaboration with AWS will help drive the adoption of open-source models in the enterprise.   The complete show notes for this episode can be found at
Apr 11, 2023
Generative AI at the Edge with Vinesh Sukumar - #623
Today we’re joined by Vinesh Sukumar, a senior director and head of AI/ML product management at Qualcomm Technologies. In our conversation with Vinesh, we explore how mobile and automotive devices have different requirements for AI models and how their AI stack helps developers create complex models on both platforms. We also discuss the growing interest in text-based input and the shift towards transformers, generative content, and recommendation engines. Additionally, we explore the challenges and opportunities for ML Ops investments on the edge, including the use of synthetic data and evolving models based on user data. Finally, we delve into the latest advancements in large language models, including Prometheus-style models and GPT-4. The complete show notes for this episode can be found at
Apr 03, 2023
Runway Gen-2: Generative AI for Video Creation with Anastasis Germanidis - #622
Today we’re joined by Anastasis Germanidis, Co-Founder and CTO of RunwayML. Amongst all the product and model releases over the past few months, Runway threw its hat into the ring with Gen-1, a model that can take still images or video and transform them into completely stylized videos. They followed that up just a few weeks later with the release of Gen-2, a multimodal model that can produce a video from text prompts. We had the pleasure of chatting with Anastasis about both models, exploring the challenges of generating video, the importance of alignment in model deployment, the potential use of RLHF, the deployment of models as APIs, and much more! The complete show notes for this episode can be found at
Mar 27, 2023
Watermarking Large Language Models to Fight Plagiarism with Tom Goldstein - 621
Today we’re joined by Tom Goldstein, an associate professor at the University of Maryland. Tom’s research sits at the intersection of ML and optimization and has previously been featured in the New Yorker for his work on invisibility cloaks, clothing that can evade object detection. In our conversation, we focus on his more recent research on watermarking LLM output. We explore the motivations behind adding these watermarks, how they work, and different ways a watermark could be deployed, as well as political and economic incentive structures around the adoption of watermarking and future directions for that line of work. We also discuss Tom’s research into data leakage, particularly in stable diffusion models, work that is analogous to recent guest Nicholas Carlini’s research into LLM data extraction. 
Mar 20, 2023
Does ChatGPT “Think”? A Cognitive Neuroscience Perspective with Anna Ivanova - #620
Today we’re joined by Anna Ivanova, a postdoctoral researcher at MIT Quest for Intelligence. In our conversation with Anna, we discuss her recent paper Dissociating language and thought in large language models: a cognitive perspective. In the paper, Anna reviews the capabilities of LLMs by considering their performance on two different aspects of language use: 'formal linguistic competence', which includes knowledge of rules and patterns of a given language, and 'functional linguistic competence', a host of cognitive abilities required for language understanding and use in the real world. We explore parallels between linguistic competence and AGI, the need to identify new benchmarks for these models, whether an end-to-end trained LLM can address various aspects of functional competence, and much more!  The complete show notes for this episode can be found at
Mar 13, 2023
Robotic Dexterity and Collaboration with Monroe Kennedy III - #619
Today we’re joined by Monroe Kennedy III, an assistant professor at Stanford, director of the Assistive Robotics and Manipulation Lab, and a national director of Black in Robotics. In our conversation with Monroe, we spend some time exploring the robotics landscape, getting Monroe’s thoughts on the current challenges in the field, as well as his opinion on choreographed demonstrations like the dancing Boston Robotics machines. We also dig into his work around two distinct threads, Robotic Dexterity, (what does it take to make robots capable of doing manipulation useful tasks with and for humans?) and Collaborative Robotics (how do we go beyond advanced autonomy in robots towards making effective robotic teammates capable of working with human counterparts?). Finally, we discuss DenseTact, an optical-tactile sensor capable of visualizing the deformed surface of a soft fingertip and using that image in a neural network to perform calibrated shape reconstruction and 6-axis wrench estimation. The complete show notes for this episode can be found at
Mar 06, 2023
Privacy and Security for Stable Diffusion and LLMs with Nicholas Carlini - #618
Today we’re joined by Nicholas Carlini, a research scientist at Google Brain. Nicholas works at the intersection of machine learning and computer security, and his recent paper “Extracting Training Data from LLMs” has generated quite a buzz within the ML community. In our conversation, we discuss the current state of adversarial machine learning research, the dynamic of dealing with privacy issues in black box vs accessible models, what privacy attacks in vision models like diffusion models look like, and the scale of “memorization” within these models. We also explore Nicholas’ work on data poisoning, which looks to understand what happens if a bad actor can take control of a small fraction of the data that an ML model is trained on. The complete show notes for this episode can be found at
Feb 27, 2023
Understanding AI’s Impact on Social Disparities with Vinodkumar Prabhakaran - #617
Today we’re joined by Vinodkumar Prabhakaran, a Senior Research Scientist at Google Research. In our conversation with Vinod, we discuss his two main areas of research, using ML, specifically NLP, to explore these social disparities, and how these same social disparities are captured and propagated within machine learning tools. We explore a few specific projects, the first using NLP to analyze interactions between police officers and community members, determining factors like level of respect or politeness and how they play out across a spectrum of community members. We also discuss his work on understanding how bias creeps into the pipeline of building ML models, whether it be from the data or the person building the model. Finally, for those working with human annotators, Vinod shares his thoughts on how to incorporate principles of fairness to help build more robust models.  The complete show notes for this episode can be found at
Feb 20, 2023
AI Trends 2023: Causality and the Impact on Large Language Models with Robert Osazuwa Ness - #616
Today we’re joined by Robert Osazuwa Ness, a senior researcher at Microsoft Research, to break down the latest trends in the world of causal modeling. In our conversation with Robert, we explore advances in areas like causal discovery, causal representation learning, and causal judgements. We also discuss the impact causality could have on large language models, especially in some of the recent use cases we’ve seen like Bing Search and ChatGPT. Finally, we discuss the benchmarks for causal modeling, the top causality use cases, and the most exciting opportunities in the field.   The complete show notes for this episode can be found at
Feb 14, 2023
Data-Centric Zero-Shot Learning for Precision Agriculture with Dimitris Zermas - #615
Today we’re joined by Dimitris Zermas, a principal scientist at agriscience company Sentera. Dimitris’ work at Sentera is focused on developing tools for precision agriculture using machine learning, including hardware like cameras and sensors, as well as ML models for analyzing the vast amount of data they acquire. We explore some specific use cases for machine learning, including plant counting, the challenges of working with classical computer vision techniques, database management, and data annotation. We also discuss their use of approaches like zero-shot learning and how they’ve taken advantage of a data-centric mindset when building a better, more cost-efficient product.
Feb 06, 2023
How LLMs and Generative AI are Revolutionizing AI for Science with Anima Anandkumar - #614
Today we’re joined by Anima Anandkumar, Bren Professor of Computing And Mathematical Sciences at Caltech and Sr Director of AI Research at NVIDIA. In our conversation, we take a broad look at the emerging field of AI for Science, focusing on both practical applications and longer-term research areas. We discuss the latest developments in the area of protein folding, and how much it has evolved since we first discussed it on the podcast in 2018, the impact of generative models and stable diffusion on the space, and the application of neural operators. We also explore the ways in which prediction models like weather models could be improved, how foundation models are helping to drive innovation, and finally, we dig into MineDojo, a new framework built on the popular Minecraft game for embodied agent research, which won a 2022 Outstanding Paper Award at NeurIPS.  The complete show notes for this episode can be found at
Jan 30, 2023
AI Trends 2023: Natural Language Proc - ChatGPT, GPT-4 and Cutting Edge Research with Sameer Singh - #613
Today we continue our AI Trends 2023 series joined by Sameer Singh, an associate professor in the department of computer science at UC Irvine and fellow at the Allen Institute for Artificial Intelligence (AI2). In our conversation with Sameer, we focus on the latest and greatest advancements and developments in the field of NLP, starting out with one that took the internet by storm just a few short weeks ago, ChatGPT. We also explore top themes like decomposed reasoning, causal modeling in NLP, and the need for “clean” data. We also discuss projects like HuggingFace’s BLOOM, the debacle that was the Galactica demo, the impending intersection of LLMs and search, use cases like Copilot, and of course, we get Sameer’s predictions for what will happen this year in the field. The complete show notes for this episode can be found at
Jan 23, 2023
AI Trends 2023: Reinforcement Learning - RLHF, Robotic Pre-Training, and Offline RL with Sergey Levine - #612
Today we’re taking a deep dive into the latest and greatest in the world of Reinforcement Learning with our friend Sergey Levine, an associate professor, at UC Berkeley. In our conversation with Sergey, we explore some game-changing developments in the field including the release of ChatGPT and the onset of RLHF. We also explore more broadly the intersection of RL and language models, as well as advancements in offline RL and pre-training for robotics models, inverse RL, Q learning, and a host of papers along the way. Finally, you don’t want to miss Sergey’s predictions for the top developments of the year 2023!  The complete show notes for this episode can be found at
Jan 16, 2023
Supporting Food Security in Africa Using ML with Catherine Nakalembe - #611
Today we conclude our coverage of the 2022 NeurIPS series joined by Catherine Nakalembe, an associate research professor at the University of Maryland, and Africa Program Director under NASA Harvest. In our conversation with Catherine, we take a deep dive into her talk from the ML in the Physical Sciences workshop, Supporting Food Security in Africa using Machine Learning and Earth Observations. We discuss the broad challenges associated with food insecurity, as well as Catherine’s role and the priorities of Harvest Africa, a program focused on advancing innovative satellite-driven methods to produce automated within-season crop type and crop-specific condition products that support agricultural assessments. We explore some of the technical challenges of her work, including the limited, but growing, access to remote sensing and earth observation datasets and how the availability of that data has changed in recent years, the lack of benchmarks for the tasks she’s working on, examples of how they’ve applied techniques like multi-task learning and task-informed meta-learning, and much more.  The complete show notes for this episode can be found at
Jan 09, 2023
Service Cards and ML Governance with Michael Kearns - #610
Today we conclude our AWS re:Invent 2022 series joined by Michael Kearns, a professor in the department of computer and information science at UPenn, as well as an Amazon Scholar. In our conversation, we briefly explore Michael’s broader research interests in responsible AI and ML governance and his role at Amazon. We then discuss the announcement of service cards, and their take on “model cards” at a holistic, system level as opposed to an individual model level. We walk through the information represented on the cards, as well as explore the decision-making process around specific information being omitted from the cards. We also get Michael’s take on the years-old debate of algorithmic bias vs dataset bias, what some of the current issues are around this topic, and what research he has seen (and hopes to see) addressing issues of “fairness” in large language models.   The complete show notes for this episode can be found at
Jan 02, 2023
Reinforcement Learning for Personalization at Spotify with Tony Jebara - #609
Today we continue our NeurIPS 2022 series joined by Tony Jebara, VP of engineering and head of machine learning at Spotify. In our conversation with Tony, we discuss his role at Spotify and how the company’s use of machine learning has evolved over the last few years, and the business value of machine learning, specifically recommendations, hold at the company. We dig into his talk on the intersection of reinforcement learning and lifetime value (LTV) at Spotify, which explores the application of Offline RL for user experience personalization. We discuss the various papers presented in the talk, and how they all map toward determining and increasing a user’s LTV.  The complete show notes for this episode can be found at
Dec 29, 2022
Will ChatGPT take my job? - #608
More than any system before it, ChatGPT has tapped into our enduring fascination with artificial intelligence, raising in a more concrete and present way important questions and fears about what AI is capable of and how it will impact us as humans. One of the concerns most frequently voiced, whether sincerely or cloaked in jest, is how ChatGPT or systems like it, will impact our livelihoods. In other words, “will ChatGPT put me out of a job???” In this episode of the podcast, I seek to answer this very question by conducting an interview in which ChatGPT is asking all the questions. (The questions are answered by a second ChatGPT, as in my own recent Interview with it, Exploring Large Laguage Models with ChatGPT.) In addition to the straight dialogue, I include my own commentary along the way and conclude with a discussion of the results of the experiment, that is, whether I think ChatGPT will be taking my job as your host anytime soon. Ultimately, though, I hope you’ll be the judge of that and share your thoughts on how ChatGPT did at my job via a comment below or on social media.
Dec 26, 2022
Geospatial Machine Learning at AWS with Kumar Chellapilla - #607
Today we continue our re:Invent 2022 series joined by Kumar Chellapilla, a general manager of ML and AI Services at AWS. We had the opportunity to speak with Kumar after announcing their recent addition of geospatial data to the SageMaker Platform. In our conversation, we explore Kumar’s role as the GM for a diverse array of SageMaker services, what has changed in the geospatial data landscape over the last 10 years, and why Amazon decided now was the right time to invest in geospatial data. We discuss the challenges of accessing and working with this data and the pain points they’re trying to solve. Finally, Kumar walks us through a few customer use cases, describes how this addition will make users more effective than they currently are, and shares his thoughts on the future of this space over the next 2-5 years, including the potential intersection of geospatial data and stable diffusion/generative models. The complete show notes for this episode can be found at
Dec 22, 2022
Real-Time ML Workflows at Capital One with Disha Singla - #606
Today we’re joined by Disha Singla, a senior director of machine learning engineering at Capital One. In our conversation with Disha, we explore her role as the leader of the Data Insights team at Capital One, where they’ve been tasked with creating reusable libraries, components, and workflows to make ML usable broadly across the company, as well as a platform to make it all accessible and to drive meaningful insights. We discuss the construction of her team, as well as the types of interactions and requests they receive from their customers (data scientists), productionized use cases from the platform, and their efforts to transition from batch to real-time deployment. Disha also shares her thoughts on the ROI of machine learning and getting buy-in from executives, how she sees machine learning evolving at the company over the next 10 years, and much more! The complete show notes for this episode can be found at
Dec 19, 2022
Weakly Supervised Causal Representation Learning with Johann Brehmer - #605
Today we’re excited to kick off our coverage of the 2022 NeurIPS conference with Johann Brehmer, a research scientist at Qualcomm AI Research in Amsterdam. We begin our conversation discussing some of the broader problems that causality will help us solve, before turning our focus to Johann’s paper Weakly supervised causal representation learning, which seeks to prove that high-level causal representations are identifiable in weakly supervised settings. We also discuss a few other papers that the team at Qualcomm presented, including neural topological ordering for computation graphs, as well as some of the demos they showcased, which we’ll link to on the show notes page.  The complete show notes for this episode can be found at
Dec 15, 2022
Stable Diffusion & Generative AI with Emad Mostaque - #604
Today we’re excited to kick off our 2022 AWS re:Invent series with a conversation with Emad Mostaque, Founder and CEO of is a very popular name in the generative AI space at the moment, having taken the internet by storm with the release of its stable diffusion model just a few months ago. In our conversation with Emad, we discuss the story behind Stability's inception, the model's speed and scale, and the connection between stable diffusion and programming. We explore some of the spaces that Emad anticipates being disrupted by this technology, his thoughts on the open-source vs API debate, how they’re dealing with issues of user safety and artist attribution, and of course, what infrastructure they’re using to stand the model up. The complete show notes for this episode can be found at
Dec 12, 2022
Exploring Large Language Models with ChatGPT - #603
Today we're joined by ChatGPT, the latest and coolest large language model developed by OpenAl. In our conversation with ChatGPT, we discuss the background and capabilities of large language models, the potential applications of these models, and some of the technical challenges and open questions in the field. We also explore the role of supervised learning in creating ChatGPT, and the use of PPO in training the model. Finally, we discuss the risks of misuse of large language models, and the best resources for learning more about these models and their applications. Join us for a fascinating conversation with ChatGPT, and learn more about the exciting world of large language models. The complete show notes for this episode can be found at
Dec 08, 2022
Accelerating Intelligence with AI-Generating Algorithms with Jeff Clune - #602
Are AI-generating algorithms the path to artificial general intelligence(AGI)?  Today we’re joined by Jeff Clune, an associate professor of computer science at the University of British Columbia, and faculty member at the Vector Institute. In our conversation with Jeff, we discuss the broad ambitious goal of the AI field, artificial general intelligence, where we are on the path to achieving it, and his opinion on what we should be doing to get there, specifically, focusing on AI generating algorithms. With the goal of creating open-ended algorithms that can learn forever, Jeff shares his three pillars to an AI-GA, meta-learning architectures, meta-learning algorithms, and auto-generating learning environments. Finally, we discuss the inherent safety issues with these learning algorithms and Jeff’s thoughts on how to combat them, and what the not-so-distant future holds for this area of research.  The complete show notes for this episode can be found at
Dec 05, 2022
Programmatic Labeling and Data Scaling for Autonomous Commercial Aviation with Cedric Cocaud - #601
Today we’re joined by Cedric Cocaud, the chief engineer of the Wayfinder Group at Acubed, the innovation center for aircraft manufacturer Airbus. In our conversation with Cedric, we explore some of the technical challenges of innovation in the aircraft space, including autonomy. Cedric’s work on Project Vahana, Acubed’s foray into air taxis, attempted to leverage work in the self-driving car industry to develop fully autonomous planes. We discuss some of the algorithms being developed for this work, the data collection process, and Cedric’s thoughts on using synthetic data for these tasks. We also discuss the challenges of labeling the data, including programmatic and automated labeling, and much more.
Nov 28, 2022
Engineering Production NLP Systems at T-Mobile with Heather Nolis - #600
Today we’re joined by Heather Nolis, a principal machine learning engineer at T-Mobile. In our conversation with Heather, we explored her machine learning journey at T-Mobile, including their initial proof of concept project, which held the goal of putting their first real-time deep learning model into production. We discuss the use case, which aimed to build a model customer intent model that would pull relevant information about a customer during conversations with customer support. This process has now become widely known as blank assist. We also discuss the decision to use supervised learning to solve this problem and the challenges they faced when developing a taxonomy. Finally, we explore the idea of using small models vs uber-large models, the hardware being used to stand up their infrastructure, and how Heather thinks about the age-old question of build vs buy. 
Nov 21, 2022
Sim2Real and Optimus, the Humanoid Robot with Ken Goldberg - #599
Today we’re joined by return guest Ken Goldberg, a professor at UC Berkeley and the chief scientist at Ambi Robotics. It’s been a few years since our initial conversation with Ken, so we spent a bit of time talking through the progress that has been made in robotics in the time that has passed. We discuss Ken’s recent work, including the paper Autonomously Untangling Long Cables, which won Best Systems Paper at the RSS conference earlier this year, including the complexity of the problem and why it is classified as a systems challenge, as well as the advancements in hardware that made solving this problem possible. We also explore Ken’s thoughts on the push towards simulation by research entities and large tech companies, and the potential for causal modeling to find its way into robotics. Finally, we discuss the recent showcase of Optimus, Tesla, and Elon Musk’s “humanoid” robot and how far we are from it being a viable piece of technology. The complete show notes for this episode can be found at
Nov 14, 2022
The Evolution of the NLP Landscape with Oren Etzioni - #598
Today friend of the show and esteemed guest host John Bohannon is back with another great interview, this time around joined by Oren Etzioni, former CEO of the Allen Institute for AI, where he is currently an advisor. In our conversation with Oren, we discuss his philosophy as a researcher and how that has manifested in his pivot to institution builder. We also explore his thoughts on the current landscape of NLP, including the emergence of LLMs and the hype being built up around AI systems from folks like Elon Musk. Finally, we explore some of the research coming out of AI2, including Semantic Scholar, an AI-powered research tool analogous to arxiv, and the somewhat controversial Delphi project, a research prototype designed to model people’s moral judgments on a variety of everyday situations.
Nov 07, 2022
Live from TWIMLcon! The Great MLOps Debate: End-to-End ML Platforms vs Specialized Tools - #597
Over the last few years, it’s been established that your ML team needs at least some basic tooling in order to be effective, providing support for various aspects of the machine learning workflow, from data acquisition and management, to model development and optimization, to model deployment and monitoring. But how do you get there? Many tools available off the shelf, both commercial and open source, can help. At the extremes, these tools can fall into one of a couple of buckets. End-to-end platforms that try to provide support for many aspects of the ML lifecycle, and specialized tools that offer deep functionality in a particular domain or area. At TWIMLcon: AI Platforms 2022, our panelists debated the merits of these approaches in The Great MLOps Debate: End-to-End ML Platforms vs Specialized Tools.
Oct 31, 2022
Live from TWIMLcon! You're not Facebook. Architecting MLOps for B2B Use Cases with Jacopo Tagliabue - #596
Much of the way we talk and think about MLOps comes from the perspective of large consumer internet companies like Facebook or Google. If you work at a FAANG company, these approaches might work well for you. But what about if you work at one of the many small, B2B companies that stand to benefit through the use of machine learning? How should you be thinking about MLOps and the ML lifecycle in that case? In this live podcast interview from TWIMLcon: AI Platforms 2022, Sam Charrington explores these questions with Jacopo Tagliabue, whose perspectives and contributions on scaling down MLOps have served to make the field more accessible and relevant to a wider array of practitioners.
Oct 24, 2022
Building Foundational ML Platforms with Kubernetes and Kubeflow with Ali Rodell - #595
Today we’re joined by Ali Rodell, a senior director of machine learning engineering at Capital One. In our conversation with Ali, we explore his role as the head of model development platforms at Capital One, including how his 25+ years in software development have shaped his view on building platforms and the evolution of the platforms space over the last 10 years. We discuss the importance of a healthy open source tooling ecosystem, Capital One’s use of various open source capabilites like kubeflow and kubernetes to build out platforms, and some of the challenges that come along with modifying/customizing these tools to work for him and his teams. Finally, we explore the range of user personas that need to be accounted for when making decisions about tooling, supporting things like Jupyter notebooks and other low level tools, and how that can be potentially challenging in a highly regulated environment like the financial industry. The complete show notes for this episode can be found at
Oct 17, 2022
AI-Powered Peer Programming with Vasi Philomin - #594
Today we’re joined by Vasi Philomin, vice president of AI services at AWS, joins us for our first in-person interview since 2019! In our conversation with Vasi, we discussed the recently released Amazon Code Whisperer, a developer-focused coding companion. We begin by exploring Vasi’s role and the various products under the banner of cognitive and non-cognitive services, and how those came together where Code Whisperer fits into the equation and some of the differences between Code Whisperer and some of the other recently released coding companions like GitHub Copilot. We also discuss the training corpus for the model, and how they’ve dealt with the potential issues of bias that arise when training LLMs with crawled web data, and Vasi’s thoughts on what the path of innovation looks like for Code Whisperer.  At the end of our conversation, Vasi was gracious enough to share a quick live demo of Code Whisperer, so you can catch that here.
Oct 10, 2022
The Top 10 Reasons to Register for TWIMLcon: AI Platforms 2022!
TWIMLcon: AI Platforms 2022 is just a day away! If you're interested in all things MLOps and Platforms/Infrastructure technology, this is the event for you! Register now at for FREE!
Oct 03, 2022
Applied AI/ML Research at PayPal with Vidyut Naware - #593
Today we’re joined by Vidyut Naware, the director of machine learning and artificial intelligence at Paypal. As the leader of the ML/AI organization at Paypal, Vidyut is responsible for all things applied, from R&D to MLOps infrastructure. In our conversation, we explore the work being done in four major categories, hardware/compute, data, applied responsible AI, and tools, frameworks, and platforms. We also discuss their use of federated learning and delayed supervision models for use cases like anomaly detection and fraud prevention, research into quantum computing and causal inference, as well as applied use cases like graph machine learning and collusion detection.  The complete show notes for this episode can be found at
Sep 26, 2022
Assessing Data Quality at Shopify with Wendy Foster - #592
Today we’re back with another installment of our Data-Centric AI series, joined by Wendy Foster, a director of engineering & data science at Shopify. In our conversation with Wendy, we explore the differences between data-centric and model-centric approaches and how they manifest at Shopify, including on her team, which is responsible for utilizing merchant and product data to assist individual vendors on the platform. We discuss how they address, maintain, and improve data quality, emphasizing the importance of coverage and “freshness” data when solving constantly evolving use cases. Finally, we discuss how data is taxonomized at the company and the challenges that present themselves when producing large-scale ML models, future use cases that Wendy expects her team to tackle, and we briefly explore Merlin, Shopify’s new ML platform (that you can hear more about at TWIMLcon!), and how it fits into the broader scope of ML at the company. The complete show notes for this episode can be found at
Sep 19, 2022
Transformers for Tabular Data at Capital One with Bayan Bruss - #591
Today we’re joined by Bayan Bruss, a Sr. director of applied ML research at Capital One. In our conversation with Bayan, we dig into his work in applying various deep learning techniques to tabular data, including taking advancements made in other areas like graph CNNs and other traditional graph mining algorithms and applying them to financial services applications. We discuss why despite a “flood” of innovation in the field, work on tabular data doesn’t elicit as much fanfare despite its broad use across businesses, Bayan’s experience with the difficulty of making deep learning work on tabular data, and what opportunities have been presented for the field with the emergence of multi-modality and transformer models. We also explore a pair of papers from Bayan’s team, focused on both transformers and transfer learning for tabular data.  The complete show notes for this episode can be found at
Sep 12, 2022
Understanding Collective Insect Communication with ML, w/ Orit Peleg - #590
Today we’re joined by Orit Peleg, an assistant professor at the University of Colorado, Boulder. Orit’s work focuses on understanding the behavior of disordered living systems, by merging tools from physics, biology, engineering, and computer science. In our conversation, we discuss how Orit found herself exploring problems of swarming behaviors and their relationship to distributed computing system architecture and spiking neurons. We look at two specific areas of research, the first focused on the patterns observed in firefly species, how the data is collected, and the types of algorithms used for optimization. Finally, we look at how Orit’s research with fireflies translates to a completely different insect, the honeybee, and what the next steps are for investigating these and other insect families. The complete show notes for this episode can be found at
Sep 05, 2022
Multimodal, Multi-Lingual NLP at Hugging Face with John Bohannon and Douwe Kiela - #589
In this extra special episode of the TWIML AI Podcast, a friend of the show John Bohannon leads a jam-packed conversation with Hugging Face’s recently appointed head of research Douwe Kiela. In our conversation with Douwe, we explore his role at the company, how his perception of Hugging Face has changed since joining, and what research entails at the company. We discuss the emergence of the transformer model and the emergence of BERT-ology, the recent shift to solving more multimodal problems, the importance of this subfield as one of the “Grand Directions'' of Hugging Face’s research agenda, and the importance of BLOOM, the open-access Multilingual Language Model that was the output of the BigScience project. Finally, we get into how Douwe’s background in philosophy shapes his view of current projects, as well as his projections for the future of NLP and multimodal ML. The complete show notes for this episode can be found at
Aug 29, 2022
Synthetic Data Generation for Robotics with Bill Vass - #588
Today we’re joined by Bill Vass, a VP of engineering at Amazon Web Services. Bill spoke at the most recent AWS re:MARS conference, where he delivered an engineering Keynote focused on some recent updates to Amazon sagemaker, including its support for synthetic data generation. In our conversation, we discussed all things synthetic data, including the importance of data quality when creating synthetic data, and some of the use cases that this data is being created for, including warehouses and in the case of one of their more recent acquisitions, iRobot, synthetic house generation. We also explore Astro, the household robot for home monitoring, including the types of models running it, is running, what type of on-device sensor suite it has, the relationship between the robot and the cloud, and the role of simulation.  The complete show notes for this episode can be found at
Aug 22, 2022
Multi-Device, Multi-Use-Case Optimization with Jeff Gehlhaar - #587
Today we’re joined by Jeff Gehlhaar, vice president of technology at Qualcomm Technologies. In our annual conversation with Jeff, we dig into the relationship between Jeff’s team on the product side and the research team, many of whom we’ve had on the podcast over the last few years. We discuss the challenges of real-world neural network deployment and doing quantization on-device, as well as a look at the tools that power their AI Stack. We also explore a few interesting automotive use cases, including automated driver assistance, and what advancements Jeff is looking forward to seeing in the next year. The complete show notes for this episode can be found at
Aug 15, 2022
Causal Conceptions of Fairness and their Consequences with Sharad Goel - #586
Today we close out our ICML 2022 coverage joined by Sharad Goel, a professor of public policy at Harvard University. In our conversation with Sharad, we discuss his Outstanding Paper award winner Causal Conceptions of Fairness and their Consequences, which seeks to understand what it means to apply causality to the idea of fairness in ML. We explore the two broad classes of intent that have been conceptualized under the subfield of causal fairness and how they differ, the distinct ways causality is treated in economic and statistical contexts vs a computer science and algorithmic context, and why policies are created in the context of causal definitions are suboptimal broadly. The complete show notes for this episode can be found at
Aug 08, 2022
Brain-Inspired Hardware and Algorithm Co-Design with Melika Payvand - #585
Today we continue our ICML coverage joined by Melika Payvand, a research scientist at the Institute of Neuroinformatics at the University of Zurich and ETH Zurich. Melika spoke at the Hardware Aware Efficient Training (HAET) Workshop, delivering a keynote on Brain-inspired hardware and algorithm co-design for low power online training on the edge. In our conversation with Melika, we explore her work at the intersection of ML and neuroinformatics, what makes the proposed architecture “brain-inspired”, and how techniques like online learning fit into the picture. We also discuss the characteristics of the devices that are running the algorithms she’s creating, and the challenges of adapting online learning-style algorithms to this hardware. The complete show notes for this episode can be found at
Aug 01, 2022
Equivariant Priors for Compressed Sensing with Arash Behboodi - #584
Today we’re joined by Arash Behboodi, a machine learning researcher at Qualcomm Technologies. In our conversation with Arash, we explore his paper Equivariant Priors for Compressed Sensing with Unknown Orientation, which proposes using equivariant generative models as a prior means to show that signals with unknown orientations can be recovered with iterative gradient descent on the latent space of these models and provide additional theoretical recovery guarantees. We discuss the differences between compression and compressed sensing, how he was able to evolve a traditional VAE architecture to understand equivalence, and some of the research areas he’s applying this work, including cryo-electron microscopy. We also discuss a few of the other papers that his colleagues have submitted to the conference, including Overcoming Oscillations in Quantization-Aware Training, Variational On-the-Fly Personalization, and CITRIS: Causal Identifiability from Temporal Intervened Sequences. The complete show notes for this episode can be found at
Jul 25, 2022
Managing Data Labeling Ops for Success with Audrey Smith - #583
Today we continue our Data-Centric AI Series joined by Audrey Smith, the COO at MLtwist, and a recent participant in our panel on DCAI. In our conversation, we do a deep dive into data labeling for ML, exploring the typical journey for an organization to get started with labeling, her experience when making decisions around in-house vs outsourced labeling, and what commitments need to be made to achieve high-quality labels. We discuss how organizations that have made significant investments in labelops typically function, how someone working on an in-house labeling team approaches new projects, the ethical considerations that need to be taken for remote labeling workforces, and much more! The complete show notes for this episode can be found at
Jul 18, 2022
Engineering an ML-Powered Developer-First Search Engine with Richard Socher - #582
Today we’re joined by Richard Socher, the CEO of In our conversation with Richard, we explore the inspiration and motivation behind the search engine, and how it differs from the traditional google search engine experience. We discuss some of the various ways that machine learning is used across the platform including how they surface relevant search results and some of the recent additions like code completion and a text generator that can write complete essays and blog posts. Finally, we talk through some of the projects we covered in our last conversation with Richard, namely his work on Salesforce’s AI Economist project.  The complete show notes for this episode can be found at
Jul 11, 2022
On The Path Towards Robot Vision with Aljosa Osep - #581
Today we wrap up our coverage of the 2022 CVPR conference joined by Aljosa Osep, a postdoc at the Technical University of Munich & Carnegie Mellon University. In our conversation with Aljosa, we explore his broader research interests in achieving robot vision, and his vision for what it will look like when that goal is achieved. The first paper we dig into is Text2Pos: Text-to-Point-Cloud Cross-Modal Localization, which proposes a cross-modal localization module that learns to align textual descriptions with localization cues in a coarse-to-fine manner. Next up, we explore the paper Forecasting from LiDAR via Future Object Detection, which proposes an end-to-end approach for detection and motion forecasting based on raw sensor measurement as opposed to ground truth tracks. Finally, we discuss Aljosa’s third and final paper Opening up Open-World Tracking, which proposes a new benchmark to analyze existing efforts in multi-object tracking and constructs a baseline for these tasks. The complete show notes for this episode can be found at
Jul 04, 2022
More Language, Less Labeling with Kate Saenko - #580
Today we continue our CVPR series joined by Kate Saenko, an associate professor at Boston University and a consulting professor for the MIT-IBM Watson AI Lab. In our conversation with Kate, we explore her research in multimodal learning, which she spoke about at the Multimodal Learning and Applications Workshop, one of a whopping 6 workshops she spoke at. We discuss the emergence of multimodal learning, the current research frontier, and Kate’s thoughts on the inherent bias in LLMs and how to deal with it. We also talk through some of the challenges that come up when building out applications, including the cost of labeling, and some of the methods she’s had success with. Finally, we discuss Kate’s perspective on the monopolizing of computing resources for “foundational” models, and her paper Unsupervised Domain Generalization by learning a Bridge Across Domains. The complete show notes for this episode can be found at
Jun 27, 2022
Optical Flow Estimation, Panoptic Segmentation, and Vision Transformers with Fatih Porikli - #579
Today we kick off our annual coverage of the CVPR conference joined by Fatih Porikli, Senior Director of Engineering at Qualcomm AI Research. In our conversation with Fatih, we explore a trio of CVPR-accepted papers, as well as a pair of upcoming workshops at the event. The first paper, Panoptic, Instance and Semantic Relations: A Relational Context Encoder to Enhance Panoptic Segmentation, presents a novel framework to integrate semantic and instance contexts for panoptic segmentation. Next up, we discuss Imposing Consistency for Optical Flow Estimation, a paper that introduces novel and effective consistency strategies for optical flow estimation. The final paper we discuss is IRISformer: Dense Vision Transformers for Single-Image Inverse Rendering in Indoor Scenes, which proposes a transformer architecture to simultaneously estimate depths, normals, spatially-varying albedo, roughness, and lighting from a single image of an indoor scene. For each paper, we explore the motivations and challenges and get concrete examples to demonstrate each problem and solution presented. The complete show notes for this episode can be found at
Jun 20, 2022
Data Governance for Data Science with Adam Wood - #578
Today we’re joined by Adam Wood, Director of Data Governance and Data Quality at Mastercard. In our conversation with Adam, we explore the challenges that come along with data governance at a global scale, including dealing with regional regulations like GDPR and federating records at scale. We discuss the role of feature stores in keeping track of data lineage and how Adam and his team have dealt with the challenges of metadata management, how large organizations like Mastercard are dealing with enabling feature reuse, and the steps they take to alleviate bias, especially in scenarios like acquisitions. Finally, we explore data quality for data science and why Adam sees it as an encouraging area of growth within the company, as well as the investments they’ve made in tooling around data management, catalog, feature management, and more. The complete show notes for this episode can be found at
Jun 13, 2022
Feature Platforms for Data-Centric AI with Mike Del Balso - #577
In the latest installment of our Data-Centric AI series, we’re joined by a friend of the show Mike Del Balso, Co-founder and CEO of Tecton. If you’ve heard any of our other conversations with Mike, you know we spend a lot of time discussing feature stores, or as he now refers to them, feature platforms. We explore the current complexity of data infrastructure broadly and how that has changed over the last five years, as well as the maturation of streaming data platforms. We discuss the wide vs deep paradox that exists around ML tooling, and the idea around the “ML Flywheel”, a strategy that leverages data to accelerate machine learning. Finally, we spend time discussing internal ML team construction, some of the challenges that organizations face when building their ML platforms teams, and how they can avoid the pitfalls as they arise. The complete show notes for this episode can be found at
Jun 06, 2022
The Fallacy of "Ground Truth" with Shayan Mohanty - #576
Today we continue our Data-centric AI series joined by Shayan Mohanty, CEO at Watchful. In our conversation with Shayan, we focus on the data labeling aspect of the machine learning process, and ways that a data-centric approach could add value and reduce cost by multiple orders of magnitude. Shayan helps us define “data-centric”, while discussing the main challenges that organizations face when dealing with labeling, how these problems are currently being solved, and how techniques like active learning and weak supervision could be used to more effectively label. We also explore the idea of machine teaching, which focuses on using techniques that make the model training process more efficient, and what organizations need to be successful when trying to make the aforementioned mindset shift to DCAI.  The complete show notes for this episode can be found at
May 30, 2022
Principle-centric AI with Adrien Gaidon - #575
This week, we continue our conversations around the topic of Data-Centric AI joined by a friend of the show Adrien Gaidon, the head of ML research at the Toyota Research Institute (TRI). In our chat, Adrien expresses a fourth, somewhat contrarian, viewpoint to the three prominent schools of thought that organizations tend to fall into, as well as a great story about how the breakthrough came via an unlikely source. We explore his principle-centric approach to machine learning as well as the role of self-supervised machine learning and synthetic data in this and other research threads. Make sure you’re following along with the entire DCAI series at The complete show notes for this episode can be found at
May 23, 2022
Data Debt in Machine Learning with D. Sculley - #574
Today we kick things off with a conversation with D. Sculley, a director on the Google Brain team. Many listeners of today’s show will know D. from his work on the paper, The Hidden Technical Debt in Machine Learning Systems, and of course, the infamous diagram. D. has recently translated the idea of technical debt into data debt, something we spend a bit of time on in the interview. We discuss his view of the concept of DCAI, where debt fits into the conversation of data quality, and what a shift towards data-centrism looks like in a world of increasingly larger models i.e. GPT-3 and the recent PALM models. We also explore common sources of data debt, what are things that the community can and have done to mitigate these issues, the usefulness of causal inference graphs in this work, and much more! If you enjoyed this interview or want to hear more on this topic, check back on the DCAI series page weekly at The complete show notes for this episode can be found at
May 19, 2022
AI for Enterprise Decisioning at Scale with Rob Walker - #573
Today we’re joined by Rob Walker, VP of decisioning & analytics and gm of one-to-one customer engagement at Pegasystems. Rob, who you might know from his previous appearances on the podcast, joins us to discuss his work on AI and ML in the context of customer engagement and decisioning, the various problems that need to be solved, including solving the “next best” problem. We explore the distinction between the idea of the next best action and determining it from a recommender system, how the combination of machine learning and heuristics are currently co-existing in engagements, scaling model evaluation, and some of the challenges they’re facing when dealing with problems of responsible AI and how they’re managed. Finally, we spend a few minutes digging into the upcoming PegaWorld conference, and what attendees should anticipate at the event. The complete show notes for this episode can be found at
May 16, 2022
Data Rights, Quantification and Governance for Ethical AI with Margaret Mitchell - #572
Today we close out our coverage of the ICLR series joined by Meg Mitchell, chief ethics scientist and researcher at Hugging Face. In our conversation with Meg, we discuss her participation in the WikiM3L Workshop, as well as her transition into her new role at Hugging Face, which has afforded her the ability to prioritize coding in her work around AI ethics. We explore her thoughts on the work happening in the fields of data curation and data governance, her interest in the inclusive sharing of datasets and creation of models that don't disproportionately underperform or exploit subpopulations, and how data collection practices have changed over the years.  We also touch on changes to data protection laws happening in some pretty uncertain places, the evolution of her work on Model Cards, and how she’s using this and recent Data Cards work to lower the barrier to entry to responsibly informed development of data and sharing of data. The complete show notes for this episode can be found at
May 12, 2022
Studying Machine Intelligence with Been Kim - #571
Today we continue our ICLR coverage joined by Been Kim, a staff research scientist at Google Brain, and an ICLR 2022 Invited Speaker. Been, whose research has historically been focused on interpretability in machine learning, delivered the keynote Beyond interpretability: developing a language to shape our relationships with AI, which explores the need to study AI machines as scientific objects, in isolation and with humans, which will provide principles for tools, but also is necessary to take our working relationship with AI to the next level.  Before we dig into Been’s talk, she characterizes where we are as an industry and community with interpretability, and what the current state of the art is for interpretability techniques. We explore how the Gestalt principles appear in neural networks, Been’s choice to characterize communication with machines as a language as opposed to a set of principles or foundational understanding, and much much more. The complete show notes for this episode can be found at
May 09, 2022
Advances in Neural Compression with Auke Wiggers - #570
Today we’re joined by Auke Wiggers, an AI research scientist at Qualcomm. In our conversation with Auke, we discuss his team’s recent research on data compression using generative models. We discuss the relationship between historical compression research and the current trend of neural compression, and the benefit of neural codecs, which learn to compress data from examples. We also explore the performance evaluation process and the recent developments that show that these models can operate in real-time on a mobile device. Finally, we discuss another ICLR paper, “Transformer-based transform coding”, that proposes a vision transformer-based architecture for image and video coding, and some of his team’s other accepted works at the conference.  The complete show notes for this episode can be found at
May 02, 2022
Mixture-of-Experts and Trends in Large-Scale Language Modeling with Irwan Bello - #569
Today we’re joined by Irwan Bello, formerly a research scientist at Google Brain, and now on the founding team at a stealth AI startup. We begin our conversation with an exploration of Irwan’s recent paper, Designing Effective Sparse Expert Models, which acts as a design guide for building sparse large language model architectures. We discuss mixture of experts as a technique, the scalability of this method, and it's applicability beyond NLP tasks the data sets this experiment was benchmarked against. We also explore Irwan’s interest in the research areas of alignment and retrieval, talking through interesting lines of work for each area including instruction tuning and direct alignment. The complete show notes for this episode can be found at
Apr 25, 2022
Daring to DAIR: Distributed AI Research with Timnit Gebru - #568
Today we’re joined by friend of the show Timnit Gebru, the founder and executive director of DAIR, the Distributed Artificial Intelligence Research Institute. In our conversation with Timnit, we discuss her journey to create DAIR, their goals and some of the challenges shes faced along the way. We start is the obvious place, Timnit being “resignated” from Google after writing and publishing a paper detailing the dangers of large language models, the fallout from that paper and her firing, and the eventual founding of DAIR. We discuss the importance of the “distributed” nature of the institute, how they’re going about figuring out what is in scope and out of scope for the institute’s research charter, and what building an institution means to her. We also explore the importance of independent alternatives to traditional research structures, if we should be pessimistic about the impact of internal ethics and responsible AI teams in industry due to the overwhelming power they wield, examples she looks to of what not to do when building out the institute, and much much more! The complete show notes for this episode can be found at
Apr 18, 2022
Hierarchical and Continual RL with Doina Precup - #567
Today we’re joined by Doina Precup, a research team lead at DeepMind Montreal, and a professor at McGill University. In our conversation with Doina, we discuss her recent research interests, including her work in hierarchical reinforcement learning, with the goal being agents learning abstract representations, especially over time. We also explore her work on reward specification for RL agents, where she hypothesizes that a reward signal in a complex environment could lead an agent to develop attributes of intuitive intelligence. We also dig into quite a few of her papers, including On the Expressivity of Markov Reward, which won a NeruIPS 2021 outstanding paper award. Finally, we discuss the analogy between hierarchical RL and CNNs, her work in continual RL, and her thoughts on the evolution of RL in the recent past and present, and the biggest challenges facing the field going forward. The complete show notes for this episode can be found at
Apr 11, 2022
Open-Source Drug Discovery with DeepChem with Bharath Ramsundar - #566
Today we’re joined by Bharath Ramsundar, founder and CEO of Deep Forest Sciences. In our conversation with Bharath, we explore his work on the DeepChem, an open-source library for drug discovery, materials science, quantum chemistry, and biology tools. We discuss the challenges that biotech and pharmaceutical companies are facing as they attempt to incorporate AI into the drug discovery process, where the innovation frontier is, and what the promise is for AI in this field in the near term. We also dig into the origins of DeepChem and the problems it's solving for practitioners, the capabilities that are enabled when using this library as opposed to others, and MoleculeNET, a dataset and benchmark focused on molecular design that lives within the DeepChem suite. The complete show notes for this episode can be found at
Apr 04, 2022
Advancing Hands-On Machine Learning Education with Sebastian Raschka - #565
Today we’re joined by Sebastian Raschka, an assistant professor at the University of Wisconsin-Madison and lead AI educator at In our conversation with Sebastian, we explore his work around AI education, including the “hands-on” philosophy that he takes when building these courses, his recent book Machine Learning with PyTorch and Scikit-Learn, his advise to beginners in the field when they’re trying to choose tools and frameworks, and more.  We also discuss his work on Pytorch Lightning, a platform that allows users to organize their code and integrate it into other technologies, before switching gears and discuss his recent research efforts around ordinal regression, including a ton of great references that we’ll link on the show notes page below!  The complete show notes for this episode can be found at
Mar 28, 2022
Big Science and Embodied Learning at Hugging Face 🤗 with Thomas Wolf - #564
Today we’re joined by Thomas Wolf, co-founder and chief science officer at Hugging Face 🤗. We cover a ton of ground In our conversation, starting with Thomas’ interesting backstory as a quantum physicist and patent lawyer, and how that lead him to a career in machine learning. We explore how Hugging Face began, what the current direction is for the company, and how much of their focus is NLP and language models versus other disciplines. We also discuss the BigScience project, a year-long research workshop where 1000+ researchers of all backgrounds and disciplines have come together to create an 800GB multilingual dataset and model. We talk through their approach to curating the dataset, model evaluation at this scale, and how they differentiate their work from projects like Eluther AI. Finally, we dig into Thomas’ work on multimodality, his thoughts on the metaverse, his new book NLP with Transformers, and much more! The complete show notes for this episode can be found at
Mar 21, 2022
Full-Stack AI Systems Development with Murali Akula - #563
Today we’re joined by Murali Akula, a Sr. director of Software Engineering at Qualcomm. In our conversation with Murali, we explore his role at Qualcomm, where he leads the corporate research team focused on the development and deployment of AI onto Snapdragon chips, their unique definition of “full stack”, and how that philosophy permeates into every step of the software development process. We explore the complexities that are unique to doing machine learning on resource constrained devices, some of the techniques that are being applied to get complex models working on mobile devices, and the process for taking these models from research into real-world applications. We also discuss a few more tools and recent developments, including DONNA for neural architecture search, X-Distill, a method of improving the self-supervised training of monocular depth, and the AI Model Effeciency Toolkit, a library that provides advanced quantization and compression techniques for trained neural network models. The complete show notes for this episode can be found at
Mar 14, 2022
100x Improvements in Deep Learning Performance with Sparsity, w/ Subutai Ahmad - #562
Today we’re joined by Subutai Ahmad, VP of research at Numenta. While we’ve had numerous conversations about the biological inspirations of deep learning models with folks working at the intersection of deep learning and neuroscience, we dig into uncharted territory with Subutai. We set the stage by digging into some of fundamental ideas behind Numenta’s research and the present landscape of neuroscience, before exploring our first big topic of the podcast: the cortical column. Cortical columns are a group of neurons in the cortex of the brain which have nearly identical receptive fields; we discuss the behavior of these columns, why they’re a structure worth mimicing computationally, how far along we are in understanding the cortical column, and how these columns relate to neurons.   We also discuss what it means for a model to have inherent 3d understanding and for computational models to be inherently sensory motor, and where we are with these lines of research. Finally, we dig into our other big idea, sparsity. We explore the fundamental ideals of sparsity and the differences between sparse and dense networks, and applying sparsity and optimization to drive greater efficiency in current deep learning networks, including transformers and other large language models.  The complete show notes for this episode can be found at
Mar 07, 2022
Scaling BERT and GPT for Financial Services with Jennifer Glore - #561
Today we’re joined by Jennifer Glore, VP of customer engineering at SambaNova Systems. In our conversation with Jennifer, we discuss how, and why, Sambanova, who is primarily focused on building hardware to support machine learning applications, has built a GPT language model for the financial services industry. Jennifer shares her thoughts on the progress of industries like banking and finance, as well as other traditional organizations, in their attempts at using transformers and other models, and where they’ve begun to see success, as well as some of the hidden challenges that orgs run into that impede their progress. Finally, we explore their experience replicating the GPT-3 paper from a R&D perspective, how they’re addressing issues of predictability, controllability, governance, etc, and much more. The complete show notes for this episode can be found at
Feb 28, 2022
Trends in Deep Reinforcement Learning with Kamyar Azizzadenesheli - #560
Today we’re joined by Kamyar Azizzadenesheli, an assistant professor at Purdue University, to close out our AI Rewind 2021 series! In this conversation, we focused on all things deep reinforcement learning, starting with a general overview of the direction of the field, and though it might seem to be slowing, thats just a product of the light being shined constantly on the CV and NLP spaces. We dig into themes like the convergence of RL methodology with both robotics and control theory, as well as a few trends that Kamyar sees over the horizon, such as self-supervised learning approaches in RL. We also talk through Kamyar’s predictions for RL in 2022 and beyond. This was a fun conversation, and I encourage you to look through all the great resources that Kamyar shared on the show notes page at!
Feb 21, 2022
Deep Reinforcement Learning at the Edge of the Statistical Precipice with Rishabh Agarwal - #559
Today we’re joined by Rishabh Agarwal, a research scientist at Google Brain in Montreal. In our conversation with Rishabh, we discuss his recent paper Deep Reinforcement Learning at the Edge of the Statistical Precipice, which won an outstanding paper award at the most recent NeurIPS conference. In this paper, Rishabh and his coauthors call for a change in how deep RL performance is reported on benchmarks when using only a few runs, acknowledging that typically, DeepRL algorithms are evaluated by the performance on a large suite of tasks. Using the Atari 100k benchmark, they found substantial disparities in the conclusions from point estimates alone versus statistical analysis. We explore the reception of this paper from the research community, some of the more surprising results, what incentives researchers have to implement these types of changes in self-reporting when publishing, and much more. The complete show notes for this episode can be found at
Feb 14, 2022
Designing New Energy Materials with Machine Learning with Rafael Gomez-Bombarelli - #558
Today we’re joined by Rafael Gomez-Bombarelli, an assistant professor in the department of material science and engineering at MIT. In our conversation with Rafa, we explore his goal of ​​fusing machine learning and atomistic simulations for designing materials, a topic he spoke about at the recent SigOpt AI & HPC Summit. We discuss the two ways in which he thinks of material design, virtual screening and inverse design, as well as the unique challenges each technique presents. We also talk through the use of generative models for simulation, the type of training data necessary for these tasks, and if he’s building hand-coded simulations vs existing packages or tools. Finally, we explore the dynamic relationship between simulation and modeling and how the results of one drive the others efforts, and how hyperparameter optimization gets incorporated into the various projects. The complete show notes for this episode can be found at
Feb 07, 2022
Differentiable Programming for Oceanography with Patrick Heimbach - #557
Today we’re joined by Patrick Heimbach, a professor at the University of Texas working at the intersection of ML and oceanography. In our conversation with Patrick, we explore some of the challenges of computational oceanography, the potential use cases for machine learning in this field, as well as how it can be used to support scientists in solving simulation problems, and the role of differential programming and how it is expressed in his work.  The complete show notes for this episode can be found at
Jan 31, 2022
Trends in Machine Learning & Deep Learning with Zachary Lipton - #556
Today we continue our AI Rewind 2021 series joined by a friend of the show, assistant professor at Carnegie Mellon University, and AI Rewind veteran, Zack Lipton! In our conversation with Zack, we touch on recurring themes like “NLP Eating AI” and the recent slowdown in innovation in the field, the redistribution of resources across research problems, and where the opportunities for real breakthroughs lie. We also discuss problems facing the current peer-review system, notable research from last year like the introduction of the WILDS library, and the evolution of problems (and potential solutions) in fairness, bias, and equity. Of course, we explore some of the use cases and application areas that made notable progress in 2021, what Zack is looking forward to in 2022 and beyond, and much more! The complete show notes for this episode can be found at
Jan 27, 2022
Solving the Cocktail Party Problem with Machine Learning, w/ ‪Jonathan Le Roux - #555
Today we’re joined by Jonathan Le Roux, a senior principal research scientist at Mitsubishi Electric Research Laboratories (MERL). At MERL, Jonathan and his team are focused on using machine learning to solve the “cocktail party problem”, focusing on not only the separation of speech from noise, but also the separation of speech from speech. In our conversation with Jonathan, we focus on his paper The Cocktail Fork Problem: Three-Stem Audio Separation For Real-World Soundtracks, which looks to separate and enhance a complex acoustic scene into three distinct categories, speech, music, and sound effects. We explore the challenges of working with such noisy data, the model architecture used to solve this problem, how ML/DL fits into solving the larger cocktail party problem, future directions for this line of research, and much more! The complete show notes for this episode can be found at
Jan 24, 2022
Machine Learning for Earthquake Seismology with Karianne Bergen - #554
Today we’re joined by Karianne Bergen, an assistant professor at Brown University. In our conversation with Karianne, we explore her work at the intersection of earthquake seismology and machine learning, where she’s working on interpretable data classification for seismology. We discuss some of the challenges that present themselves when trying to solve this problem, and the state of applying machine learning to seismological events and earth sciences. Karianne also shares her thoughts on the different relationships that computer scientists and natural scientists have with machine learning, and how to bridge that gap to create tools that work broadly for all scientists. The complete show notes for this episode can be found at
Jan 20, 2022
The New DBfication of ML/AI with Arun Kumar - #553
Today we’re joined by Arun Kumarm, an associate professor at UC San Diego. We had the pleasure of catching up with Arun prior to the Workshop on Databases and AI at NeurIPS 2021, where he delivered the talk “The New DBfication of ML/AI.” In our conversation, we explore this “database-ification” of machine learning, a concept analogous to the transformation of relational SQL computation. We discuss the relationship between the ML and database fields and how the merging of the two could have positive outcomes for the end-to-end ML workflow, and a few tools that his team has developed, Cerebro, a tool for reproducible model selection, and SortingHat, a tool for automating data prep, and how tools like these and others affect Arun’s outlook on the future of machine learning platforms and MLOps. The complete show notes for this episode can be found at
Jan 17, 2022
Building Public Interest Technology with Meredith Broussard - #552
Today we’re joined by Meredith Broussard, an associate professor at NYU & research director at the NYU Alliance for Public Interest Technology. Meredith was a keynote speaker at the recent NeurIPS conference, and we had the pleasure of speaking with her to discuss her talk from the event, and her upcoming book, tentatively titled More Than A Glitch: What Everyone Needs To Know About Making Technology Anti-Racist, Accessible, And Otherwise Useful To All. In our conversation, we explore Meredith’s work in the field of public interest technology, and her view of the relationship between technology and artificial intelligence. Meredith and Sam talk through real-world scenarios where an emphasis on monitoring bias and responsibility would positively impact outcomes, and how this type of monitoring parallels the infrastructure that many organizations are already building out. Finally, we talk through the main takeaways from Meredith’s NeurIPS talk, and how practitioners can get involved in the work of building and deploying public interest technology. The complete show notes for this episode can be found at
Jan 13, 2022
A Universal Law of Robustness via Isoperimetry with Sebastien Bubeck - #551
Today we’re joined by Sebastian Bubeck a sr principal research manager at Microsoft, and author of the paper A Universal Law of Robustness via Isoperimetry, a NeurIPS 2021 Outstanding Paper Award recipient. We begin our conversation with Sebastian with a bit of a primer on convex optimization, a topic that hasn’t come up much in previous interviews. We explore the problem that convex optimization is trying to solve, the application of convex optimization to multi-armed bandit problems, metrical task systems and solving the K-server problem. We then dig into Sebastian’s paper, which looks to prove that for a broad class of data distributions and model classes, overparameterization is necessary if one wants to interpolate the data. Finally, we discussed the relationship between the paper and the work being done in the adversarial robustness community. The complete show notes for this episode can be found at
Jan 10, 2022
Trends in NLP with John Bohannon - #550
Today we’re joined by friend of the show John Bohannon, the director of science at Primer AI, to help us showcase all of the great achievements and accomplishments in NLP in 2021! In our conversation, John shares his two major takeaways from last year, 1) NLP as we know it has changed, and we’re back into the incremental phase of the science, and 2) NLP is “eating” the rest of machine learning. We explore the implications of these two major themes across the discipline, as well as best papers, up and coming startups, great things that did happen, and even a few bad things that didn’t. Finally, we explore what 2022 and beyond will look like for NLP, from multilingual NLP to use cases for the influx of large auto-regressive language models like GPT-3 and others, as well as ethical implications that are reverberating across domains and the changes that have been ushered in in that vein. The complete show notes for this episode can be found at
Jan 06, 2022
Trends in Computer Vision with Georgia Gkioxari - #549
Happy New Year! We’re excited to kick off 2022 joined by Georgia Gkioxari, a research scientist at Meta AI, to showcase the best advances in the field of computer vision over the past 12 months, and what the future holds for this domain.  Welcome back to AI Rewind! In our conversation Georgia highlights the emergence of the transformer model in CV research, what kind of performance results we’re seeing vs CNNs, and the immediate impact of NeRF, amongst a host of other great research. We also explore what is ImageNet’s place in the current landscape, and if it's time to make big changes to push the boundaries of what is possible with image, video and even 3D data, with challenges like the Metaverse, amongst others, on the horizon. Finally, we touch on the startups to keep an eye on, the collaborative efforts of software and hardware researchers, and the vibe of the “ImageNet moment” being upon us once again. The complete show notes for this episode can be found at
Jan 03, 2022
Kids Run the Darndest Experiments: Causal Learning in Children with Alison Gopnik - #548
Today we close out the 2021 NeurIPS series joined by Alison Gopnik, a professor at UC Berkeley and an invited speaker at the Causal Inference & Machine Learning: Why now? Workshop. In our conversation with Alison, we explore the question, “how is it that we can know so much about the world around us from so little information?,” and how her background in psychology, philosophy, and epistemology has guided her along the path to finding this answer through the actions of children. We discuss the role of causality as a means to extract representations of the world and how the “theory theory” came about, and how it was demonstrated to have merit. We also explore the complexity of causal relationships that children are able to deal with and what that can tell us about our current ML models, how the training and inference stages of the ML lifecycle are akin to childhood and adulthood, and much more! The complete show notes for this episode can be found at
Dec 27, 2021
Hypergraphs, Simplicial Complexes and Graph Representations of Complex Systems with Tina Eliassi-Rad - #547
Today we continue our NeurIPS coverage joined by Tina Eliassi-Rad, a professor at Northeastern University, and an invited speaker at the I Still Can't Believe It's Not Better! Workshop. In our conversation with Tina, we explore her research at the intersection of network science, complex networks, and machine learning, how graphs are used in her work and how it differs from typical graph machine learning use cases. We also discuss her talk from the workshop, “The Why, How, and When of Representations for Complex Systems”, in which Tina argues that one of the reasons practitioners have struggled to model complex systems is because of the lack of connection to the data sourcing and generation process. This is definitely a NERD ALERT approved interview! The complete show notes for this episode can be found at
Dec 23, 2021
Deep Learning, Transformers, and the Consequences of Scale with Oriol Vinyals - #546
Today we’re excited to kick off our annual NeurIPS, joined by Oriol Vinyals, the lead of the deep learning team at Deepmind. We cover a lot of ground in our conversation with Oriol, beginning with a look at his research agenda and why the scope has remained wide even through the maturity of the field, his thoughts on transformer models and if they will get us beyond the current state of DL, or if some other model architecture would be more advantageous. We also touch on his thoughts on the large language models craze, before jumping into his recent paper StarCraft II Unplugged: Large Scale Offline Reinforcement Learning, a follow up to their popular AlphaStar work from a few years ago. Finally, we discuss the degree to which the work that Deepmind and others are doing around games actually translates into real-world, non-game scenarios, recent work on multimodal few-shot learning, and we close with a discussion of the consequences of the level of scale that we’ve achieved thus far.   The complete show notes for this episode can be found at
Dec 20, 2021
Optimization, Machine Learning and Intelligent Experimentation with Michael McCourt - #545
Today we’re joined by Michael McCourt the head of engineering at SigOpt. In our conversation with Michael, we explore the vast space around the topic of optimization, including the technical differences between ML and optimization and where they’re applied, what the path to increasing complexity looks like for a practitioner and the relationship between optimization and active learning. We also discuss the research frontier for optimization and how folks think about the interesting challenges and open questions for this field, how optimization approaches appeared at the latest NeurIPS conference, and Mike’s excitement for the emergence of interdisciplinary work between the machine learning community and other fields like the natural sciences. The complete show notes for this episode can be found at
Dec 16, 2021
Jupyter and the Evolution of ML Tooling with Brian Granger - #544
Today we conclude our AWS re:Invent coverage joined by Brian Granger, a senior principal technologist at Amazon Web Services, and a co-creator of Project Jupyter. In our conversion with Brian, we discuss the inception and early vision of Project Jupyter, including how the explosion of machine learning and deep learning shifted the landscape for the notebook, and how they balanced the needs of these new user bases vs their existing community of scientific computing users. We also explore AWS’s role with Jupyter and why they’ve decided to invest resources in the project, Brian's thoughts on the broader ML tooling space, and how they’ve applied (and the impact of) HCI principles to the building of these tools. Finally, we dig into the recent Sagemaker Canvas and Studio Lab releases and Brian’s perspective on the future of notebooks and the Jupyter community at large. The complete show notes for this episode can be found at
Dec 13, 2021
Creating a Data-Driven Culture at ADP with Jack Berkowitz - #543
Today we continue our 2021 re:Invent series joined by Jack Berkowitz, chief data officer at ADP. In our conversation with Jack, we explore the ever evolving role and growth of machine learning at the company, from the evolution of their ML platform, to the unique team structure. We discuss Jack’s perspective on data governance, the broad use cases for ML, how they approached the decision to move to the cloud, and the impact of scale in the way they deal with data. Finally, we touch on where innovation comes from at ADP, and the challenge of getting the talent it needs to innovate as a large “legacy” company. The complete show notes for this episode can be found at
Dec 09, 2021
re:Invent Roundup 2021 with Bratin Saha - #542
Today we’re joined by Bratin Saha, vice president and general manager at Amazon. In our conversation with Bratin, we discuss quite a few of the recent ML-focused announcements coming out of last weeks re:Invent conference, including new products like Canvas and Studio Lab, as well as upgrades to existing services like Ground Truth Plus. We explore what no-code environments like the aforementioned Canvas mean for the democratization of ML tooling, and some of the key challenges to delivering it as a consumable product. We also discuss industrialization as a subset of MLOps, and how customer patterns inform the creation of these tools, and much more! The complete show notes for this episode can be found at
Dec 06, 2021
Multi-modal Deep Learning for Complex Document Understanding with Doug Burdick - #541
Today we’re joined by Doug Burdick, a principal research staff member at IBM Research. In a recent interview, Doug’s colleague Yunyao Li joined us to talk through some of the broader enterprise NLP problems she’s working on. One of those problems is making documents machine consumable, especially with the traditionally archival file type, the PDF. That’s where Doug and his team come in. In our conversation, we discuss the multimodal approach they’ve taken to identify, interpret, contextualize and extract things like tables from a document, the challenges they’ve faced when dealing with the tables and how they evaluate the performance of models on tables. We also explore how he’s handled generalizing across different formats, how fine-tuning has to be in order to be effective, the problems that appear on the NLP side of things, and how deep learning models are being leveraged within the group. The complete show notes for this episode can be found at
Dec 02, 2021
Predictive Maintenance Using Deep Learning and Reliability Engineering with Shayan Mortazavi - #540
Today we’re joined by Shayan Mortazavi, a data science manager at Accenture.  In our conversation with Shayan, we discuss his talk from the recent SigOpt HPC & AI Summit, titled A Novel Framework Predictive Maintenance Using Dl and Reliability Engineering. In the talk, Shayan proposes a novel deep learning-based approach for prognosis prediction of oil and gas plant equipment in an effort to prevent critical damage or failure. We explore the evolution of reliability engineering, the decision to use a residual-based approach rather than traditional anomaly detection to determine when an anomaly was happening, the challenges of using LSTMs when building these models, the amount of human labeling required to build the models, and much more! The complete show notes for this episode can be found at
Nov 29, 2021
Building a Deep Tech Startup in NLP with Nasrin Mostafazadeh - #539
Today we’re joined by friend-of-the-show Nasrin Mostafazadeh, co-founder of Verneek.  Though Verneek is still in stealth, Nasrin was gracious enough to share a bit about the company, including their goal of enabling anyone to make data-informed decisions without the need for a technical background, through the use of innovative human-machine interfaces. In our conversation, we explore the state of AI research in the domains relevant to the problem they’re trying to solve and how they use those insights to inform and prioritize their research agenda. We also discuss what advice Nasrin would give to someone thinking about starting a deep tech startup or going from research to product development.  The complete show notes for today’s show can be found at
Nov 24, 2021
Models for Human-Robot Collaboration with Julie Shah - #538
Today we’re joined by Julie Shah, a professor at the Massachusetts Institute of Technology (MIT). Julie’s work lies at the intersection of aeronautics, astronautics, and robotics, with a specific focus on collaborative and interactive robotics. In our conversation, we explore how robots would achieve the ability to predict what their human collaborators are thinking, what the process of building knowledge into these systems looks like, and her big picture idea of developing a field robot that doesn’t “require a human to be a robot” to work with it. We also discuss work Julie has done on cross-training between humans and robots with the focus on getting them to co-learn how to work together, as well as future projects that she’s excited about. The complete show notes for this episode can be found at
Nov 22, 2021
Four Key Tools for Robust Enterprise NLP with Yunyao Li - #537
Today we’re joined by Yunyao Li, a senior research manager at IBM Research.  Yunyao is in a somewhat unique position at IBM, addressing the challenges of enterprise NLP in a traditional research environment, while also having customer engagement responsibilities. In our conversation with Yunyao, we explore the challenges associated with productizing NLP in the enterprise, and if she focuses on solving these problems independent of one another, or through a more unified approach.  We then ground the conversation with real-world examples of these enterprise challenges, including enabling level document discovery at scale using combinations of techniques like deep neural networks and supervised and/or unsupervised learning, and entity extraction and semantic parsing to identify text. Finally, we talk through data augmentation in the context of NLP, and how we enable the humans in-the-loop to generate high-quality data. The complete show notes for this episode can be found at
Nov 18, 2021
Machine Learning at GSK with Kim Branson - #536
Today we’re joined by Kim Branson, the SVP and global head of artificial intelligence and machine learning at GSK.  We cover a lot of ground in our conversation, starting with a breakdown of GSK’s core pharmaceutical business, and how ML/AI fits into that equation, use cases that appear using genetics data as a data source, including sequential learning for drug discovery. We also explore the 500 billion node knowledge graph Kim’s team built to mine scientific literature, and their “AI Hub”, the ML/AI infrastructure team that handles all tooling and engineering problems within their organization. Finally, we explore their recent cancer research collaboration with King’s College, which is tasked with understanding the individualized needs of high- and low-risk cancer patients using ML/AI amongst other technologies.  The complete show notes for this episode can be found at
Nov 15, 2021
The Benefit of Bottlenecks in Evolving Artificial Intelligence with David Ha - #535
Today we’re joined by David Ha, a research scientist at Google.  In nature, there are many examples of “bottlenecks”, or constraints, that have shaped our development as a species. Building upon this idea, David posits that these same evolutionary bottlenecks could work when training neural network models as well. In our conversation with David, we cover a TON of ground, including the aforementioned biological inspiration for his work, then digging deeper into the different types of constraints he’s applied to ML systems. We explore abstract generative models and how advanced training agents inside of generative models has become, and quite a few papers including Neuroevolution of self-interpretable agents, World Models and Attention for Reinforcement Learning, and The Sensory Neuron as a Transformer: Permutation-Invariant Neural Networks for Reinforcement Learning. This interview is Nerd Alert certified, so get your notes ready!  PS. David is one of our favorite follows on Twitter (@hardmaru), so check him out and share your thoughts on this interview and his work! The complete show notes for this episode can be found at
Nov 11, 2021
Facebook Abandons Facial Recognition. Should Everyone Else Follow Suit? With Luke Stark - #534
Today we’re joined by Luke Stark, an assistant professor at Western University in London, Ontario.  In our conversation with Luke, we explore the existence and use of facial recognition technology, something Luke has been critical of in his work over the past few years, comparing it to plutonium. We discuss Luke’s recent paper, “Physiognomic Artificial Intelligence”, in which he critiques studies that will attempt to use faces and facial expressions and features to make determinations about people, a practice fundamental to facial recognition, also one that Luke believes is inherently racist at its core.  Finally, briefly discuss the recent wave of hires at the FTC, and the news that broke (mid-recording) announcing that Facebook will be shutting down their facial recognition system and why it's not necessarily the game-changing announcement it seemed on its… face.  The complete show notes for this episode can be found at
Nov 08, 2021
Building Blocks of Machine Learning at LEGO with Francesc Joan Riera - #533
Today we’re joined by Francesc Joan Riera, an applied machine learning engineer at The LEGO Group.  In our conversation, we explore the ML infrastructure at LEGO, specifically around two use cases, content moderation and user engagement. While content moderation is not a new or novel task, but because their apps and products are marketed towards children, their need for heightened levels of moderation makes it very interesting.  We discuss if the moderation system is built specifically to weed out bad actors or passive behaviors if their system has a human-in-the-loop component, why they built a feature store as opposed to a traditional database, and challenges they faced along that journey. We also talk through the range of skill sets on their team, the use of MLflow for experimentation, the adoption of AWS for serverless, and so much more! The complete show notes for this episode can be found at
Nov 04, 2021
Exploring the FastAI Tooling Ecosystem with Hamel Husain - #532
Today we’re joined by Hamel Husain, Staff Machine Learning Engineer at GitHub.  Over the last few years, Hamel has had the opportunity to work on some of the most popular open source projects in the ML world, including, nbdev, fastpages, and fastcore, just to name a few. In our conversation with Hamel, we discuss his journey into Silicon Valley, and how he discovered that the ML tooling and infrastructure weren’t quite as advanced as he’d assumed, and how that led him to help build some of the foundational pieces of Airbnb’s Bighead Platform.  We also spend time exploring Hamel’s time working with Jeremy Howard and the team creating, how nbdev came about, and how it plans to change the way practitioners interact with traditional jupyter notebooks. Finally, talk through a few more tools in the ecosystem, fastpages, fastcore, how these tools interact with Github Actions, and the up and coming ML tools that Hamel is excited about.  The complete show notes for this episode can be found at
Nov 01, 2021
Multi-task Learning for Melanoma Detection with Julianna Ianni - #531
In today’s episode, we are joined by Julianna Ianni, vice president of AI research & development at Proscia. In our conversation, Julianna shares her and her team’s research focused on developing applications that would help make the life of pathologists easier by enabling tasks to quickly and accurately be diagnosed using deep learning and AI. We also explore their paper “A Pathology Deep Learning System Capable of Triage of Melanoma Specimens Utilizing Dermatopathologist Consensus as Ground Truth”, while talking through how ML aids pathologists in diagnosing Melanoma by building a multitask classifier to distinguish between low-risk and high-risk cases. Finally, we discussed the challenges involved in designing a model that would help in identifying and classifying Melanoma, the results they’ve achieved, and what the future of this work could look like. The complete show notes for this episode can be found at
Oct 28, 2021
House Hunters: Machine Learning at Redfin with Akshat Kaul - #530
Today we’re joined by Akshat Kaul, the head of data science and machine learning at Redfin. We’re all familiar with Redfin, but did you know that is the largest real estate brokerage site in the US? In our conversation with Akshat, we discuss the history of ML at Redfin and a few of the key use cases that ML is currently being applied to, including recommendations, price estimates, and their “hot homes” feature. We explore their recent foray into building their own internal platform, which they’ve coined “Redeye”, how they’ve built Redeye to support modeling across the business, and how Akshat thinks about the role of the cloud when building and delivering their platform. Finally, we discuss the impact the pandemic has had on ML at the company, and Akshat’s vision for the future of their platform and machine learning at the company more broadly.  The complete show notes for this episode can be found at
Oct 26, 2021
Attacking Malware with Adversarial Machine Learning, w/ Edward Raff - #529
Today we’re joined by Edward Raff, chief scientist and head of the machine learning research group at Booz Allen Hamilton. Edward’s work sits at the intersection of machine learning and cybersecurity, with a particular interest in malware analysis and detection. In our conversation, we look at the evolution of adversarial ML over the last few years before digging into Edward’s recently released paper, Adversarial Transfer Attacks With Unknown Data and Class Overlap. In this paper, Edward and his team explore the use of adversarial transfer attacks and how they’re able to lower their success rate by simulating class disparity. Finally, we talk through quite a few future directions for adversarial attacks, including his interest in graph neural networks. The complete show notes for this episode can be found at
Oct 21, 2021
Learning to Ponder: Memory in Deep Neural Networks with Andrea Banino - #528
Today we’re joined by Andrea Banino, a research scientist at DeepMind. In our conversation with Andrea, we explore his interest in artificial general intelligence by way of episodic memory, the relationship between memory and intelligence, the challenges of applying memory in the context of neural networks, and how to overcome problems of generalization.  We also discuss his work on the PonderNet, a neural network that “budgets” its computational investment in solving a problem, according to the inherent complexity of the problem, the impetus and goals of this research, and how PonderNet connects to his memory research.  The complete show notes for this episode can be found at
Oct 18, 2021
Advancing Deep Reinforcement Learning with NetHack, w/ Tim Rocktäschel - #527
Take our survey at! Today we’re joined by Tim Rocktäschel, a research scientist at Facebook AI Research and an associate professor at University College London (UCL).  Tim’s work focuses on training RL agents in simulated environments, with the goal of these agents being able to generalize to novel situations. Typically, this is done in environments like OpenAI Gym, MuJuCo, or even using Atari games, but these all come with constraints. In Tim’s approach, he utilizes a game called NetHack, which is much more rich and complex than the aforementioned environments.   In our conversation with Tim, we explore the ins and outs of using NetHack as a training environment, including how much control a user has when generating each individual game and the challenges he's faced when deploying the agents. We also discuss his work on MiniHack, an environment creation framework and suite of tasks that are based on NetHack, and future directions for this research. The complete show notes for this episode can be found at
Oct 14, 2021
Building Technical Communities at Stack Overflow with Prashanth Chandrasekar - #526
In this special episode of the show, we’re excited to bring you our conversation with Prashanth Chandrasekar, CEO of Stack Overflow. This interview was recorded as a part of the annual Prosus AI Marketplace event.  In our discussion with Prashanth, we explore the impact the pandemic has had on Stack Overflow, how they think about community and enable collaboration in over 100 million monthly users from around the world, and some of the challenges they’ve dealt with when managing a community of this scale. We also examine where Stack Overflow is in their AI journey, use cases illustrating how they’re currently utilizing ML, what their role is in the future of AI-based code generation, what other trends they’ve picked up on over the last few years, and how they’re using those insights to forge the path forward. The complete show notes for this episode can be found at
Oct 11, 2021
Deep Learning is Eating 5G. Here’s How, w/ Joseph Soriaga - #525
Today we’re joined by Joseph Soriaga, a senior director of technology at Qualcomm.  In our conversation with Joseph, we focus on a pair of papers that he and his team will be presenting at Globecom later this year. The first, Neural Augmentation of Kalman Filter with Hypernetwork for Channel Tracking, details the use of deep learning to augment an algorithm to address mismatches in models, allowing for more efficient training and making models more interpretable and predictable. The second paper, WiCluster: Passive Indoor 2D/3D Positioning using WiFi without Precise Labels, explores the use of rf signals to infer what the environment looks like, allowing for estimation of a person’s movement.  We also discuss the ability for machine learning and AI to help enable 5G and make it more efficient for these applications, as well as the scenarios that ML would allow for more effective delivery of connected services, and look towards what might be possible in the near future.  The complete show notes for this episode can be found at
Oct 07, 2021
Modeling Human Cognition with RNNs and Curriculum Learning, w/ Kanaka Rajan - #524
Today we’re joined by Kanaka Rajan, an assistant professor at the Icahn School of Medicine at Mt Sinai. Kanaka, who is a recent recipient of the NSF Career Award, bridges the gap between the worlds of biology and artificial intelligence with her work in computer science. In our conversation, we explore how she builds “lego models” of the brain that mimic biological brain functions, then reverse engineers those models to answer the question “do these follow the same operating principles that the biological brain uses?” We also discuss the relationship between memory and dynamically evolving system states, how close we are to understanding how memory actually works, how she uses RNNs for modeling these processes, and what training and data collection looks like. Finally, we touch on her use of curriculum learning (where the task you want a system to learn increases in complexity slowly), and of course, we look ahead at future directions for Kanaka’s research.  The complete show notes for this episode can be found at
Oct 04, 2021
Do You Dare Run Your ML Experiments in Production? with Ville Tuulos - #523
Today we’re joined by a friend of the show and return guest Ville Tuulos, CEO and co-founder of Outerbounds. In our previous conversations with Ville, we explored his experience building and deploying the open-source framework, Metaflow, while working at Netflix. Since our last chat, Ville has embarked on a few new journeys, including writing the upcoming book Effective Data Science Infrastructure, and commercializing Metaflow, both of which we dig into quite a bit in this conversation.  We reintroduce the problem that Metaflow was built to solve and discuss some of the unique use cases that Ville has seen since it's release, the relationship between Metaflow and Kubernetes, and the maturity of services like batch and lambdas allowing a complete production ML system to be delivered. Finally, we discuss the degree to which Ville is catering is Outerbounds’ efforts to building tools for the MLOps community, and what the future looks like for him and Metaflow.  The complete show notes for this episode can be found at
Sep 30, 2021
Delivering Neural Speech Services at Scale with Li Jiang - #522
Today we’re joined by Li Jiang, a distinguished engineer at Microsoft working on Azure Speech.  In our conversation with Li, we discuss his journey across 27 years at Microsoft, where he’s worked on, among other things, audio and speech recognition technologies. We explore his thoughts on the advancements in speech recognition over the past few years, the challenges, and advantages, of using either end-to-end or hybrid models.  We also discuss the trade-offs between delivering accuracy or quality and the kind of runtime characteristics that you require as a service provider, in the context of engineering and delivering a service at the scale of Azure Speech. Finally, we walk through the data collection process for customizing a voice for TTS, what languages are currently supported, managing the responsibilities of threats like deep fakes, the future for services like these, and much more! The complete show notes for this episode can be found at
Sep 27, 2021
AI’s Legal and Ethical Implications with Sandra Wachter - #521
Today we’re joined by Sandra Wacther, an associate professor and senior research fellow at the University of Oxford.  Sandra’s work lies at the intersection of law and AI, focused on what she likes to call “algorithmic accountability”. In our conversation, we explore algorithmic accountability in three segments, explainability/transparency, data protection, and bias, fairness and discrimination. We discuss how the thinking around black boxes changes when discussing applying regulation and law, as well as a breakdown of counterfactual explanations and how they’re created. We also explore why factors like the lack of oversight lead to poor self-regulation, and the conditional demographic disparity test that she helped develop to test bias in models, which was recently adopted by Amazon. The complete show notes for this episode can be found at
Sep 23, 2021
Compositional ML and the Future of Software Development with Dillon Erb - #520
Today we’re joined by Dillon Erb, CEO of Paperspace.  If you’re not familiar with Dillon, he joined us about a year ago to discuss Machine Learning as a Software Engineering Discipline; we strongly encourage you to check out that interview as well. In our conversation, we explore the idea of compositional AI, and if it is the next frontier in a string of recent game-changing machine learning developments. We also discuss a source of constant back and forth in the community around the role of notebooks, and why Paperspace made the choice to pivot towards a more traditional engineering code artifact model after building a popular notebook service. Finally, we talk through their newest release Workflows, an automation and build system for ML applications, which Dillon calls their “most ambitious and comprehensive project yet.” The complete show notes for this episode can be found at
Sep 20, 2021
Generating SQL Database Queries from Natural Language with Yanshuai Cao - #519
Today we’re joined by Yanshuai Cao, a senior research team lead at Borealis AI. In our conversation with Yanshuai, we explore his work on Turing, their natural language to SQL engine that allows users to get insights from relational databases without having to write code. We do a bit of compare and contrast with the recently released Codex Model from OpenAI, the role that reasoning plays in solving this problem, and how it is implemented in the model. We also talk through various challenges like data augmentation, the complexity of the queries that Turing can produce, and a paper that explores the explainability of this model. The complete show notes for this episode can be found at
Sep 16, 2021
Social Commonsense Reasoning with Yejin Choi - #518
Today we’re joined by Yejin Choi, a professor at the University of Washington. We had the pleasure of catching up with Yejin after her keynote interview at the recent Stanford HAI “Foundational Models” workshop. In our conversation, we explore her work at the intersection of natural language generation and common sense reasoning, including how she defines common sense, and what the current state of the world is for that research. We discuss how this could be used for creative storytelling, how transformers could be applied to these tasks, and we dig into the subfields of physical and social common sense reasoning. Finally, we talk through the future of Yejin’s research and the areas that she sees as most promising going forward.  If you enjoyed this episode, check out our conversation on AI Storytelling Systems with Mark Riedl. The complete show notes for today’s episode can be found at
Sep 13, 2021
Deep Reinforcement Learning for Game Testing at EA with Konrad Tollmar - #517
Today we’re joined by Konrad Tollmar, research director at Electronic Arts and an associate professor at KTH.  In our conversation, we explore his role as the lead of EA’s applied research team SEED and the ways that they’re applying ML/AI across popular franchises like Apex Legends, Madden, and FIFA. We break down a few papers focused on the application of ML to game testing, discussing why deep reinforcement learning is at the top of their research agenda, the differences between training atari games and modern 3D games, using CNNs to detect glitches in games, and of course, Konrad gives us his outlook on the future of ML for games training. The complete show notes for this episode can be found at
Sep 09, 2021
Exploring AI 2041 with Kai-Fu Lee - #516
Today we’re joined by Kai-Fu Lee, chairman and CEO of Sinovation Ventures and author of AI 2041: Ten Visions for Our Future.  In AI 2041, Kai-Fu and co-author Chen Qiufan tell the story of how AI could shape our future through a series of 10 “scientific fiction” short stories. In our conversation with Kai-Fu, we explore why he chose 20 years as the time horizon for these stories, and dig into a few of the stories in more detail. We explore the potential for level 5 autonomous driving and what effect that will have on both established and developing nations, the potential outcomes when dealing with job displacement, and his perspective on how the book will be received. We also discuss the potential consequences of autonomous weapons, if we should actually worry about singularity or superintelligence, and the evolution of regulations around AI in 20 years. We’d love to hear from you! What are your thoughts on any of the stories we discuss in the interview? Will you be checking this book out? Let us know in the comments on the show notes page at
Sep 06, 2021
Advancing Robotic Brains and Bodies with Daniela Rus - #515
Today we’re joined by Daniela Rus, director of CSAIL & Deputy Dean of Research at MIT.  In our conversation with Daniela, we explore the history of CSAIL, her role as director of one of the most prestigious computer science labs in the world, how she defines robots, and her take on the current AI for robotics landscape. We also discuss some of her recent research interests including soft robotics, adaptive control in autonomous vehicles, and a mini surgeon robot made with sausage casing(?!).  The complete show notes for this episode can be found at
Sep 02, 2021
Neural Synthesis of Binaural Speech From Mono Audio with Alexander Richard - #514
Today we’re joined by Alexander Richard, a research scientist at Facebook Reality Labs, and recipient of the ICLR Best Paper Award for his paper “Neural Synthesis of Binaural Speech From Mono Audio.”  We begin our conversation with a look into the charter of Facebook Reality Labs, and Alex’s specific Codec Avatar project, where they’re developing AR/VR for social telepresence (applications like this come to mind). Of course, we dig into the aforementioned paper, discussing the difficulty in improving the quality of audio and the role of dynamic time warping, as well as the challenges of creating this model. Finally, Alex shares his thoughts on 3D rendering for audio, and other future research directions.  The complete show notes for this episode can be found at
Aug 30, 2021
Using Brain Imaging to Improve Neural Networks with Alona Fyshe - #513
Today we’re joined by Alona Fyshe, an assistant professor at the University of Alberta.  We caught up with Alona on the heels of an interesting panel discussion that she participated in, centered around improving AI systems using research about brain activity. In our conversation, we explore the multiple types of brain images that are used in this research, what representations look like in these images, and how we can improve language models without knowing explicitly how the brain understands the language. We also discuss similar experiments that have incorporated vision, the relationship between computer vision models and the representations that language models create, and future projects like applying a reinforcement learning framework to improve language generation. The complete show notes for this episode can be found at
Aug 26, 2021
Adaptivity in Machine Learning with Samory Kpotufe - #512
Today we’re joined by Samory Kpotufe, an associate professor at Columbia University and program chair of the 2021 Conference on Learning Theory (COLT).  In our conversation with Samory, we explore his research at the intersection of machine learning, statistics, and learning theory, and his goal of reaching self-tuning, adaptive algorithms. We discuss Samory’s research in transfer learning and other potential procedures that could positively affect transfer, as well as his work understanding unsupervised learning including how clustering could be applied to real-world applications like cybersecurity, IoT (Smart homes, smart city sensors, etc) using methods like dimension reduction, random projection, and others. If you enjoyed this interview, you should definitely check out our conversation with Jelani Nelson on the “Theory of Computation.”  The complete show notes for this episode can be found at
Aug 23, 2021
A Social Scientist’s Perspective on AI with Eric Rice - #511
Today we’re joined by Eric Rice, associate professor at USC, and the co-director of the USC Center for Artificial Intelligence in Society.  Eric is a sociologist by trade, and in our conversation, we explore how he has made extensive inroads within the machine learning community through collaborations with ML academics and researchers. We discuss some of the most important lessons Eric has learned while doing interdisciplinary projects, how the social scientist’s approach to assessment and measurement would be different from a computer scientist's approach to assessing the algorithmic performance of a model.  We specifically explore a few projects he’s worked on including HIV prevention amongst the homeless youth population in LA, a project he spearheaded with former guest Milind Tambe, as well as a project focused on using ML techniques to assist in the identification of people in need of housing resources, and ensuring that they get the best interventions possible.  If you enjoyed this conversation, I encourage you to check out our conversation with Milind Tambe from last year’s TWIMLfest on Why AI Innovation and Social Impact Go Hand in Hand. The complete show notes for this episode can be found at
Aug 19, 2021
Applications of Variational Autoencoders and Bayesian Optimization with José Miguel Hernández Lobato - #510
Today we’re joined by José Miguel Hernández-Lobato, a university lecturer in machine learning at the University of Cambridge. In our conversation with Miguel, we explore his work at the intersection of Bayesian learning and deep learning. We discuss how he’s been applying this to the field of molecular design and discovery via two different methods, with one paper searching for possible chemical reactions, and the other doing the same, but in 3D and in 3D space. We also discuss the challenges of sample efficiency, creating objective functions, and how those manifest themselves in these experiments, and how he integrated the Bayesian approach to RL problems. We also talk through a handful of other papers that Miguel has presented at recent conferences, which are all linked at
Aug 16, 2021
Codex, OpenAI’s Automated Code Generation API with Greg Brockman - #509
Today we’re joined by return guest Greg Brockman, co-founder and CTO of OpenAI. We had the pleasure of reconnecting with Greg on the heels of the announcement of Codex, OpenAI’s most recent release. Codex is a direct descendant of GPT-3 that allows users to do autocomplete tasks based on all of the publicly available text and code on the internet. In our conversation with Greg, we explore the distinct results Codex sees in comparison to GPT-3, relative to the prompts it's being given, how it could evolve given different types of training data, and how users and practitioners should think about interacting with the API to get the most out of it. We also discuss Copilot, their recent collaboration with Github that is built on Codex, as well as the implications of Codex on coding education, explainability, and broader societal issues like fairness and bias, copyrighting, and jobs.  The complete show notes for this episode can be found at
Aug 12, 2021
Spatiotemporal Data Analysis with Rose Yu - #508
Today we’re joined by Rose Yu, an assistant professor at the Jacobs School of Engineering at UC San Diego.  Rose’s research focuses on advancing machine learning algorithms and methods for analyzing large-scale time-series and spatial-temporal data, then applying those developments to climate, transportation, and other physical sciences. We discuss how Rose incorporates physical knowledge and partial differential equations in these use cases and how symmetries are being exploited. We also explore their novel neural network design that is focused on non-traditional convolution operators and allows for general symmetry, how we get from these representations to the network architectures that she has developed and another recent paper on deep spatio-temporal models.  The complete show note for this episode can be found at
Aug 09, 2021
Parallelism and Acceleration for Large Language Models with Bryan Catanzaro - #507
Today we’re joined by Bryan Catanzaro, vice president of applied deep learning research at NVIDIA. Most folks know Bryan as one of the founders/creators of cuDNN, the accelerated library for deep neural networks. In our conversation, we explore his interest in high-performance computing and its recent overlap with AI, his current work on Megatron, a framework for training giant language models, and the basic approach for distributing a large language model on DGX infrastructure.  We also discuss the three different kinds of parallelism, tensor parallelism, pipeline parallelism, and data parallelism, that Megatron provides when training models, as well as his work on the Deep Learning Super Sampling project and the role it's playing in the present and future of game development via ray tracing.  The complete show notes for this episode can be found at
Aug 05, 2021
Applying the Causal Roadmap to Optimal Dynamic Treatment Rules with Lina Montoya - #506
Today we close out our 2021 ICML series joined by Lina Montoya, a postdoctoral researcher at UNC Chapel Hill.  In our conversation with Lina, who was an invited speaker at the Neglected Assumptions in Causal Inference Workshop, we explored her work applying Optimal Dynamic Treatment (ODT) to understand which kinds of individuals respond best to specific interventions in the US criminal justice system. We discuss the concept of neglected assumptions and how it connects to ODT rule estimation, as well as a breakdown of the causal roadmap, coined by researchers at UC Berkeley.  Finally, Lina talks us through the roadmap while applying the ODT rule problem, how she’s applied a “superlearner” algorithm to this problem, how it was trained, and what the future of this research looks like. The complete show notes for this episode can be found at
Aug 02, 2021
Constraint Active Search for Human-in-the-Loop Optimization with Gustavo Malkomes - #505
Today we continue our ICML series joined by Gustavo Malkomes, a research engineer at Intel via their recent acquisition of SigOpt.  In our conversation with Gustavo, we explore his paper Beyond the Pareto Efficient Frontier: Constraint Active Search for Multiobjective Experimental Design, which focuses on a novel algorithmic solution for the iterative model search process. This new algorithm empowers teams to run experiments where they are not optimizing particular metrics but instead identifying parameter configurations that satisfy constraints in the metric space. This allows users to efficiently explore multiple metrics at once in an efficient, informed, and intelligent way that lends itself to real-world, human-in-the-loop scenarios. The complete show notes for this episode can be found at
Jul 29, 2021
Fairness and Robustness in Federated Learning with Virginia Smith -#504
Today we kick off our ICML coverage joined by Virginia Smith, an assistant professor in the Machine Learning Department at Carnegie Mellon University.  In our conversation with Virginia, we explore her work on cross-device federated learning applications, including where the distributed learning aspects of FL are relative to the privacy techniques. We dig into her paper from ICML, Ditto: Fair and Robust Federated Learning Through Personalization, what fairness means in contrast to AI ethics, the particulars of the failure modes, the relationship between models, and the things being optimized across devices, and the tradeoffs between fairness and robustness. We also discuss a second paper, Heterogeneity for the Win: One-Shot Federated Clustering, how the proposed method makes heterogeneity beneficial in data, how the heterogeneity of data is classified, and some applications of FL in an unsupervised setting. The complete show notes for this episode can be found at
Jul 26, 2021
Scaling AI at H&M Group with Errol Koolmeister - #503
Today we’re joined by Errol Koolmeister, the head of AI foundation at H&M Group. In our conversation with Errol, we explore H&M’s AI journey, including its wide adoption across the company in 2016, and the various use cases in which it's deployed like fashion forecasting and pricing algorithms. We discuss Errol’s first steps in taking on the challenge of scaling AI broadly at the company, the value-added learning from proof of concepts, and how to align in a sustainable, long-term way. Of course, we dig into the infrastructure and models being used, the biggest challenges faced, and the importance of managing the project portfolio, while Errol shares their approach to building infra for a specific product with many products in mind.
Jul 22, 2021
Evolving AI Systems Gracefully with Stefano Soatto - #502
Today we’re joined by Stefano Soatto, VP of AI applications science at AWS and a professor of computer science at UCLA.  Our conversation with Stefano centers on recent research of his called Graceful AI, which focuses on how to make trained systems evolve gracefully. We discuss the broader motivation for this research and the potential dangers or negative effects of constantly retraining ML models in production. We also talk about research into error rate clustering, the importance of model architecture when dealing with problems of model compression, how they’ve solved problems of regression and reprocessing by utilizing existing models, and much more. The complete show notes for this episode can be found at
Jul 19, 2021
ML Innovation in Healthcare with Suchi Saria - #501
Today we’re joined by Suchi Saria, the founder and CEO of Bayesian Health, the John C. Malone associate professor of computer science, statistics, and health policy, and the director of the machine learning and healthcare lab at Johns Hopkins University.  Suchi shares a bit about her journey to working in the intersection of machine learning and healthcare, and how her research has spanned across both medical policy and discovery. We discuss why it has taken so long for machine learning to become accepted and adopted by the healthcare infrastructure and where exactly we stand in the adoption process, where there have been “pockets” of tangible success.  Finally, we explore the state of healthcare data, and of course, we talk about Suchi’s recently announced startup Bayesian Health and their goals in the healthcare space, and an accompanying study that looks at real-time ML inference in an EMR setting. The complete show notes for this episode can be found at
Jul 15, 2021
Cross-Device AI Acceleration, Compilation & Execution with Jeff Gehlhaar - #500
Today we’re joined by a friend of the show Jeff Gehlhaar, VP of technology and the head of AI software platforms at Qualcomm.  In our conversation with Jeff, we cover a ton of ground, starting with a bit of exploration around ML compilers, what they are, and their role in solving issues of parallelism. We also dig into the latest additions to the Snapdragon platform, AI Engine Direct, and how it works as a bridge to bring more capabilities across their platform, how benchmarking works in the context of the platform, how the work of other researchers we’ve spoken to on compression and quantization finds its way from research to product, and much more!  After you check out this interview, you can look below for some of the other conversations with researchers mentioned.  The complete show notes for this episode can be found at
Jul 12, 2021
The Future of Human-Machine Interaction with Dan Bohus and Siddhartha Sen - #499
Today we continue our AI in Innovation series joined by Dan Bohus, senior principal researcher at Microsoft Research, and Siddhartha Sen, a principal researcher at Microsoft Research.  In this conversation, we use a pair of research projects, Maia Chess and Situated Interaction, to springboard us into a conversation about the evolution of human-AI interaction. We discuss both of these projects individually, as well as the commonalities they have, how themes like understanding the human experience appear in their work, the types of models being used, the various types of data, and the complexity of each of their setups.  We explore some of the challenges associated with getting computers to better understand human behavior and interact in ways that are more fluid. Finally, we touch on what excites both Dan and Sid about their respective projects, and what they’re excited about for the future.   The complete show notes for this episode can be found at
Jul 08, 2021
Vector Quantization for NN Compression with Julieta Martinez - #498
Today we’re joined by Julieta Martinez, a senior research scientist at recently announced startup Waabi.  Julieta was a keynote speaker at the recent LatinX in AI workshop at CVPR, and our conversation focuses on her talk “What do Large-Scale Visual Search and Neural Network Compression have in Common,” which shows that multiple ideas from large-scale visual search can be used to achieve state-of-the-art neural network compression. We explore the commonality between large databases and dealing with high dimensional, many-parameter neural networks, the advantages of using product quantization, and how that plays out when using it to compress a neural network.  We also dig into another paper Julieta presented at the conference, Deep Multi-Task Learning for Joint Localization, Perception, and Prediction, which details an architecture that is able to reuse computation between the three tasks, and is thus able to correct localization errors efficiently. The complete show notes for this episode can be found at
Jul 05, 2021
Deep Unsupervised Learning for Climate Informatics with Claire Monteleoni - #497
Today we continue our CVPR 2021 coverage joined by Claire Monteleoni, an associate professor at the University of Colorado Boulder.  We cover quite a bit of ground in our conversation with Claire, including her journey down the path from environmental activist to one of the leading climate informatics researchers in the world. We explore her current research interests, and the available opportunities in applying machine learning to climate informatics, including the interesting position of doing ML from a data-rich environment.  Finally, we dig into the evolution of climate science-focused events and conferences, as well as the Keynote Claire gave at the EarthVision workshop at CVPR “Deep Unsupervised Learning for Climate Informatics,” which focused on semi- and unsupervised deep learning approaches to studying rare and extreme climate events. The complete show notes for this episode can be found at
Jul 01, 2021
Skip-Convolutions for Efficient Video Processing with Amir Habibian - #496
Today we kick off our CVPR coverage joined by Amir Habibian, a senior staff engineer manager at Qualcomm Technologies.  In our conversation with Amir, whose research primarily focuses on video perception, we discuss a few papers they presented at the event. We explore the paper Skip-Convolutions for Efficient Video Processing, which looks at training discrete variables to end to end into visual neural networks. We also discuss his work on his FrameExit paper, which proposes a conditional early exiting framework for efficient video recognition.  The complete show notes for this episode can be found at
Jun 28, 2021
Advancing NLP with Project Debater w/ Noam Slonim - #495
Today we’re joined by Noam Slonim, the principal investigator of Project Debater at IBM Research.  In our conversation with Noam, we explore the history of Project Debater, the first AI system that can “debate” humans on complex topics. We also dig into the evolution of the project, which is the culmination of 7 years and over 50 research papers, and eventually becoming a Nature cover paper, “An Autonomous Debating System,” which details the system in its entirety.  Finally, Noam details many of the underlying capabilities of Debater, including the relationship between systems preparation and training, evidence detection, detecting the quality of arguments, narrative generation, the use of conventional NLP methods like entity linking, and much more. The complete show notes for this episode can be found at
Jun 24, 2021
Bringing AI Up to Speed with Autonomous Racing w/ Madhur Behl - #494
Today we’re joined by Madhur Behl, an Assistant Professor in the department of computer science at the University of Virginia.  In our conversation with Madhur, we explore the super interesting work he’s doing at the intersection of autonomous driving, ML/AI, and Motorsports, where he’s teaching self-driving cars how to drive in an agile manner. We talk through the differences between traditional self-driving problems and those encountered in a racing environment, the challenges in solving planning, perception, control.  We also discuss their upcoming race at the Indianapolis Motor Speedway, where Madhur and his students will compete for 1 million dollars in the world’s first head-to-head fully autonomous race, and how they’re preparing for it.
Jun 21, 2021
AI and Society: Past, Present and Future with Eric Horvitz - #493
Today we continue our AI Innovation series joined by Microsoft’s Chief Scientific Officer, Eric Horvitz.  In our conversation with Eric, we explore his tenure as AAAI president and his focus on the future of AI and its ethical implications, the scope of the study on the topic, and how drastically the AI and machine learning landscape has changed since 2009. We also discuss Eric’s role at Microsoft and the Aether committee that has advised the company on issues of responsible AI since 2017. Finally, we talk through his recent work as a member of the National Security Commission on AI, where he helped commission a 750+ page report on topics including the Future of AI R&D, Building Trustworthy AI systems, civil liberties and privacy, and the challenging area of AI and autonomous weapons.   The complete show notes for this episode can be found at
Jun 17, 2021
Agile Applied AI Research with Parvez Ahammad - #492
Today we’re joined by Parvez Ahammad, head of data science applied research at LinkedIn. In our conversation, Parvez shares his interesting take on organizing principles for his organization, starting with how data science teams are broadly organized at LinkedIn. We explore how they ensure time investments on long-term projects are managed, how to identify products that can help in a cross-cutting way across multiple lines of business, quantitative methodologies to identify unintended consequences in experimentation, and navigating the tension between research and applied ML teams in an organization. Finally, we discuss differential privacy, and their recently released GreyKite library, an open-source Python library developed to support forecasting. The complete show note for this episode can be found at
Jun 14, 2021
Haptic Intelligence with Katherine J. Kuchenbecker - #491
Today we’re joined Katherine J. Kuchenbecker, director at the Max Planck Institute for Intelligent Systems and of the haptic intelligence department.  In our conversation, we explore Katherine’s research interests, which lie at the intersection of haptics (physical interaction with the world) and machine learning, introducing us to the concept of “haptic intelligence.” We discuss how ML, mainly computer vision, has been integrated to work together with robots, and some of the devices that Katherine’s lab is developing to take advantage of this research. We also talk about hugging robots, augmented reality in robotic surgery, and the degree to which she studies human-robot interaction. Finally, Katherine shares with us her passion for mentoring and the importance of diversity and inclusion in robotics and machine learning.  The complete show notes for this episode can be found at
Jun 10, 2021
Data Science on AWS with Chris Fregly and Antje Barth - #490
Today we continue our coverage of the AWS ML Summit joined by Chris Fregly, a principal developer advocate at AWS, and Antje Barth, a senior developer advocate at AWS.  In our conversation with Chris and Antje, we explore their roles as community builders prior to, and since, joining AWS, as well as their recently released book Data Science on AWS. In the book, Chris and Antje demonstrate how to reduce cost and improve performance while successfully building and deploying data science projects.  We also discuss the release of their new Practical Data Science Specialization on Coursera, managing the complexity that comes with building real-world projects, and some of their favorite sessions from the recent ML Summit.
Jun 07, 2021
Accelerating Distributed AI Applications at Qualcomm with Ziad Asghar - #489
Today we’re joined by Ziad Asghar, vice president of product management for snapdragon technologies & roadmap at Qualcomm Technologies.  We begin our conversation with Ziad exploring the symbiosis between 5G and AI and what is enabling developers to take full advantage of AI on mobile devices. We also discuss the balance of product evolution and incorporating research concepts, and the evolution of their hardware infrastructure Cloud AI 100, their role in the deployment of Ingenuity, the robotic helicopter that operated on Mars just last year.  Finally, we talk about specialization in building IoT applications like autonomous vehicles and smart cities, the degree to which federated learning is being deployed across the industry, and the importance of privacy and security of personal data.  The complete show notes can be found at
Jun 03, 2021
Buy AND Build for Production Machine Learning with Nir Bar-Lev - #488
Today we’re joined by Nir Bar-Lev, co-founder and CEO of ClearML. In our conversation with Nir, we explore how his view of the wide vs deep machine learning platforms paradox has changed and evolved over time, how companies should think about building vs buying and integration, and his thoughts on why experiment management has become an automatic buy, be it open source or otherwise.  We also discuss the disadvantages of using a cloud vendor as opposed to a software-based approach, the balance between mlops and data science when addressing issues of overfitting, and how ClearML is applying techniques like federated machine learning and transfer learning to their solutions. The complete show notes for this episode can be found at
May 31, 2021
Applied AI Research at AWS with Alex Smola - #487
Today we’re joined by Alex Smola, Vice President and Distinguished Scientist at AWS AI. We had the pleasure to catch up with Alex prior to the upcoming AWS Machine Learning Summit, and we covered a TON of ground in the conversation. We start by focusing on his research in the domain of deep learning on graphs, including a few examples showcasing its function, and an interesting discussion around the relationship between large language models and graphs. Next up, we discuss their focus on AutoML research and how it's the key to lowering the barrier of entry for machine learning research. Alex also shares a bit about his work on causality and causal modeling, introducing us to the concept of Granger causality. Finally, we talk about the aforementioned ML Summit, its exponential growth since its inception a few years ago, and what speakers he's most excited about hearing from. The complete show notes for this episode can be found at
May 27, 2021
Causal Models in Practice at Lyft with Sean Taylor - #486
Today we’re joined by Sean Taylor, Staff Data Scientist at Lyft Rideshare Labs. We cover a lot of ground with Sean, starting with his recent decision to step away from his previous role as the lab director to take a more hands-on role, and what inspired that change. We also discuss his research at Rideshare Labs, where they take a more “moonshot” approach to solving the typical problems like forecasting and planning, marketplace experimentation, and decision making, and how his statistical approach manifests itself in his work. Finally, we spend quite a bit of time exploring the role of causality in the work at rideshare labs, including how systems like the aforementioned forecasting system are designed around causal models, if driving model development is more effective using business metrics, challenges associated with hierarchical modeling, and much much more. The complete show notes for this episode can be found at
May 24, 2021
Using AI to Map the Human Immune System w/ Jabran Zahid - #485
Today we’re joined by Jabran Zahid, a Senior Researcher at Microsoft Research. In our conversation with Jabran, we explore their recent endeavor into the complete mapping of which T-cells bind to which antigens through the Antigen Map Project. We discuss how Jabran’s background in astrophysics and cosmology has translated to his current work in immunology and biology, the origins of the antigen map, the biological and how the focus was changed by the emergence of the coronavirus pandemic. We talk through the biological advancements, and the challenges of using machine learning in this setting, some of the more advanced ML techniques that they’ve tried that have not panned out (as of yet), the path forward for the antigen map to make a broader impact, and much more. The complete show notes for this episode can be found at
May 20, 2021
Learning Long-Time Dependencies with RNNs w/ Konstantin Rusch - #484
Today we conclude our 2021 ICLR coverage joined by Konstantin Rusch, a PhD Student at ETH Zurich. In our conversation with Konstantin, we explore his recent papers, titled coRNN and uniCORNN respectively, which focus on a novel architecture of recurrent neural networks for learning long-time dependencies. We explore the inspiration he drew from neuroscience when tackling this problem, how the performance results compared to networks like LSTMs and others that have been proven to work on this problem and Konstantin’s future research goals. The complete show notes for this episode can be found at
May 17, 2021
What the Human Brain Can Tell Us About NLP Models with Allyson Ettinger - #483
Today we continue our ICLR ‘21 series joined by Allyson Ettinger, an Assistant Professor at the University of Chicago.  One of our favorite recurring conversations on the podcast is the two-way street that lies between machine learning and neuroscience, which Allyson explores through the modeling of cognitive processes that pertain to language. In our conversation, we discuss how she approaches assessing the competencies of AI, the value of control of confounding variables in AI research, and how the pattern matching traits of Ml/DL models are not necessarily exclusive to these systems.  Allyson also participated in a recent panel discussion at the ICLR workshop How Can Findings About The Brain Improve AI Systems?, centered around the utility of brain inspiration for developing AI models. We discuss ways in which we can try to more closely simulate the functioning of a brain, where her work fits into the analysis and interpretability area of NLP, and much more! The complete show notes for this episode can be found at 
May 13, 2021
Probabilistic Numeric CNNs with Roberto Bondesan - #482
Today we kick off our ICLR 2021 coverage joined by Roberto Bondesan, an AI Researcher at Qualcomm.  In our conversation with Roberto, we explore his paper Probabilistic Numeric Convolutional Neural Networks, which represents features as Gaussian processes, providing a probabilistic description of discretization error. We discuss some of the other work the team at Qualcomm presented at the conference, including a paper called Adaptive Neural Compression, as well as work on Guage Equvariant Mesh CNNs. Finally, we briefly discuss quantum deep learning, and what excites Roberto and his team about the future of their research in combinatorial optimization.   The complete show notes for this episode can be found at
May 10, 2021
Building a Unified NLP Framework at LinkedIn with Huiji Gao - #481
Today we’re joined by Huiji Gao, a Senior Engineering Manager of Machine Learning and AI at LinkedIn.  In our conversation with Huiji, we dig into his interest in building NLP tools and systems, including a recent open-source project called DeText, a framework for generating models for ranking classification and language generation. We explore the motivation behind DeText, the landscape at LinkedIn before and after it was put into use broadly, and the various contexts it’s being used in at the company. We also discuss the relationship between BERT and DeText via LiBERT, a version of BERT that is trained and calibrated on LinkedIn data, the practical use of these tools from an engineering perspective, the approach they’ve taken to optimization, and much more! The complete show notes for this episode can be found at 
May 06, 2021
Dask + Data Science Careers with Jacqueline Nolis - #480
Today we’re joined by Jacqueline Nolis, Head of Data Science at Saturn Cloud, and co-host of the Build a Career in Data Science Podcast.  You might remember Jacqueline from our Advancing Your Data Science Career During the Pandemic panel, where she shared her experience trying to navigate the suddenly hectic data science job market. Now, a year removed from that panel, we explore her book on data science careers, top insights for folks just getting into the field, ways that job seekers should be signaling that they have the required background, and how to approach and navigate failure as a data scientist.  We also spend quite a bit of time discussing Dask, an open-source library for parallel computing in Python, as well as use cases for the tool, the relationship between dask and Kubernetes and docker containers, where data scientists are in regards to the software development toolchain and much more! The complete show notes for this episode can be found at  
May 03, 2021
Machine Learning for Equitable Healthcare Outcomes with Irene Chen - #479
Today we’re joined by Irene Chen, a Ph.D. student at MIT.  Irene’s research is focused on developing new machine learning methods specifically for healthcare, through the lens of questions of equity and inclusion. In our conversation, we explore some of the various projects that Irene has worked on, including an early detection program for intimate partner violence.  We also discuss how she thinks about the long term implications of predictions in the healthcare domain, how she’s learned to communicate across the interface between the ML researcher and clinician, probabilistic approaches to machine learning for healthcare, and finally, key takeaways for those of you interested in this area of research. The complete show notes for this episode can be found at
Apr 29, 2021
AI Storytelling Systems with Mark Riedl - #478
Today we’re joined by Mark Riedl, a Professor in the School of Interactive Computing at Georgia Tech. In our conversation with Mark, we explore his work building AI storytelling systems, mainly those that try and predict what listeners think will happen next in a story and how he brings together many different threads of ML/AI together to solve these problems. We discuss how the theory of mind is layered into his research, the use of large language models like GPT-3, and his push towards being able to generate suspenseful stories with these systems.  We also discuss the concept of intentional creativity and the lack of good theory on the subject, the adjacent areas in ML that he’s most excited about for their potential contribution to his research, his recent focus on model explainability, how he approaches problems of common sense, and much more!  The complete show notes for this episode can be found at
Apr 26, 2021
Creating Robust Language Representations with Jamie Macbeth - #477
Today we’re joined by Jamie Macbeth, an assistant professor in the department of computer science at Smith College.  In our conversation with Jamie, we explore his work at the intersection of cognitive systems and natural language understanding, and how to use AI as a vehicle for better understanding human intelligence. We discuss the tie that binds these domains together, if the tasks are the same as traditional NLU tasks, and what are the specific things he’s trying to gain deeper insights into. One of the unique aspects of Jamie’s research is that he takes an “old-school AI” approach, and to that end, we discuss the models he handcrafts to generate language. Finally, we examine how he evaluates the performance of his representations if he’s not playing the SOTA “game,” what he bookmarks against, identifying deficiencies in deep learning systems, and the exciting directions for his upcoming research.  The complete show notes for this episode can be found at
Apr 21, 2021
Reinforcement Learning for Industrial AI with Pieter Abbeel - #476
Today we’re joined by Pieter Abbeel, a Professor at UC Berkeley, co-Director of the Berkeley AI Research Lab (BAIR), as well as Co-founder and Chief Scientist at Covariant. In our conversation with Pieter, we cover a ton of ground, starting with the specific goals and tasks of his work at Covariant, the shift in needs for industrial AI application and robots, if his experience solving real-world problems has changed his opinion on end to end deep learning, and the scope for the three problem domains of the models he’s building. We also explore his recent work at the intersection of unsupervised and reinforcement learning, goal-directed RL, his recent paper “Pretrained Transformers as Universal Computation Engines” and where that research thread is headed, and of course, his new podcast Robot Brains, which you can find on all streaming platforms today! The complete show notes for this episode can be found at
Apr 19, 2021
AutoML for Natural Language Processing with Abhishek Thakur - #475
Today we’re joined by Abhishek Thakur, a machine learning engineer at Hugging Face, and the world’s first Quadruple Kaggle Grandmaster! In our conversation with Abhishek, we explore his Kaggle journey, including how his approach to competitions has evolved over time, what resources he used to prepare for his transition to a full-time practitioner, and the most important lessons he’s learned along the way. We also spend a great deal of time discussing his new role at HuggingFace, where he's building AutoNLP. We talk through the goals of the project, the primary problem domain, and how the results of AutoNLP compare with those from hand-crafted models. Finally, we discuss Abhishek’s book, Approaching (Almost) Any Machine Learning Problem. The complete show notes for this episode can be found at
Apr 15, 2021
Inclusive Design for Seeing AI with Saqib Shaikh - #474
Today we’re joined by Saqib Shaikh, a Software Engineer at Microsoft, and the lead for the Seeing AI Project. In our conversation with Saqib, we explore the Seeing AI app, an app “that narrates the world around you.” We discuss the various technologies and use cases for the app, and how it has evolved since the inception of the project, how the technology landscape supports projects like this one, and the technical challenges he faces when building out the app. We also the relationship and trust between humans and robots, and how that translates to this app, what Saqib sees on the research horizon that will support his vision for the future of Seeing AI, and how the integration of tech like Apple’s upcoming “smart” glasses could change the way their app is used. The complete show notes for this episode can be found at
Apr 12, 2021
Theory of Computation with Jelani Nelson - #473
Today we’re joined by Jelani Nelson, a professor in the Theory Group at UC Berkeley. In our conversation with Jelani, we explore his research in computational theory, where he focuses on building streaming and sketching algorithms, random projections, and dimensionality reduction. We discuss how Jelani thinks about the balance between the innovation of new algorithms and the performance of existing ones, and some use cases where we’d see his work in action. Finally, we talk through how his work ties into machine learning, what tools from the theorist’s toolbox he’d suggest all ML practitioners know, and his nonprofit AddisCoder, a 4 week summer program that introduces high-school students to programming and algorithms. The complete show notes for this episode can be found at
Apr 08, 2021
Human-Centered ML for High-Risk Behaviors with Stevie Chancellor - #472
Today we’re joined by Stevie Chancellor, an Assistant Professor in the Department of Computer Science and Engineering at the University of Minnesota. In our conversation with Stevie, we explore her work at the intersection of human-centered computing, machine learning, and high-risk mental illness behaviors. We discuss how her background in HCC helps shapes her perspective, how machine learning helps with understanding severity levels of mental illness, and some recent work where convolutional graph neural networks are applied to identify and discover new kinds of behaviors for people who struggle with opioid use disorder. We also explore the role of computational linguistics and NLP in her research, issues in using social media data being used as a data source, and finally, how people who are interested in an introduction to human-centered computing can get started. The complete show notes for this episode can be found at
Apr 05, 2021
Operationalizing AI at Dataiku with Conor Jensen - #471
In this episode, we’re joined by Dataiku’s Director of Data Science, Conor Jensen. In our conversation, we explore the panel he lead at TWIMLcon “AI Operationalization: Where the AI Rubber Hits the Road for the Enterprise,” discussing the ML journey of each panelist’s company, and where Dataiku fits in the equation. The complete show notes for this episode can be found at 
Apr 01, 2021
ML Lifecycle Management at Algorithmia with Diego Oppenheimer - #470
In this episode, we’re joined by Diego Oppenheimer, Founder and CEO of Algorithmia. In our conversation, we discuss Algorithmia’s involvement with TWIMLcon, as well as an exploration of the results of their recently conducted survey on the state of the AI market. The complete show notes for this episode can be found at
Apr 01, 2021
End to End ML at Cloudera with Santiago Giraldo - #469 [TWIMLcon Sponsor Series]
In this episode, we’re joined by Santiago Giraldo, Director Of Product Marketing for Data Engineering & Machine Learning at Cloudera. In our conversation, we discuss Cloudera’s talks at TWIMLcon, as well as their various research efforts from their Fast Forward Labs arm. The complete show notes for this episode can be found at
Mar 29, 2021
ML Platforms for Global Scale at Prosus with Paul van der Boor - #468 [TWIMLcon Sponsor Series]
In this episode, we’re joined by Paul van der Boor, Senior Director of Data Science at Prosus, to discuss his TWIMLcon experience and how they’re using ML platforms to manage machine learning at a global scale. The complete show notes for this episode can be found at
Mar 29, 2021
Can Language Models Be Too Big? 🦜 with Emily Bender and Margaret Mitchell - #467
Today we’re joined by Emily M. Bender, Professor at the University of Washington, and AI Researcher, Margaret Mitchell.  Emily and Meg, as well as Timnit Gebru and Angelina McMillan-Major, are co-authors on the paper On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜. As most of you undoubtedly know by now, there has been much controversy surrounding, and fallout from, this paper. In this conversation, our main priority was to focus on the message of the paper itself. We spend some time discussing the historical context for the paper, then turn to the goals of the paper, discussing the many reasons why the ever-growing datasets and models are not necessarily the direction we should be going.  We explore the cost of these training datasets, both literal and environmental, as well as the bias implications of these models, and of course the perpetual debate about responsibility when building and deploying ML systems. Finally, we discuss the thin line between AI hype and useful AI systems, and the importance of doing pre-mortems to truly flesh out any issues you could potentially come across prior to building models, and much much more.  The complete show notes for this episode can be found at
Mar 24, 2021
Applying RL to Real-World Robotics with Abhishek Gupta - #466
Today we’re joined by Abhishek Gupta, a PhD Student at UC Berkeley.  Abhishek, a member of the BAIR Lab, joined us to talk about his recent robotics and reinforcement learning research and interests, which focus on applying RL to real-world robotics applications. We explore the concept of reward supervision, and how to get robots to learn these reward functions from videos, and the rationale behind supervised experts in these experiments.  We also discuss the use of simulation for experiments, data collection, and the path to scalable robotic learning. Finally, we discuss gradient surgery vs gradient sledgehammering, and his ecological RL paper, which focuses on the “phenomena that exist in the real world” and how humans and robotics systems interface in those situations.  The complete show notes for this episode can be found at
Mar 22, 2021
Accelerating Innovation with AI at Scale with David Carmona - #465
Today we’re joined by David Carmona, General Manager of Artificial Intelligence & Innovation at Microsoft.  In our conversation with David, we focus on his work on AI at Scale, an initiative focused on the change in the ways people are developing AI, driven in large part by the emergence of massive models. We explore David’s thoughts about the progression towards larger models, the focus on parameters and how it ties to the architecture of these models, and how we should assess how attention works in these models. We also discuss the different families of models (generation & representation), the transition from CV to NLP tasks, and an interesting point of models “becoming a platform” via transfer learning. The complete show notes for this episode can be found at
Mar 18, 2021
Complexity and Intelligence with Melanie Mitchell - #464
Today we’re joined by Melanie Mitchell, Davis Professor at the Santa Fe Institute and author of Artificial Intelligence: A Guide for Thinking Humans.  While Melanie has had a long career with a myriad of research interests, we focus on a few, complex systems and the understanding of intelligence, complexity, and her recent work on getting AI systems to make analogies. We explore examples of social learning, and how it applies to AI contextually, and defining intelligence.  We discuss potential frameworks that would help machines understand analogies, established benchmarks for analogy, and if there is a social learning solution to help machines figure out analogy. Finally we talk through the overall state of AI systems, the progress we’ve made amid the limited concept of social learning, if we’re able to achieve intelligence with current approaches to AI, and much more! The complete show notes for this episode can be found at
Mar 15, 2021
Robust Visual Reasoning with Adriana Kovashka - #463
Today we’re joined by Adriana Kovashka, an Assistant Professor at the University of Pittsburgh. In our conversation with Adriana, we explore her visual commonsense research, and how it intersects with her background in media studies. We discuss the idea of shortcuts, or faults in visual question answering data sets that appear in many SOTA results, as well as the concept of masking, a technique developed to assist in context prediction. Adriana then describes how these techniques fit into her broader goal of trying to understand the rhetoric of visual advertisements.  Finally, Adriana shares a bit about her work on robust visual reasoning, the parallels between this research and other work happening around explainability, and the vision for her work going forward.  The complete show notes for this episode can be found at
Mar 11, 2021
Architectural and Organizational Patterns in Machine Learning with Nishan Subedi - #462
Today we’re joined by Nishan Subedi, VP of Algorithms at In our conversation with Nishan, we discuss his interesting path to MLOps and how ML/AI is used at Overstock, primarily for search/recommendations and marketing/advertisement use cases. We spend a great deal of time exploring machine learning architecture and architectural patterns, how he perceives the differences between architectural patterns and algorithms, and emergent architectural patterns that standards have not yet been set for. Finally, we discuss how the idea of anti-patterns was innovative in early design pattern thinking and if those concepts are transferable to ML, if architectural patterns will bleed over into organizational patterns and culture, and Nishan introduces us to the concept of Squads within an organizational structure. The complete show notes for this episode can be found at
Mar 08, 2021
Common Sense Reasoning in NLP with Vered Shwartz - #461
Today we’re joined by Vered Shwartz, a Postdoctoral Researcher at both the Allen Institute for AI and the Paul G. Allen School of Computer Science & Engineering at the University of Washington. In our conversation with Vered, we explore her NLP research, where she focuses on teaching machines common sense reasoning in natural language. We discuss training using GPT models and the potential use of multimodal reasoning and incorporating images to augment the reasoning capabilities. Finally, we talk through some other noteworthy research in this field, how she deals with biases in the models, and Vered's future plans for incorporating some of the newer techniques into her future research. The complete show notes for this episode can be found at 
Mar 04, 2021
How to Be Human in the Age of AI with Ayanna Howard - #460
Today we’re joined by returning guest and newly appointed Dean of the College of Engineering at The Ohio State University, Ayanna Howard.  Our conversation with Dr. Howard focuses on her recently released book, Sex, Race, and Robots: How to Be Human in the Age of AI, which is an extension of her research on the relationships between humans and robots. We continue to explore this relationship through the themes of socialization introduced in the book, like associating genders to AI and robotic systems and the “self-fulfilling prophecy” that has become search engines.  We also discuss a recurring conversation in the community around AI  being biased because of data versus models and data, and the choices and responsibilities that come with the ethical aspects of building AI systems. Finally, we discuss Dr. Howard’s new role at OSU, how it will affect her research, and what the future holds for the applied AI field.  The complete show notes for this episode can be found at
Mar 01, 2021
How to Be Human in the Age of AI with Ayanna Howard - #460
Today we’re joined by returning guest and newly appointed Dean of the College of Engineering at The Ohio State University, Ayanna Howard.  Our conversation with Dr. Howard focuses on her recently released book, Sex, Race, and Robots: How to Be Human in the Age of AI, which is an extension of her research on the relationships between humans and robots. We continue to explore this relationship through the themes of socialization introduced in the book, like associating genders to AI and robotic systems and the “self-fulfilling prophecy” that has become search engines.  We also discuss a recurring conversation in the community around AI  being biased because of data versus models and data, and the choices and responsibilities that come with the ethical aspects of building AI systems. Finally, we discuss Dr. Howard’s new role at OSU, how it will affect her research, and what the future holds for the applied AI field.  The complete show notes for this episode can be found at
Mar 01, 2021
Evolution and Intelligence with Penousal Machado - #459
Today we’re joined by Penousal Machado, Associate Professor and Head of the Computational Design and Visualization Lab in the Center for Informatics at the University of Coimbra.  In our conversation with Penousal, we explore his research in Evolutionary Computation, and how that work coincides with his passion for images and graphics. We also discuss the link between creativity and humanity, and have an interesting sidebar about the philosophy of Sci-Fi in popular culture.  Finally, we dig into Penousals evolutionary machine learning research, primarily in the context of the evolution of various animal species mating habits and practices. The complete show notes for this episode can be found at  
Feb 25, 2021
Innovating Neural Machine Translation with Arul Menezes - #458
Today we’re joined by Arul Menezes, a Distinguished Engineer at Microsoft.  Arul, a 30 year veteran of Microsoft, manages the machine translation research and products in the Azure Cognitive Services group. In our conversation, we explore the historical evolution of machine translation like breakthroughs in seq2seq and the emergence of transformer models.  We also discuss how they’re using multilingual transfer learning and combining what they’ve learned in translation with pre-trained language models like BERT. Finally, we explore what they’re doing to experience domain-specific improvements in their models, and what excites Arul about the translation architecture going forward.  The complete show notes for this series can be found at
Feb 22, 2021
Building the Product Knowledge Graph at Amazon with Luna Dong - #457
Today we’re joined by Luna Dong, Sr. Principal Scientist at Amazon. In our conversation with Luna, we explore Amazon’s expansive product knowledge graph, and the various roles that machine learning plays throughout it. We also talk through the differences and synergies between the media and retail product knowledge graph use cases and how ML comes into play in search and recommendation use cases. Finally, we explore the similarities to relational databases and efforts to standardize the product knowledge graphs across the company and broadly in the research community. The complete show notes for this episode can be found at
Feb 18, 2021
Towards a Systems-Level Approach to Fair ML with Sarah M. Brown - #456
Today we’re joined by Sarah Brown, an Assistant Professor of Computer Science at the University of Rhode Island. In our conversation with Sarah, whose research focuses on Fairness in AI, we discuss why a “systems-level” approach is necessary when thinking about ethical and fairness issues in models and algorithms. We also explore Wiggum: a fairness forensics tool, which explores bias and allows for regular auditing of data, as well as her ongoing collaboration with a social psychologist to explore how people perceive ethics and fairness. Finally, we talk through the role of tools in assessing fairness and bias, and the importance of understanding the decisions the tools are making. The complete show notes can be found at
Feb 15, 2021
AI for Digital Health Innovation with Andrew Trister - #455
Today we’re joined by Andrew Trister, Deputy Director for Digital Health Innovation at the Bill & Melinda Gates Foundation.  In our conversation with Andrew, we explore some of the AI use cases at the foundation, with the goal of bringing “community-based” healthcare to underserved populations in the global south. We focus on COVID-19 response and improving the accuracy of malaria testing with a bayesian framework and a few others, and the challenges like scaling these systems and building out infrastructure so that communities can begin to support themselves.  We also touch on Andrew's previous work at Apple, where he helped develop what is now known as Research Kit, their ML for health tools that are now seen in apple devices like phones and watches. The complete show notes for this episode can be found at
Feb 11, 2021
System Design for Autonomous Vehicles with Drago Anguelov - #454
Today we’re joined by Drago Anguelov, Distinguished Scientist and Head of Research at Waymo.  In our conversation, we explore the state of the autonomous vehicles space broadly and at Waymo, including how AV has improved in the last few years, their focus on level 4 driving, and Drago’s thoughts on the direction of the industry going forward. Drago breaks down their core ML use cases, Perception, Prediction, Planning, and Simulation, and how their work has lead to a fully autonomous vehicle being deployed in Phoenix.  We also discuss the socioeconomic and environmental impact of self-driving cars, a few research papers submitted to NeurIPS 2020, and if the sophistication of AV systems will lend themselves to the development of tomorrow’s enterprise machine learning systems. The complete show notes for this episode can be found at 
Feb 08, 2021
Building, Adopting, and Maturing LinkedIn's Machine Learning Platform with Ya Xu - #453
Today we’re joined by Ya Xu, head of Data Science at LinkedIn, and TWIMLcon: AI Platforms 2021 Keynote Speaker. We cover a ton of ground with Ya, starting with her experiences prior to becoming Head of DS, as one of the architects of the LinkedIn Platform. We discuss her “three phases” (building, adoption, and maturation) to keep in mind when building out a platform, how to avoid “hero syndrome” early in the process. Finally, we dig into the various tools and platforms that give LinkedIn teams leverage, their organizational structure, as well as the emergence of differential privacy for security use cases and if it's ready for prime time. The complete show notes for this episode can be found at 
Feb 04, 2021
Expressive Deep Learning with Magenta DDSP w/ Jesse Engel - #452
Today we’re joined by Jesse Engel, Staff Research Scientist at Google, working on the Magenta Project.  In our conversation with Jesse, we explore the current landscape of creativity AI, and the role Magenta plays in helping express creativity through ML and deep learning. We dig deep into their Differentiable Digital Signal Processing (DDSP) library, which “lets you combine the interpretable structure of classical DSP elements (such as filters, oscillators, reverberation, etc.) with the expressivity of deep learning.” Finally, Jesse walks us through some of the other projects that the Magenta team undertakes, including NLP and language modeling, and what he wants to see come out of the work that he and others are doing in creative AI research. The complete show notes for this episode can be found at 
Feb 01, 2021
Semantic Folding for Natural Language Understanding with Francisco Weber - #451
Today we’re joined by return guest Francisco Webber, CEO & Co-founder of Francisco was originally a guest over 4 years and 400 episodes ago, where we discussed his company, and their unique approach to natural language processing. In this conversation, Francisco gives us an update on Cortical, including their applications and toolkit, including semantic extraction, classifier, and search use cases. We also discuss GPT-3, and how it compares to semantic folding, the unreasonable amount of data needed to train these models, and the difference between the GPT approach and semantic modeling for language understanding. The complete show notes for this episode can be found at
Jan 29, 2021
The Future of Autonomous Systems with Gurdeep Pall - #450
Today we’re joined by Gurdeep Pall, Corporate Vice President at Microsoft. Gurdeep, who we had the pleasure of speaking with on his 31st anniversary at the company, has had a hand in creating quite a few influential projects, including Skype for business (and Teams) and being apart of the first team that shipped wifi as a part of a general-purpose operating system. In our conversation with Gurdeep, we discuss Microsoft’s acquisition of Bonsai and how they fit in the toolchain for creating brains for autonomous systems with “machine teaching,” and other practical applications of machine teaching in autonomous systems. We also explore the challenges of simulation, and how they’ve evolved to make the problems that the physical world brings more tenable. Finally, Gurdeep shares concrete use cases for autonomous systems, and how to get the best ROI on those investments, and of course, what’s next in the very broad space of autonomous systems. The complete show notes for this episode can be found at
Jan 25, 2021
AI for Ecology and Ecosystem Preservation with Bryan Carstens - #449
Today we’re joined by Bryan Carstens, a professor in the Department of Evolution, Ecology, and Organismal Biology & Head of the Tetrapod Division in the Museum of Biological Diversity at The Ohio State University. In our conversation with Bryan, who comes from a traditional biology background, we cover a ton of ground, including a foundational layer of understanding for the vast known unknowns in species and biodiversity, and how he came to apply machine learning to his lab’s research. We explore a few of his lab’s projects, including applying ML to genetic data to understand the geographic and environmental structure of DNA, what factors keep machine learning from being used more frequently used in biology, and what’s next for his group. The complete show notes for this episode can be found at
Jan 21, 2021
Off-Line, Off-Policy RL for Real-World Decision Making at Facebook - #448
Today we’re joined by Jason Gauci, a Software Engineering Manager at Facebook AI. In our conversation with Jason, we explore their Reinforcement Learning platform, Re-Agent (Horizon). We discuss the role of decision making and game theory in the platform and the types of decisions they’re using Re-Agent to make, from ranking and recommendations to their eCommerce marketplace. Jason also walks us through the differences between online/offline and on/off policy model training, and where Re-Agent sits in this spectrum. Finally, we discuss the concept of counterfactual causality, and how they ensure safety in the results of their models. The complete show notes for this episode can be found at
Jan 18, 2021
A Future of Work for the Invisible Workers in A.I. with Saiph Savage - #447
Today we’re joined by Saiph Savage, a Visiting professor at the Human-Computer Interaction Institute at CMU, director of the HCI Lab at WVU, and co-director of the Civic Innovation Lab at UNAM. We caught up with Saiph during NeurIPS where she delivered an insightful invited talk “A Future of Work for the Invisible Workers in A.I.”. In our conversation with Saiph, we gain a better understanding of the “Invisible workers,” or the people doing the work of labeling for machine learning and AI systems, and some of the issues around lack of economic empowerment, emotional trauma, and other issues that arise with these jobs. We discuss ways that we can empower these workers, and push the companies that are employing these workers to do the same. Finally, we discuss Saiph’s participatory design work with rural workers in the global south. The complete show notes for this episode can be found at
Jan 14, 2021
Trends in Graph Machine Learning with Michael Bronstein - #446
Today we’re back with the final episode of AI Rewind joined by Michael Bronstein, a professor at Imperial College London and the Head of Graph Machine Learning at Twitter. In our conversation with Michael, we touch on his thoughts about the year in Machine Learning overall, including GPT-3 and Implicit Neural Representations, but spend a major chunk of time on the sub-field of Graph Machine Learning.  We talk through the application of Graph ML across domains like physics and bioinformatics, and the tools to look out for. Finally, we discuss what Michael thinks is in store for 2021, including graph ml applied to molecule discovery and non-human communication translation.
Jan 11, 2021
Trends in Natural Language Processing with Sameer Singh - #445
Today we continue the 2020 AI Rewind series, joined by friend of the show Sameer Singh, an Assistant Professor in the Department of Computer Science at UC Irvine.  We last spoke with Sameer at our Natural Language Processing office hours back at TWIMLfest, and was the perfect person to help us break down 2020 in NLP. Sameer tackles the review in 4 main categories, Massive Language Modeling, Fundamental Problems with Language Models, Practical Vulnerabilities with Language Models, and Evaluation.  We also explore the impact of GPT-3 and Transformer models, the intersection of vision and language models, and the injection of causal thinking and modeling into language models, and much more. The complete show notes for this episode can be found at
Jan 07, 2021
Trends in Computer Vision with Pavan Turaga - #444
AI Rewind continues today as we’re joined by Pavan Turaga, Associate Professor in both the Departments of Arts, Media, and Engineering & Electrical Engineering, and the Interim Director of the School of Arts, Media, and Engineering at Arizona State University. Pavan, who joined us back in June to talk through his work from CVPR ‘20, Invariance, Geometry and Deep Neural Networks, is back to walk us through the trends he’s seen in Computer Vision last year. We explore the revival of physics-based thinking about scenes, differential rendering, the best papers, and where the field is going in the near future. We want to hear from you! Send your thoughts on the year that was 2020 below in the comments, or via Twitter at @samcharrington or @twimlai. The complete show notes for this episode can be found at
Jan 04, 2021
Trends in Reinforcement Learning with Pablo Samuel Castro - #443
Today we kick off our annual AI Rewind series joined by friend of the show Pablo Samuel Castro, a Staff Research Software Developer at Google Brain. Pablo joined us earlier this year for a discussion about Music & AI, and his Geometric Perspective on Reinforcement Learning, as well our RL office hours during the inaugural TWIMLfest. In today’s conversation, we explore some of the latest and greatest RL advancements coming out of the major conferences this year, broken down into a few major themes, Metrics/Representations, Understanding and Evaluating Deep Reinforcement Learning, and RL in the Real World. This was a very fun conversation, and we encourage you to check out all the great papers and other resources available on the show notes page.
Dec 30, 2020
MOReL: Model-Based Offline Reinforcement Learning with Aravind Rajeswaran - #442
Today we close out our NeurIPS series joined by Aravind Rajeswaran, a PhD Student in machine learning and robotics at the University of Washington. At NeurIPS, Aravind presented his paper MOReL: Model-Based Offline Reinforcement Learning. In our conversation, we explore model-based reinforcement learning, and if models are a “prerequisite” to achieve something analogous to transfer learning. We also dig into MOReL and the recent progress in offline reinforcement learning, the differences in developing MOReL models and traditional RL models, and the theoretical results they’re seeing from this research. The complete show notes for this episode can be found at
Dec 28, 2020
Machine Learning as a Software Engineering Enterprise with Charles Isbell - #441
As we continue our NeurIPS 2020 series, we’re joined by friend-of-the-show Charles Isbell, Dean, John P. Imlay, Jr. Chair, and professor at the Georgia Tech College of Computing. This year Charles gave an Invited Talk at this year’s conference, You Can’t Escape Hyperparameters and Latent Variables: Machine Learning as a Software Engineering Enterprise. In our conversation, we explore the success of the Georgia Tech Online Masters program in CS, which now has over 11k students enrolled, and the importance of making the education accessible to as many people as possible. We spend quite a bit speaking about the impact machine learning is beginning to have on the world, and how we should move from thinking of ourselves as compiler hackers, and begin to see the possibilities and opportunities that have been ignored. We also touch on the fallout from Timnit Gebru being “resignated” and the importance of having diverse voices and different perspectives “in the room,” and what the future holds for machine learning as a discipline. The complete show notes for this episode can be found at 
Dec 23, 2020
Natural Graph Networks with Taco Cohen - #440
Today we kick off our NeurIPS 2020 series joined by Taco Cohen, a Machine Learning Researcher at Qualcomm Technologies. In our conversation with Taco, we discuss his current research in equivariant networks and video compression using generative models, as well as his paper “Natural Graph Networks,” which explores the concept of “naturality, a generalization of equivariance” which suggests that weaker constraints will allow for a “wider class of architectures.” We also discuss some of Taco’s recent research on neural compression and a very interesting visual demo for equivariance CNNs that Taco and the Qualcomm team released during the conference. The complete show notes for this episode can be found at
Dec 21, 2020
Productionizing Time-Series Workloads at Siemens Energy with Edgar Bahilo Rodriguez - #439
Today we close out our re:Invent series joined by Edgar Bahilo Rodriguez, Lead Data Scientist in the industrial applications division of Siemens Energy. Edgar spoke at this year's re:Invent conference about Productionizing R Workloads, and the resurrection of R for machine learning and productionalization. In our conversation with Edgar, we explore the fundamentals of building a strong machine learning infrastructure, and how they’re breaking down applications and using mixed technologies to build models. We also discuss their industrial applications, including wind, power production management, managing systems intent on decreasing the environmental impact of pre-existing installations, and their extensive use of time-series forecasting across these use cases. The complete show notes can be found at
Dec 18, 2020
ML Feature Store at Intuit with Srivathsan Canchi - #438
Today we continue our re:Invent series with Srivathsan Canchi, Head of Engineering for the Machine Learning Platform team at Intuit.  As we teased earlier this week, one of the major announcements coming from AWS at re:Invent was the release of the SageMaker Feature Store. To our pleasant surprise, we came to learn that our friends at Intuit are the original architects of this offering and partnered with AWS to productize it at a much broader scale. In our conversation with Srivathsan, we explore the focus areas that are supported by the Intuit machine learning platform across various teams, including QuickBooks and Mint, Turbotax, and Credit Karma,  and his thoughts on why companies should be investing in feature stores.  We also discuss why the concept of “feature store” has seemingly exploded in the last year, and how you know when your organization is ready to deploy one. Finally, we dig into the specifics of the feature store, including the popularity of graphQL and why they chose to include it in their pipelines, the similarities (and differences) between the two versions of the store, and much more! The complete show notes for this episode can be found at
Dec 16, 2020
re:Invent Roundup 2020 with Swami Sivasubramanian - #437
Today we’re kicking off our annual re:invent series joined by Swami Sivasubramanian, VP of Artificial Intelligence, at AWS. During re:Invent last week, Amazon made a ton of announcements on the machine learning front, including quite a few advancements to SageMaker. In this roundup conversation, we discuss the motivation for hosting the first-ever machine learning keynote at the conference, a bunch of details surrounding tools like Pipelines for workflow management, Clarify for bias detection, and JumpStart for easy to use algorithms and notebooks, and many more. We also discuss the emphasis placed on DevOps and MLOps tools in these announcements, and how the tools are all interconnected. Finally, we briefly touch on the announcement of the AWS feature store, but be sure to check back later this week for a more in-depth discussion on that particular release! The complete show notes for this episode can be found at
Dec 14, 2020
Predictive Disease Risk Modeling at 23andMe with Subarna Sinha - #436
Today we’re joined by Subarna Sinha, Machine Learning Engineering Leader at 23andMe. 23andMe handles a massive amount of genomic data every year from its core ancestry business but also uses that data for disease prediction, which is the core use case we discuss in our conversation. Subarna talks us through an initial use case of creating an evaluation of polygenic scores, and how that led them to build an ML pipeline and platform. We talk through the tools and tech stack used for the operationalization of their platform, the use of synthetic data, the internal pushback that came along with the changes that were being made, and what’s next for her team and the platform. The complete show notes for this episode can be found at
Dec 11, 2020
Scaling Video AI at RTL with Daan Odijk - #435
Today we’re joined by Daan Odijk, Data Science Manager at RTL. In our conversation with Daan, we explore the RTL MLOps journey, and their need to put platform infrastructure in place for ad optimization and forecasting, personalization, and content understanding use cases. Daan walks us through some of the challenges on both the modeling and engineering sides of building the platform, as well as the inherent challenges of video applications. Finally, we discuss the current state of their platform, and the benefits they’ve seen from having this infrastructure in place, and why using building a custom platform was worth the investment. The complete show notes for this episode can be found at 
Dec 09, 2020
Benchmarking ML with MLCommons w/ Peter Mattson - #434
Today we’re joined by Peter Mattson, General Chair at MLPerf, a Staff Engineer at Google, and President of MLCommons.  In our conversation with Peter, we discuss MLCommons and MLPerf, the former an open engineering group with the goal of accelerating machine learning innovation, and the latter a set of standardized Machine Learning speed benchmarks used to measure things like model training speed, throughput speed for inference.  We explore the target user for the MLPerf benchmarks, the need for benchmarks in the ethics, bias, fairness space, and how they’re approaching this through the "People’s Speech" datasets. We also walk through the MLCommons best practices of getting a model into production, why it's so difficult, and how MLCube can make the process easier for researchers and developers. The complete show notes page for this episode can be found at
Dec 07, 2020
Deep Learning for NLP: From the Trenches with Charlene Chambliss - #433
Today we’re joined by Charlene Chambliss, Machine Learning Engineer at Primer AI.  Charlene, who we also had the pleasure of hosting at NLP Office Hours during TWIMLfest, is back to share some of the work she’s been doing with NLP. In our conversation, we explore her experiences working with newer NLP models and tools like BERT and HuggingFace, as well as whats she’s learned along the way with word embeddings, labeling tasks, debugging, and more. We also focus on a few of her projects, like her popular multi-lingual BERT project, and a COVID-19 classifier.  Finally, Charlene shares her experience getting into data science and machine learning coming from a non-technical background, and what the transition was like, and tips for people looking to make a similar shift.
Dec 03, 2020
Feature Stores for Accelerating AI Development - #432
In this special episode of the podcast, we're joined by Kevin Stumpf, Co-Founder and CTO of Tecton, Willem Pienaar, an engineering lead at Gojek and founder of the Feast Project, and Maxime Beauchemin, Founder & CEO of Preset, for a discussion on Feature Stores for Accelerating AI Development. In this panel discussion, Sam and our guests explored how organizations can increase value and decrease time-to-market for machine learning using feature stores, MLOps, and open source. We also discuss the main data challenges of AI/ML, and the role of the feature store in solving those challenges. The complete show notes for this episode can be found at
Nov 30, 2020
An Exploration of Coded Bias with Shalini Kantayya, Deb Raji and Meredith Broussard - #431
In this special edition of the podcast, we're joined by Shalini Kantayya, the director of Coded Bias, and Deb Raji and Meredith Broussard, who both contributed to the film. In this panel discussion, Sam and our guests explored the societal implications of the biases embedded within AI algorithms. The conversation discussed examples of AI systems with disparate impact across industries and communities, what can be done to mitigate this disparity, and opportunities to get involved. Our panelists Shalini, Meredith, and Deb each share insight into their experience working on and researching bias in AI systems and the oppressive and dehumanizing impact they can have on people in the real world.
 The complete show notes for this film can be found at
Nov 27, 2020
Common Sense as an Algorithmic Framework with Dileep George - #430
Today we’re joined by Dileep George, Founder and the CTO of Vicarious. Dileep, who was also a co-founder of Numenta, works at the intersection of AI research and neuroscience, and famously pioneered the hierarchical temporal memory. In our conversation, we explore the importance of mimicking the brain when looking to achieve artificial general intelligence, the nuance of “language understanding” and how all the tasks that fall underneath it are all interconnected, with or without language. We also discuss his work with Recursive Cortical Networks, Schema Networks, and what’s next on the path towards AGI!
Nov 23, 2020
Scaling Enterprise ML in 2020: Still Hard! with Sushil Thomas - #429
Today we’re joined by Sushil Thomas, VP of Engineering for Machine Learning at Cloudera. Over the summer, I had the pleasure of hosting Sushil and a handful of business leaders across industries at the Cloudera Virtual Roundtable. In this conversation with Sushil, we recap the roundtable, exploring some of the topics discussed and insights gained from those conversations. Sushil gives us a look at how COVID19 has impacted business throughout the year, and how the pandemic is shaping enterprise decision making moving forward.  We also discuss some of the key trends he’s seeing as organizations try to scale their machine learning and AI efforts, including understanding best practices, and learning how to hybridize the engineering side of ML with the scientific exploration of the tasks. Finally, we explore if organizational models like hub vs centralized are still organization-specific or if that’s changed in recent years, as well as how to get and retain good ML talent with giant companies like Google and Microsoft looming large. The complete show notes for this episode can be found at
Nov 19, 2020
Enabling Clinical Automation: From Research to Deployment with Devin Singh - #428
Today we’re joined by Devin Singh, a Physician Lead for Clinical Artificial Intelligence & Machine Learning in Pediatric Emergency Medicine at the Hospital for Sick Children (SickKids) in Toronto, and Founder and CEO of HeroAI. In our conversation with Devin, we discuss some of the interesting ways that Devin is deploying machine learning within the SickKids hospital, the current structure of academic research, including how much research and publications are currently being incentivized, how little of those research projects actually make it to deployment, and how Devin is working to flip that system on it's head.  We also talk about his work at Hero AI, where he is commercializing and deploying his academic research to build out infrastructure and deploy AI solutions within hospitals, creating an automated pipeline with patients, caregivers, and EHS companies. Finally, we discuss Devins's thoughts on how he’d approach bias mitigation in these systems, and the importance of having proper stakeholder engagement and using design methodology when building ML systems. The complete show notes for this episode can be found at
Nov 16, 2020
Pixels to Concepts with Backpropagation w/ Roland Memisevic - #427
Today we’re joined by Roland Memisevic, return podcast guest and Co-Founder & CEO of Twenty Billion Neurons.  We last spoke to Roland in 2018, and just earlier this year TwentyBN made a sharp pivot to a surprising use case, a companion app called Fitness Ally, an interactive, personalized fitness coach on your phone.  In our conversation with Roland, we explore the progress TwentyBN has made on their goal of training deep neural networks to understand physical movement and exercise. We also discuss how they’ve taken their research on understanding video context and awareness and applied it in their app, including how recent advancements have allowed them to deploy their neural net locally while preserving privacy, and Roland’s thoughts on the enormous opportunity that lies in the merging of language and video processing. The complete show notes for this episode can be found at
Nov 12, 2020
Fighting Global Health Disparities with AI w/ Jon Wang - #426
Today we’re joined by Jon Wang, a medical student at UCSF, and former Gates Scholar and AI researcher at the Bill and Melinda Gates Foundation. In our conversation with Jon, we explore a few of the different ways he’s attacking various public health issues, including improving the electronic health records system through automating clinical order sets, and exploring how the lack of literature and AI talent in the non-profit and healthcare spaces, and bad data have further marginalized undersupported communities. We also discuss his work at the Gates Foundation, which included understanding how AI can be helpful in lower-resource and lower-income countries, and building digital infrastructure, and much more. The complete show notes for this episode can be found at  
Nov 09, 2020
Accessibility and Computer Vision - #425
Digital imagery is pervasive today. More than a billion images per day are produced and uploaded to social media sites, with many more embedded within websites, apps, digital documents, and eBooks. Engaging with digital imagery has become fundamental to participating in contemporary society, including education, the professions, e-commerce, civics, entertainment, and social interactions. However, most digital images remain inaccessible to the 39 million people worldwide who are blind. AI and computer vision technologies hold the potential to increase image accessibility for people who are blind, through technologies like automated image descriptions. The speakers share their perspectives as people who are both technology experts and are blind, providing insight into future directions for the field of computer vision for describing images and videos for people who are blind. To check out the video of this panel, visit here! The complete show notes for this episode can be found at
Nov 05, 2020
NLP for Equity Investing with Frank Zhao - #424
Today we’re joined by Frank Zhao, Senior Director of Quantamental Research at S&P Global Market Intelligence. In our conversation with Frank, we explore how he came to work at the intersection of ML and finance, and how he navigates the relationship between data science and domain expertise. We also discuss the rise of data science in the investment management space, examining the largely under-explored technique of using unstructured data to gain insights into equity investing, and the edge it can provide for investors. Finally, Frank gives us a look at how he uses natural language processing with textual data of earnings call transcripts and walks us through the entire pipeline. The complete show notes for this episode can be found at
Nov 02, 2020
The Future of Education and AI with Salman Khan - #423
In the final #TWIMLfest Keynote Interview, we’re joined by Salman Khan, Founder of Khan Academy. In our conversation with Sal, we explore the amazing origin story of the academy, and how coronavirus is shaping the future of education and remote and distance learning, for better and for worse. We also explore Sal’s perspective on machine learning and AI being used broadly in education, the potential of injecting a platform like Khan Academy with ML and AI for course recommendations, and if they’re planning on implementing these features in the future. Finally, Sal shares some great stories about the impact of community and opportunity, and what advice he has for learners within the TWIML community and beyond! The complete show notes for this episode can be found at
Oct 28, 2020
Why AI Innovation and Social Impact Go Hand in Hand with Milind Tambe - #422
In this special #TWIMLfest Keynote episode, we’re joined by Milind Tambe, Director of AI for Social Good at Google Research India, and Director of the Center for Research in Computation and Society (CRCS) at Harvard University. In our conversation, we explore Milind’s various research interests, most of which fall under the umbrella of AI for Social Impact, including his work in public health, both stateside and abroad, his conservation work in South Asia and Africa, and his thoughts on the ways that those interested in social impact can get involved.  The complete show notes for this episode can be found at
Oct 23, 2020
What's Next for w/ Jeremy Howard - #421
In this special #TWIMLfest episode of the podcast, we’re joined by Jeremy Howard, Founder of In our conversation with Jeremy, we discuss his career path, including his journey through the consulting world and how those experiences led him down the path to ML education, his thoughts on the current state of the machine learning adoption cycle, and if we’re at maximum capacity for deep learning use and capability. Of course, we dig into the newest version of the framework and course, the reception of Jeremy’s book ‘Deep Learning for Coders with Fastai and PyTorch: AI Applications Without a PhD,’ and what’s missing from the machine learning education landscape. If you’ve missed our previous conversations with Jeremy, I encourage you to check them out here and here. The complete show notes for this episode can be found at
Oct 21, 2020
Feature Stores for MLOps with Mike del Balso - #420
Today we’re joined by Mike del Balso, co-Founder and CEO of Tecton.  Mike, who you might remember from our last conversation on the podcast, was a foundational member of the Uber team that created their ML platform, Michelangelo. Since his departure from the company in 2018, he has been busy building up Tecton, and their enterprise feature store.  In our conversation, Mike walks us through why he chose to focus on the feature store aspects of the machine learning platform, the journey, personal and otherwise, to operationalizing machine learning, and the capabilities that more mature platforms teams tend to look for or need to build. We also explore the differences between standalone components and feature stores, if organizations are taking their existing databases and building feature stores with them, and what a dynamic, always available feature store looks like in deployment.  Finally, we explore what sets Tecton apart from other vendors in this space, including enterprise cloud providers who are throwing their hat in the ring. The complete show notes for this episode can be found at Thanks to our friends at Tecton for sponsoring this episode of the podcast! Find out more about what they're up to at
Oct 19, 2020
Exploring Causality and Community with Suzana Ilić - #419
In this special #TWIMLfest episode, we’re joined by Suzana Ilić, a computational linguist at Causaly and founder of Machine Learning Tokyo (MLT). Suzana joined us as a keynote speaker to discuss the origins of the MLT community, but we cover a lot of ground in this conversation. We briefly discuss Suzana’s work at Causaly, touching on her experiences transitioning from linguist and domain expert to working with causal modeling, balancing her role as both product manager and leader of the development team for their causality extraction module, and the unique ways that she thinks about UI in relation to their product. We also spend quite a bit of time exploring MLT, including how they’ve achieved exponential growth within the community over the past few years and when Suzana knew MLT was moving beyond just a personal endeavor, her experiences publishing papers at major ML conferences as an independent organization, and inspires her within the broader ML/AI Community. And of course, we answer quite a few great questions from our live audience!
Oct 16, 2020
Decolonizing AI with Shakir Mohamed - #418
In this special #TWIMLfest edition of the podcast, we’re joined by Shakir Mohamed, a Senior Research Scientist at DeepMind. Shakir is also a leader of Deep Learning Indaba, a non-profit organization whose mission is to Strengthen African Machine Learning and Artificial Intelligence. In our conversation with Shakir, we discuss his recent paper ‘Decolonial AI,’ the distinction between decolonizing AI and ethical AI, while also exploring the origin of the Indaba, the phases of community, and much more. The complete show notes for this episode can be found at
Oct 14, 2020
Spatial Analysis for Real-Time Video Processing with Adina Trufinescu
Today we’re joined by Adina Trufinescu, Principal Program Manager at Microsoft, to discuss some of the computer vision updates announced at Ignite 2020.  We focus on the technical innovations that went into their recently announced spatial analysis software, and the software’s use cases including the movement of people within spaces, distance measurements (social distancing), and more.  We also discuss the ‘responsible AI guidelines’ put in place to curb bad actors potentially using this software for surveillance, what techniques are being used to do object detection and image classification, and the challenges to productizing this research.  The complete show notes for this episode can be found at
Oct 08, 2020
How Deep Learning has Revolutionized OCR with Cha Zhang - #416
Today we’re joined by Cha Zhang, a Partner Engineering Manager at Microsoft Cloud & AI.  Cha’s work at MSFT is focused on exploring ways that new technologies can be applied to optical character recognition, or OCR, pushing the boundaries of what has been seen as an otherwise ‘solved’ problem. In our conversation with Cha, we explore some of the traditional challenges of doing OCR in the wild, and what are the ways in which deep learning algorithms are being applied to transform these solutions.  We also discuss the difficulties of using an end to end pipeline for OCR work, if there is a semi-supervised framing that could be used for OCR, the role of techniques like neural architecture search, how advances in NLP could influence the advancement of OCR problems, and much more.  The complete show notes for this episode can be found at
Oct 05, 2020
Machine Learning for Food Delivery at Global Scale - #415
In this special edition of the show, we discuss the various ways in which machine learning plays a role in helping businesses overcome their challenges in the food delivery space.  A few weeks ago Sam had the opportunity to moderate a panel at the Prosus AI Marketplace virtual event with Sandor Caetano of iFood, Dale Vaz of Swiggy, Nicolas Guenon of Delivery Hero, and Euro Beinat of Prosus.  In this conversation, panelists describe the application of machine learning to a variety of business use cases, including how they deliver recommendations, the unique ways they handle the logistics of deliveries, and fraud and abuse prevention.  The complete show notes for this episode can be found at
Oct 02, 2020
Open Source at Qualcomm AI Research with Jeff Gehlhaar and Zahra Koochak - #414
Today we're joined by Jeff Gehlhaar, VP of Technology at Qualcomm, and Zahra Koochak, Staff Machine Learning Engineer at Qualcomm AI Research.  If you haven’t had a chance to listen to our first interview with Jeff, I encourage you to check it out here! In this conversation, we catch up with Jeff and Zahra to get an update on what the company has up to since our last conversation, including the Snapdragon 865 chipset and Hexagon Neural Network Direct.  We also discuss open-source projects like the AI efficiency toolkit and Tensor Virtual Machine compiler, and how these projects fit in the broader Qualcomm ecosystem. Finally, we talk through their vision for on-device federated learning.  The complete show notes for this page can be found at
Sep 30, 2020
Visualizing Climate Impact with GANs w/ Sasha Luccioni - #413
Today we’re joined by Sasha Luccioni, a Postdoctoral Researcher at the MILA Institute, and moderator of our upcoming TWIMLfest Panel, ‘Machine Learning in the Fight Against Climate Change.’  We were first introduced to Sasha’s work through her paper on ‘Visualizing The Consequences Of Climate Change Using Cycle-consistent Adversarial Networks’, and we’re excited to pick her brain about the ways ML is currently being leveraged to help the environment. In our conversation, we explore the use of GANs to visualize the consequences of climate change, the evolution of different approaches she used, and the challenges of training GANs using an end-to-end pipeline. Finally, we talk through Sasha’s goals for the aforementioned panel, which is scheduled for Friday, October 23rd at 1 pm PT. Register for all of the great TWIMLfest sessions at! The complete show notes for this episode can be found at
Sep 28, 2020
ML-Powered Language Learning at Duolingo with Burr Settles - #412
Today we’re joined by Burr Settles, Research Director at Duolingo. Most would acknowledge that one of the most effective ways to learn is one on one with a tutor, and Duolingo’s main goal is to replicate that at scale. In our conversation with Burr, we dig how the business model has changed over time, the properties that make a good tutor, and how those features translate to the AI tutor they’ve built. We also discuss the Duolingo English Test, and the challenges they’ve faced with maintaining the platform while adding languages and courses. Check out the complete show notes for this episode at
Sep 24, 2020
Bridging The Gap Between Machine Learning and the Life Sciences with Artur Yakimovich - #411
Today we’re joined by Artur Yakimovich, Co-Founder at Artificial Intelligence for Life Sciences and a visiting scientist in the Lab for Molecular Cell Biology at University College London. In our conversation with Artur, we explore the gulf that exists between life science researchers and the tools and applications used by computer scientists.  While Artur’s background is in viral chemistry, he has since transitioned to a career in computational biology to “see where chemistry stopped, and biology started.” We discuss his work in that middle ground, looking at quite a few of his recent work applying deep learning and advanced neural networks like capsule networks to his research problems.  Finally, we discuss his efforts building the Artificial Intelligence for Life Sciences community, a non-profit organization he founded to bring scientists from different fields together to share ideas and solve interdisciplinary problems.  Check out the complete show notes at
Sep 21, 2020
Understanding Cultural Style Trends with Computer Vision w/ Kavita Bala - #410
Today we’re joined by Kavita Bala, the Dean of Computing and Information Science at Cornell University.  Kavita, whose research explores the overlap of computer vision and computer graphics, joined us to discuss a few of her projects, including GrokStyle, a startup that was recently acquired by Facebook and is currently being deployed across their Marketplace features. We also talk about StreetStyle/GeoStyle, projects focused on using social media data to find style clusters across the globe.  Kavita shares her thoughts on the privacy and security implications, progress with integrating privacy-preserving techniques into vision projects like the ones she works on, and what’s next for Kavita’s research. The complete show notes for this episode can be found at
Sep 17, 2020
That's a VIBE: ML for Human Pose and Shape Estimation with Nikos Athanasiou, Muhammed Kocabas, Michael Black - #409
Today we’re joined by Nikos Athanasiou, Muhammed Kocabas, Ph.D. students, and Michael Black, Director of the Max Planck Institute for Intelligent Systems.  We caught up with the group to explore their paper VIBE: Video Inference for Human Body Pose and Shape Estimation, which they submitted to CVPR 2020. In our conversation, we explore the problem that they’re trying to solve through an adversarial learning framework, the datasets (AMASS) that they’re building upon, the core elements that separate this work from its predecessors in this area of research, and the results they’ve seen through their experiments and testing.  The complete show notes for this episode can be found at Register for TWIMLfest today!
Sep 14, 2020
3D Deep Learning with PyTorch 3D w/ Georgia Gkioxari - #408
Today we’re joined by Georgia Gkioxari, a research scientist at Facebook AI Research.  Georgia was hand-picked by the TWIML community to discuss her work on the recently released open-source library PyTorch3D. In our conversation, Georgia describes her experiences as a computer vision researcher prior to the 2012 deep learning explosion, and how the entire landscape has changed since then.  Georgia walks us through the user experience of PyTorch3D, while also detailing who the target audience is, why the library is useful, and how it fits in the broad goal of giving computers better means of perception. Finally, Georgia gives us a look at what it’s like to be a co-chair for CVPR 2021 and the challenges with updating the peer review process for the larger academic conferences.  The complete show notes for this episode can be found at
Sep 10, 2020
What are the Implications of Algorithmic Thinking? with Michael I. Jordan - #407
Today we’re joined by the legendary Michael I. Jordan, Distinguished Professor in the Departments of EECS and Statistics at UC Berkeley.  Michael was gracious enough to connect us all the way from Italy after being named IEEE’s 2020 John von Neumann Medal recipient. In our conversation with Michael, we explore his career path, and how his influence from other fields like philosophy shaped his path.  We spend quite a bit of time discussing his current exploration into the intersection of economics and AI, and how machine learning systems could be used to create value and empowerment across many industries through “markets.” We also touch on the potential of “interacting learning systems” at scale, the valuation of data, the commoditization of human knowledge into computational systems, and much, much more. The complete show notes for this episode can be found at.
Sep 07, 2020
Beyond Accuracy: Behavioral Testing of NLP Models with Sameer Singh - #406
Today we’re joined by Sameer Singh, an assistant professor in the department of computer science at UC Irvine.  Sameer’s work centers on large-scale and interpretable machine learning applied to information extraction and natural language processing. We caught up with Sameer right after he was awarded the best paper award at ACL 2020 for his work on Beyond Accuracy: Behavioral Testing of NLP Models with CheckList. In our conversation, we explore CheckLists, the task-agnostic methodology for testing NLP models introduced in the paper. We also discuss how well we understand the cause of pitfalls or failure modes in deep learning models, Sameer’s thoughts on embodied AI, and his work on the now famous LIME paper, which he co-authored alongside Carlos Guestrin.  The complete show notes for this episode can be found at
Sep 03, 2020
How Machine Learning Powers On-Demand Logistics at Doordash with Gary Ren - #405
Today we’re joined by Gary Ren, a machine learning engineer for the logistics team at DoorDash.  In our conversation, we explore how machine learning powers the entire logistics ecosystem. We discuss the stages of their “marketplace,” and how using ML for optimized route planning and matching affects consumers, dashers, and merchants. We also talk through how they use traditional mathematics, classical machine learning, potential use cases for reinforcement learning frameworks, and challenges to implementing these explorations.   The complete show notes for this episode can be found at! Check out our upcoming event at
Aug 31, 2020
Machine Learning as a Software Engineering Discipline with Dillon Erb - #404
Today we’re joined by Dillon Erb, Co-founder & CEO of Paperspace. We’ve followed Paperspace since their origins offering GPU-enabled compute resources to data scientists and machine learning developers, to the release of their Jupyter-based Gradient service. Our conversation with Dillon centered on the challenges that organizations face building and scaling repeatable machine learning workflows, and how they’ve done this in their own platform by applying time-tested software engineering practices.  We also discuss the importance of reproducibility in production machine learning pipelines, how the processes and tools of software engineering map to the machine learning workflow, and technical issues that ML teams run into when trying to scale the ML workflow. The complete show notes for this episode can be found at
Aug 27, 2020
AI and the Responsible Data Economy with Dawn Song - #403
Today we’re joined by Professor of Computer Science at UC Berkeley, Dawn Song. Dawn’s research is centered at the intersection of AI, deep learning, security, and privacy. She’s currently focused on bringing these disciplines together with her startup, Oasis Labs.  In our conversation, we explore their goals of building a ‘platform for a responsible data economy,’ which would combine techniques like differential privacy, blockchain, and homomorphic encryption. The platform would give consumers more control of their data, and enable businesses to better utilize data in a privacy-preserving and responsible way.  We also discuss how to privatize and anonymize data in language models like GPT-3, real-world examples of adversarial attacks and how to train against them, her work on program synthesis to get towards AGI, and her work on privatizing coronavirus contact tracing data. The complete show notes for this episode can be found
Aug 24, 2020
Relational, Object-Centric Agents for Completing Simulated Household Tasks with Wilka Carvalho - #402
Today we’re joined by Wilka Carvalho, a PhD student at the University of Michigan, Ann Arbor. In our conversation, we focus on his paper ‘ROMA: A Relational, Object-Model Learning Agent for Sample-Efficient Reinforcement Learning.’ In the paper, Wilka explores the challenge of object interaction tasks, focusing on every day, in-home functions. We discuss how he’s addressing the challenge of ‘object-interaction’ tasks, the biggest obstacles he’s run into along the way.
Aug 20, 2020
Model Explainability Forum - #401
Today we bring you the latest Discussion Series: The Model Explainability Forum. Our group of experts and researchers explore the current state of explainability and discuss the key emerging ideas shaping the field. Each guest shares their unique perspective and contributions to thinking about model explainability in a practical way. We explore concepts like stakeholder-driven explainability, adversarial attacks on explainability methods, counterfactual explanations, legal and policy implications, and more.
Aug 17, 2020
What NLP Tells Us About COVID-19 and Mental Health with Johannes Eichstaedt - #400
Today we’re joined by Johannes Eichstaedt, an Assistant Professor of Psychology at Stanford University. In our conversation, we explore how Johannes applies his physics background to a career as a computational social scientist, some of the major patterns in the data that emerged over the first few months of lockdown, including mental health, social norms, and political patterns. We also explore how Johannes built the process, and the techniques he’s using to collect, sift through, and understand the da
Aug 13, 2020
Human-AI Collaboration for Creativity with Devi Parikh - #399
Today we’re joined by Devi Parikh, Associate Professor at the School of Interactive Computing at Georgia Tech, and research scientist at Facebook AI Research (FAIR). In our conversation, we touch on Devi’s definition of creativity, explore multiple ways that AI could impact the creative process for artists, and help humans become more creative. We investigate tools like casual creator for preference prediction, neuro-symbolic generative art, and visual journaling.
Aug 10, 2020
Neural Augmentation for Wireless Communication with Max Welling - #398
Today we’re joined by Max Welling, Vice President of Technologies at Qualcomm Netherlands, and Professor at the University of Amsterdam. In our conversation, we explore Max’s work in neural augmentation, and how it’s being deployed. We also discuss his work with federated learning and incorporating the technology on devices to give users more control over the privacy of their personal data. Max also shares his thoughts on quantum mechanics and the future of quantum neural networks for chip design.
Aug 06, 2020
Quantum Machine Learning: The Next Frontier? with Iordanis Kerenidis - #397
Today we're joined by Iordanis Kerenidis, Research Director CNRS Paris and Head of Quantum Algorithms at QC Ware. Iordanis was an ICML main conference Keynote speaker on the topic of Quantum ML, and we focus our conversation on his presentation, exploring the prospects and challenges of quantum machine learning, as well as the field’s history, evolution, and future. We’ll also discuss the foundations of quantum computing, and some of the challenges to consider for breaking into the field.
Aug 04, 2020
ML and Epidemiology with Elaine Nsoesie - #396
Today we continue our ICML series with Elaine Nsoesie, assistant professor at Boston University. In our conversation, we discuss the different ways that machine learning applications can be used to address global health issues, including infectious disease surveillance, and tracking search data for changes in health behavior in African countries. We also discuss COVID-19 epidemiology and the importance of recognizing how the disease is affecting people of different races and economic backgrounds.
Jul 30, 2020
Language (Technology) Is Power: Exploring the Inherent Complexity of NLP Systems with Hal Daumé III - #395
Today we’re joined by Hal Daume III, professor at the University of Maryland and Co-Chair of the 2020 ICML Conference. We had the pleasure of catching up with Hal ahead of this year's ICML to discuss his research at the intersection of bias, fairness, NLP, and the effects language has on machine learning models, exploring language in two categories as they appear in machine learning models and systems: (1) How we use language to interact with the world, and (2) how we “do” language.
Jul 27, 2020
Graph ML Research at Twitter with Michael Bronstein - #394
Today we’re excited to be joined by return guest Michael Bronstein, Head of Graph Machine Learning at Twitter. In our conversation, we discuss the evolution of the graph machine learning space, his new role at Twitter, and some of the research challenges he’s faced, including scalability and working with dynamic graphs. Michael also dives into his work on differential graph modules for graph CNNs, and the various applications of this work.
Jul 23, 2020
Panel: The Great ML Language (Un)Debate! - #393
Today we’re excited to bring ‘The Great ML Language (Un)Debate’ to the podcast! In the latest edition of our series of live discussions, we brought together experts and enthusiasts to discuss both popular and emerging programming languages for machine learning, along with the strengths, weaknesses, and approaches offered by Clojure, JavaScript, Julia, Probabilistic Programming, Python, R, Scala, and Swift. We round out the session with an audience Q&A (58:28).
Jul 20, 2020
What the Data Tells Us About COVID-19 with Eric Topol - #392
Today we’re joined by Eric Topol, Director & Founder of the Scripps Research Translational Institute, and author of the book Deep Medicine. We caught up with Eric to talk through what we’ve learned about the coronavirus since it's emergence, and the role of tech in understanding and preventing the spread of the disease. We also explore the broader opportunity for medical applications of AI, the promise of personalized medicine, and how techniques like federated learning can offer more privacy in healthc
Jul 16, 2020
The Case for Hardware-ML Model Co-design with Diana Marculescu - #391
Today we’re joined by Diana Marculescu, Professor of Electrical and Computer Engineering at UT Austin. We caught up with Diana to discuss her work on hardware-aware machine learning. In particular, we explore her keynote, “Putting the “Machine” Back in Machine Learning: The Case for Hardware-ML Model Co-design” from CVPR 2020. We explore how her research group is focusing on making models more efficient so that they run better on current hardware systems, and how they plan on achieving true co
Jul 13, 2020
Computer Vision for Remote AR with Flora Tasse - #390
Today we conclude our CVPR coverage joined by Flora Tasse, Head of Computer Vision & AI Research at Streem. Flora, a keynote speaker at the AR/VR workshop, walks us through some of the interesting use cases at the intersection of AI, CV, and AR technologies, her current work and the origin of her company Selerio, which was eventually acquired by Streem, the difficulties associated with building 3D mesh environments, extracting metadata from those environments, the challenges of pose estimation and more.
Jul 09, 2020
Deep Learning for Automatic Basketball Video Production with Julian Quiroga - #389
Today we're Julian Quiroga, a Computer Vision Team Lead at Genius Sports, to discuss his recent paper “As Seen on TV: Automatic Basketball Video Production using Gaussian-based Actionness and Game States Recognition.” We explore camera setups and angles, detection and localization of figures on the court (players, refs, and of course, the ball), and the role that deep learning plays in the process. We also break down how this work applies to different sports, and the ways that he is looking to improve i
Jul 06, 2020
How External Auditing is Changing the Facial Recognition Landscape with Deb Raji - #388
Today we’re taking a break from our CVPR coverage to bring you this interview with Deb Raji, a Technology Fellow at the AI Now Institute. Recently there have been quite a few major news stories in the AI community, including the self-imposed moratorium on facial recognition tech from Amazon, IBM and Microsoft. In our conversation with Deb, we dig into these stories, discussing the origins of Deb’s work on the Gender Shades project, the harms of facial recognition, and much more.
Jul 02, 2020
AI for High-Stakes Decision Making with Hima Lakkaraju - #387
Today we’re joined by Hima Lakkaraju, an Assistant Professor at Harvard University. At CVPR, Hima was a keynote speaker at the Fair, Data-Efficient and Trusted Computer Vision Workshop, where she spoke on Understanding the Perils of Black Box Explanations. Hima talks us through her presentation, which focuses on the unreliability of explainability techniques that center perturbations, such as LIME or SHAP, as well as how attacks on these models can be carried out, and what they look like.
Jun 29, 2020
Invariance, Geometry and Deep Neural Networks with Pavan Turaga - #386
We continue our CVPR coverage with today’s guest, Pavan Turaga, Associate Professor at Arizona State University. Pavan gave a keynote presentation at the Differential Geometry in CV and ML Workshop, speaking on Revisiting Invariants with Geometry and Deep Learning. We go in-depth on Pavan’s research on integrating physics-based principles into computer vision. We also discuss the context of the term “invariant,” and Pavan contextualizes this work in relation to Hinton’s similar Capsule Network res
Jun 25, 2020
Channel Gating for Cheaper and More Accurate Neural Nets with Babak Ehteshami Bejnordi - #385
Today we’re joined by Babak Ehteshami Bejnordi, a Research Scientist at Qualcomm. Babak is currently focused on conditional computation, which is the main driver for today’s conversation. We dig into a few papers in great detail including one from this year’s CVPR conference, Conditional Channel Gated Networks for Task-Aware Continual Learning, covering how gates are used to drive efficiency and accuracy, while decreasing model size, how this research manifests into actual products, and more!
Jun 22, 2020
Machine Learning Commerce at Square with Marsal Gavalda - #384
Today we’re joined by Marsal Gavalda, head of machine learning for the Commerce platform at Square, where he manages the development of machine learning for various tools and platforms, including marketing, appointments, and above all, risk management. We explore how they manage their vast portfolio of projects, and how having an ML and technology focus at the outset of the company has contributed to their success, tips and best practices for internal democratization of ML, and much more.
Jun 18, 2020
Cell Exploration with ML at the Allen Institute w/ Jianxu Chen - #383
Today we’re joined by Jianxu Chen, a scientist at the Allen Institute for Cell Science. At the latest GTC conference, Jianxu presented his work on the Allen Cell Explorer Toolkit, an open-source project that allows users to do 3D segmentation of intracellular structures in fluorescence microscope images at high resolutions, making the images more accessible for data analysis. We discuss three of the major components of the toolkit: the cell image analyzer, the image generator, and the image visualizer
Jun 15, 2020
Neural Arithmetic Units & Experiences as an Independent ML Researcher with Andreas Madsen - #382
Today we’re joined by Andreas Madsen, an independent researcher based in Denmark. While we caught up with Andreas to discuss his ICLR spotlight paper, “Neural Arithmetic Units,” we also spend time exploring his experience as an independent researcher, discussing the difficulties of working with limited resources, the importance of finding peers to collaborate with, and tempering expectations of getting papers accepted to conferences -- something that might take a few tries to get right.
Jun 11, 2020
2020: A Critical Inflection Point for Responsible AI with Rumman Chowdhury - #381
Today we’re joined by Rumman Chowdhury, Managing Director and Global Lead of Responsible AI at Accenture. In our conversation with Rumman, we explored questions like:  • Why is now such a critical inflection point in the application of responsible AI? • How should engineers and practitioners think about AI ethics and responsible AI? • Why is AI ethics inherently personal and how can you define your own personal approach? • Is the implementation of AI governance necessarily authoritarian?
Jun 08, 2020
Panel: Advancing Your Data Science Career During the Pandemic - #380
Today we’re joined by Ana Maria Echeverri, Caroline Chavier, Hilary Mason, and Jacqueline Nolis, our guests for the recent Advancing Your Data Science Career During the Pandemic panel. In this conversation, we explore ways that Data Scientists and ML/AI practitioners can continue to advance their careers despite current challenges. Our panelists provide concrete tips, advice, and direction for those just starting out, those affected by layoffs, and those just wanting to move forward in their careers.
Jun 04, 2020
On George Floyd, Empathy, and the Road Ahead
Visit for resources to support organizations pushing for social equity like Black Lives Matter, and groups offering relief for those jailed for exercising their rights to peaceful protest. 
Jun 02, 2020
Engineering a Less Artificial Intelligence with Andreas Tolias - #379
Today we’re joined by Andreas Tolias, Professor of Neuroscience at Baylor College of Medicine. We caught up with Andreas to discuss his recent perspective piece, “Engineering a Less Artificial Intelligence,” which explores the shortcomings of state-of-the-art learning algorithms in comparison to the brain. The paper also offers several ideas about how neuroscience can lead the quest for better inductive biases by providing useful constraints on representations and network architecture.
May 28, 2020
Rethinking Model Size: Train Large, Then Compress with Joseph Gonzalez - #378
Today we’re joined by Joseph Gonzalez, Assistant Professor in the EECS department at UC Berkeley. In our conversation, we explore Joseph’s paper “Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers,” which looks at compute-efficient training strategies for models. We discuss the two main problems being solved; 1) How can we rapidly iterate on variations in architecture? And 2) If we make models bigger, is it really improving any efficiency?
May 25, 2020
The Physics of Data with Alpha Lee - #377
Today we’re joined by Alpha Lee, Winton Advanced Fellow in the Department of Physics at the University of Cambridge. Our conversation centers around Alpha’s research which can be broken down into three main categories: data-driven drug discovery, material discovery, and physical analysis of machine learning. We discuss the similarities and differences between drug discovery and material science, his startup, PostEra which offers medicinal chemistry as a service powered by machine learning, and much more
May 21, 2020
Is Linguistics Missing from NLP Research? w/ Emily M. Bender - #376 🦜
Today we’re joined by Emily M. Bender, Professor of Linguistics at the University of Washington. Our discussion covers a lot of ground, but centers on the question, "Is Linguistics Missing from NLP Research?" We explore if we would be making more progress, on more solid foundations, if more linguists were involved in NLP research, or is the progress we're making (e.g. with deep learning models like Transformers) just fine?
May 18, 2020
Disrupting DeepFakes: Adversarial Attacks Against Conditional Image Translation Networks with Nataniel Ruiz - #375
Today we’re joined by Nataniel Ruiz, a PhD Student at Boston University. We caught up with Nataniel to discuss his paper “Disrupting DeepFakes: Adversarial Attacks Against Conditional Image Translation Networks and Facial Manipulation Systems.” In our conversation, we discuss the concept of this work, as well as some of the challenging parts of implementing this work, potential scenarios in which this could be deployed, and the broader contributions that went into this work.
May 14, 2020
Understanding the COVID-19 Data Quality Problem with Sherri Rose - #374
Today we’re joined by Sherri Rose, Associate Professor at Harvard Medical School. We cover a lot of ground in our conversation, including the intersection of her research with the current COVID-19 pandemic, the importance of quality in datasets and rigor when publishing papers, and the pitfalls of using causal inference. We also touch on Sherri’s work in algorithmic fairness, the shift she’s seen in fairness conferences covering these issues in relation to healthcare research, and a few recent pape
May 11, 2020
The Whys and Hows of Managing Machine Learning Artifacts with Lukas Biewald - #373
Today we’re joined by Lukas Biewald, founder and CEO of Weights & Biases, to discuss their new tool Artifacts, an end to end pipeline tracker. In our conversation, we explore Artifacts’ place in the broader machine learning tooling ecosystem through the lens of our eBook “The definitive guide to ML Platforms” and how it fits with the W&B model management platform. We discuss also discuss what exactly “Artifacts” are, what the tool is tracking, and take a look at the onboarding process for users.
May 07, 2020
Language Modeling and Protein Generation at Salesforce with Richard Socher - #372
Today we’re joined Richard Socher, Chief Scientist and Executive VP at Salesforce. Richard and his team have published quite a few great projects lately, including CTRL: A Conditional Transformer Language Model for Controllable Generation, and ProGen, an AI Protein Generator, both of which we cover in-depth in this conversation. We also explore the balancing act between investments, product requirement research and otherwise at a large product-focused company like Salesforce.
May 04, 2020
AI Research at JPMorgan Chase with Manuela Veloso - #371
Today we’re joined by Manuela Veloso, Head of AI Research at J.P. Morgan Chase. Since moving from CMU to JP Morgan Chase, Manuela and her team established a set of seven lofty research goals. In this conversation we focus on the first three: building AI systems to eradicate financial crime, safely liberate data, and perfect client experience. We also explore Manuela’s background, including her time CMU in the ‘80s, or as she describes it, the “mecca of AI,” and her founding role with RoboCup.
Apr 30, 2020
Panel: Responsible Data Science in the Fight Against COVID-19 - #370
In this discussion, we explore how data scientists and ML/AI practitioners can responsibly contribute to the fight against coronavirus and COVID-19. Four experts: Rex Douglass, Rob Munro, Lea Shanley, and Gigi Yuen-Reed shared a ton of valuable insight on the best ways to get involved. We've gathered all the resources that our panelists discussed during the conversation, you can find those at
Apr 29, 2020
Adversarial Examples Are Not Bugs, They Are Features with Aleksander Madry - #369
Today we’re joined by Aleksander Madry, Faculty in the MIT EECS Department, to discuss his paper “Adversarial Examples Are Not Bugs, They Are Features.” In our conversation, we talk through what we expect these systems to do, vs what they’re actually doing, if we’re able to characterize these patterns, and what makes them compelling, and if the insights from the paper will help inform opinions on either side of the deep learning debate.
Apr 27, 2020
AI for Social Good: Why "Good" isn't Enough with Ben Green - #368
Today we’re joined by Ben Green, PhD Candidate at Harvard and Research Fellow at the AI Now Institute at NYU. Ben’s research is focused on the social and policy impacts of data science, with a focus on algorithmic fairness and the criminal justice system. We discuss his paper ‘Good' Isn't Good Enough,’ which explores the 2 things he feels are missing from data science and machine learning research; A grounded definition of what “good” actually means, and the absence of a “theory of change.
Apr 23, 2020
The Evolution of Evolutionary AI with Risto Miikkulainen - #367
Today we’re joined by Risto Miikkulainen, Associate VP of Evolutionary AI at Cognizant AI. Risto joined us back on episode #47 to discuss evolutionary algorithms, and today we get an update on the latest on the topic. In our conversation, we discuss use cases for evolutionary AI and the latest approaches to deploying evolutionary models. We also explore his paper “Better Future through AI: Avoiding Pitfalls and Guiding AI Towards its Full Potential,” which digs into the historical evolution of AI.
Apr 20, 2020
Neural Architecture Search and Google’s New AutoML Zero with Quoc Le - #366
Today we’re super excited to share our recent conversation with Quoc Le, a research scientist at Google. Quoc joins us to discuss his work on Google’s AutoML Zero, semi-supervised learning, and the development of Meena, the multi-turn conversational chatbot. This was a really fun conversation, so much so that we decided to release the video! April 16th at 12 pm PT, Quoc and Sam will premiere the video version of this interview on Youtube, and answer your questions in the chat. We’ll see you there!
Apr 16, 2020
Automating Electronic Circuit Design with Deep RL w/ Karim Beguir - #365
Today we’re joined by return guest Karim Beguir, Co-Founder and CEO of InstaDeep. In our conversation, we chat with Karim about InstaDeep’s new offering, DeepPCB, an end-to-end platform for automated circuit board design. We discuss challenges and problems with some of the original iterations of auto-routers, how Karim defines circuit board “complexity,” the differences between reinforcement learning being used for games and in this use case, and their spotlight paper from NeurIPS.
Apr 13, 2020
Neural Ordinary Differential Equations with David Duvenaud - #364
Today we’re joined by David Duvenaud, Assistant Professor at the University of Toronto, to discuss his research on Neural Ordinary Differential Equations, a type of continuous-depth neural network. In our conversation, we talk through a few of David’s papers on the subject. We discuss the problem that David is trying to solve with this research, the potential that ODEs have to replace “the backbone” of the neural networks that are used to train today, and David’s approach to engineering.
Apr 09, 2020
The Measure and Mismeasure of Fairness with Sharad Goel - #363
Today we’re joined by Sharad Goel, Assistant Professor at Stanford. Sharad, who also has appointments in the computer science, sociology, and law departments, has spent recent years focused on applying ML to understanding and improving public policy. In our conversation, we discuss Sharad’s extensive work on discriminatory policing, and The Stanford Open Policing Project. We also dig into Sharad’s paper “The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning.”
Apr 06, 2020
Simulating the Future of Traffic with RL w/ Cathy Wu - #362
Today we’re joined by Cathy Wu, Assistant Professor at MIT. We had the pleasure of catching up with Cathy to discuss her work applying RL to mixed autonomy traffic, specifically, understanding the potential impact autonomous vehicles would have on various mixed-autonomy scenarios. To better understand this, Cathy built multiple RL simulations, including a track, intersection, and merge scenarios. We talk through how each scenario is set up, how human drivers are modeled, the results, and much more.
Apr 02, 2020
Consciousness and COVID-19 with Yoshua Bengio - #361
Today we’re joined by one of, if not the most cited computer scientist in the world, Yoshua Bengio, Professor at the University of Montreal and the Founder and Scientific Director of MILA. We caught up with Yoshua to explore his work on consciousness, including how Yoshua defines consciousness, his paper “The Consciousness Prior,” as well as his current endeavor in building a COVID-19 tracing application, and the use of ML to propose experimental candidate drugs.
Mar 30, 2020
Geometry-Aware Neural Rendering with Josh Tobin - #360
Today we’re joined by Josh Tobin, Co-Organizer of the machine learning training program Full Stack Deep Learning. We had the pleasure of sitting down with Josh prior to his presentation of his paper Geometry-Aware Neural Rendering at NeurIPS. Josh's goal is to develop implicit scene understanding, building upon Deepmind's Neural scene representation and rendering work. We discuss challenges, the various datasets used to train his model, and the similarities between VAE training and his process, and mor
Mar 26, 2020
The Third Wave of Robotic Learning with Ken Goldberg - #359
Today we’re joined by Ken Goldberg, professor of engineering at UC Berkeley, focused on robotic learning. In our conversation with Ken, we chat about some of the challenges that arise when working on robotic grasping, including uncertainty in perception, control, and physics. We also discuss his view on the role of physics in robotic learning, and his thoughts on potential robot use cases, from the use of robots in assisting in telemedicine, agriculture, and even robotic Covid-19 testing.
Mar 23, 2020
Learning Visiolinguistic Representations with ViLBERT w/ Stefan Lee - #358
Today we’re joined by Stefan Lee, an assistant professor at Oregon State University. In our conversation, we focus on his paper ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks. We discuss the development and training process for this model, the adaptation of the training process to incorporate additional visual information to BERT models, where this research leads from the perspective of integration between visual and language tasks.
Mar 18, 2020
Upside-Down Reinforcement Learning with Jürgen Schmidhuber - #357
Today we’re joined by Jürgen Schmidhuber, Co-Founder and Chief Scientist of NNAISENSE, the Scientific Director at IDSIA, as well as a Professor of AI at USI and SUPSI in Switzerland. Jürgen’s lab is well known for creating the Long Short-Term Memory (LSTM) network, and in this conversation, we discuss some of the recent research coming out of his lab, namely Upside-Down Reinforcement Learning.
Mar 16, 2020
SLIDE: Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning with Beidi Chen - #356
Beidi Chen is part of the team that developed a cheaper, algorithmic, CPU alternative to state-of-the-art GPU machines. They presented their findings at NeurIPS 2019 and have since gained a lot of attention for their paper, SLIDE: In Defense of Smart Algorithms Over Hardware Acceleration for Large-Scale Deep Learning Systems. Beidi shares how the team took a new look at deep learning with the case of extreme classification by turning it into a search problem and using locality-sensitive hashing.
Mar 12, 2020
Advancements in Machine Learning with Sergey Levine - #355
Today we're joined by Sergey Levine, an Assistant Professor at UC Berkeley. We last heard from Sergey back in 2017, where we explored Deep Robotic Learning. Sergey and his lab’s recent efforts have been focused on contributing to a future where machines can be “out there in the real world, learning continuously through their own experience.” We caught up with Sergey at NeurIPS 2019, where Sergey and his team presented 12 different papers -- which means a lot of ground to cover!
Mar 09, 2020
Secrets of a Kaggle Grandmaster with David Odaibo - #354
Imagine spending years learning ML from the ground up, from its theoretical foundations, but still feeling like you didn’t really know how to apply it. That’s where David Odaibo found himself in 2015, after the second year of his PhD. David’s solution was Kaggle, a popular platform for data science competitions. Fast forward four years, and David is now a Kaggle Grandmaster, the highest designation, with particular accomplishment in computer vision competitions, and co-founder and CTO of Analytical
Mar 05, 2020
NLP for Mapping Physics Research with Matteo Chinazzi - #353
Predicting the future of science, particularly physics, is the task that Matteo Chinazzi, an associate research scientist at Northeastern University focused on in his paper Mapping the Physics Research Space: a Machine Learning Approach. In addition to predicting the trajectory of physics research, Matteo is also active in the computational epidemiology field. His work in that area involves building simulators that can model the spread of diseases like Zika or the seasonal flu at a global scale.
Mar 02, 2020
Metric Elicitation and Robust Distributed Learning with Sanmi Koyejo - #352
The unfortunate reality is that many of the most commonly used machine learning metrics don't account for the complex trade-offs that come with real-world decision making. This is one of the challenges that Sanmi Koyejo, assistant professor at the University of Illinois, has dedicated his research to address. Sanmi applies his background in cognitive science, probabilistic modeling, and Bayesian inference to pursue his research which focuses broadly on “adaptive and robust machine learning.”
Feb 27, 2020
High-Dimensional Robust Statistics with Ilias Diakonikolas - #351
Today we’re joined by Ilias Diakonikolas, faculty in the CS department at the University of Wisconsin-Madison, and author of the paper Distribution-Independent PAC Learning of Halfspaces with Massart Noise, recipient of the NeurIPS 2019 Outstanding Paper award. The paper is regarded as the first progress made around distribution-independent learning with noise since the 80s. In our conversation, we explore robustness in ML, problems with corrupt data in high-dimensional settings, and of course, the paper.
Feb 24, 2020
How AI Predicted the Coronavirus Outbreak with Kamran Khan - #350
Today we’re joined by Kamran Khan, founder & CEO of BlueDot, and professor of medicine and public health at the University of Toronto. BlueDot has been the recipient of a lot of attention for being the first to publicly warn about the coronavirus that started in Wuhan. How did the company’s system of algorithms and data processing techniques help flag the potential dangers of the disease? In our conversation, Kamran talks us through how the technology works, its limits, and the motivation behind the wor
Feb 19, 2020
Turning Ideas into ML Powered Products with Emmanuel Ameisen - #349
Today we’re joined by Emmanuel Ameisen, machine learning engineer at Stripe, and author of the recently published book “Building Machine Learning Powered Applications; Going from Idea to Product.” In our conversation, we discuss structuring end-to-end machine learning projects, debugging and explainability in the context of models, the various types of models covered in the book, and the importance of post-deployment monitoring. 
Feb 17, 2020
Algorithmic Injustices and Relational Ethics with Abeba Birhane - #348
Today we’re joined by Abeba Birhane, PhD Student at University College Dublin and author of the recent paper Algorithmic Injustices: Towards a Relational Ethics, which was the recipient of the Best Paper award at the 2019 Black in AI Workshop at NeurIPS. In our conversation, break down the paper and the thought process around AI ethics, the “harm of categorization,” how ML generally doesn’t account for the ethics of various scenarios and how relational ethics could solve the issue, and much more.
Feb 13, 2020
AI for Agriculture and Global Food Security with Nemo Semret - #347
Today we’re excited to kick off our annual Black in AI Series joined by Nemo Semret, CTO of Gro Intelligence. Gro provides an agricultural data platform dedicated to improving global food security, focused on applying AI at macro scale. In our conversation with Nemo, we discuss Gro’s approach to data acquisition, how they apply machine learning to various problems, and their approach to modeling.
Feb 10, 2020
Practical Differential Privacy at LinkedIn with Ryan Rogers - #346
Today we’re joined by Ryan Rogers, Senior Software Engineer at LinkedIn, to discuss his paper “Practical Differentially Private Top-k Selection with Pay-what-you-get Composition.” In our conversation, we discuss how LinkedIn allows its data scientists to access aggregate user data for exploratory analytics while maintaining its users’ privacy through differential privacy, and the connection between a common algorithm for implementing differential privacy, the exponential mechanism, and Gumbel noise.
Feb 07, 2020
Networking Optimizations for Multi-Node Deep Learning on Kubernetes with Erez Cohen - #345
Today we conclude the KubeCon ‘19 series joined by Erez Cohen, VP of CloudX & AI at Mellanox, who we caught up with before his talk “Networking Optimizations for Multi-Node Deep Learning on Kubernetes.” In our conversation, we discuss NVIDIA’s recent acquisition of Mellanox, the evolution of technologies like RDMA and GPU Direct, how Mellanox is enabling Kubernetes and other platforms to take advantage of the recent advancements in networking tech, and why we should care about networking in Deep Lea
Feb 05, 2020
Managing Research Needs at the University of Michigan using Kubernetes w/ Bob Killen - #344
Today we’re joined by Bob Killen, Research Cloud Administrator at the University of Michigan. In our conversation, we explore how Bob and his group at UM are deploying Kubernetes, the user experience, and how those users are taking advantage of distributed computing. We also discuss if ML/AI focused Kubernetes users should fear that the larger non-ML/AI user base will negatively impact their feature needs, where gaps currently exist in trying to support these ML/AI users’ workloads, and more!
Feb 03, 2020
Scalable and Maintainable Workflows at Lyft with Flyte w/ Haytham AbuelFutuh and Ketan Umare - #343
Today we kick off our KubeCon ‘19 series joined by Haytham AbuelFutuh and Ketan Umare, a pair of software engineers at Lyft. We caught up with Haytham and Ketan at KubeCo, where they were presenting their newly open-sourced, cloud-native ML and data processing platform, Flyte. We discuss what prompted Ketan to undertake this project and his experience building Flyte, the core value proposition, what type systems mean for the user experience, how it relates to Kubeflow and how Flyte is used across Lyft.
Jan 30, 2020
Causality 101 with Robert Osazuwa Ness - #342
Today Robert Osazuwa Ness, ML Research Engineer at Gamalon and Instructor at Northeastern University joins us to discuss Causality, what it means, and how that meaning changes across domains and users, and our upcoming study group based around his new course sequence, “Causal Modeling in Machine Learning," for which you can find details at
Jan 27, 2020
PaccMann^RL: Designing Anticancer Drugs with Reinforcement Learning w/ Jannis Born - #341
Today we’re joined by Jannis Born, Ph.D. student at ETH & IBM Research Zurich, to discuss his “PaccMann^RL” research. Jannis details how his background in computational neuroscience applies to this research, how RL fits into the goal of anticancer drug discovery, the effect DL has had on his research, and of course, a step-by-step walkthrough of how the framework works to predict the sensitivity of cancer drugs on a cell and then discover new anticancer drugs.
Jan 23, 2020
Social Intelligence with Blaise Aguera y Arcas - #340
Today we’re joined by Blaise Aguera y Arcas, a distinguished scientist at Google. We had the pleasure of catching up with Blaise at NeurIPS last month, where he was invited to speak on “Social Intelligence.” In our conversation, we discuss his role at Google, and his team’s approach to machine learning, and of course his presentation, in which he touches discussing today’s ML landscape, the gap between AI and ML/DS, the difference between intelligent systems and true intelligence, and much more.
Jan 20, 2020
Music & AI Plus a Geometric Perspective on Reinforcement Learning with Pablo Samuel Castro - #339
Today we’re joined by Pablo Samuel Castro, Staff Research Software Developer at Google. We cover a lot of ground in our conversation, including his love for music, and how that has guided his work on the Lyric AI project, and a few of his papers including “A Geometric Perspective on Optimal Representations for Reinforcement Learning” and “Estimating Policy Functions in Payments Systems using Deep Reinforcement Learning.”
Jan 16, 2020
Trends in Computer Vision with Amir Zamir - #338
Today we close out AI Rewind 2019 joined by Amir Zamir, who recently began his tenure as an Assistant Professor of Computer Science at the Swiss Federal Institute of Technology. Amir joined us back in 2018 to discuss his CVPR Best Paper winner, and in today’s conversation, we continue with the thread of Computer Vision. In our conversation, we discuss quite a few topics, including Vision-for-Robotics, the expansion of the field of 3D Vision, Self-Supervised Learning for CV Tasks, and much more!
Jan 13, 2020
Trends in Natural Language Processing with Nasrin Mostafazadeh - #337
Today we continue the AI Rewind 2019 joined by friend-of-the-show Nasrin Mostafazadeh, Senior AI Research Scientist at Elemental Cognition. We caught up with Nasrin to discuss the latest and greatest developments and trends in Natural Language Processing, including Interpretability, Ethics, and Bias in NLP, how large pre-trained models have transformed NLP research, and top tools and frameworks in the space.
Jan 09, 2020
Trends in Fairness and AI Ethics with Timnit Gebru - #336
Today we keep the 2019 AI Rewind series rolling with friend-of-the-show Timnit Gebru, a research scientist on the Ethical AI team at Google. A few weeks ago at NeurIPS, Timnit joined us to discuss the ethics and fairness landscape in 2019. In our conversation, we discuss diversification of NeurIPS, with groups like Black in AI, WiML and others taking huge steps forward, trends in the fairness community, quite a few papers, and much more.
Jan 06, 2020
Trends in Reinforcement Learning with Chelsea Finn - #335
Today we continue to review the year that was 2019 via our AI Rewind series, and do so with friend of the show Chelsea Finn, Assistant Professor in the CS Department at Stanford University. Chelsea’s research focuses on Reinforcement Learning, so we couldn’t think of a better person to join us to discuss the topic. In this conversation, we cover topics like Model-based RL, solving hard exploration problems, along with RL libraries and environments that Chelsea thought moved the needle last year.
Jan 02, 2020
Trends in Machine Learning & Deep Learning with Zack Lipton - #334
Today we kick off our 2019 AI Rewind Series joined by Zack Lipton, Professor at CMU. You might remember Zack from our conversation earlier this year, “Fairwashing” and the Folly of ML Solutionism. In today's conversation, Zack recaps advancements across the vast fields of Machine Learning and Deep Learning, including trends, tools, research papers and more. We want to hear from you! Send your thoughts on the year that was 2019 below in the comments, or via Twitter @samcharrington or @twimlai.
Dec 30, 2019
FaciesNet & Machine Learning Applications in Energy with Mohamed Sidahmed - #333
Today we close out our 2019 NeurIPS series with Mohamed Sidahmed, Machine Learning and Artificial Intelligence R&D Manager at Shell. In our conversation, we discuss two papers Mohamed and his team submitted to the conference this year, Accelerating Least Squares Imaging Using Deep Learning Techniques, and FaciesNet: Machine Learning Applications for Facies Classification in Well Logs. The show notes for this episode can be found at, where you’ll find links to both of these papers!
Dec 27, 2019
Machine Learning: A New Approach to Drug Discovery with Daphne Koller - #332
Today we’re joined by Daphne Koller, co-Founder and former co-CEO of Coursera and Founder and CEO of Insitro. In our conversation, discuss the current landscape of pharmaceutical drugs and drug discovery, including the current pricing of drugs, and an overview of Insitro’s goal of using ML as a “compass” in drug discovery. We also explore how Insitro functions as a company, their focus on the biology of drug discovery and the landscape of ML techniques being used, Daphne’s thoughts on AutoML, and
Dec 26, 2019
Sensory Prediction Error Signals in the Neocortex with Blake Richards - #331
Today we continue our 2019 NeurIPS coverage, this time around joined by Blake Richards, Assistant Professor at McGill University and a Core Faculty Member at Mila. Blake was an invited speaker at the Neuro-AI Workshop, and presented his research on “Sensory Prediction Error Signals in the Neocortex.” In our conversation, we discuss a series of recent studies on two-photon calcium imaging. We talk predictive coding, hierarchical inference, and Blake’s recent work on memory systems for reinforcement lea
Dec 24, 2019
How to Know with Celeste Kidd - #330
Today we’re joined by Celeste Kidd, Assistant Professor at UC Berkeley, to discuss her invited talk “How to Know” which details her lab’s research about the core cognitive systems people use to guide their learning about the world. We explore why people are curious about some things but not others, and how past experiences and existing knowledge shape future interests, why people believe what they believe, and how these beliefs are influenced, and how machine learning figures into the equation.
Dec 23, 2019
Using Deep Learning to Predict Wildfires with Feng Yan - #329
Today we’re joined by Feng Yan, Assistant Professor at the University of Nevada, Reno to discuss ALERTWildfire, a camera-based network infrastructure that captures satellite imagery of wildfires. In our conversation, Feng details the development of the machine learning models and surrounding infrastructure. We also talk through problem formulation, challenges with using camera and satellite data in this use case, and how he has combined the use of IaaS and FaaS tools for cost-effectiveness and scalability
Dec 20, 2019
Advancing Machine Learning at Capital One with Dave Castillo - #328
Today we’re joined by Dave Castillo, Managing VP for ML at Capital One and head of their Center for Machine Learning. In our conversation, we explore Capital One’s transition from “lab-based” ML to enterprise-wide adoption and support of ML, surprising ML use cases, their current platform ecosystem, their design vision in building this into a larger, all-encompassing platform, pain points in building this platform, and much more.
Dec 19, 2019
Helping Fish Farmers Feed the World with Deep Learning w/ Bryton Shang - #327
Today we’re joined by Bryton Shang, Founder & CEO at Aquabyte, a company focused on the application of computer vision to various fish farming use cases. In our conversation, we discuss how Bryton identified the various problems associated with mass fish farming, challenges developing computer algorithms that can measure the height and weight of fish, assess issues like sea lice, and how they’re developing interesting new features such as facial recognition for fish!
Dec 17, 2019
Metaflow, a Human-Centric Framework for Data Science with Ville Tuulos - #326
Today we kick off our re:Invent 2019 series with Ville Tuulos, Machine Learning Infrastructure Manager at Netflix. At re:Invent, Netflix announced the open-sourcing of Metaflow, their “human-centric framework for data science.” In our conversation, we discuss all things Metaflow, including features, user experience, tooling, supported libraries, and much more. If you’re interested in checking out a Metaflow democast with Villa, reach out at!
Dec 13, 2019
Single Headed Attention RNN: Stop Thinking With Your Head with Stephen Merity - #325
Today we’re joined by Stephen Merity, an independent researcher focused on NLP and Deep Learning. In our conversation, we discuss Stephens latest paper, Single Headed Attention RNN: Stop Thinking With Your Head, detailing his primary motivations behind the paper, the decision to use SHA-RNNs for this research, how he built and trained the model, his approach to benchmarking, and finally his goals for the research in the broader research community.
Dec 12, 2019
Automated Model Tuning with SigOpt - #324
In this TWIML Democast, we're joined by SigOpt Co-Founder and CEO Scott Clark. Scott details the SigOpt platform, and gives us a live demo! This episode is best consumed by watching the corresponding video demo, which you can find at 
Dec 09, 2019
Automated Machine Learning with Erez Barak - #323
Today we’re joined by Erez Barak, Partner Group Manager of Azure ML at Microsoft. In our conversation, Erez gives us a full breakdown of his AutoML philosophy, and his take on the AutoML space, its role, and its importance. We also discuss the application of AutoML as a contributor to the end-to-end data science process, which Erez breaks down into 3 key areas; Featurization, Learner/Model Selection, and Tuning/Optimizing Hyperparameters. We also discuss post-deployment AutoML use cases, and much more!
Dec 06, 2019
Responsible AI in Practice with Sarah Bird - #322
Today we continue our Azure ML at Microsoft Ignite series joined by Sarah Bird, Principal Program Manager at Microsoft. At Ignite, Microsoft released new tools focused on responsible machine learning, which fall under the umbrella of the Azure ML 'Machine Learning Interpretability Toolkit.’ In our conversation, Sarah walks us this toolkit, detailing use cases and the user experience. We also discuss her work in differential privacy, and in the broader ML community, in particular, the MLSys conference.
Dec 04, 2019
Enterprise Readiness, MLOps and Lifecycle Management with Jordan Edwards - #321
Today we’re joined by Jordan Edwards, Principal Program Manager for MLOps on Azure ML at Microsoft. In our conversation, Jordan details how Azure ML accelerates model lifecycle management with MLOps, which enables data scientists to collaborate with IT teams to increase the pace of model development and deployment. We discuss various problems associated with generalizing ML at scale at Microsoft, what exactly MLOps is, the “four phases” along the journey of customer implementation of MLOps, and much m
Dec 02, 2019
DevOps for ML with Dotscience - #320
Today we’re joined by Luke Marsden, Founder and CEO of Dotscience. Luke walks us through the Dotscience platform and their manifesto on DevOps for ML. Thanks to Luke and Dotscience for their sponsorship of this Democast and their continued support of TWIML.   Head to to watch the full democast!
Nov 26, 2019
Building an Autonomous Knowledge Graph with Mike Tung - #319
Today we’re joined by Mike Tung, Founder, and CEO of Diffbot. In our conversation, we discuss Diffbot’s Knowledge Graph, including how it differs from more mainstream use cases like Google Search and MSFT Bing. We also discuss the developer experience with the knowledge graph and other tools, like Extraction API and Crawlbot, challenges like knowledge fusion, balancing being a research company that is also commercially viable, and how they approach their role in the research community.
Nov 21, 2019
The Next Generation of Self-Driving Engineers with Aaron Ma - Talk #318
Today we’re joined by our youngest guest ever (by far), Aaron Ma, an 11-year-old middle school student and machine learning engineer in training. Aaron has completed over 80(!) Coursera courses and is the recipient of 3 Udacity nano-degrees. In our conversation, we discuss Aaron’s research interests in reinforcement learning and self-driving cars, his journey from programmer to ML engineer, his experiences participating in kaggle competitions, and how he balances his passion for ML with day-to-day life.
Nov 18, 2019
Spiking Neural Networks: A Primer with Terrence Sejnowski - #317
On today’s episode, we’re joined by Terrence Sejnowski, to discuss the ins and outs of spiking neural networks, including brain architecture, the relationship between neuroscience and machine learning, and ways to make NN’s more efficient through spiking. Terry also gives us some insight into hardware used in this field, characterizes the major research problems currently being undertaken, and the future of spiking networks.
Nov 14, 2019
Bridging the Patient-Physician Gap with ML and Expert Systems w/ Xavier Amatriain - #316
Today we’re joined by return guest Xavier Amatriain, Co-founder and CTO of Curai, whose goal is to make healthcare accessible and scaleable while bringing down costs. In our conversation, we touch on the shortcomings of traditional primary care, and how Curai fills that role, and some of the unique challenges his team faces in applying ML in the healthcare space. We also discuss the use of expert systems, how they train them, and how NLP projects like BERT and GPT-2 fit into what they’re building.
Nov 11, 2019
What Does it Mean for a Machine to "Understand"? with Thomas Dietterich - #315
Today we have the pleasure of being joined by Tom Dietterich, Distinguished Professor Emeritus at Oregon State University. Tom recently wrote a blog post titled "What does it mean for a machine to “understand”, and in our conversation, he goes into great detail on his thoughts. We cover a lot of ground, including Tom’s position in the debate, his thoughts on the role of systems like deep learning in potentially getting us to AGI, the “hype engine” around AI advancements, and so much more.
Nov 07, 2019
Scaling TensorFlow at LinkedIn with Jonathan Hung - #314
Today we’re joined by Jonathan Hung, Sr. Software Engineer at LinkedIn. Jonathan presented at TensorFlow world last week, titled Scaling TensorFlow at LinkedIn. In our conversation, we discuss their motivation for using TensorFlow on their pre-existing Hadoop clusters infrastructure, TonY, or TensorFlow on Yard, LinkedIn’s framework that natively runs deep learning jobs on Hadoop, and its relationship with Pro-ML, LinkedIn’s internal AI Platform, and their foray into using Kubernetes for research.
Nov 04, 2019
Machine Learning at GitHub with Omoju Miller - #313
Today we’re joined by Omoju Miller, a Sr. machine learning engineer at GitHub. In our conversation, we discuss: • Her dissertation, Hiphopathy, A Socio-Curricular Study of Introductory Computer Science,  • Her work as an inaugural member of the Github machine learning team • Her two presentations at Tensorflow World, “Why is machine learning seeing exponential growth in its communities” and “Automating your developer workflow on GitHub with Tensorflow.”
Oct 31, 2019
Using AI to Diagnose and Treat Neurological Disorders with Archana Venkataraman - #312
Today we’re joined by Archana Venkataraman, John C. Malone Assistant Professor of Electrical and Computer Engineering at Johns Hopkins University. Archana’s research at the Neural Systems Analysis Laboratory focuses on developing tools, frameworks, and algorithms to better understand, and treat neurological and psychiatric disorders, including autism, epilepsy, and others. We explore her work applying machine learning to these problems, including biomarker discovery, disorder severity prediction and mor
Oct 28, 2019
Deep Learning for Earthquake Aftershock Patterns with Phoebe DeVries & Brendan Meade - #311
Today we are joined by Phoebe DeVries, Postdoctoral Fellow in the Department of Earth and Planetary Sciences at Harvard and Brendan Meade, Professor of Earth and Planetary Sciences at Harvard. Phoebe and Brendan’s work is focused on discovering as much as possible about earthquakes before they happen, and by measuring how the earth’s surface moves, predicting future movement location, as seen in their paper: ‘Deep learning of aftershock patterns following large earthquakes'.
Oct 25, 2019
Live from TWIMLcon! Operationalizing Responsible AI - #310
An often forgotten about topic garnered high praise at TWIMLcon this month: operationalizing responsible and ethical AI. This important topic was combined with an impressive panel of speakers, including: Rachel Thomas, Director, Center for Applied Data Ethics at the USF Data Institute, Guillaume Saint-Jacques, Head of Computational Science at LinkedIn, and Parinaz Sobahni, Director of Machine Learning at Georgian Partners, moderated by Khari Johnson, Senior AI Staff Writer at VentureBeat.
Oct 22, 2019
Live from TWIMLcon! Scaling ML in the Traditional Enterprise - #309
Machine learning and AI is finding a place in the traditional enterprise - although the path to get there is different. In this episode, our panel analyzes the state and future of larger, more established brands. Hear from Amr Awadallah, Founder and Global CTO of Cloudera, Pallav Agrawal, Director of Data Science at Levi Strauss & Co., and Jürgen Weichenberger, Data Science Senior Principal & Global AI Lead at Accenture, moderated by Josh Bloom, Professor at UC Berkeley.
Oct 18, 2019
Live from TWIMLcon! Culture & Organization for Effective ML at Scale (Panel) - #308
TWIMLcon brought together so many in the ML/AI community to discuss the unique challenges to building and scaling machine learning platforms. In this episode, hear about changing the way companies think about machine learning from a diverse set of panelists including Pardis Noorzad, Data Science Manager at Twitter, Eric Colson, Chief Algorithms Officer Emeritus at Stitch Fix, and Jennifer Prendki, Founder & CEO at Alectio, moderated by Maribel Lopez, Founder & Principal Analyst at Lopez Research.
Oct 15, 2019
Live from TWIMLcon! Use-Case Driven ML Platforms with Franziska Bell - #307
Today we're Franziska Bell, Ph.D., the Director of Data Science Platforms at Uber, who joined Sam on stage at TWIMLcon last week. Fran provided a look into the cutting edge data science available company-wide at the push of a button. Since joining Uber, Fran has developed a portfolio of platforms, ranging from forecasting to conversational AI. Hear how use cases can strategically guide platform development, the evolving relationship between her team and Michelangelo (Uber’s ML Platform) and much more!
Oct 10, 2019
Live from TWIMLcon! Operationalizing ML at Scale with Hussein Mehanna - #306
The live interviews from TWIMLcon continue with Hussein Mehanna, Head of ML and AI at Cruise. From his start at Facebook to his current work at Cruise, Hussein has seen first hand what it takes to scale and sustain machine learning programs. Hear him discuss the challenges (and joys) of working in the industry, his insight into analyzing scale when innovation is happening in parallel with development, his experiences at Facebook, Google, and Cruise, and his predictions for the future of ML platforms!
Oct 08, 2019
Live from TWIMLcon! Encoding Company Culture in Applied AI Systems - #305
In this episode, Sam is joined by Deepak Agarwal, VP of Engineering at LinkedIn, who graced the stage at TWIMLcon: AI Platforms for a keynote interview. Deepak shares the impact that standardizing processes and tools have on a company’s culture and productivity levels, and best practices to increasing ML ROI. He also details the Pro-ML initiative for delivering machine learning systems at scale, specifically looking at aligning improvement of tooling and infrastructure with the pace of innovation and more
Oct 04, 2019
Live from TWIMLcon! Overcoming the Barriers to Deep Learning in Production with Andrew Ng - #304
Earlier today, Andrew Ng joined us onstage at TWIMLcon - as the Founder and CEO of Landing AI and founding lead of Google Brain, Andrew is no stranger to knowing what it takes for AI and machine learning to be successful. Hear about the work that Landing AI is doing to help organizations adopt modern AI, his experience in overcoming challenges for large companies, how enterprises can get the most value for their ML investment as well as addressing the ‘essential complexity’ of software engineering.
Oct 01, 2019
The Future of Mixed-Autonomy Traffic with Alexandre Bayen - #303