Embodied AI 101

By Shaoqing Tan

Listen to a podcast, please open Podcast Republic app. Available on Google Play Store and Apple App Store.

Image by Shaoqing Tan

Category: Technology

Open in Apple Podcasts


Open RSS feed


Open Website


Rate for this podcast

Subscribers: 0
Reviews: 0
Episodes: 67

Description

Stay in the loop on research in AI and physical intelligence.

Episode Date
AlexNet: The Deep Convolutional Network That Transformed Vision
May 23, 2026
A Few Useful Things to Know About Machine Learning
May 23, 2026
SimToolReal: A Universal Dexterous Tool-Use Policy
May 23, 2026
Mimic-Video: Learning Physics Priors from Web-Scale Video for Robot Dexterity
May 23, 2026
Deep Residual Learning for Image Recognition (ResNet)
May 23, 2026
Attention Is All You Need – The Transformer Revolution
May 23, 2026
NVIDIA Cosmos: World Foundation Models for Physical AI
May 20, 2026
LATENT: Teaching a Humanoid to Play Tennis from Imperfect Data
May 19, 2026
CollabVR: Collaborative Video Reasoning with Vision-Language and Video Generation Models
May 19, 2026
World Action Models: The Next Frontier in Embodied AI
May 19, 2026
Training a Whole-Body Control Foundation Model
May 18, 2026
DexJoCo: A Unified Benchmark for Task-Oriented Dexterous Manipulation
May 18, 2026
MMSkills: Building Multimodal Skill Libraries for Visual Agents
May 18, 2026
PhysBrain 1.0 VLA (TwinBrainVLA): Dual-Brain Vision-Language-Action with Physics-Grounded Learning
May 18, 2026
MolmoAct2-LIBERO: An Open Vision-Language-Action Model for Robotics
May 17, 2026
SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Diffusion Transformers
May 17, 2026
WildClawBench: A Real-World, Long-Horizon Benchmark for AI Agents
May 17, 2026
MCP-Cosmos: Bring Your Own World Model
May 17, 2026
OpenAI o1: Teaching LLMs to Think Slow and Deep
May 17, 2026
The Llama 3 Herd of Models
May 17, 2026
LATENT: Learning Athletic Humanoid Tennis Skills from Imperfect Human Motion Data
May 17, 2026
AnyFlow: Any-Step Video Diffusion for Predictive World Modeling
May 14, 2026
# Robotics: The Endgame
May 14, 2026
Claw-Eval: Toward Trustworthy and Transparent Evaluation of Autonomous Agents
Apr 08, 2026
LIBERO-Para: Paraphrase Robustness in Robotic Manipulation
Apr 08, 2026
YOR: Your Own Mobile Manipulator for Generalizable Robotics
Apr 07, 2026
EgoSim: Egocentric World Simulator for Embodied Interaction Generation
Apr 07, 2026
Accelerating Video World Models: From Generative Videos to Real-Time Simulators
Apr 07, 2026
From Tokens to Thoughts: Continuous Latent Reasoning in Large Models and Robot Control
Apr 07, 2026
CaP-X: Coding Agents for Physical eXecution
Apr 06, 2026
DoRA: Weight-Decomposed Low-Rank Adaptation
Apr 06, 2026
AI Model Collapse: What Happens When AI Trains on Its Own Outputs
Apr 06, 2026
PhAIL: Benchmarking Vision-Language-Action Models on Real-World Bin-Picking
Apr 05, 2026
Co-training Large Behavior Models: Data Modalities and Training Strategies for Robot Manipulation
Apr 05, 2026
HyDRA: Hybrid Memory for Dynamic Video World Models
Apr 05, 2026
# WildWorld: Dynamic World Modeling with Actions and Explicit State
Apr 04, 2026
Omni-WorldBench: Evaluating Interactive 4D World Models
Apr 04, 2026
SIMART: From Static Meshes to Sim-Ready Articulated Models
Apr 04, 2026
EgoSim: An Egocentric World Simulator for Embodied Interaction
Apr 04, 2026
Digit's New Motor Cortex: Sim-to-Real RL for Whole-Body Control
Apr 03, 2026
EgoNav: Diffusion-Based Humanoid Navigation from Human Egocentric Video
Apr 03, 2026
CaP-X: A Code-as-Policy Framework for Robot Manipulation
Apr 03, 2026
Embodied Intelligence Breakthrough: Generalist AI’s GEN-1 Robots
Apr 02, 2026
CaP-X: LMs' First Physical Exam
Apr 02, 2026
AI Model Collapse: The Danger of Training on AI-Generated Data
Mar 31, 2026
High-Level Automated Reasoning with Qwen2.5-7B
Mar 31, 2026
Co-Training Large Behavior Models: Multimodal Data for Robot Manipulation
Mar 31, 2026
HyDRA: Hybrid Memory for Dynamic Video World Models
Mar 30, 2026
DexWM: Leveraging Human Videos for Dexterous Robot World Models
Mar 30, 2026
World Models in Robotics
Mar 29, 2026
SIMART: Decomposing Monolithic Meshes into Sim-Ready Articulated Assets
Mar 28, 2026
LeWorldModel: A Stable JEPA World Model from Pixels
Mar 28, 2026
World Models for Robots: The Next Big Leap?
Mar 27, 2026
Harnessing Long-Running AI in Embodied Systems
Mar 27, 2026
HoMMI: Learning Whole-Body Mobile Manipulation from Human Demonstrations
Mar 26, 2026
TurboQuant: Redefining AI Efficiency with Extreme Compression
Mar 26, 2026
DexWM: Learning Dexterous Object Manipulation from Human Videos
Mar 25, 2026
FlashAttention-3: Fast & Accurate Attention with Asynchrony & Low-Precision
Mar 25, 2026
When AI Trains on Its Own Output: The Model Collapse Problem
Mar 25, 2026
MolmoBot: A Vision-Language Model for Zero-Shot Robot Manipulation
Mar 24, 2026
LeWorldModel: Stable End-to-End JEPA from Pixels
Mar 24, 2026
EgoVerse: An Egocentric Data Ecosystem for Scaling Robot Learning
Mar 24, 2026
HSImul3R: Physics-Driven Reconstruction of Human–Scene Interactions
Mar 24, 2026
MolmoSpaces: A Large-Scale Open Ecosystem for Robot Navigation and Manipulation
Mar 23, 2026
DreamZero: World Action Models Are Zero-Shot Policies
Mar 23, 2026
Kinema4D: A 4D Generative Simulator for Embodied AI
Mar 23, 2026
VEGA-3D: Teaching multimodal LLMs spatial reasoning through video generation
Mar 23, 2026