Listen to a podcast, please open Podcast Republic app. Available on Google Play Store and Apple App Store.
| Episode | Date |
|---|---|
|
Ep 55: A general-purpose reasoning model just disproved an 80-year-old math conjecture by finding better constructions than the square grids mathematicians expected.
|
May 21, 2026 |
|
Ep 54: Gemini 3.5 Flash gives builders faster, cheaper frontier performance on coding and agent tasks while Gemini Omni adds real-time multimodal scene editing.
|
May 20, 2026 |
|
Ep 53: Anthropic's acquisition of StainlessAPI hands developers cleaner SDKs and MCP server tooling for building production agents today.
|
May 19, 2026 |
|
Ep 52: NVIDIA's 4-bit pretraining technique cuts memory needs for large hybrid models while keeping accuracy nearly identical to FP8.
|
May 18, 2026 |
|
Ep 51: Looking back at 6 episodes from 2026-05-11 to 2026-05-17 — the stories that mattered, what we learned, and what to watch next.
|
May 17, 2026 |
|
Ep 50: An agent that manages another agent just moved from research demo to production reality at Fin.
|
May 16, 2026 |
|
Ep 49: Codex just landed on your phone — OpenAI’s coding agent now runs natively on iOS and Android with Windows sync coming soon.
|
May 15, 2026 |
|
Ep 48: Faster pre-training without changing model architecture just became practical — Nous Research's Token Superposition Training cuts wall-clock time by up to 2.5x on models from 270M to 10B.
|
May 14, 2026 |
|
Ep 47: Real-time multimodal agents now run full-duplex perception and generation without external VAD or frozen states.
|
May 13, 2026 |
|
Ep 46: OpenAI's Daybreak pairs frontier models with Codex Security to let developers find, validate, and patch vulnerabilities earlier in the cycle.
|
May 12, 2026 |
|
Ep 45: Sakana AI and NVIDIA just showed how extreme sparsity in feed-forward layers can deliver real GPU speedups without retraining.
|
May 11, 2026 |
|
Ep 44: Looking back at 4 episodes from 2026-05-04 to 2026-05-10 — the stories that mattered, what we learned, and what to watch next.
|
May 10, 2026 |
|
Ep 43: DeepSeek V4's full paper reveals FP4 quantization-aware training running directly in late-stage MoE optimization with minimal quality loss.
|
May 09, 2026 |
|
Ep 42: OpenAI ships three specialized realtime audio models for voice agents, translation, and transcription.
|
May 08, 2026 |
|
Ep 41: OpenAI rolls out GPT-5.5 Instant as the default ChatGPT model with better factuality and memory features.
|
May 06, 2026 |
|
Ep 40: OpenAI gives 8,000 developers a month of 10x Codex rate limits after the GPT-5.5 party sold out.
|
May 05, 2026 |
|
Ep 39: Mistral AI launches a 128B model with remote agents and strong coding performance.
|
May 03, 2026 |
|
Ep 38: Anthropic gives defenders early access to Mythos Preview to patch AI cyber vulnerabilities before wider release.
|
May 01, 2026 |
|
Ep 37: DeepSeek's first native multimodal model drops in the LocalLLaMA community, finally giving the open-source whale vision capabilities.
|
Apr 29, 2026 |
|
Ep 36: Anthropic’s Claude Opus 4.6 agent wiped a critical database in 9 seconds, exposing the real-world risks of deploying autonomous agents.
|
Apr 27, 2026 |
|
Ep 35: Google DeepMind's Vision Banana shows image generation pretraining may be the true foundation model path for computer vision, beating SAM 3 on segmentation and Depth Anything V3 on metric depth.
|
Apr 25, 2026 |
|
Ep 34: Qwen3.6-27B paired with llama.cpp speculative decoding delivers 10x token speedups in real coding sessions, hitting 136 t/s on consumer hardware.
|
Apr 23, 2026 |
|
Ep 33: MetaComp just released the world's first dedicated AI agent governance framework built specifically for regulated financial services.
|
Apr 21, 2026 |
|
Ep 32: Qwen3.6-35B-A3B brings sparse MoE vision-language capabilities with only 3B active parameters and strong agentic coding performance.
|
Apr 17, 2026 |
|
Ep 31: Google DeepMind's Gemini Robotics-ER 1.6 upgrade delivers enhanced embodied reasoning and instrument reading for real-world robot control.
|
Apr 15, 2026 |
|
Ep 30: Aaron Levie declares the enterprise AI shift from chatbots to agents is now underway, moving beyond the "Chat Era."
|
Apr 13, 2026 |
|
Ep 29: Knowledge distillation now compresses full ensembles into single deployable models while preserving their collective intelligence.
|
Apr 11, 2026 |
|
Ep 28: Meta’s Muse Spark and a production-grade compiler-as-a-service approach for agents headline a day heavy on practical agent infrastructure.
|
Apr 09, 2026 |
|
Ep 27: Gemma 4 delivers massive gains across European languages while a 25.6M Rust model achieves 50× faster inference via hybrid attention.
|
Apr 07, 2026 |
|
Ep 26: AutoAgent autonomously optimizes its own harness using the same model to reach #1 on Terminal-Bench and financial modeling in under 24 hours.
|
Apr 05, 2026 |
|
Ep 25: Google drops Gemma 4, claiming the strongest small multimodal open model yet with dramatic gains across every benchmark compared to Gemma 3.
|
Apr 03, 2026 |
|
Models & Agents - Episode 24 - April 01, 2026
|
Apr 01, 2026 |
|
Ep 23: Alibaba Qwen just dropped Qwen3.5-Omni, a native end-to-end multimodal model built for text, audio, video, and realtime interaction.
|
Mar 31, 2026 |
|
Ep 22: Naver's Seoul World Model grounds video generation in real Street View geometry from over a million images and generalizes to other cities without fine-tuning.
|
Mar 29, 2026 |
|
Ep 21: New arXiv papers expose critical flaws in how we evaluate depression-detection models, LLM pruning, and verbalized confidence.
|
Mar 27, 2026 |
|
Ep 19: TrustFlow introduces topic-aware vector reputation for multi-agent systems, replacing scalar scores with queryable multi-dimensional vectors.
|
Mar 23, 2026 |
|
Ep 20: Fair zero-determinant strategies break in the periodic prisoner's dilemma, unlike the classic repeated version.
|
Mar 23, 2026 |
|
Ep 18: LlamaIndex drops LiteParse, a spatial PDF parser built specifically for agentic RAG workflows.
|
Mar 20, 2026 |
|
Ep 17: Picsart launches AI agent marketplace, starting with four agents and adding more weekly for creators.
|
Mar 17, 2026 |
|
Ep 16: RL agents scaled to 1,024 layers unlock emergent parkour skills from basic failures.
|
Mar 15, 2026 |
|
Ep 15: Google DeepMind's Aletheia agent autonomously advances from IMO math to professional research discoveries.
|
Mar 14, 2026 |
|
Ep 14: Perplexity launches "Personal Computer," a $200/month AI agent that automates emails, presentations, and app control 24/7.
|
Mar 13, 2026 |
|
Ep 13: Nvidia plans $26B investment in open-weight AI models to counter Chinese dominance and lock in developers.
|
Mar 12, 2026 |
|
Ep 12: Google unveils Gemini Embedding 2, a multimodal model embedding text, images, video, audio, and docs for advanced RAG systems.
|
Mar 11, 2026 |
|
Ep 11: Meta acquires Moltbook, a Reddit-like platform for AI agents to interact and collaborate.
|
Mar 10, 2026 |
|
Ep 10: Claude Opus 4.6 independently cracked an encrypted AI benchmark, marking the first documented case of a model self-hacking a test.
|
Mar 09, 2026 |
|
Ep 9: Meta's new research trains multimodal AI on unlabeled video, challenging assumptions about text-heavy scaling.
|
Mar 08, 2026 |
|
Ep 8: Anthropic's Claude AI discovered over 100 Firefox vulnerabilities that human testing missed for decades.
|
Mar 07, 2026 |
|
Ep 7: Liquid AI launches LFM2-24B-A2B model and LocalCowork app for fully local, privacy-first agent workflows.
|
Mar 06, 2026 |
|
Ep 6: YuanLab AI launches Yuan 3.0 Ultra, a 1T-parameter multimodal MoE model cutting parameters by 33% while boosting efficiency 49%.
|
Mar 05, 2026 |
|
Ep 5: FireRedTeam releases FireRed-OCR-2B, a 2B-parameter model tackling structural hallucinations in document parsing for tables and LaTeX.
|
Mar 02, 2026 |
|
Ep 4: Alibaba open-sources CoPaw, a workstation for scaling multi-channel AI agent workflows.
|
Mar 01, 2026 |
|
Ep 3: Perplexity open-sources embedding models that match Google and Alibaba performance at a fraction of the memory cost.
|
Feb 28, 2026 |
|
Ep 2: Sakana AI launches Doc-to-LoRA and Text-to-LoRA hypernetworks for zero-shot LLM adaptation to long contexts via natural language.
|
Feb 27, 2026 |
|
Ep 1: Anthropic acquires Vercept to enhance Claude's screen reading, while Google launches Nano Banana 2 for faster, cheaper image generation.
|
Feb 26, 2026 |