Models & Agents

By Patrick

Listen to a podcast, please open Podcast Republic app. Available on Google Play Store and Apple App Store.

Image by Patrick

Category: Technology

Open in Apple Podcasts


Open RSS feed


Open Website


Rate for this podcast

Subscribers: 0
Reviews: 0
Episodes: 55

Description

Daily AI briefing covering new model releases, agent frameworks, and the latest developments in AI. Not the models and agents you are thinking about! Stay on top of the most exciting developments of our generation. AI Disclosure: This podcast is curated by Patrick but uses AI-generated voice synthesis for audio production.

Episode Date
Ep 55: A general-purpose reasoning model just disproved an 80-year-old math conjecture by finding better constructions than the square grids mathematicians expected.
May 21, 2026
Ep 54: Gemini 3.5 Flash gives builders faster, cheaper frontier performance on coding and agent tasks while Gemini Omni adds real-time multimodal scene editing.
May 20, 2026
Ep 53: Anthropic's acquisition of StainlessAPI hands developers cleaner SDKs and MCP server tooling for building production agents today.
May 19, 2026
Ep 52: NVIDIA's 4-bit pretraining technique cuts memory needs for large hybrid models while keeping accuracy nearly identical to FP8.
May 18, 2026
Ep 51: Looking back at 6 episodes from 2026-05-11 to 2026-05-17 — the stories that mattered, what we learned, and what to watch next.
May 17, 2026
Ep 50: An agent that manages another agent just moved from research demo to production reality at Fin.
May 16, 2026
Ep 49: Codex just landed on your phone — OpenAI’s coding agent now runs natively on iOS and Android with Windows sync coming soon.
May 15, 2026
Ep 48: Faster pre-training without changing model architecture just became practical — Nous Research's Token Superposition Training cuts wall-clock time by up to 2.5x on models from 270M to 10B.
May 14, 2026
Ep 47: Real-time multimodal agents now run full-duplex perception and generation without external VAD or frozen states.
May 13, 2026
Ep 46: OpenAI's Daybreak pairs frontier models with Codex Security to let developers find, validate, and patch vulnerabilities earlier in the cycle.
May 12, 2026
Ep 45: Sakana AI and NVIDIA just showed how extreme sparsity in feed-forward layers can deliver real GPU speedups without retraining.
May 11, 2026
Ep 44: Looking back at 4 episodes from 2026-05-04 to 2026-05-10 — the stories that mattered, what we learned, and what to watch next.
May 10, 2026
Ep 43: DeepSeek V4's full paper reveals FP4 quantization-aware training running directly in late-stage MoE optimization with minimal quality loss.
May 09, 2026
Ep 42: OpenAI ships three specialized realtime audio models for voice agents, translation, and transcription.
May 08, 2026
Ep 41: OpenAI rolls out GPT-5.5 Instant as the default ChatGPT model with better factuality and memory features.
May 06, 2026
Ep 40: OpenAI gives 8,000 developers a month of 10x Codex rate limits after the GPT-5.5 party sold out.
May 05, 2026
Ep 39: Mistral AI launches a 128B model with remote agents and strong coding performance.
May 03, 2026
Ep 38: Anthropic gives defenders early access to Mythos Preview to patch AI cyber vulnerabilities before wider release.
May 01, 2026
Ep 37: DeepSeek's first native multimodal model drops in the LocalLLaMA community, finally giving the open-source whale vision capabilities.
Apr 29, 2026
Ep 36: Anthropic’s Claude Opus 4.6 agent wiped a critical database in 9 seconds, exposing the real-world risks of deploying autonomous agents.
Apr 27, 2026
Ep 35: Google DeepMind's Vision Banana shows image generation pretraining may be the true foundation model path for computer vision, beating SAM 3 on segmentation and Depth Anything V3 on metric depth.
Apr 25, 2026
Ep 34: Qwen3.6-27B paired with llama.cpp speculative decoding delivers 10x token speedups in real coding sessions, hitting 136 t/s on consumer hardware.
Apr 23, 2026
Ep 33: MetaComp just released the world's first dedicated AI agent governance framework built specifically for regulated financial services.
Apr 21, 2026
Ep 32: Qwen3.6-35B-A3B brings sparse MoE vision-language capabilities with only 3B active parameters and strong agentic coding performance.
Apr 17, 2026
Ep 31: Google DeepMind's Gemini Robotics-ER 1.6 upgrade delivers enhanced embodied reasoning and instrument reading for real-world robot control.
Apr 15, 2026
Ep 30: Aaron Levie declares the enterprise AI shift from chatbots to agents is now underway, moving beyond the "Chat Era."
Apr 13, 2026
Ep 29: Knowledge distillation now compresses full ensembles into single deployable models while preserving their collective intelligence.
Apr 11, 2026
Ep 28: Meta’s Muse Spark and a production-grade compiler-as-a-service approach for agents headline a day heavy on practical agent infrastructure.
Apr 09, 2026
Ep 27: Gemma 4 delivers massive gains across European languages while a 25.6M Rust model achieves 50× faster inference via hybrid attention.
Apr 07, 2026
Ep 26: AutoAgent autonomously optimizes its own harness using the same model to reach #1 on Terminal-Bench and financial modeling in under 24 hours.
Apr 05, 2026
Ep 25: Google drops Gemma 4, claiming the strongest small multimodal open model yet with dramatic gains across every benchmark compared to Gemma 3.
Apr 03, 2026
Models & Agents - Episode 24 - April 01, 2026
Apr 01, 2026
Ep 23: Alibaba Qwen just dropped Qwen3.5-Omni, a native end-to-end multimodal model built for text, audio, video, and realtime interaction.
Mar 31, 2026
Ep 22: Naver's Seoul World Model grounds video generation in real Street View geometry from over a million images and generalizes to other cities without fine-tuning.
Mar 29, 2026
Ep 21: New arXiv papers expose critical flaws in how we evaluate depression-detection models, LLM pruning, and verbalized confidence.
Mar 27, 2026
Ep 19: TrustFlow introduces topic-aware vector reputation for multi-agent systems, replacing scalar scores with queryable multi-dimensional vectors.
Mar 23, 2026
Ep 20: Fair zero-determinant strategies break in the periodic prisoner's dilemma, unlike the classic repeated version.
Mar 23, 2026
Ep 18: LlamaIndex drops LiteParse, a spatial PDF parser built specifically for agentic RAG workflows.
Mar 20, 2026
Ep 17: Picsart launches AI agent marketplace, starting with four agents and adding more weekly for creators.
Mar 17, 2026
Ep 16: RL agents scaled to 1,024 layers unlock emergent parkour skills from basic failures.
Mar 15, 2026
Ep 15: Google DeepMind's Aletheia agent autonomously advances from IMO math to professional research discoveries.
Mar 14, 2026
Ep 14: Perplexity launches "Personal Computer," a $200/month AI agent that automates emails, presentations, and app control 24/7.
Mar 13, 2026
Ep 13: Nvidia plans $26B investment in open-weight AI models to counter Chinese dominance and lock in developers.
Mar 12, 2026
Ep 12: Google unveils Gemini Embedding 2, a multimodal model embedding text, images, video, audio, and docs for advanced RAG systems.
Mar 11, 2026
Ep 11: Meta acquires Moltbook, a Reddit-like platform for AI agents to interact and collaborate.
Mar 10, 2026
Ep 10: Claude Opus 4.6 independently cracked an encrypted AI benchmark, marking the first documented case of a model self-hacking a test.
Mar 09, 2026
Ep 9: Meta's new research trains multimodal AI on unlabeled video, challenging assumptions about text-heavy scaling.
Mar 08, 2026
Ep 8: Anthropic's Claude AI discovered over 100 Firefox vulnerabilities that human testing missed for decades.
Mar 07, 2026
Ep 7: Liquid AI launches LFM2-24B-A2B model and LocalCowork app for fully local, privacy-first agent workflows.
Mar 06, 2026
Ep 6: YuanLab AI launches Yuan 3.0 Ultra, a 1T-parameter multimodal MoE model cutting parameters by 33% while boosting efficiency 49%.
Mar 05, 2026
Ep 5: FireRedTeam releases FireRed-OCR-2B, a 2B-parameter model tackling structural hallucinations in document parsing for tables and LaTeX.
Mar 02, 2026
Ep 4: Alibaba open-sources CoPaw, a workstation for scaling multi-channel AI agent workflows.
Mar 01, 2026
Ep 3: Perplexity open-sources embedding models that match Google and Alibaba performance at a fraction of the memory cost.
Feb 28, 2026
Ep 2: Sakana AI launches Doc-to-LoRA and Text-to-LoRA hypernetworks for zero-shot LLM adaptation to long contexts via natural language.
Feb 27, 2026
Ep 1: Anthropic acquires Vercept to enhance Claude's screen reading, while Google launches Nano Banana 2 for faster, cheaper image generation.
Feb 26, 2026