AI Breakdown Podcast Republic

AI Breakdown

By agibreakdown

Listen to a podcast, please open Podcast Republic app. Available on Google Play Store and Apple App Store.

Image by agibreakdown

Category: Education

Open Website

Rate for this podcast

Subscribers: 8
Reviews: 0
Episodes: 400

Description

The podcast where we use AI to breakdown the recent AI papers and provide simplified explanations of intricate AI topics for educational purposes. The content presented here is generated automatically by utilizing LLM and text to speech technologies. While every effort is made to ensure accuracy, any potential misrepresentations or inaccuracies are unintentional due to evolving technology. We value your feedback to enhance our podcast and provide you with the best possible learning experience.

Episode	Date
Beyond Language Modeling: An Exploration of Multimodal Pretraining Read the full episode description	Mar 06, 2026
Mode Seeking meets Mean Seeking for Fast Long Video Generation Read the full episode description	Mar 04, 2026
Recursive Language Models Read the full episode description	Mar 04, 2026
PaperBanana: Automating Academic Illustration for AI Scientists Read the full episode description	Feb 10, 2026
World-Gymnast: Training Robots with Reinforcement Learning in a World Model Read the full episode description	Feb 10, 2026
Memory-V2V: Augmenting Video-to-Video Diffusion Models with Memory Read the full episode description	Jan 29, 2026
Self-Rewarding Language Models Read the full episode description	Jan 08, 2026
On the generalization of language models from in-context learning and finetuning: a controlled study Read the full episode description	Jan 05, 2026
OpenThoughts: Data Recipes for Reasoning Models Read the full episode description	Dec 16, 2025
Nested Learning: The Illusion of Deep Learning Architecture Read the full episode description	Dec 13, 2025
ARC Is a Vision Problem! Read the full episode description	Dec 09, 2025
Solving a Million-Step LLM Task with Zero Errors Read the full episode description	Dec 09, 2025
DataRater: Meta-Learned Dataset Curation Read the full episode description	Dec 05, 2025
Mathematical exploration and discovery at scale Read the full episode description	Nov 15, 2025
Kosmos: An AI Scientist for Autonomous Discovery Read the full episode description	Nov 12, 2025
World Simulation with Video Foundation Models for Physical AI Read the full episode description	Nov 08, 2025
Towards Robust Mathematical Reasoning Read the full episode description	Nov 06, 2025
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Read the full episode description	Nov 04, 2025
Roboflow100-VL: A Multi-Domain Object Detection Benchmark for Vision-Language Models Read the full episode description	Oct 28, 2025
ImpossibleBench: Measuring LLMs’ Propensity of Exploiting Test Cases Read the full episode description	Oct 27, 2025
Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset Read the full episode description	Oct 27, 2025
Reasoning with Sampling: Your Base Model is Smarter Than You Think Read the full episode description	Oct 23, 2025
DeepSeek-OCR: Contexts Optical Compression Read the full episode description	Oct 21, 2025
The Markovian Thinker Read the full episode description	Oct 16, 2025
DeepDive: Advancing Deep Search Agents with Knowledge Graphs and Multi-Turn RL Read the full episode description	Oct 08, 2025
Towards a Physics Foundation Model Read the full episode description	Oct 03, 2025
Scalable Option Learning in High-Throughput Environments Read the full episode description	Sep 30, 2025
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Read the full episode description	Sep 24, 2025
Reverse-Engineered Reasoning for Open-Ended Generation Read the full episode description	Sep 19, 2025
Scaling Performance of Large Language Model Pretraining Read the full episode description	Sep 16, 2025
General Social Agents Read the full episode description	Sep 15, 2025
We need a new ethics for a world of AI agents Read the full episode description	Sep 12, 2025
Hierarchical Reasoning Model Read the full episode description	Sep 11, 2025
ARC-Hunyuan-Video-7B: Structured Video Comprehension of Real-World Shorts Read the full episode description	Sep 10, 2025
Small Language Models are the Future of Agentic AI Read the full episode description	Sep 09, 2025
Learning When to Plan: Efficiently Allocating Test-Time Compute for LLM Agents Read the full episode description	Sep 08, 2025
Why Language Models Hallucinate Read the full episode description	Sep 07, 2025
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens Read the full episode description	Aug 19, 2025
Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models Read the full episode description	Aug 15, 2025
Persona Vectors: Monitoring and Controlling Character Traits in Language Models Read the full episode description	Aug 13, 2025
Learn Globally, Speak Locally: Bridging the Gaps in Multilingual Reasoning Read the full episode description	Aug 01, 2025
Position: The AI Conference Peer Review Crisis Demands Author Feedback and Reviewer Rewards Read the full episode description	Jul 31, 2025
Working with AI: Measuring the Occupational Implications of Generative AI Read the full episode description	Jul 31, 2025
Towards physician-centered oversight of conversational diagnostic AI Read the full episode description	Jul 30, 2025
Learning without training: The implicit dynamics of in-context learning Read the full episode description	Jul 28, 2025
Aime: Towards Fully-Autonomous Multi-Agent Framework Read the full episode description	Jul 25, 2025
ARAG: Agentic Retrieval Augmented Generation for Personalized Recommendation Read the full episode description	Jul 23, 2025
4KAgent: Agentic Any Image to 4K Super-Resolution Read the full episode description	Jul 18, 2025
Critiques of World Models Read the full episode description	Jul 16, 2025
Arxiv paper - Expert-level validation of AI-generated medical text with scalable language models Read the full episode description	Jul 15, 2025
Arxiv paper - ImplicitQA: Going beyond frames towards Implicit Video Reasoning Read the full episode description	Jul 11, 2025
Arxiv paper - BlenderFusion: 3D-Grounded Visual Editing and Generative Compositing Read the full episode description	Jul 08, 2025
Arxiv paper - Strategic Intelligence in Large Language Models: Evidence from evolutionary Game Theory Read the full episode description	Jul 08, 2025
Blogpost paper - Project Vend: Can Claude run a small shop? (And why does that matter?) Read the full episode description	Jul 02, 2025
Arxiv paper - Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens Read the full episode description	Jul 02, 2025
Arxiv paper - SuperEdit: Rectifying and Facilitating Supervision for Instruction-Based Image Editing Read the full episode description	Jun 30, 2025
Arxiv paper - OMEGA: Can LLMs Reason Outside the Box in Math? Evaluating Exploratory, Compositional, and Transformative Generalization Read the full episode description	Jun 27, 2025
Arxiv paper - Long-Context State-Space Video World Models Read the full episode description	Jun 25, 2025
Arxiv paper - From Bytes to Ideas: Language Modeling with Autoregressive U-Nets Read the full episode description	Jun 24, 2025
Arxiv paper - Reinforcement Pre-Training Read the full episode description	Jun 20, 2025
Arxiv paper - Token-Efficient Long Video Understanding for Multimodal LLMs Read the full episode description	Jun 18, 2025
Arxiv paper - The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity Read the full episode description	Jun 11, 2025
Arxiv paper - Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models Read the full episode description	Jun 09, 2025
Arxiv paper - How much do language models memorize? Read the full episode description	Jun 06, 2025
Arxiv paper - MMaDA: Multimodal Large Diffusion Language Models Read the full episode description	Jun 03, 2025
Arxiv paper - Superhuman performance of a large language model on the reasoning tasks of a physician Read the full episode description	Jun 03, 2025
Arxiv paper - The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models Read the full episode description	May 29, 2025
Arxiv paper - DanceGRPO: Unleashing GRPO on Visual Generation Read the full episode description	May 28, 2025
Arxiv paper - Visual Planning: Let’s Think Only with Images Read the full episode description	May 21, 2025
Arxiv paper - A Preliminary Study for GPT-4o on Image Restoration Read the full episode description	May 14, 2025
Arxiv paper - DiffusionSfM: Predicting Structure and Motion via Ray Origin and Endpoint Diffusion Read the full episode description	May 12, 2025
Arxiv paper - RayZer: A Self-supervised Large View Synthesis Model Read the full episode description	May 09, 2025
Arxiv paper - Reinforcement Learning for Reasoning in Large Language Models with One Training Example Read the full episode description	May 08, 2025
Arxiv paper - MINERVA: Evaluating Complex Video Reasoning Read the full episode description	May 06, 2025
Arxiv paper - The Leaderboard Illusion Read the full episode description	May 06, 2025
Arxiv paper - Towards Understanding Camera Motions in Any Video Read the full episode description	May 05, 2025
Arxiv paper - Describe Anything: Detailed Localized Image and Video Captioning Read the full episode description	Apr 29, 2025
Arxiv paper - MCNC: MANIFOLD-CONSTRAINED REPARAMETERIZATION FOR NEURAL COMPRESSION Read the full episode description	Apr 28, 2025
Arxiv paper - Self-Improving Robust Preference Optimization Read the full episode description	Apr 23, 2025
Arxiv paper - LLM Post-Training: A Deep Dive into Reasoning Large Language Models Read the full episode description	Apr 22, 2025
Arxiv paper - Welcome to the Era of Experience Read the full episode description	Apr 21, 2025
Arxiv paper - MALT Diffusion: Memory-Augmented Latent Transformers for Any-Length Video Generation Read the full episode description	Apr 19, 2025
Arxiv paper - InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Read the full episode description	Apr 17, 2025
Arxiv paper - EquiVDM: Equivariant Video Diffusion Models with Temporally Consistent Noise Read the full episode description	Apr 16, 2025
Arxiv paper - TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning Read the full episode description	Apr 16, 2025
Arxiv paper - Reasoning Models Don’t Always Say What They Think Read the full episode description	Apr 09, 2025
Arxiv paper - Slow-Fast Architecture for Video Multi-Modal Large Language Models Read the full episode description	Apr 07, 2025
Arxiv paper - TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes Read the full episode description	Apr 04, 2025
Arxiv paper - VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning Read the full episode description	Apr 01, 2025
Arxiv paper - SynCity: Training-Free Generation of 3D Worlds Read the full episode description	Mar 28, 2025
Arxiv paper - HD-EPIC: A Highly-Detailed Egocentric Video Dataset Read the full episode description	Mar 26, 2025
Arxiv paper - Video-T1: Test-Time Scaling for Video Generation Read the full episode description	Mar 25, 2025
Arxiv paper - Calibrated Multi-Preference Optimization for Aligning Diffusion Models Read the full episode description	Mar 24, 2025
Arxiv paper - Personalize Anything for Free with Diffusion Transformer Read the full episode description	Mar 21, 2025
Arxiv paper - Story-Adapter: A Training-free Iterative Framework for Long Story Visualization Read the full episode description	Mar 20, 2025
Arxiv paper - ReCamMaster: Camera-Controlled Generative Rendering from A Single Video Read the full episode description	Mar 18, 2025
Arxiv paper - Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models Read the full episode description	Mar 17, 2025
Arxiv paper - MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks Read the full episode description	Mar 13, 2025
Arxiv paper - TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models Read the full episode description	Mar 12, 2025
Arxiv paper - PlanGEN: A Multi-Agent Framework for Generating Planning and Reasoning Trajectories for Complex Problem Solving Read the full episode description	Mar 11, 2025
Arxiv paper - VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing Read the full episode description	Mar 08, 2025
Arxiv paper - ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models Read the full episode description	Mar 04, 2025
Arxiv paper - Teaching Language Models to Critique via Reinforcement Learning Read the full episode description	Mar 03, 2025
Arxiv paper - PANDAS: Improving Many-shot Jailbreaking via Positive Affirmation, Negative Demonstration, and Adaptive Sampling Read the full episode description	Feb 27, 2025
Arxiv paper - VidCRAFT3: Camera, Object, and Lighting Control for Image-to-Video Generation Read the full episode description	Feb 24, 2025
Arxiv paper - Heuristically Adaptive Diffusion-Model Evolutionary Strategy Read the full episode description	Feb 22, 2025
Arxiv paper - Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Read the full episode description	Feb 20, 2025
Arxiv paper - EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents Read the full episode description	Feb 19, 2025
Arxiv paper - VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection Read the full episode description	Feb 14, 2025
Arxiv paper - VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models Read the full episode description	Feb 13, 2025
Arxiv paper - HunyuanVideo: A Systematic Framework For Large Video Generative Models Read the full episode description	Feb 12, 2025
Arxiv paper - s1: Simple test-time scaling Read the full episode description	Feb 10, 2025
Arxiv paper - Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation Read the full episode description	Feb 07, 2025
Arxiv paper - MatAnyone: Stable Video Matting with Consistent Memory Propagation Read the full episode description	Feb 07, 2025
Arxiv paper - Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate Read the full episode description	Feb 03, 2025
Arxiv paper - Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs Read the full episode description	Jan 31, 2025
Arxiv paper - MetaMorph: Multimodal Understanding and Generation via Instruction Tuning Read the full episode description	Jan 30, 2025
Arxiv paper - Improving Video Generation with Human Feedback Read the full episode description	Jan 29, 2025
Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling Read the full episode description	Jan 28, 2025
Arxiv paper - DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Read the full episode description	Jan 27, 2025
Arxiv paper - Can We Generate Images with CoT? Let’s Verify and Reinforce Image Generation Step by Step Read the full episode description	Jan 24, 2025
Arxiv paper - Improving Factuality with Explicit Working Memory Read the full episode description	Jan 23, 2025
Arxiv paper - Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control Read the full episode description	Jan 17, 2025
Arxiv paper - FaceLift: Single Image to 3D Head with View Generation and GS-LRM Read the full episode description	Jan 13, 2025
Arxiv paper - GenHMR: Generative Human Mesh Recovery Read the full episode description	Jan 08, 2025
Arxiv paper - Video Creation by Demonstration Read the full episode description	Jan 06, 2025
Arxiv paper - Byte Latent Transformer: Patches Scale Better Than Tokens Read the full episode description	Jan 02, 2025
Arxiv paper - Align3R: Aligned Monocular Depth Estimation for Dynamic Videos Read the full episode description	Dec 17, 2024
Arxiv paper - FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion Read the full episode description	Dec 17, 2024
Arxiv paper - ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis Read the full episode description	Dec 11, 2024
Arxiv paper - o1-Coder: an o1 Replication for Coding Read the full episode description	Dec 10, 2024
Arxiv paper - DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning Read the full episode description	Dec 06, 2024
ICLR 2025 submission - CYCLE-CONSISTENT LEARNING FOR JOINT LAYOUT-TO-IMAGE GENERATION AND OBJECT DETECTION Read the full episode description	Dec 03, 2024
Arxiv Paper - WonderWorld: Interactive 3D Scene Generation from a Single Image Read the full episode description	Nov 26, 2024
Arxiv Paper - Hymba: A Hybrid-head Architecture for Small Language Models Read the full episode description	Nov 22, 2024
Arxiv Paper - Covert Malicious Finetuning: Challenges in Safeguarding LLM Adaptation Read the full episode description	Nov 21, 2024
Arxiv Paper - Video Instruction Tuning With Synthetic Data Read the full episode description	Nov 20, 2024
Arxiv Paper - Generative Agent Simulations of 1,000 People Read the full episode description	Nov 19, 2024
NeurIPS 2024 - Moving Off-the-Grid: Scene-Grounded Video Representations Read the full episode description	Nov 15, 2024
Arxiv Paper - Qwen2-VL: Enhancing Vision-Language Model’s Perception of the World at Any Resolution Read the full episode description	Nov 14, 2024
Arxiv Paper - FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality Read the full episode description	Nov 13, 2024
Arxiv Paper - Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA Read the full episode description	Nov 11, 2024
Arxiv Paper - Long Context RAG Performance of Large Language Models Read the full episode description	Nov 08, 2024
Arxiv Paper - NVLM: Open Frontier-Class Multimodal LLMs Read the full episode description	Nov 05, 2024
Arxiv Paper - ColPali: Efficient Document Retrieval with Vision Language Models Read the full episode description	Nov 01, 2024
Arxiv Paper - Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models Read the full episode description	Oct 31, 2024
Arxiv Paper - Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization Read the full episode description	Oct 31, 2024
Arxiv Paper - Unbounded: A Generative Infinite Game of Character Life Simulation Read the full episode description	Oct 29, 2024
Arxiv Paper - Reverse Question Answering: Can an LLM Write a Question so Hard (or Bad) that it Can’t Answer? Read the full episode description	Oct 28, 2024
Arxiv Paper - LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding Read the full episode description	Oct 25, 2024
Arxiv Paper - When Does Perceptual Alignment Benefit Vision Representations? Read the full episode description	Oct 23, 2024
Arxiv paper - SceneCraft: Layout-Guided 3D Scene Generation Read the full episode description	Oct 22, 2024
arxiv preprint - A Tale of Tails: Model Collapse as a Change of Scaling Laws Read the full episode description	Oct 18, 2024
arxiv preprint - Thinking LLMs: General Instruction Following with Thought Generation Read the full episode description	Oct 17, 2024
arxiv preprint - Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think Read the full episode description	Oct 16, 2024
arxiv preprint - F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching Read the full episode description	Oct 14, 2024
arxiv preprint - One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation Read the full episode description	Oct 11, 2024
arxiv preprint - Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion Models Read the full episode description	Oct 10, 2024
arxiv preprint - NEPTUNE: THE LONG ORBIT TO BENCHMARKING LONG VIDEO UNDERSTANDING Read the full episode description	Oct 07, 2024
arxiv preprint - SHIC: Shape-Image Correspondences with no Keypoint Supervision Read the full episode description	Oct 04, 2024
arxiv preprint - E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding Read the full episode description	Oct 03, 2024
arxiv preprint - LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness Read the full episode description	Oct 01, 2024
arxiv preprint - DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos Read the full episode description	Sep 28, 2024
arxiv preprint - Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale Read the full episode description	Sep 27, 2024
arxiv preprint - Phantom of Latent for Large Language and Vision Models Read the full episode description	Sep 24, 2024
arxiv preprint - Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think Read the full episode description	Sep 20, 2024
arxiv preprint - On the Diagram of Thought Read the full episode description	Sep 19, 2024
arxiv preprint - Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources Read the full episode description	Sep 17, 2024
arxiv preprint - SongCreator: Lyrics-based Universal Song Generation Read the full episode description	Sep 12, 2024
arxiv preprint - Achieving Human Level Competitive Robot Table Tennis Read the full episode description	Sep 11, 2024
arxiv preprint - Sapiens: Foundation for Human Vision Models Read the full episode description	Sep 09, 2024
arxiv preprint - Re-Reading Improves Reasoning in Large Language Models Read the full episode description	Sep 06, 2024
arxiv preprint - SPIRE: Semantic Prompt-Driven Image Restoration Read the full episode description	Sep 03, 2024
arxiv preprint - Automated Design of Agentic Systems Read the full episode description	Aug 31, 2024
arxiv preprint - Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model Read the full episode description	Aug 28, 2024
arxiv preprint - To Code, or Not To Code? Exploring Impact of Code in Pre-training Read the full episode description	Aug 26, 2024
arxiv preprint - Segment Anything with Multiple Modalities Read the full episode description	Aug 23, 2024
arxiv preprint - JPEG-LM: LLMs as Image Generators with Canonical Codec Representations Read the full episode description	Aug 20, 2024
arxiv preprint - Mission: Impossible Language Models Read the full episode description	Aug 19, 2024
arxiv preprint - Learning Task Decomposition to Assist Humans in Competitive Programming Read the full episode description	Aug 16, 2024
arxiv preprint - IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts Read the full episode description	Aug 13, 2024
arxiv preprint - Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters Read the full episode description	Aug 10, 2024
arxiv preprint - Language Model Can Listen While Speaking Read the full episode description	Aug 09, 2024
arxiv preprint - Improving Text Embeddings for Smaller Language Models Using Contrastive Fine-tuning Read the full episode description	Aug 07, 2024
arxiv preprint - Cycle3D: High-quality and Consistent Image-to-3D Generation via Generation-Reconstruction Cycle Read the full episode description	Aug 06, 2024
arxiv preprint - Towards Achieving Human Parity on End-to-end Simultaneous Speech Translation via LLM Agent Read the full episode description	Aug 06, 2024
arxiv preprint - Graph-enhanced Large Language Models in Asynchronous Plan Reasoning Read the full episode description	Jul 31, 2024
arxiv preprint - LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference Read the full episode description	Jul 30, 2024
arxiv preprint - OutfitAnyone: Ultra-high Quality Virtual Try-On for Any Clothing and Any Person Read the full episode description	Jul 29, 2024
arxiv preprint - DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM Read the full episode description	Jul 27, 2024
arxiv preprint - Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning Read the full episode description	Jul 23, 2024
arxiv preprint - Chameleon: Mixed-Modal Early-Fusion Foundation Models Read the full episode description	Jul 22, 2024
arxiv preprint - Goldfish: Vision-Language Understanding of Arbitrarily Long Videos Read the full episode description	Jul 18, 2024
arxiv preprint - Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity Read the full episode description	Jul 17, 2024
arxiv preprint - Human-like Episodic Memory for Infinite Context LLMs Read the full episode description	Jul 15, 2024
arxiv preprint - Learning to (Learn at Test Time): RNNs with Expressive Hidden States Read the full episode description	Jul 12, 2024
arxiv preprint - Graph-Based Captioning: Enhancing Visual Descriptions by Interconnecting Region Captions Read the full episode description	Jul 11, 2024
arxiv preprint - Evaluating Human Alignment and Model Faithfulness of LLM Rationale Read the full episode description	Jul 09, 2024
arxiv preprint - Detection and Measurement of Syntactic Templates in Generated Text Read the full episode description	Jul 08, 2024
arxiv preprint - From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data Read the full episode description	Jul 01, 2024
arxiv preprint - MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning Read the full episode description	Jun 27, 2024
arxiv preprint - 4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities Read the full episode description	Jun 26, 2024
arxiv preprint - VideoLLM-online: Online Video Large Language Model for Streaming Video Read the full episode description	Jun 25, 2024
arxiv preprint - EvTexture: Event-driven Texture Enhancement for Video Super-Resolution Read the full episode description	Jun 24, 2024
arxiv preprint - MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model Read the full episode description	Jun 22, 2024
arxiv preprint - An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels Read the full episode description	Jun 20, 2024
arxiv preprint - Graphic Design with Large Multimodal Model Read the full episode description	Jun 20, 2024
arxiv preprint - LLARVA: Vision-Action Instruction Tuning Enhances Robot Learning Read the full episode description	Jun 18, 2024
arxiv preprint - Transformers need glasses! Information over-squashing in language tasks Read the full episode description	Jun 17, 2024
arxiv preprint - Show, Don’t Tell: Aligning Language Models with Demonstrated Feedback Read the full episode description	Jun 14, 2024
arxiv preprint - TextGrad: Automatic ”Differentiation” via Text Read the full episode description	Jun 13, 2024
arxiv preprint - SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales Read the full episode description	Jun 12, 2024
arxiv preprint - Open-Endedness is Essential for Artificial Superhuman Intelligence Read the full episode description	Jun 11, 2024
arxiv preprint - To Believe or Not to Believe Your LLM Read the full episode description	Jun 08, 2024
arxiv preprint - Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts Read the full episode description	Jun 06, 2024
arxiv preprint - Contextual Position Encoding: Learning to Count What’s Important Read the full episode description	Jun 04, 2024
arxiv preprint - Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis Read the full episode description	Jun 03, 2024
arxiv preprint - VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos Read the full episode description	May 31, 2024
arxiv preprint - CinePile: A Long Video Question Answering Dataset and Benchmark Read the full episode description	May 30, 2024
arxiv preprint - Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum Read the full episode description	May 29, 2024
arxiv preprint - SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering Read the full episode description	May 28, 2024
arxiv preprint - Octo: An Open-Source Generalist Robot Policy Read the full episode description	May 24, 2024
arxiv preprint - Layer-Condensed KV Cache for Efficient Inference of Large Language Models Read the full episode description	May 23, 2024
arxiv preprint - Observational Scaling Laws and the Predictability of Language Model Performance Read the full episode description	May 22, 2024
arxiv preprint - Pack of LLMs: Model Fusion at Test-Time via Perplexity Optimization Read the full episode description	May 21, 2024
arxiv preprint - The Platonic Representation Hypothesis Read the full episode description	May 20, 2024
arxiv preprint - Many-Shot In-Context Learning in Multimodal Foundation Models Read the full episode description	May 18, 2024
arxiv preprint - Naturalistic Music Decoding from EEG Data via Latent Diffusion Models Read the full episode description	May 16, 2024
arxiv preprint - The Chosen One: Consistent Characters in Text-to-Image Diffusion Models Read the full episode description	May 15, 2024
arxiv preprint - Memory Mosaics Read the full episode description	May 14, 2024
arxiv preprint - Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations? Read the full episode description	May 13, 2024
arxiv preprint - LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models Read the full episode description	May 10, 2024
arxiv preprint - WildChat: 1M ChatGPT Interaction Logs in the Wild Read the full episode description	May 09, 2024
arxiv preprint - Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models Read the full episode description	May 08, 2024
arxiv preprint - NOLA: Compressing LoRA using Linear Combination of Random Basis Read the full episode description	May 07, 2024
arxiv preprint - StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation Read the full episode description	May 06, 2024
arxiv preprint - Iterative Reasoning Preference Optimization Read the full episode description	May 03, 2024
arxiv preprint - Better & Faster Large Language Models via Multi-token Prediction Read the full episode description	May 02, 2024
arxiv preprint - Make Your LLM Fully Utilize the Context Read the full episode description	May 01, 2024
arxiv preprint - Davidsonian Scene Graph: Improving Reliability in Fine-grained Evaluation for Text-to-Image Generation Read the full episode description	Apr 30, 2024
arxiv preprint - PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning Read the full episode description	Apr 29, 2024
arxiv preprint - Hippocrates: An Open-Source Framework for Advancing Large Language Models in Healthcare Read the full episode description	Apr 26, 2024
arxiv preprint - SnapKV: LLM Knows What You are Looking for Before Generation Read the full episode description	Apr 25, 2024
arxiv preprint - CATS: Contextually-Aware Thresholding for Sparsity in Large Language Models Read the full episode description	Apr 24, 2024
arxiv preprint - SpaceByte: Towards Deleting Tokenization from Large Language Modeling Read the full episode description	Apr 23, 2024
arxiv preprint - TextSquare: Scaling up Text-Centric Visual Instruction Tuning Read the full episode description	Apr 22, 2024
arxiv preprint - EdgeFusion: On-Device Text-to-Image Generation Read the full episode description	Apr 19, 2024
arxiv preprint - VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time Read the full episode description	Apr 18, 2024
arxiv preprint - Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models Read the full episode description	Apr 17, 2024
arxiv preprint - High-Dimension Human Value Representation in Large Language Models Read the full episode description	Apr 16, 2024
arxiv preprint - Why do small language models underperform? Studying Language Model Saturation via the Softmax Bottleneck Read the full episode description	Apr 15, 2024
arxiv preprint - Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Read the full episode description	Apr 12, 2024
arxiv preprint - Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs Read the full episode description	Apr 11, 2024
arxiv preprint - Evaluating Text-to-Visual Generation with Image-to-Text Generation Read the full episode description	Apr 10, 2024
arxiv preprint - Future Lens: Anticipating Subsequent Tokens from a Single Hidden State Read the full episode description	Apr 09, 2024
arxiv preprint - Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity Read the full episode description	Apr 08, 2024
arxiv preprint - Mixture-of-Depths: Dynamically allocating compute in transformer-based language models Read the full episode description	Apr 05, 2024
arxiv preprint - WavLLM: Towards Robust and Adaptive Speech Large Language Model Read the full episode description	Apr 04, 2024
arxiv preprint - Gecko: Versatile Text Embeddings Distilled from Large Language Models Read the full episode description	Apr 03, 2024
arxiv preprint - ReALM: Reference Resolution As Language Modeling Read the full episode description	Apr 02, 2024
arxiv preprint - sDPO: Don’t Use Your Data All at Once Read the full episode description	Apr 01, 2024
arxiv preprint - LITA: Language Instructed Temporal-Localization Assistant Read the full episode description	Mar 29, 2024
arxiv preprint - AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing Tasks Read the full episode description	Mar 28, 2024
arxiv preprint - InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding Read the full episode description	Mar 27, 2024
arxiv preprint - Giraffe: Adventures in Expanding Context Lengths in LLMs Read the full episode description	Mar 26, 2024
arxiv preprint - Explorative Inbetweening of Time and Space Read the full episode description	Mar 25, 2024
arxiv preprint - Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking Read the full episode description	Mar 22, 2024
arxiv preprint - Evaluating Large Language Models at Evaluating Instruction Following Read the full episode description	Mar 21, 2024
arxiv preprint - Evaluating Large Language Models as Generative User Simulators for Conversational Recommendation Read the full episode description	Mar 20, 2024
arxiv preprint - Branch-Solve-Merge Improves Large Language Model Evaluation and Generation Read the full episode description	Mar 19, 2024
arxiv preprint - MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training Read the full episode description	Mar 18, 2024
arxiv preprint - Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking Read the full episode description	Mar 15, 2024
arxiv preprint - WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks? Read the full episode description	Mar 14, 2024
arxiv preprint - Synth$^2$: Boosting Visual-Language Models with Synthetic Captions and Image Embeddings Read the full episode description	Mar 13, 2024
arxiv preprint - Is Cosine-Similarity of Embeddings Really About Similarity? Read the full episode description	Mar 12, 2024
arxiv preprint - A Generative Approach for Wikipedia-Scale Visual Entity Recognition Read the full episode description	Mar 11, 2024
arxiv preprint - Self-correcting LLM-controlled Diffusion Models Read the full episode description	Mar 08, 2024
arxiv preprint - tinyBenchmarks: evaluating LLMs with fewer examples Read the full episode description	Mar 08, 2024
arxiv preprint - Asymmetry in Low-Rank Adapters of Foundation Models Read the full episode description	Mar 06, 2024
arxiv preprint - When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method Read the full episode description	Mar 05, 2024
arxiv preprint - EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions Read the full episode description	Mar 04, 2024
arxiv preprint - The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Read the full episode description	Mar 01, 2024
arxiv preprint - Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models Read the full episode description	Feb 29, 2024
arxiv preprint - LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning Read the full episode description	Feb 28, 2024
arxiv preprint - Branch-Solve-Merge Improves Large Language Model Evaluation and Generation Read the full episode description	Feb 27, 2024
arxiv preprint - SciMON: Scientific Inspiration Machines Optimized for Novelty Read the full episode description	Feb 26, 2024
arxiv preprint - Speculative Streaming: Fast LLM Inference without Auxiliary Models Read the full episode description	Feb 23, 2024
arxiv preprint - LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models Read the full episode description	Feb 22, 2024
arxiv preprint - UPAR: A Kantian-Inspired Prompting Framework for Enhancing Large Language Model Capabilities Read the full episode description	Feb 21, 2024
arxiv preprint - Guiding Instruction-based Image Editing via Multimodal Large Language Models Read the full episode description	Feb 20, 2024
arxiv preprint - Spectral State Space Models Read the full episode description	Feb 16, 2024
arxiv preprint - More Agents Is All You Need Read the full episode description	Feb 15, 2024
arxiv preprint - World Model on Million-Length Video And Language With RingAttention Read the full episode description	Feb 14, 2024
arxiv preprint - Learning Video Representations from Large Language Models Read the full episode description	Feb 13, 2024
arxiv preprint - Can Large Language Models Understand Context? Read the full episode description	Feb 12, 2024
arxiv preprint - Long Story Short: a Summarize-then-Search Method for Long Video Question Answering Read the full episode description	Feb 09, 2024
arxiv preprint - System 2 Attention (is something you might need too) Read the full episode description	Feb 08, 2024
arxiv preprint - DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Read the full episode description	Feb 07, 2024
arxiv preprint - KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization Read the full episode description	Feb 06, 2024
arxiv preprint - Language Model Inversion Read the full episode description	Feb 05, 2024
arxiv preprint - Tree Prompting: Efficient Task Adaptation without Fine-Tuning Read the full episode description	Feb 02, 2024
arxiv preprint - Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens Read the full episode description	Feb 01, 2024
arxiv preprint - Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning Read the full episode description	Jan 31, 2024
arxiv preprint - RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture Read the full episode description	Jan 30, 2024
arxiv preprint - SliceGPT: Compress Large Language Models by Deleting Rows and Columns Read the full episode description	Jan 29, 2024
arxiv preprint - Is ImageNet worth 1 video? Learning strong image encoders from 1 long unlabelled video Read the full episode description	Jan 26, 2024
arxiv preprint - MambaByte: Token-free Selective State Space Model Read the full episode description	Jan 25, 2024
arxiv preprint - Lumiere: A Space-Time Diffusion Model for Video Generation Read the full episode description	Jan 24, 2024
arxiv preprint - Self-Rewarding Language Models Read the full episode description	Jan 23, 2024
arxiv preprint - Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data Read the full episode description	Jan 22, 2024
arxiv preprint - MoVQA: A Benchmark of Versatile Question-Answering for Long-Form Movie Understanding Read the full episode description	Jan 19, 2024
arxiv preprint - Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model Read the full episode description	Jan 18, 2024
arxiv preprint - Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models Read the full episode description	Jan 17, 2024
arxiv preprint - Time Travel in LLMs: Tracing Data Contamination in Large Language Models Read the full episode description	Jan 16, 2024
arxiv preprint - InseRF: Text-Driven Generative Object Insertion in Neural 3D Scenes Read the full episode description	Jan 12, 2024
arxiv preprint - A Simple LLM Framework for Long-Range Video Question-Answering Read the full episode description	Jan 11, 2024
arxiv preprint - Mixtral of Experts Read the full episode description	Jan 09, 2024
arxiv preprint - Weight subcloning: direct initialization of transformers using larger pretrained ones Read the full episode description	Jan 08, 2024
arxiv preprint - Compositional Abilities Emerge Multiplicatively: Exploring Diffusion Models on a Synthetic Task Read the full episode description	Jan 05, 2024
arxiv preprint - LLM in a flash: Efficient Large Language Model Inference with Limited Memory Read the full episode description	Jan 05, 2024
arxiv preprint - The Truth is in There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction Read the full episode description	Jan 02, 2024
arxiv preprint - DreaMoving: A Human Video Generation Framework based on Diffusion Models Read the full episode description	Dec 29, 2023
arxiv preprint - Patch n’ Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution Read the full episode description	Dec 28, 2023
arxiv preprint - UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces Read the full episode description	Dec 28, 2023
arxiv preprint - LongNet: Scaling Transformers to 1,000,000,000 Tokens Read the full episode description	Dec 27, 2023
arxiv preprint - MotionCtrl: A Unified and Flexible Motion Controller for Video Generation Read the full episode description	Dec 27, 2023
arxiv preprint - Model-tuning Via Prompts Makes NLP Models Adversarially Robust Read the full episode description	Dec 26, 2023
arxiv preprint - Training Chain-of-Thought via Latent-Variable Inference Read the full episode description	Dec 22, 2023
arxiv preprint - Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation Read the full episode description	Dec 21, 2023
arxiv preprint - Instruction-tuning Aligns LLMs to the Human Brain Read the full episode description	Dec 20, 2023
arxiv preprint - WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia Read the full episode description	Dec 19, 2023
arxiv preprint - DemoFusion: Democratising High-Resolution Image Generation With No $$$ Read the full episode description	Dec 18, 2023
arxiv preprint - Recommender Systems with Generative Retrieval Read the full episode description	Dec 15, 2023
arxiv preprint - Mamba: Linear-Time Sequence Modeling with Selective State Spaces Read the full episode description	Dec 14, 2023
arxiv preprint - Block-State Transformers Read the full episode description	Dec 13, 2023
arxiv preprint - Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns Read the full episode description	Dec 12, 2023
arxiv preprint - LooseControl: Lifting ControlNet for Generalized Depth Conditioning Read the full episode description	Dec 11, 2023
Announcement: AI Breakdown Youtube Channel Read the full episode description	Dec 08, 2023
arxiv preprint - OneLLM: One Framework to Align All Modalities with Language Read the full episode description	Dec 08, 2023
arxiv preprint - The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning Read the full episode description	Dec 08, 2023
arxiv - MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI Read the full episode description	Dec 07, 2023
arxiv preprint - MLP-Mixer: An all-MLP Architecture for Vision Read the full episode description	Dec 07, 2023
arxiv preprint - Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine Read the full episode description	Dec 06, 2023
arxiv preprint - Nash Learning from Human Feedback Read the full episode description	Dec 05, 2023
arxiv preprint - Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation Read the full episode description	Dec 04, 2023
arxiv preprint - Knowledge is a Region in Weight Space for Fine-tuned Language Models Read the full episode description	Dec 03, 2023
arxiv preprint - MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training Read the full episode description	Dec 02, 2023
arxiv preprint - Simplifying Transformer Blocks Read the full episode description	Dec 01, 2023
arxiv - Visual In-Context Prompting Read the full episode description	Nov 30, 2023
Arxiv Preprint - GAIA: a benchmark for General AI Assistants Read the full episode description	Nov 29, 2023
Arxiv Preprint - DisCo: Disentangled Control for Realistic Human Dance Generation Read the full episode description	Nov 28, 2023
Arxiv Preprint - Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation Read the full episode description	Nov 27, 2023
Arxiv Preprint - A General Theoretical Paradigm to Understand Learning from Human Preferences Read the full episode description	Nov 25, 2023
Arxiv Preprint - ShareGPT4V: Improving Large Multi-Modal Models with Better Captions Read the full episode description	Nov 22, 2023
ArXiv Preprint - S-LoRA: Serving Thousands of Concurrent LoRA Adapters Read the full episode description	Nov 21, 2023
ArXiv Preprint - Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities Read the full episode description	Nov 20, 2023
Arxiv Preprint - LCM-LoRA: A Universal Stable-Diffusion Acceleration Module Read the full episode description	Nov 17, 2023
ArXiv Preprint - Fine-tuning Language Models for Factuality Read the full episode description	Nov 16, 2023
arxiv preprint - Language Models can be Logical Solvers Read the full episode description	Nov 15, 2023
ArXiv Preprint - Prompt Engineering a Prompt Engineer Read the full episode description	Nov 14, 2023
arxiv preprint - CogVLM: Visual Expert for Pretrained Language Models Read the full episode description	Nov 13, 2023
ArXiv Preprint - De-Diffusion Makes Text a Strong Cross-Modal Interface Read the full episode description	Nov 10, 2023
ArXiv Preprint - E3 TTS: Easy End-to-End Diffusion-based Text to Speech Read the full episode description	Nov 09, 2023
ArXiv Preprint - Holistic Analysis of Hallucination in GPT-4V(ision): Bias and Interference Challenges Read the full episode description	Nov 08, 2023
ArXiv Preprint - Learning From Mistakes Makes LLM Better Reasoner Read the full episode description	Nov 07, 2023
ArXiv Preprint - The Generative AI Paradox: ”What It Can Create, It May Not Understand” Read the full episode description	Nov 06, 2023
ArXiv Preprint - TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise Read the full episode description	Nov 03, 2023
ArXiv Preprint - MM-VID: Advancing Video Understanding with GPT-4V(ision) Read the full episode description	Nov 02, 2023
ArXiv Preprint - Zephyr: Direct Distillation of LM Alignment Read the full episode description	Nov 01, 2023
ArXiv Preprint - ControlLLM: Augment Language Models with Tools by Searching on Graphs Read the full episode description	Oct 31, 2023
ArXiv Preprint - Talk like a Graph: Encoding Graphs for Large Language Models Read the full episode description	Oct 30, 2023
arxiv Preprint - AgentTuning: Enabling Generalized Agent Abilities for LLMs Read the full episode description	Oct 29, 2023
ArXiv Preprint - Jailbreaking Black Box Large Language Models in Twenty Queries Read the full episode description	Oct 28, 2023
ArXiv Preprint - Matryoshka Diffusion Models Read the full episode description	Oct 27, 2023
arxiv Preprint - An Image is Worth Multiple Words: Learning Object Level Concepts using Multi-Concept Prompt Learning Read the full episode description	Oct 26, 2023
arxiv Preprint - Retrieval meets Long Context Large Language Models Read the full episode description	Oct 25, 2023
arxiv Preprint - Contrastive Prefence Learning: Learning from Human Feedback without RL Read the full episode description	Oct 24, 2023
arxiv Preprint - BitNet: Scaling 1-bit Transformers for Large Language Models Read the full episode description	Oct 23, 2023
arxiv Preprint - Automatic Prompt Optimization with ”Gradient Descent” and Beam Search Read the full episode description	Oct 22, 2023
arxiv Preprint - Understanding Retrieval Augmentation for Long-Form Question Answering Read the full episode description	Oct 21, 2023
arxiv Preprint - On the Connection between Pre-training Data Diversity and Fine-tuning Robustness Read the full episode description	Oct 20, 2023
arxiv Preprint - Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness Read the full episode description	Oct 19, 2023
arxiv Preprint - In-Context Pretraining: Language Modeling Beyond Document Boundaries Read the full episode description	Oct 18, 2023
ICCV 2023 - Sigmoid Loss for Language Image Pre-Training Read the full episode description	Oct 17, 2023
arxiv Preprint - Walking Down the Memory Maze: Beyond Context Limit through Interactive Reading Read the full episode description	Oct 16, 2023
arxiv Preprint - HyperAttention: Long-context Attention in Near-Linear Time Read the full episode description	Oct 15, 2023
arxiv Preprint - InstructCV: Instruction-Tuned Text-to-Image Diffusion Models as Vision Generalists Read the full episode description	Oct 13, 2023
arxiv Preprint - Large Language Models Cannot Self-Correct Reasoning Yet Read the full episode description	Oct 12, 2023
arxiv Preprint - Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution Read the full episode description	Oct 11, 2023
arxiv Preprint - Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation Read the full episode description	Oct 10, 2023
arxiv Preprint - Improved Baselines with Visual Instruction Tuning Read the full episode description	Oct 09, 2023
arxiv Preprint - Tree of Thoughts: Deliberate Problem Solving with Large Language Models Read the full episode description	Oct 08, 2023
Neurips 2023 - Evaluating Cognitive Maps and Planning in Large Language Models with CogEval Read the full episode description	Oct 07, 2023
ICCV 2023 - Diffusion Models as Masked Autoencoders Read the full episode description	Oct 06, 2023
arxiv Preprint - Conditional Diffusion Distillation Read the full episode description	Oct 05, 2023
arxiv Preprint - Enable Language Models to Implicitly Learn Self-Improvement From Data Read the full episode description	Oct 04, 2023
arxiv Preprint - Efficient Streaming Language Models with Attention Sinks Read the full episode description	Oct 03, 2023
Neurips 2023 - PuzzleFusion: Unleashing the Power of Diffusion Models for Spatial Puzzle Solving Read the full episode description	Oct 02, 2023
arxiv Preprint - Vision Transformers Need Registers Read the full episode description	Oct 01, 2023
arxiv Preprint - VPA: Fully Test-Time Visual Prompt Adaptation Read the full episode description	Sep 30, 2023

AI Breakdown

By agibreakdown

Category: Education

Open in Apple Podcasts

Open RSS feed

Open Website

Description