Papers Read on AI Podcast Republic

Papers Read on AI

By Rob

Listen to a podcast, please open Podcast Republic app. Available on Google Play Store and Apple App Store.

Image by Rob

Category: Tech News

Open Website

Rate for this podcast

Subscribers: 7
Reviews: 0
Episodes: 200

Description

Keeping you up to date with the latest trends and best performing architectures in this fast evolving field in computer science. Selecting papers by comparative results, citations and influence we educate you on the latest research. Consider supporting us on Patreon.com/PapersRead for feedback and ideas.

Episode	Date
ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases Read the full episode description	Nov 01, 2024
Mini-Omni2: Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities Read the full episode description	Oct 31, 2024
Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation Read the full episode description	Oct 30, 2024
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching Read the full episode description	Oct 18, 2024
LightRAG: Simple and Fast Retrieval-Augmented Generation Read the full episode description	Oct 17, 2024
Aria: An Open Multimodal Native Mixture-of-Experts Model Read the full episode description	Oct 16, 2024
AgentKit: Structured LLM Reasoning with Dynamic Graphs Read the full episode description	Oct 15, 2024
PDF-WuKong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse Sampling Read the full episode description	Oct 14, 2024
Diffusion Models are Evolutionary Algorithms Read the full episode description	Oct 10, 2024
Is Safer Better? The Impact of Guardrails on the Argumentative Strength of LLMs in Hate Speech Countering Read the full episode description	Oct 09, 2024
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations Read the full episode description	Oct 08, 2024
Internal Consistency and Self-Feedback in Large Language Models: A Survey Read the full episode description	Oct 07, 2024
On the Diagram of Thought Read the full episode description	Oct 02, 2024
3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion Read the full episode description	Oct 01, 2024
StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation Read the full episode description	Sep 30, 2024
On the limits of agency in agent-based models Read the full episode description	Sep 24, 2024
Symbolic Prompt Program Search: A Structure-Aware Approach to Efficient Compile-Time Prompt Optimization Read the full episode description	Sep 23, 2024
PuLID: Pure and Lightning ID Customization via Contrastive Alignment Read the full episode description	Sep 23, 2024
MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery Read the full episode description	Sep 21, 2024
PuLID: Pure and Lightning ID Customization via Contrastive Alignment Read the full episode description	Sep 20, 2024
Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming Read the full episode description	Sep 19, 2024
LLaMA-Omni: Seamless Speech Interaction with Large Language Models Read the full episode description	Sep 18, 2024
GeoCalib: Learning Single-image Calibration with Geometric Optimization Read the full episode description	Sep 17, 2024
Artificial Immune System of Secure Face Recognition Against Adversarial Attacks Read the full episode description	Sep 13, 2024
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model Read the full episode description	Sep 12, 2024
rerankers: A Lightweight Python Library to Unify Ranking Methods Read the full episode description	Sep 11, 2024
Automated Design of Agentic Systems Read the full episode description	Sep 10, 2024
Text2SQL is Not Enough: Unifying AI and Databases with TAG Read the full episode description	Sep 09, 2024
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders Read the full episode description	Sep 05, 2024
Sapiens: Foundation for Human Vision Models Read the full episode description	Sep 04, 2024
OctFusion: Octree-based Diffusion Models for 3D Shape Generation Read the full episode description	Sep 03, 2024
Writing in the Margins: Better Inference Pattern for Long Context Retrieval Read the full episode description	Sep 02, 2024
Fact Finder -- Enhancing Domain Expertise of Large Language Models by Incorporating Knowledge Graphs Read the full episode description	Aug 30, 2024
RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation Read the full episode description	Aug 29, 2024
RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation Read the full episode description	Aug 28, 2024
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search Read the full episode description	Aug 23, 2024
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs Read the full episode description	Aug 21, 2024
ControlNeXt: Powerful and Efficient Control for Image and Video Generation Read the full episode description	Aug 20, 2024
OpenResearcher: Unleashing AI for Accelerated Scientific Research Read the full episode description	Aug 19, 2024
Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation Read the full episode description	Aug 14, 2024
AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls Read the full episode description	Aug 13, 2024
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads Read the full episode description	Aug 09, 2024
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders Read the full episode description	Aug 08, 2024
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers Read the full episode description	Aug 07, 2024
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher Read the full episode description	Aug 05, 2024
Cinemo: Consistent and Controllable Image Animation with Motion Diffusion Models Read the full episode description	Jul 31, 2024
FinanceBench: A New Benchmark for Financial Question Answering Read the full episode description	Jul 30, 2024
Stable-Hair: Real-World Hair Transfer via Diffusion Model Read the full episode description	Jul 29, 2024
Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows? Read the full episode description	Jul 26, 2024
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs Read the full episode description	Jul 25, 2024
Patch-Level Training for Large Language Models Read the full episode description	Jul 24, 2024
Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models Read the full episode description	Jul 23, 2024
IMAGDressing-v1: Customizable Virtual Dressing Read the full episode description	Jul 22, 2024
A Comprehensive Survey on Human Video Generation: Challenges, Methods, and Insights Read the full episode description	Jul 19, 2024
Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence Read the full episode description	Jul 18, 2024
SEED-Story: Multimodal Long Story Generation with Large Language Model Read the full episode description	Jul 16, 2024
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models Read the full episode description	Jul 15, 2024
LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control Read the full episode description	Jul 12, 2024
Agentless: Demystifying LLM-based Software Engineering Agents Read the full episode description	Jul 11, 2024
Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More? Read the full episode description	Jul 09, 2024
ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code Read the full episode description	Jul 08, 2024
Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image Read the full episode description	Jul 05, 2024
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence Read the full episode description	Jul 04, 2024
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time Read the full episode description	Jun 28, 2024
RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture Read the full episode description	Jun 27, 2024
Seven Failure Points When Engineering a Retrieval Augmented Generation System Read the full episode description	Jun 26, 2024
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning Read the full episode description	Jun 25, 2024
Recurrent Context Compression: Efficiently Expanding the Context Window of LLM Read the full episode description	Jun 24, 2024
Multi-Head RAG: Solving Multi-Aspect Problems with LLMs Read the full episode description	Jun 21, 2024
StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning Read the full episode description	Jun 20, 2024
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time Read the full episode description	Jun 19, 2024
”Do Anything Now”: Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models Read the full episode description	Jun 18, 2024
Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models Read the full episode description	Jun 17, 2024
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models Read the full episode description	Jun 13, 2024
GNN-RAG: Graph Neural Retrieval for Large Language Model Reasoning Read the full episode description	Jun 12, 2024
AutoCoder: Enhancing Code Large Language Model with \textsc{AIEV-Instruct} Read the full episode description	Jun 11, 2024
From Sora What We Can See: A Survey of Text-to-Video Generation Read the full episode description	Jun 04, 2024
The Future of Large Language Model Pre-training is Federated Read the full episode description	Jun 03, 2024
Long-form factuality in large language models Read the full episode description	Jun 01, 2024
Real-time Transformer-based Open-Vocabulary Detection with Efficient Fusion Head Read the full episode description	May 31, 2024
Retrieval-Augmented Generation for AI-Generated Content: A Survey Read the full episode description	May 29, 2024
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning Read the full episode description	May 28, 2024
LightAutoML: AutoML Solution for a Large Financial Services Ecosystem Read the full episode description	May 27, 2024
Efficient Multimodal Large Language Models: A Survey Read the full episode description	May 24, 2024
The Platonic Representation Hypothesis Read the full episode description	May 23, 2024
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment Read the full episode description	May 22, 2024
LLMs as Hackers: Autonomous Linux Privilege Escalation Attacks Read the full episode description	May 21, 2024
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval Read the full episode description	May 16, 2024
A decoder-only foundation model for time-series forecasting Read the full episode description	May 14, 2024
Autonomous LLM-driven research from data to human-verifiable research papers Read the full episode description	May 13, 2024
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model Read the full episode description	May 12, 2024
Granite Code Models: A Family of Open Foundation Models for Code Intelligence Read the full episode description	May 11, 2024
Improving Diffusion Models for Virtual Try-on Read the full episode description	May 10, 2024
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation Read the full episode description	May 08, 2024
RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing Read the full episode description	May 07, 2024
KAN: Kolmogorov-Arnold Networks Read the full episode description	May 06, 2024
Make Your LLM Fully Utilize the Context Read the full episode description	May 03, 2024
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites Read the full episode description	May 02, 2024
Dynamic Generation of Personalities with Large Language Models Read the full episode description	Apr 30, 2024
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length Read the full episode description	Apr 25, 2024
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Read the full episode description	Apr 24, 2024
Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations Read the full episode description	Apr 23, 2024
Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models Read the full episode description	Apr 22, 2024
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models Read the full episode description	Apr 19, 2024
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models Read the full episode description	Apr 18, 2024
From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples Read the full episode description	Apr 16, 2024
AutoCodeRover: Autonomous Program Improvement Read the full episode description	Apr 15, 2024
TrustLLM: Trustworthiness in Large Language Models Read the full episode description	Apr 15, 2024
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation Read the full episode description	Apr 12, 2024
Fast Timing-Conditioned Latent Audio Diffusion Read the full episode description	Apr 11, 2024
Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians Read the full episode description	Apr 10, 2024
ReFT: Representation Finetuning for Language Models Read the full episode description	Apr 09, 2024
Long-form factuality in large language models Read the full episode description	Apr 08, 2024
Jamba: A Hybrid Transformer-Mamba Language Model Read the full episode description	Apr 06, 2024
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models Read the full episode description	Apr 05, 2024
MegaBlocks: Efficient Sparse Training with Mixture-of-Experts Read the full episode description	Apr 04, 2024
VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild Read the full episode description	Apr 03, 2024
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression Read the full episode description	Apr 02, 2024
Evolutionary Optimization of Model Merging Recipes Read the full episode description	Mar 27, 2024
EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models Read the full episode description	Mar 26, 2024
BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation Read the full episode description	Mar 25, 2024
Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts Read the full episode description	Mar 22, 2024
A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models Read the full episode description	Mar 21, 2024
Chronos: Learning the Language of Time Series Read the full episode description	Mar 19, 2024
Linear Transformers with Learnable Kernel Functions are Better In-Context Models Read the full episode description	Mar 18, 2024
SplattingAvatar: Realistic Real-Time Human Avatars with Mesh-Embedded Gaussian Splatting Read the full episode description	Mar 15, 2024
Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents Read the full episode description	Mar 14, 2024
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Read the full episode description	Mar 13, 2024
TripoSR: Fast 3D Object Reconstruction from a Single Image Read the full episode description	Mar 12, 2024
Diffusion Model-Based Image Editing: A Survey Read the full episode description	Mar 08, 2024
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Read the full episode description	Mar 07, 2024
Learning to Generate Instruction Tuning Datasets for Zero-Shot Task Adaptation Read the full episode description	Mar 06, 2024
Intent-based Prompt Calibration: Enhancing prompt optimization with synthetic boundary cases Read the full episode description	Mar 05, 2024
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models Read the full episode description	Mar 04, 2024
BitDelta: Your Fine-Tune May Only Be Worth One Bit Read the full episode description	Feb 27, 2024
Ring Attention with Blockwise Transformers for Near-Infinite Context Read the full episode description	Feb 26, 2024
Premise Order Matters in Reasoning with Large Language Models Read the full episode description	Feb 23, 2024
Generative Representational Instruction Tuning Read the full episode description	Feb 20, 2024
DoRA: Weight-Decomposed Low-Rank Adaptation Read the full episode description	Feb 19, 2024
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time Read the full episode description	Feb 18, 2024
World Model on Million-Length Video And Language With RingAttention Read the full episode description	Feb 17, 2024
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models Read the full episode description	Feb 16, 2024
Fractal Patterns May Unravel the Intelligence in Next-Token Prediction Read the full episode description	Feb 15, 2024
Precise Zero-Shot Dense Retrieval without Relevance Labels Read the full episode description	Feb 13, 2024
Relevance-guided Supervision for OpenQA with ColBERT Read the full episode description	Feb 11, 2024
ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction Read the full episode description	Feb 11, 2024
PLAID: An Efficient Engine for Late Interaction Retrieval Read the full episode description	Feb 10, 2024
RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval Read the full episode description	Feb 09, 2024
Corrective Retrieval Augmented Generation Read the full episode description	Feb 08, 2024
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence Read the full episode description	Feb 07, 2024
A Comprehensive Survey on 3D Content Generation Read the full episode description	Feb 07, 2024
OLMo: Accelerating the Science of Language Models Read the full episode description	Feb 06, 2024
Who’s Harry Potter? Approximate Unlearning in LLMs Read the full episode description	Feb 04, 2024
Parameter-Efficient Transfer Learning for NLP Read the full episode description	Feb 03, 2024
A Survey on Transformers in Reinforcement Learning Read the full episode description	Feb 02, 2024
Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Large Language Models Read the full episode description	Feb 01, 2024
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines Read the full episode description	Jan 31, 2024
Matryoshka Representation Learning Read the full episode description	Jan 30, 2024
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers Read the full episode description	Jan 27, 2024
Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs Read the full episode description	Jan 26, 2024
Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security Read the full episode description	Jan 25, 2024
Self-Rewarding Language Models Read the full episode description	Jan 24, 2024
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model Read the full episode description	Jan 23, 2024
Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering Read the full episode description	Jan 22, 2024
Quantifying Language Models’ Sensitivity to Spurious Features in Prompt Design or: How I learned to start worrying about prompt formatting Read the full episode description	Jan 19, 2024
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning Read the full episode description	Jan 18, 2024
Large Language Models for Generative Information Extraction: A Survey Read the full episode description	Jan 17, 2024
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models Read the full episode description	Jan 16, 2024
Soaring from 4K to 400K: Extending LLM’s Context with Activation Beacon Read the full episode description	Jan 15, 2024
Parameter-Efficient Transfer Learning for NLP Read the full episode description	Jan 13, 2024
Mixtral of Experts Read the full episode description	Jan 12, 2024
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts Read the full episode description	Jan 11, 2024
WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia Read the full episode description	Jan 11, 2024
Video Understanding with Large Language Models: A Survey Read the full episode description	Jan 10, 2024
GPT-4V(ision) is a Generalist Web Agent, if Grounded Read the full episode description	Jan 09, 2024
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones Read the full episode description	Jan 08, 2024
AnyText: Multilingual Visual Text Generation And Editing Read the full episode description	Jan 05, 2024
KwaiAgents: Generalized Information-seeking Agent System with Large Language Models Read the full episode description	Jan 04, 2024
Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4 Read the full episode description	Jan 03, 2024
Fast Inference of Mixture-of-Experts Language Models with Offloading Read the full episode description	Jan 02, 2024
Retrieval-Augmented Generation for Large Language Models: A Survey Read the full episode description	Dec 29, 2023
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU Read the full episode description	Dec 28, 2023
Pearl: A Production-ready Reinforcement Learning Agent Read the full episode description	Dec 19, 2023
Are Emergent Abilities in Large Language Models just In-Context Learning? Read the full episode description	Dec 17, 2023
Mixture-of-Experts Meets Instruction Tuning:A Winning Combination for Large Language Models Read the full episode description	Dec 16, 2023
Instruction Tuning for Large Language Models: A Survey Read the full episode description	Dec 15, 2023
MegaBlocks: Efficient Sparse Training with Mixture-of-Experts Read the full episode description	Dec 14, 2023
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer Read the full episode description	Dec 13, 2023
Sequential Modeling Enables Scalable Learning for Large Vision Models Read the full episode description	Dec 12, 2023
Magicoder: Source Code Is All You Need Read the full episode description	Dec 11, 2023
Mamba: Linear-Time Sequence Modeling with Selective State Spaces Read the full episode description	Dec 10, 2023
Adversarial Diffusion Distillation Read the full episode description	Dec 09, 2023
Instruction Tuning with Human Curriculum Read the full episode description	Dec 08, 2023
Initializing Models with Larger Ones Read the full episode description	Dec 07, 2023
Improving Sample Quality of Diffusion Models Using Self-Attention Guidance Read the full episode description	Dec 06, 2023
GPT4Vis: What Can GPT-4 Do for Zero-shot Visual Recognition? Read the full episode description	Dec 05, 2023
TaskWeaver: A Code-First Agent Framework Read the full episode description	Dec 04, 2023
Efficient LLM Inference on CPUs Read the full episode description	Dec 02, 2023
Igniting Language Intelligence: The Hitchhiker’s Guide From Chain-of-Thought Reasoning to Language Agents Read the full episode description	Dec 01, 2023
STaR: Bootstrapping Reasoning With Reasoning Read the full episode description	Nov 30, 2023

Papers Read on AI

By Rob

Category: Tech News

Open in Apple Podcasts

Open RSS feed

Open Website

Description