Papers Read on AI

By Rob

Listen to a podcast, please open Podcast Republic app. Available on Google Play Store and Apple App Store.

Image by Rob

Category: Tech News

Open in Apple Podcasts


Open RSS feed


Open Website


Rate for this podcast

Subscribers: 4
Reviews: 0
Episodes: 200

Description

Keeping you up to date with the latest trends and best performing architectures in this fast evolving field in computer science. Selecting papers by comparative results, citations and influence we educate you on the latest research. Consider supporting us on Patreon.com/PapersRead for feedback and ideas.

Episode Date
Make Your LLM Fully Utilize the Context
May 03, 2024
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites
May 02, 2024
Dynamic Generation of Personalities with Large Language Models
Apr 30, 2024
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
Apr 25, 2024
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Apr 24, 2024
Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations
Apr 23, 2024
Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models
Apr 22, 2024
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models
Apr 19, 2024
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models
Apr 18, 2024
From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples
Apr 16, 2024
AutoCodeRover: Autonomous Program Improvement
Apr 15, 2024
TrustLLM: Trustworthiness in Large Language Models
Apr 15, 2024
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
Apr 12, 2024
Fast Timing-Conditioned Latent Audio Diffusion
Apr 11, 2024
Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians
Apr 10, 2024
ReFT: Representation Finetuning for Language Models
Apr 09, 2024
Long-form factuality in large language models
Apr 08, 2024
Jamba: A Hybrid Transformer-Mamba Language Model
Apr 06, 2024
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
Apr 05, 2024
MegaBlocks: Efficient Sparse Training with Mixture-of-Experts
Apr 04, 2024
VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild
Apr 03, 2024
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression
Apr 02, 2024
Evolutionary Optimization of Model Merging Recipes
Mar 27, 2024
EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models
Mar 26, 2024
BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation
Mar 25, 2024
Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts
Mar 22, 2024
A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models
Mar 21, 2024
Chronos: Learning the Language of Time Series
Mar 19, 2024
Linear Transformers with Learnable Kernel Functions are Better In-Context Models
Mar 18, 2024
SplattingAvatar: Realistic Real-Time Human Avatars with Mesh-Embedded Gaussian Splatting
Mar 15, 2024
Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents
Mar 14, 2024
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Mar 13, 2024
TripoSR: Fast 3D Object Reconstruction from a Single Image
Mar 12, 2024
Diffusion Model-Based Image Editing: A Survey
Mar 08, 2024
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Mar 07, 2024
Learning to Generate Instruction Tuning Datasets for Zero-Shot Task Adaptation
Mar 06, 2024
Intent-based Prompt Calibration: Enhancing prompt optimization with synthetic boundary cases
Mar 05, 2024
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Mar 04, 2024
BitDelta: Your Fine-Tune May Only Be Worth One Bit
Feb 27, 2024
Ring Attention with Blockwise Transformers for Near-Infinite Context
Feb 26, 2024
Premise Order Matters in Reasoning with Large Language Models
Feb 23, 2024
Generative Representational Instruction Tuning
Feb 20, 2024
DoRA: Weight-Decomposed Low-Rank Adaptation
Feb 19, 2024
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Feb 18, 2024
World Model on Million-Length Video And Language With RingAttention
Feb 17, 2024
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Feb 16, 2024
Fractal Patterns May Unravel the Intelligence in Next-Token Prediction
Feb 15, 2024
Precise Zero-Shot Dense Retrieval without Relevance Labels
Feb 13, 2024
Relevance-guided Supervision for OpenQA with ColBERT
Feb 11, 2024
ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction
Feb 11, 2024
PLAID: An Efficient Engine for Late Interaction Retrieval
Feb 10, 2024
RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
Feb 09, 2024
Corrective Retrieval Augmented Generation
Feb 08, 2024
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Feb 07, 2024
A Comprehensive Survey on 3D Content Generation
Feb 07, 2024
OLMo: Accelerating the Science of Language Models
Feb 06, 2024
Who’s Harry Potter? Approximate Unlearning in LLMs
Feb 04, 2024
Parameter-Efficient Transfer Learning for NLP
Feb 03, 2024
A Survey on Transformers in Reinforcement Learning
Feb 02, 2024
Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Large Language Models
Feb 01, 2024
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines
Jan 31, 2024
Matryoshka Representation Learning
Jan 30, 2024
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Jan 27, 2024
Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs
Jan 26, 2024
Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security
Jan 25, 2024
Self-Rewarding Language Models
Jan 24, 2024
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Jan 23, 2024
Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering
Jan 22, 2024
Quantifying Language Models’ Sensitivity to Spurious Features in Prompt Design or: How I learned to start worrying about prompt formatting
Jan 19, 2024
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Jan 18, 2024
Large Language Models for Generative Information Extraction: A Survey
Jan 17, 2024
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Jan 16, 2024
Soaring from 4K to 400K: Extending LLM’s Context with Activation Beacon
Jan 15, 2024
Parameter-Efficient Transfer Learning for NLP
Jan 13, 2024
Mixtral of Experts
Jan 12, 2024
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts
Jan 11, 2024
WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia
Jan 11, 2024
Video Understanding with Large Language Models: A Survey
Jan 10, 2024
GPT-4V(ision) is a Generalist Web Agent, if Grounded
Jan 09, 2024
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Jan 08, 2024
AnyText: Multilingual Visual Text Generation And Editing
Jan 05, 2024
KwaiAgents: Generalized Information-seeking Agent System with Large Language Models
Jan 04, 2024
Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4
Jan 03, 2024
Fast Inference of Mixture-of-Experts Language Models with Offloading
Jan 02, 2024
Retrieval-Augmented Generation for Large Language Models: A Survey
Dec 29, 2023
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
Dec 28, 2023
Pearl: A Production-ready Reinforcement Learning Agent
Dec 19, 2023
Are Emergent Abilities in Large Language Models just In-Context Learning?
Dec 17, 2023
Mixture-of-Experts Meets Instruction Tuning:A Winning Combination for Large Language Models
Dec 16, 2023
Instruction Tuning for Large Language Models: A Survey
Dec 15, 2023
MegaBlocks: Efficient Sparse Training with Mixture-of-Experts
Dec 14, 2023
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
Dec 13, 2023
Sequential Modeling Enables Scalable Learning for Large Vision Models
Dec 12, 2023
Magicoder: Source Code Is All You Need
Dec 11, 2023
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Dec 10, 2023
Adversarial Diffusion Distillation
Dec 09, 2023
Instruction Tuning with Human Curriculum
Dec 08, 2023
Initializing Models with Larger Ones
Dec 07, 2023
Improving Sample Quality of Diffusion Models Using Self-Attention Guidance
Dec 06, 2023
GPT4Vis: What Can GPT-4 Do for Zero-shot Visual Recognition?
Dec 05, 2023
TaskWeaver: A Code-First Agent Framework
Dec 04, 2023
Efficient LLM Inference on CPUs
Dec 02, 2023
Igniting Language Intelligence: The Hitchhiker’s Guide From Chain-of-Thought Reasoning to Language Agents
Dec 01, 2023
STaR: Bootstrapping Reasoning With Reasoning
Nov 30, 2023
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch
Nov 29, 2023
Diffuse, Attend, and Segment: Unsupervised Zero-Shot Segmentation using Stable Diffusion
Nov 28, 2023
Exponentially Faster Language Modelling
Nov 27, 2023
Orca 2: Teaching Small Language Models How to Reason
Nov 24, 2023
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Nov 23, 2023
A Survey on Language Models for Code
Nov 22, 2023
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Nov 21, 2023
Learning to Filter Context for Retrieval-Augmented Generation
Nov 20, 2023
GraphCast: Learning skillful medium-range global weather forecasting
Nov 19, 2023
LCM-LoRA: A Universal Stable-Diffusion Acceleration Module
Nov 17, 2023
CogVLM: Visual Expert for Pretrained Language Models
Nov 16, 2023
How Can Recommender Systems Benefit from Large Language Models: A Survey
Nov 15, 2023
Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs
Nov 14, 2023
Generative Pretraining in Multimodality
Nov 10, 2023
Evaluating Large Language Models: A Comprehensive Survey
Nov 08, 2023
LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression
Nov 03, 2023
iTransformer: Inverted Transformers Are Effective for Time Series Forecasting
Nov 02, 2023
Zephyr: Direct Distillation of LM Alignment
Nov 01, 2023
Large Language Models for Software Engineering: Survey and Open Problems
Oct 31, 2023
Eureka: Human-Level Reward Design via Coding Large Language Models
Oct 30, 2023
Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model
Oct 29, 2023
Cost-Effective Hyperparameter Optimization for Large Language Model Generation Inference
Oct 27, 2023
AgentTuning: Enabling Generalized Agent Abilities for LLMs
Oct 26, 2023
MemGPT: Towards LLMs as Operating Systems
Oct 25, 2023
AgentTuning: Enabling Generalized Agent Abilities for LLMs
Oct 24, 2023
Character-LLM: A Trainable Agent for Role-Playing
Oct 23, 2023
OpenAgents: An Open Platform for Language Agents in the Wild
Oct 20, 2023
Text Embeddings Reveal (Almost) As Much As Text
Oct 18, 2023
Ferret: Refer and Ground Anything Anywhere at Any Granularity
Oct 17, 2023
Mistral 7B
Oct 16, 2023
A Definition of Continual Reinforcement Learning
Oct 15, 2023
A Survey of Techniques for Optimizing Transformer Inference
Oct 14, 2023
Improved Baselines with Visual Instruction Tuning
Oct 13, 2023
Can large language models provide useful feedback on research papers? A large-scale empirical analysis
Oct 12, 2023
The Rise and Potential of Large Language Model Based Agents: A Survey
Oct 11, 2023
Demystifying CLIP Data
Oct 10, 2023
MentalLLaMA: Interpretable Mental Health Analysis on Social Media with Large Language Models
Oct 09, 2023
Efficient Streaming Language Models with Attention Sinks
Oct 06, 2023
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving
Oct 05, 2023
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework
Oct 04, 2023
What Makes Good In-Context Examples for GPT-3?
Oct 03, 2023
Thought Cloning: Learning to Think while Acting by Imitating Human Thinking
Oct 02, 2023
Qwen Technical Report
Oct 02, 2023
Extending Context Window of Large Language Models via Positional Interpolation
Sep 28, 2023
DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models
Sep 27, 2023
The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants
Sep 26, 2023
NExT-GPT: Any-to-Any Multimodal LLM
Sep 25, 2023
GPT Can Solve Mathematical Problems Without a Calculator
Sep 22, 2023
Tracking Anything with Decoupled Video Segmentation
Sep 21, 2023
ModuleFormer: Modularity Emerges from Mixture-of-Experts
Sep 20, 2023
Agents: An Open-source Framework for Autonomous Language Agents
Sep 18, 2023
Cognitive Architectures for Language Agents
Sep 15, 2023
PyGraft: Configurable Generation of Schemas and Knowledge Graphs at Your Fingertips
Sep 14, 2023
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors in Agents
Sep 13, 2023
Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP
Sep 11, 2023
LLaSM: Large Language and Speech Model
Sep 07, 2023
Nougat: Neural Optical Understanding for Academic Documents
Sep 06, 2023
Communicative Agents for Software Development
Sep 04, 2023
Prompt2Model: Generating Deployable Models from Natural Language Instructions
Sep 02, 2023
Code Llama: Open Foundation Models for Code
Sep 01, 2023
A Survey on Large Language Model based Autonomous Agents
Aug 31, 2023
SoTaNa: The Open-Source Software Development Assistant
Aug 31, 2023
Efficient Guided Generation for Large Language Models
Aug 27, 2023
Graph of Thoughts: Solving Elaborate Problems with Large Language Models
Aug 25, 2023
GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher
Aug 25, 2023
Platypus: Quick, Cheap, and Powerful Refinement of LLMs
Aug 24, 2023
FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization
Aug 23, 2023
Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP
Aug 20, 2023
Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies
Aug 19, 2023
Simple synthetic data reduces sycophancy in large language models
Aug 18, 2023
Shepherd: A Critic for Language Model Generation
Aug 17, 2023
AgentBench: Evaluating LLMs as Agents
Aug 16, 2023
MetaGPT: Meta Programming for Multi-Agent Collaborative Framework
Aug 14, 2023
Revisiting the Minimalist Approach to Offline Reinforcement Learning
Aug 13, 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Aug 12, 2023
Scaling Relationship on Learning Mathematical Reasoning with Large Language Models
Aug 08, 2023
LISA: Reasoning Segmentation via Large Language Model
Aug 08, 2023
The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World
Aug 07, 2023
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs
Aug 03, 2023
Universal and Transferable Adversarial Attacks on Aligned Language Models
Jul 31, 2023
FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios
Jul 30, 2023
Aligning Large Language Models with Human: A Survey
Jul 29, 2023
FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets
Jul 28, 2023
FABRIC: Personalizing Diffusion Models with Iterative Feedback
Jul 27, 2023
How is ChatGPT’s behavior changing over time?
Jul 26, 2023
Meta-Transformer: A Unified Framework for Multimodal Learning
Jul 25, 2023
LIMA: Less Is More for Alignment
Jul 25, 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Jul 23, 2023
Self-regulating Prompts: Foundational Model Adaptation without Forgetting
Jul 23, 2023
Deep Unrestricted Document Image Rectification
Jul 22, 2023
MMBench: Is Your Multi-modal Model an All-around Player?
Jul 21, 2023
Editing Large Language Models: Problems, Methods, and Opportunities
Jul 20, 2023
Stack More Layers Differently: High-Rank Training Through Low-Rank Updates
Jul 19, 2023
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
Jul 18, 2023
Secrets of RLHF in Large Language Models Part I: PPO
Jul 17, 2023
Liquid Time-constant Networks
Jul 16, 2023