Best AI papers explained

Un podcast de Enoch H. Kang

506 Épisodes

The Art of Scaling Reinforcement Learning Compute for LLMs
Publié: 16/10/2025
A small number of samples can poison LLMs of any size
Publié: 16/10/2025
Dual Goal Representations
Publié: 14/10/2025
Welcome to the Era of Experience
Publié: 14/10/2025
Value Flows: Flow-Based Distributional Reinforcement Learning
Publié: 14/10/2025
Self-Adapting Language Models
Publié: 12/10/2025
The Markovian Thinker
Publié: 12/10/2025
Moloch’s Bargain: emergent misalignment when LLMs compete for audiences
Publié: 12/10/2025
Transformer Predictor Dynamics and Task Diversity
Publié: 11/10/2025
Base models know how to reason, thinking models learn when
Publié: 11/10/2025
Spectrum tuning: Post-training for distributional coverage and in-context steerability
Publié: 11/10/2025
Understanding Prompt Tuning and In-Context Learning via Meta-Learning
Publié: 11/10/2025
MLPs Learn In-Context on Regression and Classification tasks
Publié: 11/10/2025
Is Pre-Training Truly Better than Meta-Learning?
Publié: 11/10/2025
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
Publié: 11/10/2025
Do LLMs Recognize Your Preferences? Evaluating Personalized Preference Following in LLMs
Publié: 09/10/2025
Learning dynamics of LLM finetuning
Publié: 09/10/2025
Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF
Publié: 09/10/2025
OpenAI Agent Builder and n8n: Orchestrating Reasoning Versus Automating Process
Publié: 08/10/2025
Training Agents Inside of Scalable World Models
Publié: 08/10/2025

2 / 26

Cut through the noise. We curate and break down the most important AI papers so you don’t have to.

Visit the podcast's native language site

506 Épisodes

The Art of Scaling Reinforcement Learning Compute for LLMs

A small number of samples can poison LLMs of any size

Dual Goal Representations

Welcome to the Era of Experience

Value Flows: Flow-Based Distributional Reinforcement Learning

Self-Adapting Language Models

The Markovian Thinker

Moloch’s Bargain: emergent misalignment when LLMs compete for audiences

Transformer Predictor Dynamics and Task Diversity

Base models know how to reason, thinking models learn when

Spectrum tuning: Post-training for distributional coverage and in-context steerability

Understanding Prompt Tuning and In-Context Learning via Meta-Learning

MLPs Learn In-Context on Regression and Classification tasks

Is Pre-Training Truly Better than Meta-Learning?

Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

Do LLMs Recognize Your Preferences? Evaluating Personalized Preference Following in LLMs

Learning dynamics of LLM finetuning

Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF

OpenAI Agent Builder and n8n: Orchestrating Reasoning Versus Automating Process

Training Agents Inside of Scalable World Models