in

John Schulman (OpenAI Cofounder) – Reasoning, RLHF, & Plan for 2027 AGI



John Schulman on how posttraining tames the shoggoth, and the nature of the progress to come…

Timestamps:

00:00:00 Pre-training, post-training, and future capabilities
00:17:21 Plan for AGI 2025
00:29:43 Teaching models to reason
00:41:14 The Road to ChatGPT
00:52:37 What makes for a good RL researcher?
01:01:22 Keeping humans in the loop
01:15:39 State of research, plateaus, and moats

Links:

Apple Podcasts: https://podcasts.apple.com/us/podcast/john-schulman-openai-cofounder-reasoning-rlhf-plan/id1516093381?i=1000655679622
Spotify: https://open.spotify.com/episode/1ivzHH9RWciXe4O1rKtldf?si=53503781e05f4d8f
Transcript: https://www.dwarkeshpatel.com/p/john-schulman/

Me on Twitter: https://twitter.com/dwarkesh_sp/

Sponsors:

If you’re interested in advertising on the podcast, fill out this form: https://airtable.com/appxGOvFLDLP5dlzv/pagFVrbHRohW6F2bZ/form

– Your DNA shapes everything about you. Want to know how? Take 10% off our Premium DNA kit with code DWARKESH at https://mynucleus.com/

– CommandBar is an AI user assistant that any software product can embed to non-annoyingly assist, support, and unleash their users. Used by forward-thinking CX, product, growth, and marketing teams. Learn more at https://www.commandbar.com/

Chinas New “AGI Robot” Is STUNNING (Chinas Answer To Figure 01/OpenAI)

AI News From Google I/O & OpenAI – Matter 1.3, Apple Accessibility, I/O & GPT-4o