In partnership with

Opening

OpenAI reorganized around agents last Thursday. Greg Brockman is now running all of product, and the stated plan is to collapse ChatGPT and Codex into one agentic platform. That happened the same week Andon Labs put Claude, ChatGPT, Gemini, and Grok each in charge of a radio station and told them to turn a profit. All four burned through their $20 seed budget and failed. Claude tried to incite a revolution. Grok got confused. Gemini cheerfully reported on tragedies.

The week did not make agents look ready to run unsupervised. It did make clear that every major lab is treating agent autonomy as the finish line, not a feature.

That framing matters for operators right now. The race is not about which chat interface wins. It is about which agent stack earns enough trust to run without someone watching.

The AI Work Handbook That Cuts Your Workday in Half

The 8-hour workday is becoming a 4-hour workday for people who know how to use AI.

Everyone else is still catching up.

This AI work playbook shows you exactly how to cut your work hours in half using AI.

50+ step-by-step AI tutorials to cut your workload in half — covering every part of your workday, from emails to strategy, used by 1M+ professionals at Google, Microsoft, and NASA
Superhuman AI newsletter (4 min daily) so you keep discovering new AI tools and skills to stay ahead in your career — the playbook is just the start

Claim your AI playbook

Today’s Signals

Greg Brockman is now running all of OpenAI's product strategy and is consolidating ChatGPT and Codex into one agentic platform. Brockman's own words: "We're consolidating our product efforts to execute with maximum focus toward the agentic future, to win across both consumer and enterprise." OpenAI also killed side projects including Sora and OpenAI for Science to clear the runway. For operators who built separate Codex and ChatGPT integrations, a unified platform raises an immediate question about what the merged API surface looks like. (TechCrunch, May 16)
ArXiv will ban authors for one year if they submit papers with clear evidence of unreviewed AI output — hallucinated citations, LLM artifacts left in text, fabricated references. Thomas Dietterich, chair of ArXiv's CS section, announced the policy on May 16. The rule does not ban AI use; it bans the failure to review it. That framing matters to operators: if the research community is now enforcing accountability for AI-generated content at the output layer, enterprise buyers will ask operators the same question. (TechCrunch, May 16)
Apple's revamped Siri will offer auto-deleting chat histories, per Bloomberg's Mark Gurman. Users can set retention to 30 days, one year, or forever. The new standalone app runs on Google Gemini but Apple is positioning the privacy controls as the differentiator. Worth watching: if privacy-first AI converts users at scale even when the underlying model is not the strongest, every operator building on open APIs will face data-handling questions they have not had to answer yet. (TechCrunch, May 17)
The Musk v. Altman trial closed Friday. The final arguments kept returning to one question: whether Sam Altman is trustworthy as a steward of a company that may control transformative AI. No verdict yet as of dispatch. Separate from the legal outcome, the trial surfaced internal OpenAI documents about the nonprofit-to-capped-profit conversion that will stay in the public record. [Source verified May 17]
The Claude Code source ended up shipping inside the Anthropic Agent SDK package. A developer on HN found the entire 13,800-line bundled CLI binary at node_modules/@anthropic-ai/claude-agent-sdk/cli.js. The copyright note reads: "Want to see the unminified source? We're hiring." Anthropic confirmed it was intentional. The SDK wraps the CLI as its underlying engine. [Source verified May 16-17 via Hacker News]

Rugiet Ready™ is a 3-in-1 ED treatment that targets your brain and body. Dissolves under your tongue. Works in as little as 15 minutes. No pill. No guesswork. Try Ready™ now.

The Drops

[Repo] covibes/zeroshot — autonomous engineering agent for the CLI. Point it at a GitHub issue, walk away, get production-grade code back. Supports Claude Code, OpenAI Codex, OpenCode, and Gemini CLI as backends. 1,481 stars. The multi-backend support is the interesting part: you swap the underlying model without rewriting the workflow. (github.com/covibes/zeroshot)

[Repo] can1357/oh-my-pi — terminal coding agent with hash-anchored edits, LSP support, Python execution, browser access, and subagents. 4,622 stars. The hash-anchored edit approach is worth reading about if you have had agents make confident edits to the wrong version of a file. It pins each edit to a specific file state so the agent cannot act on stale context. (github.com/can1357/oh-my-pi)

[Skill] wshobson/agents — battle-tested agent collection for Claude Code. This is the evergreen pick for Monday: if you are setting up a new agent workflow and want a reference point built by someone who has been running these in production, start here before writing your own. (github.com/wshobson/agents)

100+ Claude Code hacks to ship code 10X faster

Top engineers at Anthropic say AI now writes 100% of their code.

Are you using AI to write yours?

These 100+ Claude Code hacks show you exactly how. Sign up for The Code and get:

100+ Claude Code hacks — free
The Code newsletter — learn the latest AI tools and skills to code faster in 5 mins a day

Claim your free playbook

The Stack

[MCP] nduckmink/arkon — self-hosted knowledge hub and MCP server for teams. Arkon lets you manage RAG contexts, access policies, and AI skills in one place, then expose them to Claude and other models via the Model Context Protocol. 27 stars, created May 2026.

The use case is specific: teams that want Claude to answer questions about internal docs without sending those docs to an external vector store. You run Arkon on your own infra, configure which Claude tools can call which knowledge bases, and set access policies per team or project. The MCP connection means Claude Code picks it up natively once the server is configured.

If your current setup has Claude hallucinating internal API endpoints or pulling outdated policy docs, this is the gap it fills. Setup requires a running server and a few MCP config lines. (github.com/nduckmink/arkon)

The Onboard

How to run Claude Code subagents in parallel without blowing your context window.

The community thread that surfaced this week on the Claude Code source leak also contained a clean explanation of how the agent loop actually manages subagents internally. Here is the practical version operators can use now.

When you spawn a subagent in Claude Code using Task tool calls, each subagent gets its own isolated context. The parent agent only sees the subagent's return value, not its full working transcript. That is the key: you can run four subagents in parallel on four different files and your main context stays clean.

The pattern:

Break the task into independent subtasks with clear output contracts. Each subagent should return a typed summary, not a wall of prose.
Use --max-turns on each subagent call to cap runaway loops. Without this, a subagent that hits an ambiguous case will keep trying.
Read the return values before acting. The parent agent should validate subagent output against the contract before treating it as ground truth.

The practical ceiling with current context windows is roughly four to six parallel subagents on code tasks before the orchestration overhead starts eating the gains. Start with two and measure. [Source: Hacker News discussion + Anthropic Agent SDK documentation, verified May 16-17]

The Frame

The operator gap that Andon Labs accidentally documented

Andon Labs gave four AI models a $20 budget, a radio station, and one instruction. None of them made it work.

The interesting part is not that they failed. It is how they failed. Claude tried to organize the audience into collective action. Gemini reported on disasters with the same tone it uses for weather. Grok had no model of what a radio station is for. Each model did something that made sense given its training, and none of it made sense for the actual task.

This is the operator problem in miniature. Agents trained on vast general corpora are not automatically good at specific business contexts. The gap between "can follow instructions" and "can run a thing without supervision" is filled by operator work: tighter prompts, constrained tool sets, defined failure modes, and humans who review the output before it matters.

The labs are racing to close that gap with better base models. That race is real. But the operators who figure out the constraint and review layer first will run things the labs cannot. The Andon experiment is a useful reminder that "powerful model" and "deployable agent" are still two different things.

❝

Andon Labs gave four AI models a $20 budget, a radio station, and one instruction. None of them made it work.

Builder's Brief

Right now, every newsletter is running the same story: solo founder, five AI agents, no employees. The version they skip is what ties those agents together. Without a shared operating layer, you're just running five chat windows.

Friday's Operator Access kit is built around that problem.

Part 1 is a deployable Command Center dashboard. Clone to Vercel, wire your agents, done. Part 2 is five agent templates: Sales, Content, Ops, Support, and Comms. Each one ships with full prompts, integration specs, and a handoff pattern for the next agent in the chain. Part 3 is the skills layer, the part most setups skip: what you need to know as the operator, plus what each specific agent needs to know, mapped to actual Claude Code skills. Part 4 is the operating rhythm and real cost numbers at solo scale.

Mon through Thu builds context. Friday, the full kit drops.

Before You Go

If OpenAI merges ChatGPT and Codex into one agentic platform, what does that mean for the operators who built workflows on Codex's API separately from ChatGPT's?

OpenAI Goes All-In on Agents. Claude Went Rogue on the Radio. Coding Agents Are Multiplying.