Interactions, permissions, and goals, oh my!

Heeki Park — Fri, 22 May 2026 18:01:37 GMT

I have long had this personal mantra: always be learning.

The industry is fast moving, and for anyone trying to keep up with where AI is going, you know it’s like drinking from a firehose. I always take notes on the things I read and decided to share those learnings via this weekly newsletter. The goal is to keep it simple with a brief summary and the key signal of why it matters. I hope you find this to be a helpful resource as we journey together through each iterative wave of innovation.

MY TAKE

Each lab is taking near-term bets on where they can win and gain mindshare. Those positions fall into a few thematic buckets: improving end user interactive experiences, enhancing builder efficacy, deploying models in stratified use cases, i.e., most capable reasoning models with higher cost/token versus faster models with more cost effective profiles.

FRESH READS

① Interaction Models: A Scalable Approach to Human-AI Collaboration
Summary: Thinking Machines Lab, led by Mira Murati, announced their research preview to interaction models, which target interactive real-time use cases. The system splits into two models: 1/ an interaction model that runs synchronously and is in constant exchange with the user and 2/ a background model that runs asynchronously, handles delegated tasks, and returns the results to the interaction model. The interaction model uses micro-turns that allow for what feels like interruptions in processing and thought.
Signal: While harness engineering is on the rise for improving task efficacy, this approach brings a more human touch to interacting with models and ultimately agents. Bi-directional streaming is one approach enabled by the agent. Interaction models is another approach, built into the model itself.

② Anthropic’s X post on changes to paid Claude plans
Summary: Anthropic announced a change to their subscription plans, specifically targeted at Claude Code and Claude Agent SDK. They had previously banned 3p agent use outright in early April 2026. The change now creates two buckets of usage: 1/ the standard quota of base usage, 2/ a monthly credit for programmatic usage, which is now allowed for 3p agents. Ultra users were particularly upset because the credit amounts felt paltry and would lead to massive bills given their current usage.
Signal: The industry is facing massive compute constraints, and the current subscription model subsidizes usage from ultra users with some estimates saying that those user consumed at least 8-10x in compute cost against their actual subscription costs. With token demand far outstripping compute supply, it looks like this is the new normal and likely end of flat-rate subsidized subscription plans.

③ Claude Code auto mode: a safer way to skip permissions
Summary: Anthropic released an alternate approach to managing permissions in Claude Code, given their data showing that users approve 93% of permission prompts. For developers familiar with the experience of setting a long-running task, stepping away to grab some water, and returning to find that it stopped 5 seconds after you left your desk, requesting permission to perform some task that was not yet allow listed, this is an intermediate step in the right direction without going full YOLO mode with --dangerously-skip-permissions. Note that this mode is available on Max, Team, Enterprise, or direct Anthropic API plans (not on Pro plans or via Bedrock, Vertex, or Foundry).
Signal: As agentic coding tools gain broader adoption, security is paramount. Meanwhile, developers could grow frustrated with consent fatigue if allow lists are not well curated or models like auto-mode are not enabled. Developer productivity teams should explore and find the right balance.

④ Andrej Karpathy’s X post on joining Anthropic
Summary: Andrej Karpathy is a legend and massive voice in the AI community, founding member of OpenAI, director of AI for Autopilot at Tesla, and founder of Eureka Labs. At Anthropic, he will get back to his R&D roots and build a team focused on pre-training research.
Signal: Anthropic has great momentum in the market, has made some high-profile hires of late, and is doubling down on its research initiatives. Karpathy + Claude to accelerate Claude’s own pre-training = big bet of possibility and consequence.

⑤ Using Goals in Codex
Summary: Goals are persistent objectives that keep an agent working towards a defined outcome across many turns. If you’ve spent any amount of time vibe coding, you’ve had that experience of seeing an output that missed the mark (sometimes by a wide margin) or a process that completed only half the job. Goals define the outcome, i.e., the end goal, which helps guide the agent towards what you really want.
Signal: Harness engineering focuses on building the right feedback loops like tool use, guardrails, and validation checks. Goals work backwards from the desired outcome and provide that persistent target for the agent to continuously iterate towards. It shifts the interaction from “answer this one request” to “complete this active goal”.

⑥ Everything Google Cloud customers need to know coming out of Google I/O
Summary: Google is doubling down on the Agentic Enterprise with a sprawling suite of services and features announced this week, e.g. advances in Gemini models, developing and managing autonomous agents in Antigravity, personal agent in Gemini Spark, simplified agent deployment via Managed Agents API, and a managed security agent in CodeMender for securing agent-generated code.
Signal: Offerings seem to be coalescing around a few key ideas across all the providers across two axes: platform (managed agent platforms, managed specialized agents), user/builder/developer (agentic desktop applications, agentic coding tools).

⑦ AI Gateway production index
Summary: Vercel has its own AI Gateway that provides unified API access to hundreds of models. Upon analyzing usage across 200k unique teams, they shared five key observations from the data.
Signal: Organizations are willing to spend more per token based on how expensive the wrong answer is. It showed Anthropic leading in spend and Google leading in token volume. It suggests a bifurcation where frontier models can command premium pricing for high-stakes decisions, while high-volume commodity workloads gravitate towards cost-efficient models. While model leaderboards make for easy comparisons, production workloads use models that make sense for the job, aligning effectiveness and cost efficiency with the targeted use case.

FROM THE ARCHIVE

① Building effective agents (also on YouTube)
Summary: Similar to how the Gang of Four defined the design patterns in software engineering, Erik Schluntz and Barry Zhang provide a simple and pragmatic view of when (and when not) to use agents and enumerate common building blocks, workflows, and agent patterns that they were seeing in their work across many customers.

Always be learning.

heeki reads #1
Written by Heeki Park, Principal SA @ AWS. Opinions are my own.

heeki builds: heeki reads

Interactions, permissions, and goals, oh my!

MY TAKE

FRESH READS

FROM THE ARCHIVE