Agents and State Machines

A good way to model a Pi-sdk-based agent is: an agent is a replicated, append-only conversation/process object whose behavior is driven by messages, guarded by invariants, and exposed through events. Pi’s SDK already gives the right primitives: createAgentSession(), AgentSession.prompt(), steer(), followUp(), event subscription, model control, message history, streaming state, and tree navigation. The docs describe the session as managing lifecycle, message history, model state, compaction, and event streaming. (pi.dev)

1. Core state machine

Think Lamport first: the agent is a deterministic-ish state machine over an ordered log of inputs and outputs.

AgentState =
  | Uninitialized
  | Idle
  | ReceivingInput
  | Planning
  | CallingTool
  | AwaitingToolResult
  | StreamingOutput
  | WaitingForSteer
  | Compacting
  | Branching
  | Aborting
  | Failed
  | Completed

The most important transition loop is:

Idle
  -> ReceivingInput
  -> Planning
  -> [CallingTool -> AwaitingToolResult -> Planning]*
  -> StreamingOutput
  -> Completed
  -> Idle

But a real Pi-style agent needs extra branches:

StreamingOutput + steer(message) -> WaitingForSteer -> Planning
Idle + followUp(message)         -> ReceivingInput
Planning + contextTooLarge       -> Compacting -> Planning
AnyActive + abort                -> Aborting -> Idle | Failed
AnyState + navigateTree(target)  -> Branching -> Idle
AnyState + modelFailure          -> Failed | ModelFailover

Pi’s embedded use in OpenClaw makes this more explicit: the integration emphasizes control over session lifecycle, event handling, custom tool injection, prompt customization, persistence with branching/compaction, auth failover, and provider-agnostic model switching. (OpenClaw)

2. States

I would separate state into control state, conversation state, capability state, and environment state.

A. Control state

This is the Lamport-style finite state machine.

StateMeaningUninitializedNo session exists yet.InitializingLoading auth, model registry, resource loader, tools, skills, extensions.IdleSession exists and can accept a prompt.ReceivingInputA user/system/channel message has been accepted into the queue.PlanningModel is deciding next action.CallingToolAgent has emitted a tool call.AwaitingToolResultRuntime is executing the tool and waiting for result.StreamingOutputAssistant text/tool-result explanation is streaming.WaitingForSteerUser has interrupted or steered during streaming.CompactingHistory is being summarized/pruned to maintain context budget.BranchingSession tree is navigating/forking from an earlier message.AbortingActive run is being cancelled.FailedA recoverable or terminal error occurred.CompletedOne run has completed and committed output.

B. Conversation state

This is the persistent log/object memory.

ConversationState = {
  sessionId: string
  rootMessageId: string
  currentHeadId: string
  messages: AgentMessage[]
  branches: Map<MessageId, BranchMetadata>
  compactedSummaries: Summary[]
  pendingQueue: InputMessage[]
}

Pi exposes messages, sessionId, sessionFile, and tree navigation through the session interface. (pi.dev)

C. Capability state

This is what the agent is allowed to do.

CapabilityState = {
  model: Model
  thinkingLevel: ThinkingLevel
  tools: ToolRegistry
  resourceLoader: ResourceLoader
  authProfile: AuthProfile
  systemPrompt: string
  contextFiles: ContextFile[]
  extensions: Extension[]
}

The Pi SDK docs describe model control, resource loading for extensions/skills/prompt templates/themes/context files, and built-in/custom tools. (pi.dev) OpenClaw’s integration also distinguishes package responsibilities: pi-ai for model/message abstractions, pi-agent-core for the loop and tool execution, and pi-coding-agent for high-level session creation and built-ins. (GitHub)

D. Environment state

This is Alan Kay territory: the agent is not “a function”; it is an object in a society of objects.

EnvironmentState = {
  workspace: FileSystemView
  shell: ShellState
  sandbox: SandboxState
  channels: MessagingChannel[]
  externalServices: ServiceRegistry
  observers: EventSubscriber[]
}

OpenClaw uses Pi by embedding an AgentSession inside a messaging gateway, injecting messaging, sandbox, and channel-specific actions as custom tools. (OpenClaw)

3. Messages

In Alan Kay terms, the architecture is mostly about messages between objects, not function calls. The agent, model, tools, UI, filesystem, sandbox, and channel gateway are all objects that communicate by message.

External input messages

type InputMessage =
  | { type: "prompt"; text: string; sender: UserId; channel?: ChannelId }
  | { type: "steer"; text: string; runId: RunId }
  | { type: "followUp"; text: string; parentMessageId?: MessageId }
  | { type: "slashCommand"; command: string; args: string[] }
  | { type: "abort"; runId: RunId }
  | { type: "navigateTree"; targetId: MessageId; summarize?: boolean }

Pi’s AgentSession directly supports prompt, steer, followUp, subscribe, model control, state access, and tree navigation. (pi.dev)

Internal agent messages

type AgentInternalMessage =
  | { type: "planStarted"; runId: RunId }
  | { type: "modelRequest"; messages: AgentMessage[]; model: Model }
  | { type: "modelDelta"; delta: string }
  | { type: "toolCallRequested"; tool: string; args: unknown }
  | { type: "toolResultReceived"; tool: string; result: unknown }
  | { type: "contextCompactionRequested"; reason: "budget" | "manual" }
  | { type: "branchCreated"; from: MessageId; to: MessageId }
  | { type: "runCompleted"; runId: RunId }
  | { type: "runFailed"; runId: RunId; error: AgentError }

Event messages to subscribers

type AgentSessionEvent =
  | { type: "message_update"; assistantMessageEvent: AssistantMessageEvent }
  | { type: "tool_call_started"; callId: string; tool: string }
  | { type: "tool_call_completed"; callId: string; resultRef: string }
  | { type: "state_changed"; from: AgentState; to: AgentState }
  | { type: "usage_update"; tokens: TokenUsage; cost?: number }
  | { type: "error"; error: AgentError }

The SDK example shows subscribing to events and handling message_update with text_delta while the session streams output. (pi.dev)

4. Invariants

These are the rules that must always hold. In Lamport language, they are the safety properties of the state machine.

A. Log invariants

I1. Every committed state transition is caused by exactly one input message or internal event.

I2. The message log is append-only except through explicit branch creation or compaction.

I3. Every assistant message has a causal parent: user prompt, follow-up, steer, tool result, or compacted summary.

I4. A branch never mutates its ancestor; it creates a new head.

I5. A compacted summary must preserve causal commitments:
    facts, user requests, tool results, pending tasks, safety constraints, and unresolved questions.

B. Run invariants

I6. At most one active model stream exists per session lane.

I7. A run may be active, completed, aborted, or failed, but not more than one of these.

I8. A tool result may only be accepted for a known pending tool call.

I9. A tool call must be idempotently recorded before execution.

I10. An abort moves the run to Aborting and prevents further tool side effects unless the tool has already committed.

OpenClaw’s architecture has explicit active-run tracking, abort handling, queueing, history limiting, compaction, lanes, and event subscription/dispatch, which are exactly the implementation pressure points for these invariants. (GitHub)

C. Tool invariants

I11. The model proposes tool calls; the runtime authorizes and executes them.

I12. Tools execute only if allowed by the current capability state.

I13. Tool inputs and outputs are serialized into the conversation history or a referenced artifact store.

I14. Destructive tools require an authorization policy distinct from ordinary read tools.

I15. A tool cannot silently expand the agent’s authority.

D. Context invariants

I16. The model context is a projection of persistent session state, not the source of truth.

I17. Context pruning may remove tokens, but not obligations.

I18. System prompt, developer policy, user intent, and tool results have explicit precedence.

I19. Provider-specific formatting changes must not change semantic conversation order.

I20. If the context budget is exceeded, the next valid transition is Compacting, Failing, or AskingForReduction—not silent truncation.

E. Object/message invariants

These are closer to Alan Kay:

I21. Objects communicate by messages, not shared mutable internals.

I22. The agent object does not own the world; it sends requests to workspace, shell, sandbox, channel, and model objects.

I23. Each object is responsible for maintaining its own local invariants.

I24. Cross-object effects are represented as messages/events, so they can be replayed, inspected, or compensated.

I25. The UI is an observer of session events, not the canonical state.

5. Minimal formal sketch

A compact TLA-ish model:

VARIABLES
  phase,
  log,
  queue,
  activeRun,
  pendingToolCalls,
  model,
  tools,
  context,
  branches

Init ==
  phase = "Uninitialized"
  /\ log = << >>
  /\ queue = << >>
  /\ activeRun = NULL
  /\ pendingToolCalls = {}

Next ==
  CreateSession
  \/ EnqueuePrompt
  \/ StartRun
  \/ ModelStep
  \/ RequestTool
  \/ CompleteTool
  \/ StreamDelta
  \/ CompleteRun
  \/ Steer
  \/ FollowUp
  \/ Compact
  \/ Branch
  \/ Abort
  \/ Fail

Safety ==
  OneActiveRun
  /\ KnownToolResultsOnly
  /\ AppendOnlyLog
  /\ BranchPreservesAncestors
  /\ AuthorizedToolsOnly
  /\ ContextIsProjection

6. A practical Pi-agent state model

For an actual implementation, I would define the durable state like this:

type DurableAgentState = {
  sessionId: string
  phase: AgentPhase
  head: MessageId
  log: AgentLogEntry[]
  branches: Record<MessageId, Branch>
  queue: QueuedInput[]
  activeRun?: {
    runId: string
    startedAt: string
    parentMessageId: MessageId
    status: "planning" | "tooling" | "streaming" | "aborting"
    pendingToolCalls: Record<string, PendingToolCall>
  }
  capabilities: {
    modelId: string
    thinkingLevel?: string
    enabledTools: string[]
    authProfileId?: string
    sandboxId?: string
  }
  context: {
    summaries: Summary[]
    loadedFiles: string[]
    tokenBudget: number
  }
}

And the runtime-only state like this:

type RuntimeAgentState = {
  subscribers: Set<(event: AgentSessionEvent) => void>
  abortController?: AbortController
  streamHandle?: AsyncIterable<ModelEvent>
  resourceLoader: ResourceLoader
  toolRegistry: ToolRegistry
}

The key distinction: durable state is replayable; runtime state is disposable.

7. The shortest useful answer

If you want the clean conceptual version:

States

Uninitialized, Idle, ReceivingInput, Planning, CallingTool,
AwaitingToolResult, StreamingOutput, Compacting, Branching,
Aborting, Failed, Completed

Messages

prompt, steer, followUp, slashCommand, abort, navigateTree,
modelRequest, modelDelta, toolCall, toolResult, compact,
branch, stateChanged, runCompleted, runFailed

Invariants

One active run per session lane.
All outputs descend from a causal input.
Tool results match prior tool calls.
Tools require explicit capability authorization.
The persistent log is append-only except explicit branch/compact operations.
Branches never mutate ancestors.
Context is a projection, not the source of truth.
Compaction preserves commitments.
UI observes events; it does not own state.
Every external side effect is represented in the log.

In one sentence: a Pi agent should be modeled as a message-driven object whose durable conversation tree is the source of truth, whose session loop is a Lamport-style state machine, and whose tools/world interfaces are Kay-style objects exchanging explicit messages.

Do you like what you are reading? Subscribe to receive updates.

Unsubscribe anytime