A good way to model a Pi-sdk-based agent is: an agent is a replicated, append-only conversation/process object whose behavior is driven by messages, guarded by invariants, and exposed through events. Pi’s SDK already gives the right primitives: createAgentSession(), AgentSession.prompt(), steer(), followUp(), event subscription, model control, message history, streaming state, and tree navigation. The docs describe the session as managing lifecycle, message history, model state, compaction, and event streaming. (pi.dev)
1. Core state machine
Think Lamport first: the agent is a deterministic-ish state machine over an ordered log of inputs and outputs.
AgentState =
| Uninitialized
| Idle
| ReceivingInput
| Planning
| CallingTool
| AwaitingToolResult
| StreamingOutput
| WaitingForSteer
| Compacting
| Branching
| Aborting
| Failed
| Completed
The most important transition loop is:
Idle
-> ReceivingInput
-> Planning
-> [CallingTool -> AwaitingToolResult -> Planning]*
-> StreamingOutput
-> Completed
-> Idle
But a real Pi-style agent needs extra branches:
StreamingOutput + steer(message) -> WaitingForSteer -> Planning
Idle + followUp(message) -> ReceivingInput
Planning + contextTooLarge -> Compacting -> Planning
AnyActive + abort -> Aborting -> Idle | Failed
AnyState + navigateTree(target) -> Branching -> Idle
AnyState + modelFailure -> Failed | ModelFailover
Pi’s embedded use in OpenClaw makes this more explicit: the integration emphasizes control over session lifecycle, event handling, custom tool injection, prompt customization, persistence with branching/compaction, auth failover, and provider-agnostic model switching. (OpenClaw)
2. States
I would separate state into control state, conversation state, capability state, and environment state.
A. Control state
This is the Lamport-style finite state machine.
StateMeaningUninitializedNo session exists yet.InitializingLoading auth, model registry, resource loader, tools, skills, extensions.IdleSession exists and can accept a prompt.ReceivingInputA user/system/channel message has been accepted into the queue.PlanningModel is deciding next action.CallingToolAgent has emitted a tool call.AwaitingToolResultRuntime is executing the tool and waiting for result.StreamingOutputAssistant text/tool-result explanation is streaming.WaitingForSteerUser has interrupted or steered during streaming.CompactingHistory is being summarized/pruned to maintain context budget.BranchingSession tree is navigating/forking from an earlier message.AbortingActive run is being cancelled.FailedA recoverable or terminal error occurred.CompletedOne run has completed and committed output.
B. Conversation state
This is the persistent log/object memory.
ConversationState = {
sessionId: string
rootMessageId: string
currentHeadId: string
messages: AgentMessage[]
branches: Map<MessageId, BranchMetadata>
compactedSummaries: Summary[]
pendingQueue: InputMessage[]
}
Pi exposes messages, sessionId, sessionFile, and tree navigation through the session interface. (pi.dev)
C. Capability state
This is what the agent is allowed to do.
CapabilityState = {
model: Model
thinkingLevel: ThinkingLevel
tools: ToolRegistry
resourceLoader: ResourceLoader
authProfile: AuthProfile
systemPrompt: string
contextFiles: ContextFile[]
extensions: Extension[]
}
The Pi SDK docs describe model control, resource loading for extensions/skills/prompt templates/themes/context files, and built-in/custom tools. (pi.dev) OpenClaw’s integration also distinguishes package responsibilities: pi-ai for model/message abstractions, pi-agent-core for the loop and tool execution, and pi-coding-agent for high-level session creation and built-ins. (GitHub)
D. Environment state
This is Alan Kay territory: the agent is not “a function”; it is an object in a society of objects.
EnvironmentState = {
workspace: FileSystemView
shell: ShellState
sandbox: SandboxState
channels: MessagingChannel[]
externalServices: ServiceRegistry
observers: EventSubscriber[]
}
OpenClaw uses Pi by embedding an AgentSession inside a messaging gateway, injecting messaging, sandbox, and channel-specific actions as custom tools. (OpenClaw)
3. Messages
In Alan Kay terms, the architecture is mostly about messages between objects, not function calls. The agent, model, tools, UI, filesystem, sandbox, and channel gateway are all objects that communicate by message.
External input messages
type InputMessage =
| { type: "prompt"; text: string; sender: UserId; channel?: ChannelId }
| { type: "steer"; text: string; runId: RunId }
| { type: "followUp"; text: string; parentMessageId?: MessageId }
| { type: "slashCommand"; command: string; args: string[] }
| { type: "abort"; runId: RunId }
| { type: "navigateTree"; targetId: MessageId; summarize?: boolean }
Pi’s AgentSession directly supports prompt, steer, followUp, subscribe, model control, state access, and tree navigation. (pi.dev)
Internal agent messages
type AgentInternalMessage =
| { type: "planStarted"; runId: RunId }
| { type: "modelRequest"; messages: AgentMessage[]; model: Model }
| { type: "modelDelta"; delta: string }
| { type: "toolCallRequested"; tool: string; args: unknown }
| { type: "toolResultReceived"; tool: string; result: unknown }
| { type: "contextCompactionRequested"; reason: "budget" | "manual" }
| { type: "branchCreated"; from: MessageId; to: MessageId }
| { type: "runCompleted"; runId: RunId }
| { type: "runFailed"; runId: RunId; error: AgentError }
Event messages to subscribers
type AgentSessionEvent =
| { type: "message_update"; assistantMessageEvent: AssistantMessageEvent }
| { type: "tool_call_started"; callId: string; tool: string }
| { type: "tool_call_completed"; callId: string; resultRef: string }
| { type: "state_changed"; from: AgentState; to: AgentState }
| { type: "usage_update"; tokens: TokenUsage; cost?: number }
| { type: "error"; error: AgentError }
The SDK example shows subscribing to events and handling message_update with text_delta while the session streams output. (pi.dev)
4. Invariants
These are the rules that must always hold. In Lamport language, they are the safety properties of the state machine.
A. Log invariants
I1. Every committed state transition is caused by exactly one input message or internal event.
I2. The message log is append-only except through explicit branch creation or compaction.
I3. Every assistant message has a causal parent: user prompt, follow-up, steer, tool result, or compacted summary.
I4. A branch never mutates its ancestor; it creates a new head.
I5. A compacted summary must preserve causal commitments:
facts, user requests, tool results, pending tasks, safety constraints, and unresolved questions.
B. Run invariants
I6. At most one active model stream exists per session lane.
I7. A run may be active, completed, aborted, or failed, but not more than one of these.
I8. A tool result may only be accepted for a known pending tool call.
I9. A tool call must be idempotently recorded before execution.
I10. An abort moves the run to Aborting and prevents further tool side effects unless the tool has already committed.
OpenClaw’s architecture has explicit active-run tracking, abort handling, queueing, history limiting, compaction, lanes, and event subscription/dispatch, which are exactly the implementation pressure points for these invariants. (GitHub)
C. Tool invariants
I11. The model proposes tool calls; the runtime authorizes and executes them.
I12. Tools execute only if allowed by the current capability state.
I13. Tool inputs and outputs are serialized into the conversation history or a referenced artifact store.
I14. Destructive tools require an authorization policy distinct from ordinary read tools.
I15. A tool cannot silently expand the agent’s authority.
D. Context invariants
I16. The model context is a projection of persistent session state, not the source of truth.
I17. Context pruning may remove tokens, but not obligations.
I18. System prompt, developer policy, user intent, and tool results have explicit precedence.
I19. Provider-specific formatting changes must not change semantic conversation order.
I20. If the context budget is exceeded, the next valid transition is Compacting, Failing, or AskingForReduction—not silent truncation.
E. Object/message invariants
These are closer to Alan Kay:
I21. Objects communicate by messages, not shared mutable internals.
I22. The agent object does not own the world; it sends requests to workspace, shell, sandbox, channel, and model objects.
I23. Each object is responsible for maintaining its own local invariants.
I24. Cross-object effects are represented as messages/events, so they can be replayed, inspected, or compensated.
I25. The UI is an observer of session events, not the canonical state.
5. Minimal formal sketch
A compact TLA-ish model:
VARIABLES
phase,
log,
queue,
activeRun,
pendingToolCalls,
model,
tools,
context,
branches
Init ==
phase = "Uninitialized"
/\ log = << >>
/\ queue = << >>
/\ activeRun = NULL
/\ pendingToolCalls = {}
Next ==
CreateSession
\/ EnqueuePrompt
\/ StartRun
\/ ModelStep
\/ RequestTool
\/ CompleteTool
\/ StreamDelta
\/ CompleteRun
\/ Steer
\/ FollowUp
\/ Compact
\/ Branch
\/ Abort
\/ Fail
Safety ==
OneActiveRun
/\ KnownToolResultsOnly
/\ AppendOnlyLog
/\ BranchPreservesAncestors
/\ AuthorizedToolsOnly
/\ ContextIsProjection
6. A practical Pi-agent state model
For an actual implementation, I would define the durable state like this:
type DurableAgentState = {
sessionId: string
phase: AgentPhase
head: MessageId
log: AgentLogEntry[]
branches: Record<MessageId, Branch>
queue: QueuedInput[]
activeRun?: {
runId: string
startedAt: string
parentMessageId: MessageId
status: "planning" | "tooling" | "streaming" | "aborting"
pendingToolCalls: Record<string, PendingToolCall>
}
capabilities: {
modelId: string
thinkingLevel?: string
enabledTools: string[]
authProfileId?: string
sandboxId?: string
}
context: {
summaries: Summary[]
loadedFiles: string[]
tokenBudget: number
}
}
And the runtime-only state like this:
type RuntimeAgentState = {
subscribers: Set<(event: AgentSessionEvent) => void>
abortController?: AbortController
streamHandle?: AsyncIterable<ModelEvent>
resourceLoader: ResourceLoader
toolRegistry: ToolRegistry
}
The key distinction: durable state is replayable; runtime state is disposable.
7. The shortest useful answer
If you want the clean conceptual version:
States
Uninitialized, Idle, ReceivingInput, Planning, CallingTool,
AwaitingToolResult, StreamingOutput, Compacting, Branching,
Aborting, Failed, Completed
Messages
prompt, steer, followUp, slashCommand, abort, navigateTree,
modelRequest, modelDelta, toolCall, toolResult, compact,
branch, stateChanged, runCompleted, runFailed
Invariants
One active run per session lane.
All outputs descend from a causal input.
Tool results match prior tool calls.
Tools require explicit capability authorization.
The persistent log is append-only except explicit branch/compact operations.
Branches never mutate ancestors.
Context is a projection, not the source of truth.
Compaction preserves commitments.
UI observes events; it does not own state.
Every external side effect is represented in the log.
In one sentence: a Pi agent should be modeled as a message-driven object whose durable conversation tree is the source of truth, whose session loop is a Lamport-style state machine, and whose tools/world interfaces are Kay-style objects exchanging explicit messages.
Do you like what you are reading? Subscribe to receive updates.
Unsubscribe anytime