How AI Reads Its Own Logs and Writes Better Workflows

What if your AI agent could watch itself work, then get better at it? That is the idea behind osop synthesize. You give it a stack of .osoplog execution records — the detailed trace of what an AI agent actually did across multiple sessions — and it produces an optimized .osop workflow that captures the best patterns while eliminating the waste.

We ran it on 5 real Claude Code session logs: 85 total nodes, 1,666 minutes of recorded execution time. The synthesizer distilled them into a single 13-node workflow that captures everything the agent learned across those sessions.

The Command

The CLI is straightforward. Point it at your logs, optionally provide a base workflow to improve, and set a goal:

terminal

$ osop synthesize \
    session-01.osoplog.yaml \
    session-02.osoplog.yaml \
    session-03.osoplog.yaml \
    session-04.osoplog.yaml \
    session-05.osoplog.yaml \
    --base original-workflow.osop \
    --goal "optimize for speed" \
    -o optimized-workflow.osop

Open in Editor

The synthesizer reads every node record, every edge, every duration, every failure and retry. It builds a statistical model of what the agent actually does versus what the workflow says it should do — then it writes a new workflow that matches reality.

What We Fed It

We collected 5 session logs from real development work — feature implementations, bug fixes, and refactoring tasks. The raw numbers:

85 nodes across all sessions — the full execution graph of every tool call, every decision, every retry.
1,666 minutes of total recorded execution time spanning multiple days of development.
23 unique node types from file reads and edits to test runs, git operations, and human review gates.

What Came Out

The synthesizer produced a 13-node workflow. Not a lossy compression — a distillation. It identified the recurring patterns across all 5 sessions and encoded them as a reusable workflow with proper edge types (sequential, parallel, fallback).

optimized-workflow.osop

Plan Before Touching Codeagent

Read requirements, check existing patterns, draft approach.

↓sequential→ Explore Codebase

↓parallel→ Explore Existing Tests

Explore Codebaseagent

Search for related files and dependencies in parallel.

↓sequential→ Implement Changes

Explore Existing Testsagent

Find test patterns and fixtures to reuse.

↓sequential→ Implement Changes

Implement Changesmcp

Write code following discovered patterns.

↓sequential→ Type Check

Type Checkcicd

Run tsc --noEmit immediately after edits.

↓fallback→ Implement Changes

↓sequential→ Write Tests

Write Testsmcp

Add tests matching existing conventions.

↓sequential→ Run Tests

Run Testscicd

Execute full test suite.

↓fallback→ Write Tests

↓sequential→ Risk Review

Risk Reviewagent

Check for security issues, breaking changes, edge cases.

↓sequential→ Lint and Format

Lint and Formatcli

Auto-fix lint and formatting issues.

↓sequential→ Atomic Commit

Atomic Commitgit

Commit with descriptive message.

↓sequential→ Verify CI Green

Verify CI Greencicd

Ensure all checks pass before requesting review.

↓sequential→ Create Pull Request

Create Pull Requestgit

Open PR with summary and test plan.

↓sequential→ Human Review

Human Reviewhuman

Developer reviews the PR.

Open in Editor

Patterns the AI Discovered

The most interesting part is what the synthesizer learned from watching itself. Four patterns emerged consistently:

Plan first. Every successful session started with a planning step that read requirements and checked existing patterns before writing any code. Sessions that skipped planning had 3x more fallback loops.
Explore in parallel. The best sessions ran code exploration and test exploration simultaneously. The synthesized workflow encodes this as parallel edges from the planning node.
Verify immediately. Type-checking right after implementation — not at the end — caught errors when context was fresh. The synthesized workflow adds a fallback edge from typecheck back to implement.
Risk before ship. A dedicated risk review step before committing caught security issues and breaking changes that tests alone missed. The synthesizer placed it after tests pass but before any git operations.

The Self-Optimizing Loop

This is where it gets powerful. The synthesized .osop workflow is not a one-time artifact. It becomes the base workflow for future sessions. Those sessions produce new .osoplog records. Feed them back into synthesize, and the workflow improves again.

Run, log, synthesize, repeat. Each cycle tightens the workflow based on real execution data. The agent literally gets better at its job by watching itself do it.

How It Works Under the Hood

The synthesizer performs three passes over the input logs:

Frequency analysis — Which nodes appear in every session? Which are one-off anomalies? High-frequency nodes become required steps; rare ones are pruned.
Dependency graph extraction — What ordering constraints are real? If step B always follows step A across all sessions, that is a sequential edge. If B sometimes runs alongside C, that is a parallel edge.
Duration and failure analysis — Which steps are bottlenecks? Which have fallback patterns (fail then retry)? The synthesizer encodes retry loops as fallback edges and flags slow steps for potential parallelization.

Try It Yourself

If you have been using OSOP logging (via Claude Code instructions, the MCP server, or the Python CLI), you already have the .osoplog files you need. Just point synthesize at them.

The feature ships in osop CLI v0.9.0. Install with pip install osop and run osop synthesize --help to get started.

Source code: The synthesizer implementation lives in the osop repo under osop/commands/synthesize.py. Contributions welcome.