Building a Workflow Executor in 48 Hours

Two days ago, OSOP could only validate workflows. You could write a .osop file, check it against the JSON Schema, render a diagram — but you could not run it. The executor was a placeholder that returned mock data. On April 1st we decided to change that. 48 hours later, osop run executes real workflows with conditional edges, fallback paths, security gates, cost controls, and agent nodes that call actual LLMs.

This is the story of that sprint — what we built, why each piece exists, and what we learned about executing AI agent workflows.

What We Built

The executor needed to handle everything the OSOP spec promises. That meant six major features in 48 hours:

WorkflowContext — Inter-node data flow. Each node reads inputs from previous nodes and writes outputs for downstream nodes. A shared context object carries state through the entire execution.
Conditional edges — Edges can have when: expressions that reference node outputs. The executor evaluates these at runtime to decide which path to take.
Fallback edges — If a node fails, the executor follows fallback edges instead of crashing. This enables retry patterns and graceful degradation.
Security gates (--allow-exec) — CLI nodes that run shell commands require explicit opt-in. Without --allow-exec, the executor refuses to run them. No accidental rm -rf in production.
Cost controls (--max-cost) — Agent nodes that call LLMs accumulate cost. The executor tracks spending and aborts if the total exceeds the --max-cost limit.
Agent nodes — Nodes with type: agent actually call LLMs via a pluggable client. The executor passes the node's prompt, collects the response, and records token usage in the execution log.

The Workflow: Conditional Edges and Security Gates

Here is a real workflow that uses conditional edges and security gates. The security scan node produces a verdict, and the deploy node only runs if the verdict is not danger:

deploy-with-gates.osop.yaml

Build Artifactscli

↓sequential→ Run Test Suite

Run Test Suitecicd

↓sequential→ AI Security Review

AI Security Reviewagent

↓conditional→ Deploy to Production

Deploy to Productioncli

↓fallback→ Rollback

Rollbackcli

Open in Editor

Running It

The osop run command walks the workflow graph, executes each node, evaluates edge conditions, and produces a .osoplog.yaml file with full execution details:

terminal

$ osop run deploy-with-gates.osop.yaml \
    --allow-exec \
    --max-cost 2.00 \
    --env KUBE_CONTEXT=staging

[executor] Walking graph: build -> test -> security_scan -> deploy
[build]    npm run build                        OK  4.2s
[test]     36 passed, 0 failed                  OK  12.1s
[security] AI review (gpt-4o, $0.03)            OK  3.8s
[security] Verdict: safe (score: 12/100)
[deploy]   Approval gate: --allow-exec granted
[deploy]   kubectl apply -f k8s/               OK  6.3s

Workflow COMPLETED in 26.4s
Cost: $0.03 (limit: $2.00)
Log written: deploy-with-gates.osoplog.yaml

Open in Editor

The Condition Evaluator

The hardest design decision was the condition evaluator. Edge conditions like security_scan.output.verdict != 'danger' need to be evaluated at runtime. The obvious approach — eval() — is a security disaster. We built a simple expression parser instead.

The evaluator supports dot-notation property access, string and number literals, comparison operators (==, !=, >, <, >=, <=), and boolean operators (and, or, not). It resolves references against the WorkflowContext, which holds all node outputs. No eval(), no arbitrary code execution, no injection attacks. The entire evaluator is under 120 lines of Python.

The Architecture

Three new modules power the executor:

execute.py — The graph walker and node dispatcher. Resolves execution order, evaluates edge conditions, runs each node, handles fallbacks.
llm_client.py — Pluggable LLM client for agent nodes. Supports OpenAI-compatible APIs with model selection, token tracking, and cost calculation.
osoplog.py — Execution log writer. Captures every node's inputs, outputs, duration, status, and AI metadata into a standards-compliant .osoplog.yaml.

Testing

We wrote 36 executor-specific tests covering:

Basic sequential execution and inter-node data flow
Conditional edge evaluation with various operators and data types
Fallback edge triggering on node failure
Security gate enforcement and cost limit enforcement

What's Next

The executor is live in osop v0.3.0. There are now 9 CLI commands — validate, run, render, test, report, optimize, import, export, and risk-assess — with 196 tests passing across the entire CLI. Next up: parallel node execution, streaming output, and webhook triggers. The spec supports all of these; the executor just needs to catch up.

Try it: pip install osop && osop run your-workflow.osop.yaml --allow-exec