Two days ago, OSOP could only validate workflows. You could write a .osop file, check it against the JSON Schema, render a diagram — but you could not run it. The executor was a placeholder that returned mock data. On April 1st we decided to change that. 48 hours later, osop run executes real workflows with conditional edges, fallback paths, security gates, cost controls, and agent nodes that call actual LLMs.
This is the story of that sprint — what we built, why each piece exists, and what we learned about executing AI agent workflows.
What We Built
The executor needed to handle everything the OSOP spec promises. That meant six major features in 48 hours:
- WorkflowContext — Inter-node data flow. Each node reads inputs from previous nodes and writes outputs for downstream nodes. A shared context object carries state through the entire execution.
- Conditional edges — Edges can have when: expressions that reference node outputs. The executor evaluates these at runtime to decide which path to take.
- Fallback edges — If a node fails, the executor follows fallback edges instead of crashing. This enables retry patterns and graceful degradation.
- Security gates (--allow-exec) — CLI nodes that run shell commands require explicit opt-in. Without --allow-exec, the executor refuses to run them. No accidental rm -rf in production.
- Cost controls (--max-cost) — Agent nodes that call LLMs accumulate cost. The executor tracks spending and aborts if the total exceeds the --max-cost limit.
- Agent nodes — Nodes with type: agent actually call LLMs via a pluggable client. The executor passes the node's prompt, collects the response, and records token usage in the execution log.
The Workflow: Conditional Edges and Security Gates
Here is a real workflow that uses conditional edges and security gates. The security scan node produces a verdict, and the deploy node only runs if the verdict is not danger:
Running It
The osop run command walks the workflow graph, executes each node, evaluates edge conditions, and produces a .osoplog.yaml file with full execution details:
$ osop run deploy-with-gates.osop.yaml \
--allow-exec \
--max-cost 2.00 \
--env KUBE_CONTEXT=staging
[executor] Walking graph: build -> test -> security_scan -> deploy
[build] npm run build OK 4.2s
[test] 36 passed, 0 failed OK 12.1s
[security] AI review (gpt-4o, $0.03) OK 3.8s
[security] Verdict: safe (score: 12/100)
[deploy] Approval gate: --allow-exec granted
[deploy] kubectl apply -f k8s/ OK 6.3s
Workflow COMPLETED in 26.4s
Cost: $0.03 (limit: $2.00)
Log written: deploy-with-gates.osoplog.yamlThe Condition Evaluator
The hardest design decision was the condition evaluator. Edge conditions like security_scan.output.verdict != 'danger' need to be evaluated at runtime. The obvious approach — eval() — is a security disaster. We built a simple expression parser instead.
The evaluator supports dot-notation property access, string and number literals, comparison operators (==, !=, >, <, >=, <=), and boolean operators (and, or, not). It resolves references against the WorkflowContext, which holds all node outputs. No eval(), no arbitrary code execution, no injection attacks. The entire evaluator is under 120 lines of Python.
The Architecture
Three new modules power the executor:
execute.py— The graph walker and node dispatcher. Resolves execution order, evaluates edge conditions, runs each node, handles fallbacks.llm_client.py— Pluggable LLM client for agent nodes. Supports OpenAI-compatible APIs with model selection, token tracking, and cost calculation.osoplog.py— Execution log writer. Captures every node's inputs, outputs, duration, status, and AI metadata into a standards-compliant .osoplog.yaml.
Testing
We wrote 36 executor-specific tests covering:
- Basic sequential execution and inter-node data flow
- Conditional edge evaluation with various operators and data types
- Fallback edge triggering on node failure
- Security gate enforcement and cost limit enforcement
What's Next
The executor is live in osop v0.3.0. There are now 9 CLI commands — validate, run, render, test, report, optimize, import, export, and risk-assess — with 196 tests passing across the entire CLI. Next up: parallel node execution, streaming output, and webhook triggers. The spec supports all of these; the executor just needs to catch up.
Try it: pip install osop && osop run your-workflow.osop.yaml --allow-exec