Technical article ← Back to writing

ILX + Cortex · ~15 min read · Architecture

How ILX and Cortex solve the multi-agent coordination problem

Two layers. The first: a structured protocol and shared lexicon that make precise communication between agents possible. The second: routing intelligence built on top that makes the system fast. This article covers both — and why the second layer only became obvious once we built the first.

The problem

The moment you move from a single AI agent to a team of them, you inherit a problem that human organizations have struggled with for centuries: coordination. A lone agent can hold its entire context in one pass — its goals, its vocabulary, its assumptions. But when a dispatcher hands a subtask to a research worker, and that worker's output feeds into a writing specialist, and the writer's draft gets reviewed by a code generator, every handoff becomes a potential point of failure.

Not because the agents lack capability. Because they lack a shared language.

Agent A uses "summary" to mean a three-sentence abstract. Agent B interprets "summary" as a full-page executive briefing. Neither is wrong. But when Agent A asks Agent B for a summary and gets back something four times longer than expected, the downstream pipeline breaks — or worse, silently produces low-quality output. Multiply this by dozens of symbols across a complex workflow, and you get a system that appears to work but bleeds coherence at every seam.

This is the problem that ILX (Inter-Language Exchange) and the Cortex shared lexicon were designed to solve. Building a working system against these foundations then revealed a second, less obvious problem: agents that think too hard about coordination do not coordinate — they stall.

The protocol

ILX — Inter-Language Exchange

ILX is a communication protocol for multi-agent systems. Every message that passes between agents follows a structured envelope format, removing ambiguity about who is speaking, who is listening, and what kind of communicative act is being performed.

// ILX message envelope
message msg_042 {
  from    dispatcher
  to      worker_b
  purpose route_message
  content {
    act         = ask
    task        = "Write a technical overview of the authentication module"
    constraints = { length = 1500, audience = "senior engineers" }
  }
}

The id provides traceability — every message can be referenced, logged, and audited. The from/to fields establish clear sender-receiver relationships. The purpose field declares the speech act. The content carries the payload with constraints fully specified, not implied.

The speech act taxonomy is deliberately small. Six acts cover the vast majority of multi-agent interaction:

ask A request for information or action. Expects a report or propose in return.

propose Offering a candidate output or plan for consideration. Invites criticize or commit.

report Delivering a completed result or status update.

criticize Identifying problems or gaps in a proposal. Demands revise.

revise Submitting an updated version that addresses prior criticism.

commit Formally accepting and locking in a result or definition. Closes the loop.

Each speech act carries implicit expectations about what happens next. These conversational contracts make multi-agent workflows predictable without making them rigid.

The lexicon

Cortex — the shared vocabulary

If ILX is the grammar of multi-agent communication, the Cortex is the dictionary. It serves as a central vocabulary store — a shared lexicon where every symbol that agents use to coordinate has a formal entry containing its definition, canonical examples, a version number, and a status indicating whether it is draft, active, or deprecated.

At boot, a snapshot of all active symbols is pulled and injected directly into the dispatcher's system prompt. Workers receive their task context through routed messages. No agent calls snapshot_lexicon() during live work — that query happens once, at startup, and the result is frozen into the session. The Cortex is consulted in the hot path only when a new symbol needs to be proposed or an existing one is disputed.

This distinction matters. A shared lexicon is necessary infrastructure. It is not a coordination mechanism that should be exercised on every task.

Symbols earn their place through the same deliberative cycle as everything else:

propose New entry submitted. Status: draft.

criticize Definition challenged. Edge cases surfaced.

revise Definition tightened. Examples added.

commit Symbol active. Part of the shared foundation.

The vocabulary of the workspace is not a static configuration file. It is a living artifact — growing when agents encounter new domains, tightening when ambiguity is detected, deprecating terms that no longer carry their weight.

The pivot

The bottleneck

The original design gave agents full access to the lexicon lifecycle during task execution. Workers looked up symbols before acting, proposed new entries when they encountered gaps, reflected on their own output against lexicon definitions. The dispatcher decomposed tasks by reasoning about what each symbol implied, then coordinated synthesis using the same vocabulary.

This was coherent in principle. In practice, a request to generate a banner image took eighteen minutes.

The breakdown was not a single failure — it was six stacked ones:

Each lexicon lookup added a round trip.

Each reflection step added an LLM call.

The dispatcher analysed what "banner" implied.

The designer worker consulted what "visual identity" meant.

The file agent considered output format.

None of this work was wrong. All of it was unnecessary when the answer was already known from previous sessions.

The agents were not too slow. They were thinking about the wrong things.

Core mechanism Proprietary

The ant colony insight

An ant finding food does not deliberate. It follows pheromone gradients — chemical signals left by other ants on paths that led to food. The stronger the trail, the more likely the next ant is to follow it. The trail fades if it stops being reinforced. Intelligence is not in the ant. It is in the environment.

This is the architecture that replaced deliberation-first coordination in this workspace.

The Cortex stores a task_outcomes table. Every time the dispatcher completes a task and the user rates the result — one to five stars — that outcome is logged: which worker handled it, what the task description was, what rating it received, and when. This is the pheromone deposit.

When a new task arrives, the dispatcher calls get_routing_hint() before doing anything else. The function scans recent high-rated outcomes, computes keyword overlap, applies a time decay factor, and weights by rating:

# pheromone routing score
score = keyword_overlap × (0.9 ** days_elapsed) × (rating - 3) / 2.0

Fast path — confidence > 0.6

Route directly. No analysis.

One worker call. Wait for reply. Return result. The banner that took eighteen minutes now takes the time of one worker call.

Deliberate path — no clear winner

Decompose, route, synthesise.

The original design, now reserved only for novel tasks or where past performance was genuinely mixed. The full protocol when the problem actually requires it.

The pheromone trail is self-reinforcing. A worker that handles similar tasks well accumulates high-rated outcomes. Its confidence score on those task types rises. It gets routed faster. If it starts underperforming, time decay erodes its score until deliberation kicks back in and re-evaluates the routing.

Task resolution paths — rating strength

dispatcher → researcher → writer

0.82 active · reinforcing

dispatcher → researcher → coder

0.41 in use · decaying slowly

dispatcher → writer → reviewer

0.19 low · review pending

dispatcher → multi-step specialist

0.08 emerging · unproven

Decay matters as much as rating. Without it, the system calcifies around early solutions and never adapts. With it, paths that have stopped producing good outcomes fade gradually — making room for better routes that have emerged from more recent experience. The system corrects itself without a single line of routing logic being touched.

Design principle

Lean workers

The second change was stripping the workers.

In the original design, workers were participants in the coordination protocol. They looked up lexicon symbols, proposed new entries, reflected on whether their output met the stated criteria. This made them expensive — each worker turn involved multiple LLM calls that had nothing to do with the task itself.

The insight is that a worker's job is execution, not coordination. Coordination is the dispatcher's responsibility, and the pheromone layer handles most of it automatically. A worker that receives a routed task needs to do exactly three things: read the task, use its tools, report the result.

Every worker now runs with _lean_prompt = True. This strips the Language Evolution Protocol and the Self Protocol from the system prompt entirely. No lexicon lookups. No symbol proposals. No output reflection. The worker prompt is:

// lean worker instruction
Read content.task from the incoming route_message.
Execute using your available MCP tools.
Report back via send_ilx exactly once.

The reduction in token count and round trips is substantial. But the more important effect is conceptual: workers are tools, not participants in a debate about terminology. They are good at specific things — browser automation, writing, code — and the system routes to them on that basis. Keeping them out of the coordination layer means coordination does not get tangled with execution.

Context layer

Personal context as infrastructure

A related class of overhead came from agents looking up personal context during tasks. Before drafting content, a writer worker would scan the Obsidian vault to sample writing style. Before any coding task, the coder would traverse project directories for conventions. These were correct things to know — but wrong times to learn them.

Personal context is now infrastructure, not a live query.

Style fingerprint

Built once at first boot

A compact structured JSON object built from writing samples via a single Claude call: tone descriptor, sentence patterns, vocabulary preferences, style rules, anti-patterns, calibration sample. Read instantly from disk on every subsequent boot. No vault scan at task time.

Knowledge index

LightRAG graph + vector store

The Obsidian vault indexed into a LightRAG graph-plus-vector store using local embeddings. Built once, stored per workspace, queried via semantic search. A question like "what have I written about distributed systems?" returns synthesised excerpts rather than triggering a full directory traversal.

Boot injection

Available from message one

Identity context — bio, tone, active projects — read from config at startup and injected directly into the dispatcher's system prompt before the session begins. The dispatcher never calls get_identity(). Agent registry, lexicon snapshot, and pheromone routing function are all available from the first user message.

Workspace isolation

Per-client boundaries

All workspace-scoped artifacts — fingerprint, knowledge index, metadata — stored under workspaces/<name>/ and passed to every MCP server subprocess via WORKSPACE_DATA_DIR at boot. Switching workspaces means switching which fingerprint and knowledge index the agents draw from. Client work stays isolated from personal work.

End to end

Session boot

python chat.py starts the session. Cortex comes up if not already running. Config loads. If the style fingerprint does not exist, it is built silently during the boot sequence. The knowledge index is checked. Workers initialize and warm up in parallel.

Task arrives — routing hint

The dispatcher calls get_routing_hint(). High confidence on worker_b for a writing task — routed directly via send_ilx. The dispatcher anchors the event stream position and calls wait_for_worker_reply. One round trip.

Worker executes

worker_b receives the route message, queries the knowledge index for relevant prior writing, loads the style fingerprint via get_style_fingerprint(), drafts the content, and reports back exactly once.

Result + rating

The dispatcher synthesises and returns the result. The user rates it one to five stars. The outcome is logged to task_outcomes in the Cortex with task text, elapsed time, worker, and rating.

Trail reinforces

The pheromone score from step 4 feeds step 2 the next time a similar task arrives. If the knowledge index was last rebuilt more than seven days ago, the user is prompted to resync at /quit. The system improves routing without configuration changes or code modifications.

Conclusion

The core insight behind ILX and Cortex remains: multi-agent systems need language, and language needs maintenance. A protocol for structured messaging gives agents the grammar to communicate clearly. A shared lexicon gives them the vocabulary to communicate precisely.

What building a real workspace revealed is that language alignment is necessary but not sufficient. A system where agents constantly re-derive coordination state from first principles does not scale — it compounds. Every agent deliberating about every task produces a system that is technically correct and practically unusable.

The answer is not smarter agents. It is a smarter environment. Pheromone routing stores coordination intelligence in the outcome history rather than recomputing it per task. Lean workers execute without the overhead of coordination participation. Pre-loaded context eliminates redundant lookups at task time. Workspace isolation keeps personal context precise and client-specific.

The result is a system that gets faster as it accumulates history, not slower as it accumulates complexity. The design question is not just "how do agents communicate?" It is "where does the intelligence live?" If the answer is "in the agents," every task pays the deliberation tax. If the answer is "in the environment," the agents can focus on what they are actually good at.

Interested in working with us?

Architecture built
for real delivery.

Start a project → Back to studio