Two layers. The first: a structured protocol and shared lexicon that make precise communication between agents possible. The second: routing intelligence built on top that makes the system fast. This article covers both — and why the second layer only became obvious once we built the first.
The moment you move from a single AI agent to a team of them, you inherit a problem that human organizations have struggled with for centuries: coordination. A lone agent can hold its entire context in one pass — its goals, its vocabulary, its assumptions. But when a dispatcher hands a subtask to a research worker, and that worker's output feeds into a writing specialist, and the writer's draft gets reviewed by a code generator, every handoff becomes a potential point of failure.
Not because the agents lack capability. Because they lack a shared language.
Agent A uses "summary" to mean a three-sentence abstract. Agent B interprets "summary" as a full-page executive briefing. Neither is wrong. But when Agent A asks Agent B for a summary and gets back something four times longer than expected, the downstream pipeline breaks — or worse, silently produces low-quality output. Multiply this by dozens of symbols across a complex workflow, and you get a system that appears to work but bleeds coherence at every seam.
This is the problem that ILX (Inter-Language Exchange) and the Cortex shared lexicon were designed to solve. Building a working system against these foundations then revealed a second, less obvious problem: agents that think too hard about coordination do not coordinate — they stall.
ILX is a communication protocol for multi-agent systems. Every message that passes between agents follows a structured envelope format, removing ambiguity about who is speaking, who is listening, and what kind of communicative act is being performed.
// ILX message envelope message msg_042 { from dispatcher to worker_b purpose route_message content { act = ask task = "Write a technical overview of the authentication module" constraints = { length = 1500, audience = "senior engineers" } } }
The id provides traceability — every message can be referenced, logged, and audited. The from/to fields establish clear sender-receiver relationships. The purpose field declares the speech act. The content carries the payload with constraints fully specified, not implied.
The speech act taxonomy is deliberately small. Six acts cover the vast majority of multi-agent interaction:
Each speech act carries implicit expectations about what happens next. These conversational contracts make multi-agent workflows predictable without making them rigid.
If ILX is the grammar of multi-agent communication, the Cortex is the dictionary. It serves as a central vocabulary store — a shared lexicon where every symbol that agents use to coordinate has a formal entry containing its definition, canonical examples, a version number, and a status indicating whether it is draft, active, or deprecated.
At boot, a snapshot of all active symbols is pulled and injected directly into the dispatcher's system prompt. Workers receive their task context through routed messages. No agent calls snapshot_lexicon() during live work — that query happens once, at startup, and the result is frozen into the session. The Cortex is consulted in the hot path only when a new symbol needs to be proposed or an existing one is disputed.
This distinction matters. A shared lexicon is necessary infrastructure. It is not a coordination mechanism that should be exercised on every task.
Symbols earn their place through the same deliberative cycle as everything else:
The vocabulary of the workspace is not a static configuration file. It is a living artifact — growing when agents encounter new domains, tightening when ambiguity is detected, deprecating terms that no longer carry their weight.
The original design gave agents full access to the lexicon lifecycle during task execution. Workers looked up symbols before acting, proposed new entries when they encountered gaps, reflected on their own output against lexicon definitions. The dispatcher decomposed tasks by reasoning about what each symbol implied, then coordinated synthesis using the same vocabulary.
This was coherent in principle. In practice, a request to generate a banner image took eighteen minutes.
The breakdown was not a single failure — it was six stacked ones:
The agents were not too slow. They were thinking about the wrong things.
An ant finding food does not deliberate. It follows pheromone gradients — chemical signals left by other ants on paths that led to food. The stronger the trail, the more likely the next ant is to follow it. The trail fades if it stops being reinforced. Intelligence is not in the ant. It is in the environment.
This is the architecture that replaced deliberation-first coordination in this workspace.
The Cortex stores a task_outcomes table. Every time the dispatcher completes a task and the user rates the result — one to five stars — that outcome is logged: which worker handled it, what the task description was, what rating it received, and when. This is the pheromone deposit.
When a new task arrives, the dispatcher calls get_routing_hint() before doing anything else. The function scans recent high-rated outcomes, computes keyword overlap, applies a time decay factor, and weights by rating:
# pheromone routing score score = keyword_overlap × (0.9 ** days_elapsed) × (rating - 3) / 2.0
The pheromone trail is self-reinforcing. A worker that handles similar tasks well accumulates high-rated outcomes. Its confidence score on those task types rises. It gets routed faster. If it starts underperforming, time decay erodes its score until deliberation kicks back in and re-evaluates the routing.
Decay matters as much as rating. Without it, the system calcifies around early solutions and never adapts. With it, paths that have stopped producing good outcomes fade gradually — making room for better routes that have emerged from more recent experience. The system corrects itself without a single line of routing logic being touched.
The second change was stripping the workers.
In the original design, workers were participants in the coordination protocol. They looked up lexicon symbols, proposed new entries, reflected on whether their output met the stated criteria. This made them expensive — each worker turn involved multiple LLM calls that had nothing to do with the task itself.
The insight is that a worker's job is execution, not coordination. Coordination is the dispatcher's responsibility, and the pheromone layer handles most of it automatically. A worker that receives a routed task needs to do exactly three things: read the task, use its tools, report the result.
Every worker now runs with _lean_prompt = True. This strips the Language Evolution Protocol and the Self Protocol from the system prompt entirely. No lexicon lookups. No symbol proposals. No output reflection. The worker prompt is:
// lean worker instruction Read content.task from the incoming route_message. Execute using your available MCP tools. Report back via send_ilx exactly once.
The reduction in token count and round trips is substantial. But the more important effect is conceptual: workers are tools, not participants in a debate about terminology. They are good at specific things — browser automation, writing, code — and the system routes to them on that basis. Keeping them out of the coordination layer means coordination does not get tangled with execution.
A related class of overhead came from agents looking up personal context during tasks. Before drafting content, a writer worker would scan the Obsidian vault to sample writing style. Before any coding task, the coder would traverse project directories for conventions. These were correct things to know — but wrong times to learn them.
Personal context is now infrastructure, not a live query.
python chat.py starts the session. Cortex comes up if not already running. Config loads. If the style fingerprint does not exist, it is built silently during the boot sequence. The knowledge index is checked. Workers initialize and warm up in parallel.get_routing_hint(). High confidence on worker_b for a writing task — routed directly via send_ilx. The dispatcher anchors the event stream position and calls wait_for_worker_reply. One round trip.worker_b receives the route message, queries the knowledge index for relevant prior writing, loads the style fingerprint via get_style_fingerprint(), drafts the content, and reports back exactly once.task_outcomes in the Cortex with task text, elapsed time, worker, and rating./quit. The system improves routing without configuration changes or code modifications.The core insight behind ILX and Cortex remains: multi-agent systems need language, and language needs maintenance. A protocol for structured messaging gives agents the grammar to communicate clearly. A shared lexicon gives them the vocabulary to communicate precisely.
What building a real workspace revealed is that language alignment is necessary but not sufficient. A system where agents constantly re-derive coordination state from first principles does not scale — it compounds. Every agent deliberating about every task produces a system that is technically correct and practically unusable.
The answer is not smarter agents. It is a smarter environment. Pheromone routing stores coordination intelligence in the outcome history rather than recomputing it per task. Lean workers execute without the overhead of coordination participation. Pre-loaded context eliminates redundant lookups at task time. Workspace isolation keeps personal context precise and client-specific.
The result is a system that gets faster as it accumulates history, not slower as it accumulates complexity. The design question is not just "how do agents communicate?" It is "where does the intelligence live?" If the answer is "in the agents," every task pays the deliberation tax. If the answer is "in the environment," the agents can focus on what they are actually good at.