From Agent to Domain Intelligence : A Self-Evolving Knowledge Engine
Abstract
General-purpose AI agents can reason, plan, and execute — but they cannot perform well in a specific operational domain without local knowledge. We argue that the gap between general intelligence and domain effectiveness is filled entirely by knowledge, and that the scaffolding commonly built for domain agents — skills, routes, SOPs, policies — are all knowledge encoded in different forms.
We present the Self-Evolving Knowledge Engine (SEKE), an architecture that enables an agent to autonomously learn, organize, and refine domain knowledge through real-world work. SEKE consists of three elements: a filesystem-based semantic tree as the knowledge storage model; two evolutionary loops — the Knowledge Evolution Loop (KEL) for continuous knowledge capture and refinement, and the Meta Evolution Loop (MEL) for improving the learning process itself; and a governance layer through which humans provide constitutional structural decisions that the system cannot override.
Our core thesis: General Intelligence + Capabilities + Domain Knowledge = Domain Intelligence. Given a reasoning engine, access to domain systems, and a Self-Evolving Knowledge Engine, an agent can bootstrap itself into Domain Intelligence without custom training or hand-crafted knowledge bases. Once bootstrapped, the accumulated knowledge — particularly the organizational structure it has evolved and the knowledge-learning capability it has developed — creates a compounding competitive barrier that is extremely difficult for competitors to cross.
We ground SEKE in several foundational ideas: knowledge defined as anything that changes agent behavior (not mere information storage); domains defined as the set of tasks that general-purpose models like Claude or Codex cannot perform well out of the box (a dynamic, not fixed, boundary); and a negative feedback argument for why the system self-corrects rather than amplifying errors.
These ideas emerged from building and operating a production domain intelligence system. SEKE is not a theoretical proposal — it is a working architecture refined through real-world deployment.
1. The Missing Piece in Agent Architecture
The current wave of AI agent development is focused on scaffolding. Teams build skills, define tool interfaces, write standard operating procedures, design routing logic, and codify policies. Each new domain requires a new set of these constructs, hand-crafted by engineers who understand both the domain and the agent framework.
This approach works, but it has a structural limitation: the scaffolding doesn’t learn. A skill defined today works the same way a year from now. An SOP written for one scenario doesn’t adapt when the scenario changes. A routing rule that was correct last quarter may silently degrade as data sources evolve.
We propose that the problem is not insufficient scaffolding, but a misidentification of what these constructs actually are. Routes, skills, SOPs, policies, guardrails — these are all knowledge encoded in different forms:
A route is the knowledge of “when this type of question arises, take this path.”
A skill is the knowledge of “how to use this tool to accomplish this task.”
An SOP is the knowledge of “the reliable sequence of steps for this workflow.”
A policy is the knowledge of “what should and should not be done in this context.”
If this is true — if all agent scaffolding is ultimately knowledge — then the right approach is not to build better scaffolding, but to build an engine that can generate, organize, and evolve knowledge autonomously from real-world work.
This paper presents such an engine.
2. Knowledge as the Bridge to Domain Intelligence
A general-purpose language model is remarkably capable out of the box. It can reason, plan, write code, and synthesize information across domains. Yet when placed in a specific operational context — a company’s data infrastructure, a team’s analytics workflows, a particular set of business metrics — it struggles. Not because it lacks reasoning ability, but because it lacks local knowledge.
Consider a data analyst joining a new company. They know SQL. They understand statistics. They can build dashboards. But on day one, they cannot answer “what was our retention rate last week?” — not because the question is hard, but because they don’t know which table to query, which metric definition to use, which edge cases to watch for, or which upstream pipeline might have stale data.
The gap between general competence and domain effectiveness is filled by knowledge. Knowledge, in this context, is what guides the agent’s decisions about what to do, how to do it, and what to avoid. An agent with the right domain knowledge will choose the right data source, apply the right metric definition, and flag the right caveats — not because it was programmed to, but because it knows to.
This leads to a formulation:
General Intelligence + Capabilities + Domain Knowledge = Domain Intelligence
The agent provides reasoning. Capabilities (tools, APIs, data access) provide reach. Knowledge provides direction. With all three, an agent can function as Domain Intelligence. Remove any one, and the system falls short: intelligence without capabilities is a thinker with no hands; capabilities without knowledge is a powerful tool with no aim; knowledge without intelligence is a reference book that cannot act.
It is worth noting that capabilities are themselves largely a materialization of knowledge — “how to access this database,” “what protocol this API uses,” “how to authenticate with this service” are all knowledge. An agent with sufficient interface knowledge can typically construct capabilities on its own through general intelligence. What capabilities represent as a distinct term in this formulation is the physical premise: placing the agent in an environment where it can reach the target systems. That is the one thing knowledge alone cannot substitute.
3. The Self-Evolving Knowledge Engine
If knowledge is the key ingredient, the question becomes: where does it come from?
The traditional answer is human curation. Experts write documentation, build knowledge bases, maintain wikis. This works at small scale but degrades over time — documentation drifts from reality, edge cases go undocumented, and tribal knowledge remains in people’s heads.
We propose an alternative: let the system learn its own domain knowledge through real work, and continuously refine what it learns. We call this the Self-Evolving Knowledge Engine (SEKE).
SEKE has three fundamental elements:
A filesystem-based semantic tree as the knowledge storage model — where every path and node name carries meaning, and the tree’s structure itself evolves to reflect the domain’s knowledge landscape.
Two evolutionary loops that generate knowledge and organize it into the tree — the Knowledge Evolution Loop (KEL) for continuous learning, and the Meta Evolution Loop (MEL) for improving the learning process itself.
A governance layer through which humans provide structural decisions that the system cannot override — the constitutional framework within which knowledge evolves.
3.1 Knowledge Evolution Loop (KEL)
The inner loop runs continuously as the agent works. Every interaction — every question answered, every analysis completed, every workflow executed — is an opportunity to learn.
KEL has three phases:
Phase 1: Knowledge Capture.
After each interaction, the system examines the full work trace: the user’s question, the agent’s reasoning process (including failed attempts and dead ends), the final result, and the user’s subsequent behavior.
Capture operates under two instructions:
The first: extract information that could change how the agent handles future tasks — making it faster, more accurate, or more reliable. Such information is what we call knowledge. This is the core and original instruction of the entire system. If a piece of information would not alter the agent’s future behavior, it is not captured.
The second: adversarially review each piece of extracted knowledge. Challenge it — under what conditions might this be wrong? What edge cases exist? What would need to be true for this knowledge to hold? The findings from this review are themselves evaluated against the first instruction. If an adversarial finding could change agent behavior, it is captured alongside the original knowledge.
Capture also incorporates feedback on previously retrieved knowledge. When the agent uses existing knowledge and it helps, that is a positive signal. When existing knowledge proves inaccurate or incomplete, that signal feeds back into the next refinement cycle.
Phase 2: Knowledge Refine.
Captured entities enter a processing queue. The Refiner considers each entity in the context of the current Knowledge Tree — a semantic tree structure where every path and node name carries meaning — and restructures the tree to incorporate the new knowledge.
This is not a simple insertion. The Refiner may create new branches, merge nodes that represent the same concept, split nodes that have grown too broad, rewrite content based on accumulated evidence, or move subtrees to more appropriate semantic locations. The tree’s structure is not designed upfront; it emerges from the knowledge it contains and evolves as that knowledge grows.
Phase 3: Agent Retrieval.
When the agent begins a new task, it is provided with a meta document that describes the Knowledge Tree’s current structure: its branches, their meanings, and the most important recent changes. The agent then plans its own retrieval — deciding which branches to read, how deeply to explore, and what to skip — based on the task at hand.
This is a deliberate choice. The system does not inject knowledge into the agent’s context; the agent navigates the tree autonomously. This ensures that retrieval is task-specific and that the agent develops judgment about what knowledge it needs.
Because retrieval is an active choice, it generates signal. What the agent chose to read, what it ignored, what it read and then contradicted — all feed back into the next Capture phase, closing the loop.
3.2 Meta Evolution Loop (MEL)
The outer loop operates on a longer timescale — periodically rather than per-interaction. MEL reviews the accumulated record of captures, tree changes, and interaction outcomes to identify persistent learning strategies that KEL should adopt.
MEL does not make temporary adjustments. It identifies structural gaps in what the system is learning — directions that individual KEL cycles cannot detect because each cycle only sees one interaction. MEL sees patterns across many interactions.
For example, MEL might observe that the agent repeatedly struggles with data source selection, while KEL has been primarily capturing SQL optimization knowledge. MEL would then direct KEL to pay attention to data source knowledge: which tables serve which analytical purposes, which sources have known reliability issues, which upstream pipelines affect which downstream analyses.
These directives are durable. “Pay attention to data source knowledge” is not a one-time correction; it is a persistent learning direction that improves every subsequent KEL cycle.
3.3 The Governance Layer
The Knowledge Tree has two layers of structural authority.
The constitutional layer consists of structural decisions made by authorized humans. When a human creates a top-level directory, defines a structural constraint, or establishes an organizational principle, that decision is permanent. The Refiner operates within the constitution but cannot alter it.
The knowledge layer is everything else. The Refiner has full autonomy to reorganize, rename, merge, split, and restructure knowledge within the constitutional framework.
This distinction exists because certain structural decisions require a kind of judgment that the system cannot reliably develop on its own. A domain expert who understands that “metrics should be organized globally, not per business unit” is encoding a deep insight about the domain’s architecture — one that the system might take months to discover through trial and error, if it discovers it at all.
Ordinary user feedback enters through a different channel: conversation with the agent. When a user suggests a better way to organize knowledge, that suggestion flows through the normal KEL cycle. It may influence sub-structure within constitutional boundaries. It does not alter the top-level architecture.
4. Key Design Decisions
Several design choices are fundamental to how SEKE operates.
4.1 No Explicit User Feedback
The system does not ask users to rate responses. Feedback is inferred from interaction patterns: a follow-up question implies the initial answer was insufficient; acceptance and continuation implies adequacy; rephrasing the same question implies a missed point. The model also evaluates its own work process for inefficiencies.
This avoids the well-documented problems with explicit ratings — low response rates, inconsistency, and positivity bias — while leveraging a richer signal: what actually happened next.
4.2 Capture from the Work Process, Not Just Results
Knowledge Capture examines the entire work trace, not just the question-answer pair. This matters because the most transferable knowledge often lies in the journey: the agent tried three approaches before finding one that handled edge cases correctly; it explored a data source that looked promising but turned out to have stale data; it discovered that two differently-named metrics are actually the same computation.
By capturing from the process, the system learns how to work in this domain, not just what the answers are.
4.3 Adversarial Review at Capture Time
Before a Knowledge Entity enters the Refine queue, it undergoes adversarial challenge. This serves as an early-stage quality filter, attaching boundary conditions and failure modes to each piece of knowledge before it is incorporated into the tree. The agent then receives not just the knowledge but its limitations — enabling more reliable application.
4.4 Tree Restructuring, Not Merging
The Refiner does not treat the Knowledge Tree as an append-only store. It maintains the authority to reorganize the entire tree structure as knowledge accumulates. Categories that made sense with ten nodes may not make sense with a hundred. Abstractions emerge from concrete examples. Obsolete knowledge gets pruned. The tree’s shape at any moment reflects the system’s current understanding of the domain.
4.5 Agent-Driven Retrieval
The agent decides what knowledge it needs for each task, navigating the tree structure based on the meta document’s guidance. This ensures retrieval is contextually appropriate and prevents knowledge overload. The meta document’s size is controlled by design: it contains navigational information (structure and summaries), not knowledge content, so its growth scales with the number of branches rather than the total volume of knowledge.
4.6 Versioned Refinement and Cold Start
Each Refine operation produces a versioned snapshot of the Knowledge Tree, enabling rollback if performance degrades. The system can be cold-started by running KEL on a small set of representative questions, evaluating the results with MEL, and publishing the initial tree after human review.
5. Philosophical Foundations
5.1 The Definition of Knowledge
We define knowledge as anything that could change an agent’s behavior toward better, faster completion of its goals. If a piece of information does not alter how the agent works, it is not knowledge — it is data, or memory, or noise.
This is a deliberately operational definition, in the tradition of defining concepts by their effects rather than their attributes — much as Coase defined transaction costs not by enumerating what they contain, but by what they cause (the existence of firms). Our definition does not enumerate what knowledge looks like. It provides a judgment function: does this change behavior? If yes, it is knowledge. If no, it is not.
This definition sharply distinguishes knowledge from memory. Memory is the accumulation of past events. Knowledge is the distillation of those events into principles that guide future action. A system with memory remembers that it used table_x last time. A system with knowledge understands why table_x is more reliable than table_y for this class of queries, and under what conditions even table_x cannot be trusted. Memory grows linearly with experience. Knowledge grows through abstraction — extracting one principle from a hundred experiences that can guide the hundred-and-first.
5.2 The Definition of Domain
We define a domain as the set of tasks that a general-purpose model — such as Claude or Codex — cannot perform well out of the box. Domain knowledge is the knowledge that would enable it to perform those tasks.
This definition is intentionally dynamic. What constitutes a domain shifts as general intelligence advances. Tasks that require specialized knowledge today may become trivially solvable by next-generation models. When that happens, the domain boundary contracts — not because the tasks disappeared, but because they no longer require local knowledge.
At a deeper level, domains exist because of human cognitive bandwidth limitations. Humans organize work into localized regions — industries, companies, teams — each with its own language, processes, and institutional knowledge. The most fundamental characteristic of a domain is its specialized language: terms that compress complex local meaning into short phrases. “Retention” means something precise and different at every company. Domain Intelligence is the system’s adaptation to this localization of human cognition.
5.3 Negative Feedback and Self-Correction
A legitimate concern with any self-reinforcing loop is error amplification. We argue that SEKE is a negative feedback system, not a positive one.
Positive feedback — where errors compound — requires a specific condition: incorrect knowledge produces a result that is mistakenly judged as good. In SEKE, the judge is not the knowledge system itself but the real world. Users challenge incorrect answers. SQL queries return numbers that do not match reality. Reports get questioned by stakeholders. These are external signals that the knowledge system cannot manipulate.
As long as the system remains coupled to real-world outcomes, errors are exposed. Exposed errors are captured as negative feedback. Negative feedback enters the Refine phase. Knowledge gets corrected.
The only scenario where errors self-reinforce is when incorrect knowledge produces incorrect results that the real world accepts without question. But in that case, the “error” has effectively been endorsed by reality — and from the system’s perspective, it is not an error at all.
This means individual components do not need to be perfect. If each stage operates at 80% accuracy, the result is not compound degradation but iterative improvement: the incorrect 20% gets corrected in subsequent cycles when reality pushes back.
5.4 The Bootstrap Thesis
For the majority of domains, we believe three components are sufficient to create Domain Intelligence:
An Agent (a general-purpose reasoning engine)
Capabilities (access to the domain’s systems and data)
A Self-Evolving Knowledge Engine
No custom training, no fine-tuning, no hand-crafted knowledge bases. The system bootstraps itself into Domain Intelligence through use.
Once bootstrapped — after sustained real-world operation — the system’s accumulated knowledge creates a significant competitive barrier. This barrier is not just the knowledge itself, but the organizational structure it has evolved and the knowledge-learning capability it has developed (MEL-tuned KEL). These are extremely difficult for competitors to cross, because they emerged from the specific patterns of real-world interactions and compound with every cycle.
The barrier is compounding: MEL improves KEL, which improves the Knowledge Tree, which improves agent performance, which generates richer material for the next KEL cycle. A later entrant with the same components faces the same cold start but cannot compress the evolutionary cycles.
5.5 The Durability of Domain Intelligence
Once Domain Intelligence has been bootstrapped through a Self-Evolving Knowledge Engine, its barrier is extremely strong — and it cannot be overcome simply by deploying more powerful general intelligence.
A domain is, at its core, a localized organization of human experience and cognition. Its essence is localized language — terms that carry precise, context-specific meaning within that domain. To operate effectively in a domain, one must learn this language. No amount of general reasoning ability substitutes for this learning process. A vastly more intelligent agent, without knowledge evolution, would still face the same cold start.
This means domain intelligence, once established, is durable. A competitor with a superior general-purpose model but no accumulated domain knowledge cannot simply outthink the gap.
5.6 How Domains Die
If domains cannot be conquered from within, how do they end?
Domains do not die because general intelligence “absorbs” them — learning their knowledge and making them redundant. Domains die through paradigm revolution: general intelligence changes how humans organize their work, and the domain ceases to need to exist.
Consider the domain of internet product analytics. Today, every internet company develops specialized knowledge around conversion funnels, retention metrics, and user behavior analysis. Each company’s domain has its own terminology, its own edge cases, its own institutional knowledge. A Domain Intelligence system that has spent months learning this domain has a near-insurmountable advantage within it.
General intelligence will not defeat this domain intelligence by becoming better at retention analysis. It will defeat it by changing the paradigm entirely: users will interact with services through a unified AI interface rather than through individual apps. When users no longer navigate products directly, the entire framework of product analytics — funnels, sessions, retention — loses its object of study. The domain doesn’t get outperformed. It becomes unnecessary.
This is the pattern: domains are not absorbed by stronger intelligence; they are dissolved by paradigm shifts that reorganize how humans work. The Self-Evolving Knowledge Engine builds durable competitive barriers within existing domains. Those barriers hold until the domain itself is restructured out of existence — not by a smarter system doing the same work better, but by a different world that no longer needs that work done.
The ideas in this paper emerged from building and operating a production domain intelligence system for data analytics. The Self-Evolving Knowledge Engine is not a theoretical proposal — it is a working architecture whose design was refined through real-world deployment.
