04-RAG-System
ðŸ§
The RAG system is the agent's memory of the project. It is mandatory before any plan or edit unless the user explicitly turns RAG OFF for a run.
1. What gets indexed
- routes (URI, method, name, controller, middleware, params)
- controllers and controller methods
- models and their relationships
- migrations and resulting tables/columns/indexes/FKs
- services, jobs, events, listeners, policies, middleware
- form requests, API resources
- config files
- composer.json / package.json / Podfile / Package.swift
- README files, all
docs/*.md
.cursor/rules/*.mdc
.claude/*.md
- known API endpoints (from OpenAPI if present)
- previous run summaries (final_summary events)
- previous file changes (path, action, summary)
- known bugs (from
docs/KNOWN_BUGS.mdif present)
- architectural decisions (ADRs in
docs/adr/*.md)
2. Storage shape
See 03 — Database Migrations → rag_chunks.
Key columns:
source_type:route | controller | controller_method | model | model_relation | migration | table | service | job | event | listener | policy | middleware | request | resource | config | package | readme | doc | cursor_rule | claude_rule | api_endpoint | run_summary | file_change | known_bug | adr
metadata_jsonincludes: framework, file path, line range, language, related symbols, related routes, related tables.
content_hash: sha256 of the normalized chunk_text. If the hash matches the stored chunk for a given (project_id,source_path,symbol_name), skip re-embedding.
3. Chunking strategy
Intelligent and language-aware.
| Source | Chunk unit | Notes |
|---|---|---|
| PHP class | per public method + one class header chunk | Include namespace, class doc, signature. |
| PHP migration | per up() table | Capture column types, indexes, FKs. |
| Routes | per route group, then per route | One chunk per HTTP route is the most useful. |
| Markdown docs | per H2/H3 section, with overlap | 200-token overlap. |
| Config | per top-level key | One chunk per [key => array] entry. |
| Swift | per struct/class/view | Use SwiftSyntax-style splits (regex acceptable v1). |
| JS/TS | per exported symbol | TS-morph or regex-based. |
Global limits: chunk text 200–1200 tokens, min overlap 100 tokens for prose, no overlap for code.
4. Embedding pipeline
flowchart LR
A["changed file"] --> B["Detector picks source_type"]
B --> C["Chunker emits chunks"]
C --> D["Normalize + content_hash"]
D --> E{"hash exists?"}
E -- yes --> X["skip"]
E -- no --> F["OpenAI embeddings"]
F --> G["Upsert rag_chunks row"]
G --> H["emit rag_chunk_indexed event"]- Embedding model:
text-embedding-3-large(configurable viaagent_settings.openai_embedding_model).
- Dimensions: 3072. Stored as
vector(3072).
- Re-embed only when
content_hashchanges.
5. Indexing triggers
- Initial: on project
readystate after clone.
- Incremental: on
file_created,file_updated,file_deletedevents from a run.
- Manual:
POST /api/projects/{project}/index-ragre-scans and re-indexes everything.
- Scheduled: nightly Horizon job
ProjectIndexer::scan()for drift detection.
6. Retrieval contract
RagContextService::retrieve(Project $p, string $query, array $options = [])
Options
top_k(default 12)
source_types(filter list)
must_include_paths(boost)
route_name,controller_class,table_name(relationship-aware filters)
include_run_summaries(default true)
min_score(default 0.25)
Returns RagResult { chunks: RagChunk[], debug: { strategy, latency_ms, total_candidates, scores } }.
7. Relationship-aware retrieval
The service must support multi-hop queries the agent commonly needs:
- "What controller serves route
invoices.storeand which model and table does it touch?" → resolved by graph joins inmetadata_jsonplus aroute → controller_method → model → tablewalk before vector search.
- "Which migrations created or altered table
invoices?" → SQL filter onsource_type='migration'+ metadataaffected_tables.
- "What did previous runs change about
InvoiceController?" →source_type='file_change'filter on path.
8. Debug & observability
- Every retrieval emits a
rag_search_startedandrag_search_completedevent.
payload_jsonincludes the query, top-k IDs, scores, and which chunks were chosen for the prompt.
- Console UI shows expandable chunks (path + line range + snippet).
9. Per-run RAG toggle
- Each run has
metadata_json.rag_enabled(default true).
- When false, the orchestrator skips retrieval and emits an explicit
ai_context_loadedevent with{ rag: false }.
- The agent's system prompt acknowledges "RAG is OFF for this run" so it does not hallucinate references.
10. Quality safeguards
- Stale detection: nightly job re-hashes every source file and invalidates chunks whose source is gone.
- De-duplication: identical
content_hashacross files is collapsed; metadata lists all paths.
- Tokenization budget: top-k chunks trimmed to fit ~40 % of the model's context window. Remaining budget reserved for the user prompt and tool output.
- No blind guessing: if
chunks.length === 0and confidence is below threshold, the agent must ask the user a clarifying question viapaused_for_inputrather than fabricate.
11. Service skeleton
final class RagContextService {
public function __construct(
private EmbeddingClient $embeddings,
private ChunkRepository $chunks,
private ConsoleEventService $events,
) {}
public function retrieve(Project $project, string $query, array $opts = []): RagResult { /* … */ }
public function indexFile(Project $project, string $relativePath): void { /* … */ }
public function reindexProject(Project $project): void { /* … */ }
public function invalidate(Project $project, string $sourcePath): void { /* … */ }
}