🧠

04-RAG-System

🧠

The RAG system is the agent's memory of the project. It is mandatory before any plan or edit unless the user explicitly turns RAG OFF for a run.

1. What gets indexed

  • routes (URI, method, name, controller, middleware, params)
  • controllers and controller methods
  • models and their relationships
  • migrations and resulting tables/columns/indexes/FKs
  • services, jobs, events, listeners, policies, middleware
  • form requests, API resources
  • config files
  • composer.json / package.json / Podfile / Package.swift
  • README files, all docs/*.md
  • .cursor/rules/*.mdc
  • .claude/*.md
  • known API endpoints (from OpenAPI if present)
  • previous run summaries (final_summary events)
  • previous file changes (path, action, summary)
  • known bugs (from docs/KNOWN_BUGS.md if present)
  • architectural decisions (ADRs in docs/adr/*.md)

2. Storage shape

See 03 — Database Migrations → rag_chunks.

Key columns:

  • source_type: route | controller | controller_method | model | model_relation | migration | table | service | job | event | listener | policy | middleware | request | resource | config | package | readme | doc | cursor_rule | claude_rule | api_endpoint | run_summary | file_change | known_bug | adr
  • metadata_json includes: framework, file path, line range, language, related symbols, related routes, related tables.
  • content_hash: sha256 of the normalized chunk_text. If the hash matches the stored chunk for a given (project_id, source_path, symbol_name), skip re-embedding.

3. Chunking strategy

Intelligent and language-aware.

SourceChunk unitNotes
PHP classper public method + one class header chunkInclude namespace, class doc, signature.
PHP migrationper up() tableCapture column types, indexes, FKs.
Routesper route group, then per routeOne chunk per HTTP route is the most useful.
Markdown docsper H2/H3 section, with overlap200-token overlap.
Configper top-level keyOne chunk per [key => array] entry.
Swiftper struct/class/viewUse SwiftSyntax-style splits (regex acceptable v1).
JS/TSper exported symbolTS-morph or regex-based.

Global limits: chunk text 200–1200 tokens, min overlap 100 tokens for prose, no overlap for code.

4. Embedding pipeline

flowchart LR
	A["changed file"] --> B["Detector picks source_type"]
	B --> C["Chunker emits chunks"]
	C --> D["Normalize + content_hash"]
	D --> E{"hash exists?"}
	E -- yes --> X["skip"]
	E -- no --> F["OpenAI embeddings"]
	F --> G["Upsert rag_chunks row"]
	G --> H["emit rag_chunk_indexed event"]
  • Embedding model: text-embedding-3-large (configurable via agent_settings.openai_embedding_model).
  • Dimensions: 3072. Stored as vector(3072).
  • Re-embed only when content_hash changes.

5. Indexing triggers

  • Initial: on project ready state after clone.
  • Incremental: on file_created, file_updated, file_deleted events from a run.
  • Manual: POST /api/projects/{project}/index-rag re-scans and re-indexes everything.
  • Scheduled: nightly Horizon job ProjectIndexer::scan() for drift detection.

6. Retrieval contract

RagContextService::retrieve(Project $p, string $query, array $options = [])

Options

  • top_k (default 12)
  • source_types (filter list)
  • must_include_paths (boost)
  • route_name, controller_class, table_name (relationship-aware filters)
  • include_run_summaries (default true)
  • min_score (default 0.25)

Returns RagResult { chunks: RagChunk[], debug: { strategy, latency_ms, total_candidates, scores } }.

7. Relationship-aware retrieval

The service must support multi-hop queries the agent commonly needs:

  • "What controller serves route invoices.store and which model and table does it touch?" → resolved by graph joins in metadata_json plus a route → controller_method → model → table walk before vector search.
  • "Which migrations created or altered table invoices?" → SQL filter on source_type='migration' + metadata affected_tables.
  • "What did previous runs change about InvoiceController?" → source_type='file_change' filter on path.

8. Debug & observability

  • Every retrieval emits a rag_search_started and rag_search_completed event.
  • payload_json includes the query, top-k IDs, scores, and which chunks were chosen for the prompt.
  • Console UI shows expandable chunks (path + line range + snippet).

9. Per-run RAG toggle

  • Each run has metadata_json.rag_enabled (default true).
  • When false, the orchestrator skips retrieval and emits an explicit ai_context_loaded event with { rag: false }.
  • The agent's system prompt acknowledges "RAG is OFF for this run" so it does not hallucinate references.

10. Quality safeguards

  • Stale detection: nightly job re-hashes every source file and invalidates chunks whose source is gone.
  • De-duplication: identical content_hash across files is collapsed; metadata lists all paths.
  • Tokenization budget: top-k chunks trimmed to fit ~40 % of the model's context window. Remaining budget reserved for the user prompt and tool output.
  • No blind guessing: if chunks.length === 0 and confidence is below threshold, the agent must ask the user a clarifying question via paused_for_input rather than fabricate.

11. Service skeleton

final class RagContextService {
	public function __construct(
		private EmbeddingClient $embeddings,
		private ChunkRepository $chunks,
		private ConsoleEventService $events,
	) {}

	public function retrieve(Project $project, string $query, array $opts = []): RagResult { /* … */ }
	public function indexFile(Project $project, string $relativePath): void { /* … */ }
	public function reindexProject(Project $project): void { /* … */ }
	public function invalidate(Project $project, string $sourcePath): void { /* … */ }
}