17-Roadmap

🗺

A pragmatic, phased plan to take this spec from zero to a working iPhone-driven AI coding workspace. Each phase has a clear exit criterion. Build vertically before horizontally.

Phase 0 — Bootstrapping (1–2 days)

Goal: empty Laravel 11 app boots, with Postgres + Redis + Horizon + Sanctum + Pest.

Create Laravel 11 project, configure Pint + strict types.

Configure PostgreSQL 16 + pgvector extension.

Configure Redis 7 + Horizon supervisor with agents-default, agents-long, rag-index, docs queues.

Configure Sanctum with ability scopes.

Configure Pest + first smoke test.

Stand up docs/ skeleton (PROJECT_CONTEXT, SECURITY, GIT_WORKFLOW empty stubs with auto-blocks).

Exit: php artisan horizon runs; GET /api/auth/me works with a seeded user.

Phase 1 — Projects & workspace (3–4 days)

Goal: a user can connect a Git repo and the system clones it into an isolated workspace.

Migrations: projects, project_connections.

ProjectService (create, validate repo URL, queue clone).

WorkspaceService (sandbox + path resolution + read/list/applyChange APIs).

GitService::clone() + pull() + status().

Framework / language detection (heuristic from key files).

API: POST /api/projects, GET /api/projects, POST /api/projects/{project}/clone, pull.

Exit: cloning a public GitHub repo produces an isolated, lockable workspace and a project row.

Phase 2 — Events backbone (2 days)

Goal: the audit ledger is alive and streams.

Migrations: agent_events.

ConsoleEventService::emit() with Redis fan-out.

API: GET /api/runs/{run}/events and events/stream (SSE).

Web-side smoke client (curl + eventsource) to verify SSE.

Exit: emitting an event from a tinker command shows up in real time on the SSE stream.

Phase 3 — RAG indexer (4–6 days)

Goal: a cloned project is indexed; retrieval works.

Migration: rag_chunks + pgvector column + ivfflat index.

OpenAi\EmbeddingClient + mock for tests.

Chunkers for: routes, controllers (methods), models, migrations, services, config, markdown docs, composer/package files.

ProjectIndexerJob (full scan) + incremental indexing on file events.

RagContextService::retrieve() with relationship filters.

API: POST /api/projects/{project}/index-rag.

Exit: a freshly cloned Laravel app has > 90% of routes/controllers/models indexed; retrieval returns coherent chunks for sample queries.

Phase 4 — Snapshots & file changes (3–4 days)

Goal: every file change is recorded, reversible.

Migrations: workspace_files, workspace_snapshots.

SnapshotService::create() (pre_run, pre_command, manual, final) + tarball to S3.

WorkspaceService::applyChange() writes workspace_files rows with hashes + diff.

SnapshotService::restore() via GitService::restoreCommit().

API: list, get, restore, compare.

Exit: a manually triggered file edit through the API creates a workspace_files row + a snapshot; restore returns the workspace to the previous state.

Phase 5 — Agent provider scaffolding (3–5 days)

Goal: provider abstraction with at least OpenAI implemented.

Migrations: agent_providers, agent_settings.

AgentProviderInterface + ProviderEvent + AgentProviderResolver.

OpenAIAgentService against Responses API with streaming + tool calls.

ToolSchemaCompiler (PHP attributes → OpenAI tool spec).

Tool implementations: read_file, list_files, apply_patch, rag_search, ask_user, finish.

Exit: a hand-rolled prompt to a test project completes an end-to-end run that produces a file_updated event and a final_summary.

Phase 6 — Orchestrator + run lifecycle (3–4 days)

Goal: full state machine, branch-per-run, snapshots, pause/resume.

Migration: agent_runs.

AgentRunOrchestrator + RunAgentJob.

Pre-run sequence (lock + snapshot + branch).

Pause / resume / cancel / retry semantics with metadata_json.provider_state.

API: full runs surface (/api/runs, start, pause, resume, cancel, retry).

Exit: a user can start a run from the API, pause it mid-stream, resume with input, and complete it. All transitions audited.

Phase 7 — Command execution + Git operations (3 days)

Goal: commands run safely; Git is integrated.

Migrations: command_executions, git_operations.

CommandExecutionService with allowlist/blocklist enforcement, stdout/stderr to S3.

Destructive-marker auto-snapshot.

GitService::createBranch/checkout/diff/commit/restoreCommit.

Push gating (token ability + safety setting + approved review).

API: POST /api/runs/{run}/commit, push, approve, reject.

Exit: an agent can run vendor/bin/pest, see results in events, commit to its branch, and have push refused by default.

Phase 8 — Claude API + Claude Code providers (3–4 days)

Goal: provider parity.

ClaudeAgentService using Anthropic Messages API.

ClaudeCodeAgentService wrapping the CLI in --json mode with sandboxed env.

Provider-event normalization tests for all three providers.

.claude/ doc templates committed and auto-synced.

Exit: switching provider in selected_model produces equivalent behavior on the same prompt.

Phase 9 — Documentation auto-update (2–3 days)

Goal: every completed run updates docs and Cursor/Claude mirrors.

Migration: documentation_updates.

DocumentationAutoUpdateService with affected-doc detection, marker-aware writer, secret redaction.

ProjectMapService + SchemaAnalyzerService (parse routes, controllers, models, migrations).

CHANGELOG_AI.md append-only writer.

Exit: a run that adds a route + controller + migration produces correct rows in PROJECT_MAP.md and a CHANGELOG_AI.md entry, with .claude/ mirrors updated.

Phase 10 — Server params UI + Settings API (1–2 days)

Goal: settings are editable, secrets masked.

ServerParams repository + custom config repository binding.

Encryption cast for secret rows.

API: /api/settings/server-params (GET masked, PUT partial), /api/settings/providers.

Step-up auth flow for secret writes.

Exit: rotating the OpenAI key via API takes effect without restart; never echoes the value.

Phase 11 — iPhone client v1 (1–2 weeks)

Goal: a polished iPhone app driving the API.

Project bootstrap (Xcode, SPM, Tuist optional).

APIClient + SSEClient + Keychain token storage.

Screens: Login, Projects, Project Detail, New Run, Run Detail (Console + Files + Snapshots + Summary), Diff Viewer, Snapshots, Docs Browser, Settings.

Mermaid renderer via WKWebView for diagrams.

Snapshot + UI tests for critical flows.

Exit: end-to-end demo on a real iPhone: connect repo → start run → watch console → answer a waiting_for_user → review diff → commit on branch.

Phase 12 — Hardening & v1 release (1 week)

Load test SSE with 50 concurrent runs.

Disk quota + cgroup limits in production.

Snapshot GC + log retention job.

Rate limit & idempotency keys.

Final security review against 15 — Security Rules.

Cut v1.0.0.

Post-v1 backlog

APNs notifications (13 §8).

Web console as a thin wrapper of the same API.

Multi-tenant / team model (v2).

Per-tenant KMS for agent_events.payload_json.

Cost dashboards + usage_daily MV.

Additional specialized agents (e.g. Migration, Refactor, Performance).

Optional inline IDE integrations (VS Code, Cursor) using the same API.

Estimated total

Approximately 6–8 weeks for v1 with a single full-stack engineer; faster with two engineers (one backend, one iPhone).

Done-definition for every phase

Migrations are reversible.

Each service has Pest unit tests + at least one feature test through the HTTP layer.

Logs are clean of secrets.

A docs/CHANGELOG_AI.md entry exists (even if hand-written until Phase 9).

Cursor rules are updated when assumptions change.