07-OpenAI-Integration

🔌

OpenAI is used for two things: (1) chat agents and (2) embeddings for RAG. Both go through a thin, testable abstraction.

1. Clients

App\Services\Ai\OpenAi\OpenAiClient — HTTP wrapper around the Responses API (chat + tools + streaming).

App\Services\Ai\OpenAi\EmbeddingClient — wrapper around the Embeddings API.

Both read configuration from config/agent_workspace.php → backed by agent_settings.

2. Configuration keys (server params)

Key	Default	Notes
`openai.api_key`	—	Secret. Encrypted at rest.
`openai.base_url`	`https://api.openai.com/v1`	Override for Azure OpenAI / proxies.
`openai.organization`	nullable	Optional org id.
`openai.chat_model`	`gpt-4.1`	Default chat model for the OpenAI provider.
`openai.embedding_model`	`text-embedding-3-large`	3072-dim.
`openai.tools_streaming`	`true`	Use SSE streaming for tool calls.
`openai.max_output_tokens`	`4096`	Per agent message.
`openai.temperature`	`0.2`	Conservative default for code.

3. Chat call shape (provider integration)

OpenAIAgentService builds a single Responses API call:

$response = $openai->responses()->create([
	'model' => $model,
	'input' => [
		['role' => 'system', 'content' => $systemPrompt],
		['role' => 'user',   'content' => $userPrompt],
		...$priorTurns,
	],
	'tools' => $toolSchema, // OpenAI tool spec built from the tool surface
	'stream' => true,
	'temperature' => $settings->openai_temperature,
	'max_output_tokens' => $settings->openai_max_output_tokens,
]);

The stream is consumed and normalized into ProviderEvents. Tool calls are returned to the orchestrator, executed, and the result is fed back via tool_result continuations.

4. Tool schema generation

The tool surface (05 — Agent Providers §4) is exposed as OpenAI tool definitions. Example:

{
	"type": "function",
	"name": "rag_search",
	"description": "Search project context using RAG.",
	"parameters": {
		"type": "object",
		"properties": {
			"query": { "type": "string" },
			"top_k": { "type": "integer", "minimum": 1, "maximum": 30 },
			"source_types": { "type": "array", "items": { "type": "string" } }
		},
		"required": ["query"]
	}
}

Schemas are generated from PHP-attribute-decorated tool classes (#[ToolName('rag_search')], #[ToolParam(...)]) by ToolSchemaCompiler.

5. Embeddings pipeline

$batches = chunk($chunks, 96);
foreach ($batches as $batch) {
	$vectors = $embeddingClient->embed(array_map(fn($c) => $c->text, $batch));
	$repo->upsert($project, $batch, $vectors);
}

Batch size capped to keep request bodies under 1 MB. Failures retried with exponential backoff and recorded as rag_search_completed events with severity warning if partial.

6. Error handling

Error	Action
401 / 403	Mark `agent_settings.openai.api_key` as `invalid`; surface to UI; fail the run with a clear message.
429	Exponential backoff (3 attempts) then emit `error` event and pause if mid-run.
5xx	Retry with backoff. After 3 attempts, fail the run.
Timeout	Configurable per call (default 60 s chat, 30 s embeddings).
Tool argument schema mismatch	Re-prompt with a corrective system message; if it fails twice, fail run.

7. Cost & usage tracking

Every OpenAI request writes a usage line into payload_json of the wrapping event:

{ "prompt_tokens": 1234, "completion_tokens": 567, "total_tokens": 1801, "model": "gpt-4.1" }

Nightly job aggregates per-project per-day totals into a usage_daily materialized view (optional v1.1).

8. Security

openai.api_key is encrypted using Laravel's encrypter at the model level (custom cast).

It is never sent to the model as text, never logged, never returned by APIs (always masked: sk-…XXXX).

Outbound HTTP uses Laravel's Http::withHeaders(['Authorization' => 'Bearer …']) — header is excluded from log channels.

All OpenAI calls are wrapped in a single GuzzleClient with a request id middleware so failures are correlated to events.

9. Mock provider (for tests)

MockOpenAiClient replays canned responses from tests/Fixtures/openai/*.json. The default test environment binds OpenAiClient to the mock. Embeddings mock returns deterministic vectors (seeded by hash) so RAG tests are stable.