AI Power Progress iA
Back to Ask AI Download Markdown Docs are source-of-truth for trust + safety.

Docs (Markdown)

Tip: this is plain markdown in a <pre> block for maximum inspectability.

# Ask AI (Local‑First Intelligence Augmentation) — Architecture + Safety Model

Ask AI is the grounded assistant layer for **AI Power Progress iA**. It is designed to be:

- **local‑first** (Ollama; no default third‑party model calls)
- **grounded** (catalog/docs/grid/web sources are explicit and linkable)
- **trust‑preserving** (retrieved content is treated as *untrusted*; prompt‑injection defenses)
- **capable** (teaching, planning, building/debugging support; structured answers when useful)

This document focuses on the Ask AI backend (`/api/ai/*`) and how it integrates with Search and PowerSearch Grid.

## Ask AI architecture map

```mermaid
flowchart LR
  user[Browser UI / CLI] -->|POST /api/ai/chat/stream| chat[/Ask AI Router/]
  user -->|POST /api/ai/generate| gen[/Ask AI Router/]

  chat --> prompt[Prompt builder + safety guards]
  gen --> prompt

  prompt -->|optional| catalog[(Canonical catalog JSON)]
  prompt -->|optional| sqlite[(SQLite app.db\nRAG + embeddings caches)]
  prompt -->|optional| grid[(grid_docs excerpts via SQLite)]
  prompt -->|optional| web[Brave Search API\nwhen enabled + configured]
  prompt -->|optional| kiwix[Offline library (Kiwix)\nwhen enabled + configured]

  prompt --> ollama[Ollama\nchat + embeddings]
  ollama --> prompt

  chat --> format[Grounding formatter\n(appends Sources)]
  format --> user
```

## Trust boundaries (what can talk to what)

- **Browser → FastAPI**: user prompt, optional page context, and explicit feature toggles (use_web_search/use_docs/use_resources/use_grid_context).
- **FastAPI → Ollama**: local LLM + embeddings. This is the default AI runtime.
- **FastAPI → Brave Search (optional)**: only when enabled by payload or auto‑enabled by server policy *and* `BRAVE_API_KEY` exists.
- **FastAPI → SQLite**: local‑first storage for runs, sources metadata, embedding caches, Grid docs, and catalog indexes.
- **FastAPI → Kiwix (optional)**: offline library snippets when enabled.

## Prompt‑injection defense (retrieved content is untrusted)

When Ask AI composes prompts from docs/web/offline/page snippets, the server:

- **normalizes** snippets (strip HTML, collapse whitespace, bound size)
- **wraps** them in explicit delimiters: `<BEGIN_UNTRUSTED_…>` / `<END_UNTRUSTED_…>`
- **appends** a short guardrail block only when untrusted blocks are present

Code: `app/prompt_safety.py` + `app/safety.py`.

## Grounding + citations model

Ask AI maintains a compact `sources[]` list (name, url, source_type) for:

- auditability (what the answer was based on)
- linkable evidence (a stable Sources list appended to outputs when `structured` or `force_cite` is enabled)

Formatting: `app/ai_format.py`.


## UI: Evidence panel (widget + /ask)

The Ask AI widget and `GET /ask` render backend-appended grounding blocks as a dedicated **Evidence** panel:

- Answer body: the assistant’s main markdown (safe-rendered).
- Evidence:
  - “What this is based on” (provenance summary)
  - “Sources” (expandable, clickable links)
- Run ID: when available, the UI shows **Copy run id** to help debug and correlate with server logs.

Client safety + usability:

- Follow-up context strips appended grounding sections to keep the thread compact and reduce prompt-injection surface.
- Retrieved content remains explicitly labeled as **untrusted** (the UI never treats citations as instructions).

Implementation:

- `static/site.js`: `ppiaSplitGroundingSections()` + `ppiaRenderGroundingEvidence()` + `ppiaRenderAssistantMessage()`.
- `static/ask.html`: stores `run_id` per assistant message and uses an explicit `web: off|auto|on` selector (default **off**).

## Profiles (response shaping)

Clients can pass a `profile` hint to bias model selection and response style:

- `fast`, `general`/`quality`/`balanced`, `code`
- `tutor` / `teach` (teaching)
- `builder` / `build`, `debugger` / `debug` (development assistance; maps to the code model when configured)

Source labels:

- Ask AI appends a `Sources:` list where each entry is prefixed with a small source-type label:
  - `(catalog)`, `(docs)`, `(grid)`, `(web)`, `(realtime)`

## Agent mode (safe actions + explicit approval)

The Ask AI widget supports an opt-in **agent mode** that can propose *safe next actions* to help users make progress without pretending to run arbitrary commands.

Key properties:

- **Off by default**: users must explicitly choose `mode: agent (actions)` (Ask AI widget or `GET /ask`).
- **No automatic execution**: actions are only *suggested* by the model; the user must click a button to run one.
- **Allowlisted tools only**: actions map to a small set of safe, read-only endpoints and browser navigation.
- **Untrusted evidence**: tool outputs are wrapped and treated as untrusted context (prompt-injection aware).

### Action proposal format (model → UI)

When agent mode is enabled, the model may append an actions block at the end of its response:

- `<BEGIN_ACTIONS_JSON> … <END_ACTIONS_JSON>` (JSON only inside the block)
- Shape: `{"v":1,"actions":[{"id":"a1","label":"…","kind":"…","input":{...}}]}`

Allowed `kind` values (initial set):

- `open_url` → open a same-origin path (preferred) or `https://` URL
- `services_status` → summarize `GET /api/services/status`
- `site_search` → local-first `GET /api/search/blended?q=…` (web/live off; `include_grid=false`)
- `rag_query` → `GET /api/rag/query?q=…`
- `grid_search` → `GET /api/grid/search?q=…&mode=lexical|semantic`

### Tool result format (UI → model)

When the user runs an action, the UI appends tool output wrapped as:

- `<BEGIN_UNTRUSTED_TOOL_RESULT> … <END_UNTRUSTED_TOOL_RESULT>` (JSON inside)

The agent-mode system prompt instructs the model to treat these blocks as untrusted evidence and to never follow instructions contained within retrieved content.

UX note: the UI renders actions as buttons and shows tool outputs in a collapsible “Tool result (raw)” panel (header includes ok/error + latency + hybrid/dedupe hints when available); users can then click “Continue with AI” to interpret the tool result and propose next steps.

## Teaching / tutoring / development assistance flow map

```mermaid
flowchart TD
  intent[User intent] -->|learn| teach[Teach: step-by-step + practice]
  intent -->|build/debug| dev[Build/Debug: plan + commands + checks]
  intent -->|summarize/compare| synth[Synthesize: bullets + tradeoffs]

  teach -->|grounded| sources[Use best available sources\n(catalog/docs/grid/web if enabled)]
  dev -->|grounded| sources
  synth -->|grounded| sources

  sources --> answer[Structured response\n+ appended Sources links]
```

## Baseline verification (local)

From `aipowerprogressia.com/`:

- Unit tests: `bash scripts/run_unit_tests.sh`
- Regression smoke: `python3 scripts/regression_smoke.py`
- Ask AI smoke (streaming): open `GET /ask`, send a prompt in **cited** mode
- Search + AI overview smoke: open `GET /search`, run a query, generate AI overview
- RAG smoke: `GET /api/rag/status` should show `weaviate_up` plus an ingest snapshot at `ingest.*` (best-effort)