AI Power Progress iA
Back to Grid Download Markdown Docs are source-of-truth for trust + operations.

Docs (Markdown)

Tip: this is plain markdown in a <pre> block for maximum inspectability.

# PowerSearch Grid (MVP) — Architecture + Trust Model

PowerSearch Grid is the opt-in, privacy-first distributed search subsystem of **AI Power Progress iA**.

It provides:

- a control-plane (this FastAPI app) that **signs typed jobs** and stores state in SQLite
- edge agents that **poll** for work, **verify signatures**, and run only **whitelisted handlers**
- a local-first searchable index (`grid_docs` + FTS + optional embeddings) and a `/grid` product surface
- Ask AI grounding over Grid docs via bounded, permission-safe excerpts (no arbitrary URL fetch)

## Architecture map

```mermaid
flowchart LR
  browser[Browser /grid] -->|GET /grid| web[FastAPI static page]
  browser -->|Grid Search| api_search[/api/grid/search]
  browser -->|Ask AI over results| api_ai[/api/ai/chat/stream]
  browser -->|Trust metadata| api_assets[/api/grid/assets]
  browser -->|Cluster status| api_status[/api/grid/status]
  browser -->|Operator console| api_admin[/api/grid/admin/overview]

  api_search --> db[(SQLite app.db)]
  api_status --> db
  api_assets --> fs[(static/grid assets)]
  api_admin --> db
  api_ai --> db
  api_ai --> ollama[Ollama (local AI)\noptional]

  agent[Edge agent (Python)] -->|register| api_reg[/api/grid/nodes/register]
  agent -->|heartbeat| api_hb[/api/grid/nodes/{node_id}/heartbeat]
  agent -->|poll| api_poll[/api/grid/jobs/poll]
  agent -->|result| api_res[/api/grid/jobs/{job_id}/result]

  api_reg --> db
  api_hb --> db
  api_poll --> db
  api_res --> db

  api_submit[/api/grid/jobs/submit\n(admin)] --> db
  api_submit -->|sign manifest| sig[Ed25519 signing key\n(local file)]
  api_poll -->|serves signed manifest| sig
```

## Install / distribution flow (trust-first)

Primary conversion surface: `GET /grid` (`static/grid.html`).

The UX intentionally separates:

1. **Download Edge Agent** (inspectable source)
2. **Quick Install** (platform-specific commands)
3. **Verify / Checksums** (SHA-256 + optional signed manifest)

Key endpoints/assets:

- Agent: `static/grid/edge_agent.py`
- Verify helper: `static/grid/verify_grid_release.py` (verifies signed manifest + optional pinned key + local file checksums)
- Docs:
  - `GET /grid/docs` (inspectable markdown viewer)
  - `GET /grid/docs.md` (raw markdown download)
- Post-install node check:
  - `GET /api/grid/nodes/{node_id}/status_public` (minimal node status; uses `node_id` only, never the token)
- Linux installer: `static/grid/install_edge_agent.sh` (`--print-plan`, `--uninstall`)
  - Safe to re-run: preserves existing `agent.json` (node_id + policy) by default.
  - Upgrade-only mode: `--upgrade` (skips registration; restarts service).
  - Rotate node identity (policy preserved): `--re-register`.
  - Managed caps (Linux `systemd --user`): the generated unit applies `cpu_max_percent` → `CPUQuota` and `ram_max_gb` → `MemoryMax` on install/upgrade. Custom units are preserved.
  - Optional `GRID_DISPLAY_NAME=...` sets a human-readable node name at registration time (defaults to empty for privacy).
  - Supports pinned key verification: set `GRID_PUBKEY_FPR_SHA256=<sha256 fingerprint>` to enforce the signed release-manifest public key.
  - If `GRID_PUBKEY_FPR_SHA256` is set, the installer also writes a pinned key into `agent.json`:
    - `signing_public_key_fingerprint_sha256` (observed)
    - `signing_public_key_fingerprint_sha256_expected` (pinned; agent blocks work if it does not match)
- Uninstaller: `static/grid/uninstall_edge_agent.sh`
- Registration helper: `static/grid/register_node.py`
- Windows installer: `static/grid/install_edge_agent_windows.ps1`
  - Downloads + verifies `SHA256SUMS` for `edge_agent.py`, `register_node.py`, and `verify_grid_release.py`.
  - Verifies the signed release manifest (Ed25519) via `verify_grid_release.py` (optional pinned key via `GRID_PUBKEY_FPR_SHA256` / `-PubKeyFingerprintSha256`).
  - Safe to re-run: reuses existing `agent.json` by default; use `-ReRegister` to rotate node identity (policy preserved).
- Windows uninstaller: `static/grid/uninstall_edge_agent_windows.ps1`
  - Removes agent + config directories from the user profile (does not remove Python or pip packages).
- Optional background operation:
  - macOS: LaunchAgent template: `static/grid/powersearch-grid-agent.plist` (replace `__HOME__`, then load with `launchctl`).
  - Windows: Task Scheduler (recommended) — create a user logon task to run `python edge_agent.py --config agent.json` (copyable commands are provided on `/grid`).
- Trust metadata: `GET /api/grid/assets` (hashes + signed release manifest when crypto is available)
  - Includes a stable `release_id` for the current asset set (derived from the manifest content).
  - Includes `sha256sums_ok` + `sha256sums_note` so operators can detect a stale `static/grid/SHA256SUMS` file.
    - The API serves computed checksums for the currently-served assets; if the on-disk `SHA256SUMS` is stale, `sha256sums_ok=false`.
  - Release helper: run `python3 scripts/update_grid_sha256sums.py` whenever files in `static/grid/` change.
- Policy helpers:
  - `GET /api/grid/policy/default` (default conservative policy)
  - `POST /api/grid/policy/validate` (merge + validate a policy; used by the `/grid` policy editor)

## Agent doctor (read-only diagnostics)

`edge_agent.py` supports a local doctor mode for trust-first debugging:

```bash
~/.local/share/powersearch-grid-agent/venv/bin/python ~/.local/share/powersearch-grid-agent/edge_agent.py \
  --config ~/.config/powersearch-grid/agent.json \
  --doctor
```

Doctor checks include:
- config validity + policy parse
- pinned signing key match (if configured)
- control-plane reachability (`/health`, `/api/grid/status`, `status_public`)
- Ollama reachability (best-effort)

## Trust + consent model (MVP)

Core invariants:

- **Off by default:** `policy.enabled=false` until the user explicitly opts in.
- **Emergency stop (local):** if `policy.emergency_stop_enabled=true`, creating an `EMERGENCY_STOP` file next to `agent.json` pauses work immediately (delete it to resume).
- **Pull-only:** agents poll the control-plane; no peer-to-peer pushes.
- **Typed jobs only:** `GRID_JOB_TYPES = {health_check, crawl_url, ollama_chat}`.
- **Signed manifests:** edge agents verify Ed25519 signatures on job manifests.
- **Whitelisted handlers:** the agent executes only hard-coded handlers for allowed job types.
- **Minimal telemetry (honest):** agents send heartbeats (coarse CPU/RAM/disk + agent version; no hostname by default) and job results. For `crawl_url`, the agent fetches **text content only** (robots-respecting, bounded size/time, allowlisted targets) and uploads page text for indexing. The control-plane stores the page text in `grid_docs` for search; `grid_job_results` stores metadata only (`content_omitted=true`) to keep the DB compact.
- **SSRF defense-in-depth:** for `crawl_url`, both control-plane and agents enforce domain allowlists and block private IPs by default.
- **Rate limiting:** per-node throttles on poll/heartbeat/result (token-authenticated) and per-IP throttles on public endpoints like `/api/grid/search`.

Policy keys (agent-enforced):

- `crawl_allowlist_domains`: list of allowed domains (empty → default to `base_url` host)
- `allow_private_crawl_ips`: default `false`
- `crawl_max_redirects`: default `3`

Policy keys (best-effort / heuristic):

- `idle_only`, `quiet_hours`, `plugged_in_only` (best-effort), `thermal_throttle_enabled` (Linux-only), `reserve_cores_for_user`
- `reserve_ram_gb_for_user` (Linux-only RAM reserve; other platforms treat RAM availability as unknown)

Control-plane knobs (operator env vars):

- `GRID_CRAWL_ALLOWLIST` (defaults to `PUBLIC_BASE_URL` host)
- `GRID_ALLOW_PRIVATE_CRAWL_IPS=1` to permit private IPs (LAN-only deployments)
- `GRID_REGISTRATION_TOKEN` to require a join token on public networks

## Job lifecycle map

1. **Submit (admin/operator)**: `POST /api/grid/jobs/submit` (token-gated)
2. **Validate**: `grid_validate_crawl_url` blocks off-allowlist and non-global targets
3. **Queue**: job persisted in `grid_jobs` (status `queued`)
4. **Poll**: agent calls `POST /api/grid/jobs/poll` with its current policy/capabilities
5. **Assign**: control-plane chooses a node and returns `{manifest, signature_b64}`
6. **Verify (agent)**: Ed25519 signature verification + policy allowlist checks
7. **Execute**: whitelisted handler runs:
   - `health_check` (capabilities + health)
   - `crawl_url` (bounded crawl + robots + allowlist + private-IP blocking)
   - `ollama_chat` (local Ollama; model bounded by node config)
8. **Submit result**: `POST /api/grid/jobs/{job_id}/result`
9. **Index (control-plane)**: crawl results can be indexed into `grid_docs` (+ optional FTS)

Reliability note: during polls, the control-plane best-effort **reclaims stale `assigned` jobs** (requeues them, or marks them failed when `attempts>=max_attempts`) so the queue can make progress if a node disappears mid-job. Tunables: `GRID_JOB_STALE_MULTIPLIER` (2.0), `GRID_JOB_STALE_GRACE_S` (60), `GRID_JOB_STALE_MIN_S` (180), `GRID_JOB_STALE_MAX_S` (7200), `GRID_JOB_STALE_SCAN_LIMIT` (32), `GRID_JOB_STALE_RECLAIM_LIMIT` (6), `GRID_JOB_STALE_RECLAIM_INTERVAL_S` (20).

## Search + semantic + Ask AI grounding

Grid search endpoint:

- `GET /api/grid/search` supports:
  - lexical search via SQLite FTS (when available)
    - uses BM25 ranking (title weighted higher than body)
  - optional semantic rerank with cached embeddings (`grid_doc_embeddings`)
  - trust/ops metadata in the response (best-effort):
    - `mode` (`lexical|semantic|recent`) and optional `note` / `semantic_error`
    - `dedupe` summary (`by=content_hash`, `pruned`)
    - `timings_ms` (for UI latency surfacing and operator debugging)

Ask AI over Grid results:

- The browser sends `page.sources` (URLs + `doc_id` where available) and sets `use_grid_context=true`.
- Server looks up those docs in SQLite and injects bounded excerpts as an **untrusted** context block:
  - helper: `app/_app.py:_ai_grid_docs_context_from_payload`
- Ask AI never fetches arbitrary URLs on its own for Grid context.
- The `/grid` UI renders grounding separately from the main answer:
  - “What this is based on” (high-level provenance summary)
  - “Sources” (clickable, labeled links)
  - Follow-up thread memory stores the answer body only (grounding is stripped).

## Storage / index / cache

SQLite (`app.db`) tables (MVP):

- `grid_nodes` (registration, policy snapshot, capabilities, consent, token hash)
- `grid_jobs` (queue/assignment state + signed manifest)
- `grid_docs` (indexed documents)
- `grid_docs_fts` (optional; FTS5 for lexical search)
- `grid_doc_embeddings` (optional; semantic cache)

## Operator visibility

- `GET /api/grid/status` shows:
  - node counts (online/total)
  - contributing node counts (contributing online/total)
  - work-availability (newer agents only):
    - `nodes_work_allowed_online` / `nodes_work_reported_online` (how many online nodes are currently able to pull jobs)
    - `nodes_work_blocked_reported_online` (opted-in nodes that are connected but currently blocked by local policy heuristics)
  - the heartbeat window used for “online” (`online_window_s`)
  - job counts (queued/assigned/done/failed/canceled)
    - includes best-effort queue age fields when available: `jobs.oldest_queued_utc`, `jobs.oldest_assigned_utc` (and `*_age_s`)
  - doc + embedding counts
  - a lightweight freshness block: `freshness.docs_last_indexed_utc`, `freshness.embeddings_last_updated_utc` (and `*_age_s`)
  - capacity aggregates from recent heartbeats (`capacity.online`, `capacity.contributing_online`)
    - `cpu_total`, `mem_total_bytes`, `mem_available_bytes`, `home_free_bytes`
    - `ollama_nodes_online` (best-effort)
  - recent nodes (last seen, status, contributing/paused/disconnected state)
    - may include `work_allowed` and `work_blockers` for “why isn't this opted-in node pulling jobs right now?”
  - privacy: `display_name` is omitted by default; set `GRID_PUBLIC_STATUS_INCLUDE_DISPLAY_NAME=1` to include it

Operator console (token-gated or direct-local only):

- `GET /api/grid/admin/overview` returns:
  - nodes (policy + heartbeat/capabilities summary)
  - jobs (status + attempts + result summary)
  - audit events
- `GET /api/grid/admin/jobs/{job_id}` returns:
  - per-job trace (slim job fields + assigned node summary + related audit events; best-effort)
- `POST /api/grid/jobs/submit` queues an admin job (used by the `/grid` operator console “Submit job” form)
- `POST /api/grid/admin/jobs/{job_id}/retry` re-queues a copy of a job
- `POST /api/grid/admin/jobs/{job_id}/cancel` cancels a queued **or assigned** job (status becomes `canceled`)
- `POST /api/grid/admin/jobs/reclaim-stale` forces a best-effort reclaim of stale assigned jobs (returns `{requeued, failed, scanned}`)

Auth model:

- If `GRID_ADMIN_TOKEN` (or `APP_ADMIN_TOKEN`) is set, requests must provide the token header.
- If no tokens are configured, operator endpoints allow **direct loopback only** (no forwarded headers).

## Top issues (severity × leverage)

1. Release/versioning: automate `SHA256SUMS` + signed-manifest distribution + changelog.
2. Key pinning UX: better “verify signature” flow + rotation/runbook story.
3. Policy editor UX: safe defaults + inline explanations + validation errors.
4. Node health details: show CPU/RAM/disk + last heartbeat payload on `/grid` + `/status`.
5. Operator job queue UI: submit/inspect/retry/cancel jobs (token-gated).
6. Agent upgrade path: idempotent updates + “what changed” diff view.
7. Constraint enforcement: normalize timeouts/max_bytes/redirects consistently across plane + agent.
8. Crawl pipeline: canonicalization + dedupe + content-type enforcement + snippet quality.
9. Robots handling: caching + clearer failure modes + operator overrides (still conservative).
10. Stronger consent UX: explicit resource caps + “pause/stop” UX + local logs.
11. Public endpoint hardening: rate limits + safer errors + audit trail coverage.
12. Registration safety: token requirements + replay protection + clearer “public vs LAN” modes.
13. Multi-tenant scoping: org/workspace isolation for nodes/jobs/docs (permission-aware search).
14. Search relevance: better blending of FTS + semantic; near-duplicate clustering.
15. Embedding lifecycle: TTL/invalidation + model versioning + cache health endpoints.
16. Ask AI grounding: UI to select which Grid docs are used; better citations for excerpts.
17. Degraded mode: clearer UI when grid disabled/unavailable (local-only fallback).
18. Observability: per-stage latency metrics + job success rates + node SLOs.
19. Packaging: Docker image for agent + signed releases for Linux/macOS/Windows.
20. Sandbox posture: stricter network egress for crawls (optional), safer redirect policies, headers.

## Implementation order (recommended)

1. Automate `SHA256SUMS` + signed-manifest release generation (single source of truth).
2. Ship a token-gated operator UI for nodes + jobs (status, queue, retries, audit trail).
3. Ship a user-facing policy editor + consent explainer (opt-in clarity; safe defaults).
4. Improve indexing/crawl quality (canonicalization, dedupe, content-type rules, snippets).
5. Improve Ask AI “grounded over Grid” UX (doc selection, better citations, fallbacks).
6. Add observability + SLOs (latency + job success + node health trend lines).

## Baseline measurement plan

Local checks:

- Unit tests: `bash scripts/run_unit_tests.sh`
- Regression smoke: `python3 scripts/regression_smoke.py`

Grid-specific smoke:

- `curl -fsS http://127.0.0.1:8000/api/grid/assets | head`
- `curl -fsS http://127.0.0.1:8000/api/grid/status`
- Open `http://127.0.0.1:8000/grid` and verify:
  - Download + Verify section loads checksums
  - platform tabs switch
  - Grid Search works (empty state + query)
  - Ask AI over results streams and cites Grid URLs