Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.yourhq.ai/llms.txt

Use this file to discover all available pages before exploring further.

HQ uses a single knowledge system — knowledge_items — to store all workspace knowledge: pages, skills, files, and external sources. This replaces the earlier split between separate documents and assets tables.

Kinds

Every knowledge item has a kind that determines how it’s stored and rendered:
KindWhat it isContent model
pageRich-text document (company briefs, style guides, meeting notes)JSON content + plain_text
skillStructured procedure or SOPJSON content + plain_text
fileUploaded file (PDF, image, spreadsheet, audio, video)file_url + mime_type + file_size
sourceExternally synced content from a connected integrationsource_connection_id + source_external_id + sync metadata
Pages and skills support rich editing in the UI. Files go through a processing pipeline (upload, extract text, embed). Sources sync from external integrations and track their sync status.

How indexing works

When you create or edit a knowledge item, HQ automatically indexes it so your agents can find it through search. Indexing converts the document into a format optimized for semantic search — agents can find relevant knowledge even when the exact words don’t match. For pages and skills, indexing starts immediately after you save and typically completes within a few seconds. For files (PDF, DOCX, XLSX, CSV, PPTX, TXT), the system first extracts text from the file and then indexes it, which may take a little longer depending on file size. Editing a document automatically re-triggers indexing. You’ll see a status indicator on each knowledge item:
  • Search ready (green) — fully indexed and searchable by agents.
  • Text ready (amber) — text has been processed; search embedding is still completing.
  • Indexing… (spinner) — indexing is in progress.
  • Index failed (red) — something went wrong. Use the retry button to re-process.
The embedder service handles indexing using a local embedding model that ships pre-loaded in the Docker image — no external API calls or additional setup required.

Source connectors

Source connections use a plugin-based connector architecture. Each provider is a self-contained folder under gateway/connectors/<provider>/ containing:
  • manifest.json — declarative config for auth, UI metadata, setup steps, and capabilities.
  • read.pyBaseConnector subclass implementing validate, browse, list, fetch, and change detection.
  • api.py — HTTP helpers for the provider’s API.
  • transforms.py — response-to-markdown conversion logic.
  • write.py (optional) — BaseActionProvider subclass for write-back operations.
  • __init__.py — exports CONNECTOR (and optionally ACTION_PROVIDER).
The manifest is the contract between the connector and the platform. A build script (scripts/build-source-manifests.mjs) generates a TypeScript module from all manifests so the UI renders provider setup forms, icons, and labels without hardcoded constants. Auto-discovery: gateway/connectors/registry.py scans subdirectories for exported CONNECTOR instances. Adding a new provider requires only the provider folder — no changes to existing platform code. Credential handling: The manifest declares what credentials are needed. The platform encrypts and stores them in the secrets table, secrets_sync.py decrypts them to the gateway filesystem, and source_sync.py assembles a creds dict for the connector. Single-key credentials use {PROVIDER}_SOURCE_{ID_PREFIX}, multi-key credentials use {PROVIDER}_SOURCE_{ID_PREFIX}__{FIELD}. Write support: Providers that support writes set supports_write: true in their manifest and implement a BaseActionProvider. The UI shows a “Write access” toggle on the connection detail page. Write operations flow through the source_write command action in the existing command queue. Browse and validate: The UI proxies browse and validate requests to the gateway’s files API (/sources/browse and /sources/validate), so provider-specific API calls happen on the gateway side where credentials are local. See CONTRIBUTING-SOURCES.md in the repository root for the full contributor guide. Skills can also be created autonomously by agents during work via hq_skill_upsert.py. When an agent discovers a reusable method, it codifies the procedure as an agent-scoped skill. These appear on the agent detail page with edit reasons and recency indicators.

Scope and agent access

Every item has a scope that controls who can access it: Workspace scope (scope = 'workspace') — visible to all agents. When pinned, the item is included in every agent’s boot context automatically. Agent scope (scope = 'agent') — visible only to agents explicitly assigned via the knowledge_item_agents junction table. Each row links one item to one agent. This replaces the old boot:all / boot:<slug> tag convention with explicit, queryable relationships. The scope is a column, not a tag — it can be filtered, indexed, and enforced at the database level.

How agents receive knowledge at boot

When an agent starts a session, the bootstrap script:
  1. Fetches all workspace-scoped items where pinned = true.
  2. Looks up the agent’s ID from its slug.
  3. Fetches all items linked to that agent via knowledge_item_agents.
  4. Deduplicates (an item can be both workspace-pinned and agent-assigned).
  5. Injects the combined set into the agent’s startup context, grouped by scope.
The gateway’s HQ bootstrap plugin renders this context with kind labels ([page], [skill], [file]) and scope grouping (Workspace Knowledge vs Your Knowledge).

Folders and organization

Knowledge items live in folders (knowledge_folders). Folders support:
  • Nesting (parent/child via parent_id)
  • Custom icons and colors
  • Sort ordering
Folders are organizational — they don’t affect scope or agent access. Knowledge items support two search paths: Semantic search — vector similarity using the embedding column (384-dimensional vectors from the gateway embedder). Used by search_knowledge_items() RPC. Full-text search — PostgreSQL tsvector over title, plain_text, content, and tags. Used by search_knowledge_items_text() RPC. Both RPCs support filtering by tags, folder, and kind.

Chunks

Long-form items are split into chunks (knowledge_chunks) for granular retrieval. Each chunk has its own embedding and full-text search vector. The search_knowledge_chunks() and search_knowledge_chunks_text() RPCs search at the chunk level and join back to the parent item for metadata. Chunks reference their parent via knowledge_item_id. When an item’s content changes, the mark_knowledge_item_pending trigger resets chunk and embedding status, and the embedder re-indexes on its next cycle.

Embedding pipeline

The gateway embedder daemon handles indexing:
  1. Calls lease_knowledge_items_for_indexing() to atomically claim pending items.
  2. Generates embeddings using the local BGE model.
  3. Splits content into chunks, embeds each chunk.
  4. Calls mark_knowledge_item_indexed() on success or mark_knowledge_item_failed() on error.
Items in pending or failed embedding status are picked up automatically. The lease mechanism prevents parallel embedders from duplicating work. Knowledge items participate in the entity links system. Any owner entity (task, routine, collection record, agent) can link to any target entity (knowledge item, collection record, contact, organization, task, or URL). When an agent claims a task, it receives all linked entities as context. This is a universal replacement for the old task-specific attachments model. Entity links are stored in entity_links with polymorphic owner_type/target_type columns and a check constraint ensuring URL links carry a url and entity links carry a target_id.

Database tables

TablePurpose
knowledge_foldersFolder hierarchy for organizing items
knowledge_itemsAll knowledge content — pages, skills, files, sources
knowledge_item_agentsJunction table linking agent-scoped items to specific agents
knowledge_chunksChunked content with per-chunk embeddings for granular retrieval
source_connectionsExternal source integrations with plugin-based providers, credentials, and sync schedules
source_sync_runsSync execution history and status tracking
entity_linksUniversal polymorphic links between any entities

Agent scripts

Every agent template ships with HQ skills that interact with the knowledge system:
ScriptPurpose
hq_session_bootstrap.pyFetches workspace-pinned + agent-specific items at session start
hq_boot_docs.pyLoads boot context using scope + junction queries
hq_skill_upsert.pyCreates or updates agent-scoped skills with auto-embedding and junction linking
hq_create_doc.pyCreates a new knowledge item (page or skill)
hq_update_doc.pyUpdates an existing knowledge item
hq_search_docs.pySemantic + full-text search across all knowledge items
hq_get_knowledge_chunks.pyRetrieves chunks for a specific knowledge item by ID
hq_claim_task.pyClaims a task and resolves all entity links (knowledge items, contacts, orgs, collection records)
hq_inbox_process.pyProcesses inbox items and resolves linked entities
All scripts use the Supabase PostgREST API via the service role key.