Agents - Velt

What are Agents?

Agents are AI-driven QA tools that analyze a web page (or set of pages) and surface findings as comment annotations. You configure what an agent looks for; the platform handles context gathering, LLM invocation, post-processing, deduplication, and annotation creation. Build any check you want — brand consistency, accessibility, SEO, broken links, spelling, custom domain rules — by writing a prompt and selecting a few configuration options. No need to run your own browser pool, prompt your own model, or wire up annotation creation. You define an agent configuration once, then call Run Execution with a URL. Results are persisted as a versioned execution document plus per-URL findings, and annotations are automatically posted to the document via the Comments API.

What you get

Built-in and custom agents. Use ready-made agents (spell check, broken links, grammar, PII detection, profanity, sensitive-data) or define your own custom agents from a prompt.
Pluggable context gathering. Stack one or more strategies — page text, screenshots, HTML, CSS, links, accessibility tree, computed styles, robots.txt, sitemap, Lighthouse — and the engine extracts everything in parallel.
Versioned configurations. Every behavioral edit creates a new version. In-flight executions stay pinned to the version they started on. One-click restore rolls back the most recent change.
Async execution with polling. Run Execution returns an executionId immediately and dispatches a Cloud Task. Poll Get Execution until status !== "running".
Cross-page execution. Set crossPageExecute: true and the engine crawls the seed URL, processes up to maxUrlsToProcess pages, and aggregates findings across the site.
Match-and-merge dedup. Re-running an agent against the same document only creates new annotations for new findings; pre-existing matches are skipped, and resolved findings are auto-resolved.
Token usage analytics. Per-agent, per-model, per-month token consumption tracked automatically in Firestore and exposed via the Analytics endpoint.
Agent groups. Organize related agents into named groups (e.g. “Brand QA”) and filter list responses by group.

Use cases

Brand consistency

A custom agent verifies brand colors, typography, and logo placement across your marketing pages. Re-run on every deploy.

Pre-launch QA

Spell-check, broken-links, and accessibility agents run in parallel before a release. Findings appear as comment annotations on the staging document.

Content moderation

PII detection and profanity filter agents flag sensitive content on user-generated pages. Pair with Approval Engine for human review.

Cross-page audit

Run a brand consistency agent in crossPageExecute: true mode against a marketing site. The crawler discovers internal links, then the agent checks each one.

Custom QA tasks

Describe a check in plain English (e.g. “every CTA must use the primary brand color”). The Validate Prompt endpoint expands it into a structured QA task you can ship as an agent.

Workflow integration

Trigger an agent execution from an Approval Engine workflow node. The engine handles parking the workflow on the agent’s findings.

How it works

Define the agent. Call Create Agent with a name, instructions, and a configuration block (context gathering strategies, execution strategy, post-processing options). The engine creates version 1 in the agent’s versions subcollection.
Run an execution. Call Run Execution with agentId and url. The engine creates an execution document, dispatches a Cloud Task, and returns an executionId.
The pipeline runs. Context gathering → LLM execution → post-processing (guardrails, match-and-merge, annotation creation, analytics) → response shaping. Every phase is bounded by phaseTimeoutMs.
Poll for results. Call Get Execution until status !== "running". Set includeResults: true to fetch per-URL findings.
Findings appear as annotations. When postProcess.annotations.enabled is true (default), each finding becomes a comment annotation on the document referenced by organizationId / documentId.

Mental model

Agent

An agent is a reusable QA configuration. Each agent has an identity (name, description, enabled) plus a behavioral configuration (instructions, context gathering, execution, post-processing). Identity edits update the root document; behavioral edits create a new version.

Execution

An execution is one run of an agent against a seed URL. It pins the agent’s version at dispatch time, so in-flight runs are immune to mid-run config changes. Lifecycle:

running → passed | failed | error | skipped

Finding

A finding is one issue the agent surfaced on one URL. Findings carry severity, category, target text, suggestion, HTML selector, and confidence. Findings below the guardrails confidence floor are suppressed unless explicitly disabled.

Pipeline phases

Each agent run executes 5 sequential phases:

Phase	What happens
`validate`	Zod check on payload + input requirements + variable keys
`extract`	All configured context gathering strategies run (cached by tenant + content hash)
`execute`	LLM call (or registered service); results normalized to a finding array
`postProcess`	Guardrails → match-and-merge → annotation creation → analytics
`structureResponse`	Final response assembly with optional response adapter

Built-in agents

Agent ID	Description	Internal
`spell-check`	Spelling and typo detection	no
`grammar-check`	Grammar checking	no
`broken-links`	Broken link validation	no
`pii-detection`	PII (personal info) detection	no
`profanity-filter`	Profanity detection	no
`sensitive-data`	Sensitive data detection	no
`screenshot`	Page screenshot capture	yes
`crawler`	Site crawling	yes

Internal agents (metadata.internal: true) power the execution pipeline and are excluded from list responses.

Context gathering strategies

Stack one or more strategies in the order you want them to run. Strategy results are cached by tenant + content hash, so re-running the same agent against the same page is cheap.

Strategy	What it returns
`web-page-text`	Visible text extracted via Puppeteer
`web-page-screenshot`	Full-page screenshot as image
`web-page-html`	Cleaned HTML content
`web-page-css`	Cleaned CSS content from stylesheets
`web-page-links`	All hyperlinks on the page
`web-page-accessibility`	Accessibility tree (ARIA roles, landmarks, headings)
`computed-styles`	Computed CSS for every element
`robots-txt`	Fetched `robots.txt`
`sitemap-data`	Discovered and parsed XML sitemaps
`lighthouse`	Google Lighthouse audit (perf / a11y / SEO)
`none`	No context gathering (for service-only agents)

Execution strategies

Strategy	Purpose
`ai`	LLM-driven analysis (default)
`service`	Delegate to a registered built-in service (`serviceId` required)
`service+ai`	Service context gathering + AI analysis
`stagehand-agent`	Autonomous browser agent via Stagehand AI

Scope

Field	Bound to
`apiKey`	Workspace-wide (default; agents are per-workspace)
`organizationId`	Annotations created on the named organization
`documentId`	Annotations created on the named document

organizationId and documentId on Run Execution control where annotations land. They are not required for the agent to execute.

Async execution

Run Execution returns immediately with { executionId } and dispatches a Cloud Task. The engine processes the URL(s) asynchronously and writes results to Firestore. Poll Get Execution to track progress.

Get started

Setup

Create an agent, run it against a URL, and read the findings end-to-end.

API Reference

All endpoints organized into Agents, Execution, Versioning, Prompt Tools, Analytics, and Groups.

​What are Agents?

​What you get

​Use cases

Brand consistency

Pre-launch QA

Content moderation

Cross-page audit

Custom QA tasks

Workflow integration

​How it works

​Mental model

​Agent

​Execution

​Finding

​Pipeline phases

​Built-in agents

​Context gathering strategies

​Execution strategies

​Scope

​Async execution

​Get started