LLM for Agents API

Unified LLM inference platform for AI agents. OpenAI-compatible. Pay-per-token.

https://api.llm4agents.com

Getting Started

Register your agent

Create an agent account and receive your API key. Store it securely -- it is shown only once.

curl -X POST https://api.llm4agents.com/api/v1/agents/register \
  -H "Content-Type: application/json" \
  -d '{"name": "my-agent"}'

Response:

{
  "uuid": "a1b2c3d4-...",
  "apiKey": "sk-proxy-abc123...",
  "name": "my-agent",
  "createdAt": "2026-04-14T12:00:00.000Z",
  "requestId": "req_..."
}

Add funds
Generate a deposit wallet, then send USDT or USDC on Solana or Polygon. The wallet is a top-up address used exclusively to fund your LLM4Agents balance — every confirmed deposit credits the same balance that pays for chat completions, scraper, search, and image tools. Funds are credited automatically after on-chain verification. Gasless-transfer relayer fees are paid by your own EOA in the token being transferred — they do not draw from the LLM4Agents balance.
```
curl -X POST https://api.llm4agents.com/api/v1/wallets/generate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"chain": "solana", "token": "USDC"}'
```

Start using the API

Point any OpenAI-compatible SDK at the proxy and start making requests.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.llm4agents.com/v1",
    api_key="your-api-key"
)

response = client.chat.completions.create(
    model="anthropic/claude-3-haiku",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.llm4agents.com/v1',
  apiKey: 'your-api-key',
});

const response = await client.chat.completions.create({
  model: 'anthropic/claude-3-haiku',
  messages: [{ role: 'user', content: 'Hello!' }],
});
console.log(response.choices[0].message.content);

curl https://api.llm4agents.com/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-3-haiku",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Authentication

All authenticated endpoints require an Authorization header with a Bearer token:

Authorization: Bearer sk-proxy-abc123...

Your API key is returned only once at registration. Store it securely. Keys are hashed server-side and cannot be recovered.

TypeScript SDK

@llm4agents/sdk is the official TypeScript SDK for the platform. A single LLM4AgentsClient facade exposes chat completions, wallet management, gasless stablecoin transfers, and MCP-powered tools (scraper, search, image) — all authenticated with the same sk-proxy-... bearer key.

Repo: github.com/llmforagents/sdk
npm: @llm4agents/sdk
Runtimes: Node 18+, browser, Cloudflare Workers, Deno, Bun
Dependencies: zero runtime deps. ethers ^6 is an optional peer dep, only required for client.transfer.

The SDK is a thin, predictable layer on top of this REST API and the MCP server at mcp.llm4agents.com. No auto-retry, no hidden state — every method maps to a documented endpoint or MCP tool.

Install

npm install @llm4agents/sdk

# Optional — only required for client.transfer (gasless transfers)
npm install ethers

pnpm add @llm4agents/sdk
pnpm add ethers   # optional, gasless transfers only

yarn add @llm4agents/sdk
yarn add ethers   # optional, gasless transfers only

bun add @llm4agents/sdk
bun add ethers   # optional, gasless transfers only

Initialize

import { LLM4AgentsClient } from '@llm4agents/sdk';

const client = new LLM4AgentsClient({
  apiKey: process.env.LLM4AGENTS_API_KEY!,
  // Optional overrides:
  baseUrl: 'https://api.llm4agents.com',
  mcpUrl:  'https://mcp.llm4agents.com/mcp',
  timeout: 30_000,
});

Chat — completions, streaming, tool-loop conversations

Thin wrapper over POST /v1/chat/completions, plus a higher-level conversation() helper that maintains history and auto-executes MCP tool calls.

// Single completion
const res = await client.chat.completions.create({
  model: 'anthropic/claude-sonnet-4',
  messages: [{ role: 'user', content: 'Hello' }],
});

// Streaming
const stream = await client.chat.completions.create({
  model: 'anthropic/claude-sonnet-4',
  messages: [{ role: 'user', content: 'Count to 10' }],
  stream: true,
});
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
}

// Conversation with auto tool-execution loop
const conv = client.chat.conversation({
  model:  'anthropic/claude-sonnet-4',
  system: 'You are a research assistant',
  tools:  client.tools,
  onToolCall:   (name, args) => { console.log(`→ ${name}`); return true; },
  onToolResult: (name, result) => { console.log(`✓ ${name} (${result.length} chars)`); },
  maxToolRounds: 5,
});

const answer = await conv.say('Search for Bitcoin news and summarize the top 3');
console.log(answer.content);
console.log(answer.toolCalls); // ToolCallRecord[]

// Streaming conversation — typed StreamEvent
const events = await conv.stream('Now find the current price');
for await (const ev of events) {
  switch (ev.type) {
    case 'text':       process.stdout.write(ev.content); break;
    case 'tool_start': console.log(`\n[tool] ${ev.name}`); break;
    case 'tool_end':   console.log(`[done] ${ev.durationMs}ms`); break;
    case 'done':       console.log('\n', ev.response.usage); break;
  }
}

// History is a JSON-serializable array — persist anywhere
const saved = conv.messages;          // readonly ChatMessage[]
const rehydrated = client.chat.conversation({ model: '...', history: saved });

Wallets — generate, balance, transactions

Generated wallets are deposit-only top-up addresses for your LLM4Agents balance. Anything sent to them credits the single balance used to pay for chat completions, scraper, search, and image tools. Gasless-transfer relayer fees are paid separately by your own EOA in the token being transferred — they do not draw from the LLM4Agents balance.

const wallet = await client.wallets.generate({
  chain: 'polygon',
  token: 'USDC',
});
console.log(wallet.address);

const balance = await client.wallets.balance();
console.log(balance.availableUsd);
console.log(balance.wallets); // per-chain/token breakdown

const page = await client.wallets.transactions({ limit: 20, type: 'deposit' });
for (const tx of page.transactions) {
  console.log(`${tx.type}: $${tx.amountUsd} — ${tx.description}`);
}

Gasless transfers — non-custodial stablecoin sends

The SDK signs locally with the user's private key (via the ethers peer dep) and submits to POST /v1/tx/send. The platform never touches the key. See Gasless TX for the full flow and supported chains/tokens.

// One-call: quote + sign + submit
const result = await client.transfer.send({
  chain: 'polygon',
  token: 'USDC',
  to: '0xRecipient...',
  amount: '10.50',
  privateKey: process.env.EOA_PRIVATE_KEY!,
});
console.log(result.txHash, result.explorerUrl);

// Two-step: inspect the operator fee before signing
const quote = await client.transfer.quote({
  chain: 'polygon', token: 'USDC',
  from: '0xSender...', to: '0xRecipient...', amount: '10.50',
});
console.log(`Fee: ${quote.feeFormatted}`);
const submitted = await client.transfer.submit(quote, process.env.EOA_PRIVATE_KEY!);

client.transfer requires the ethers peer dep. If it isn't installed, the SDK throws a clear error on first use: "ethers is required for gasless transfers. Install it: npm install ethers".

MCP tools — scraper, search, image

Typed wrappers over the MCP Streamable HTTP endpoint at mcp.llm4agents.com/mcp. Pricing and full parameter docs live under Scraper / MCP, Search Tools, and Image Tools.

// Scraper (headless browser)
const html = await client.tools.scraper.fetchHtml({ url: 'https://example.com' });
const md   = await client.tools.scraper.markdown({ url: 'https://example.com' });
const shot = await client.tools.scraper.screenshot({ url: 'https://example.com', fullPage: true });

// Search (Google via Serper)
const results = await client.tools.search.google({ q: 'TypeScript SDK design' });
const news    = await client.tools.search.googleNews({ q: 'Bitcoin', tbs: 'qdr:d' });
const places  = await client.tools.search.googleMaps({ q: 'coffee near me' });

// Image (generate, edit, analyze)
const img      = await client.tools.image.generate({ prompt: 'A robot writing code' });
const edited   = await client.tools.image.edit({ prompt: 'Make it blue', imageUrl: '...' });
const analysis = await client.tools.image.analyze({ prompt: 'What is this?', imageUrl: '...' });

// Tool definitions for use with custom tool loops or other LLMs
const defs = client.tools.definitions; // ToolDefinition[]

Models

const models = await client.models.list();
for (const m of models) {
  console.log(`${m.slug} — $${m.inputPricePer1m}/1M in, $${m.outputPricePer1m}/1M out`);
}

Error handling

Every failure surfaces as a single LLM4AgentsError with a stable code, the upstream statusCode (when applicable), and the requestId from the x-request-id response header.

import { LLM4AgentsError } from '@llm4agents/sdk';

try {
  await client.chat.completions.create({ model: '...', messages: [...] });
} catch (err) {
  if (err instanceof LLM4AgentsError) {
    switch (err.code) {
      case 'insufficient_balance': /* top up */ break;
      case 'rate_limited':         /* back off */ break;
      case 'model_not_found':      /* pick fallback */ break;
      case 'gas_spike':            /* re-quote and retry */ break;
      case 'tool_loop_limit':      /* raise maxToolRounds or shorten task */ break;
      default:                       console.error(err.code, err.requestId);
    }
  }
}

Error codes mirror the REST/MCP responses — see Errors for the full list. The @llm4agents/gasless standalone package is superseded by this SDK and uses the same code values, so migrating only requires renaming the import and error class.

Endpoints Reference

POST /api/v1/agents/register Public Register a new agent ▸

Anti-spam: deposit within 15 minutes. Newly registered agents that do not receive any deposit (across any supported chain/token) within ~15 minutes are automatically deleted to prevent database saturation. A deposit of even 1 cent permanently exempts the agent from this sweep. Recommended onboarding: register → POST /api/v1/wallets/generate → fund the returned address → start using the API. Save your apiKey immediately; it cannot be retrieved later.

Request Body

Field	Type	Description
name	string	required Agent name (1-100 chars)

Response `201`

{
  "uuid": "a1b2c3d4-e5f6-...",
  "apiKey": "sk-proxy-abc123...",
  "name": "my-agent",
  "createdAt": "2026-04-14T12:00:00.000Z",
  "requestId": "req_...",
  "depositDeadline": "2026-04-14T12:15:00.000Z",
  "depositRequiredWithinMinutes": 15,
  "notice": "Anti-spam protection: this agent will be automatically deleted if no deposit is received within 15 minutes. You can simply register a new agent if that happens. Once the first deposit (any amount) is credited, the agent becomes permanent. This measure exists solely to prevent malicious actors from generating agents indefinitely."
}

GET /healthz Public Health check ▸

Response `200`

{
  "status": "ok",
  "service": "llm-proxy-api",
  "timestamp": "2026-04-14T12:00:00.000Z"
}

POST /api/v1/wallets/generate Auth Required Generate deposit wallet ▸

Generates a unique deposit wallet address for your agent. Top-up only: the wallet exists solely to fund your LLM4Agents balance — anything sent to it is credited to the single balance that pays for chat completions, scraper, search, and image tools. Gasless-transfer relayer fees are not paid from this balance: they are paid by your own EOA in the token being transferred. Idempotent: calling with the same chain/token returns the existing wallet.

Request Body

Field	Type	Description
chain	string	required `"solana"` or `"polygon"`
token	string	required `"USDT"` or `"USDC"`

Response `200`

{
  "chain": "solana",
  "token": "USDC",
  "address": "7xKX...",
  "createdAt": "2026-04-14T12:00:00.000Z",
  "requestId": "req_..."
}

GET /api/v1/balance Auth Required Check balance ▸

Response `200`

{
  "uuid": "a1b2c3d4-...",
  "availableUsdCents": 5000,
  "availableUsd": "50.00",
  "totalDepositedUsd": "100.00",
  "totalSpentUsd": "50.00",
  "requestId": "req_..."
}

GET /api/v1/models Auth Required List available models ▸

Query Parameters

Field	Type	Description
search	string	optional Filter models by slug or display name

Response `200`

{
  "models": [
    {
      "slug": "anthropic/claude-3-haiku",
      "displayName": "Claude 3 Haiku",
      "provider": "anthropic",
      "inputPricePer1M": 0.3,
      "outputPricePer1M": 1.5,
      "contextWindow": 200000,
      "lastSyncedAt": "2026-04-14T06:00:00.000Z"
    }
  ],
  "requestId": "req_..."
}

inputPricePer1M and outputPricePer1M are the per-million-token prices in USD. These are the prices you pay.

POST /v1/chat/completions Auth Required Chat completion (main endpoint) ▸

OpenAI-compatible chat completions endpoint. This is the primary endpoint your agent will use.

Request Body

Field	Type	Description
model	string	required Model slug, e.g. `"anthropic/claude-3-haiku"`
messages	array	required Array of message objects with `role` and `content`
temperature	number	optional Sampling temperature (0-2)
max_tokens	integer	optional Maximum output tokens (default: 4096)
stream	boolean	optional Enable streaming responses

Extra fields are passed through to the upstream provider.

Response `200`

{
  "id": "gen-abc123",
  "object": "chat.completion",
  "model": "anthropic/claude-3-haiku",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 9,
    "total_tokens": 21
  }
}

Response Headers

Header	Description
`X-Cost-Usd-Cents`	Actual cost of this request in USD cents
`X-Tokens-Input`	Number of input (prompt) tokens
`X-Tokens-Output`	Number of output (completion) tokens
`X-Balance-Remaining-Cents`	Your remaining balance in USD cents
`X-Request-Id`	Unique request identifier

POST /v1/embeddings Auth Required Create text embeddings (OpenAI-compatible) ▸

OpenAI-compatible embeddings endpoint. Returns one vector per input string. Embeddings are billed input-only (no completion tokens). The model slug must be one of the embedding models in the catalog — call GET /api/v1/models?search=embed to list them.

Request Body

Field	Type	Description
model	string	required Embedding model slug, e.g. `"openai/text-embedding-3-large"`
input	string \| string[]	required Single string or array of up to 2048 strings to embed
encoding_format	string	optional `"float"` (default) or `"base64"`
dimensions	integer	optional Override vector dimensionality (only honored by models that support it)
user	string	optional Opaque end-user identifier passed through to upstream

Response `200`

{
  "object": "list",
  "data": [
    { "object": "embedding", "embedding": [0.012, -0.034, ...], "index": 0 }
  ],
  "model": "openai/text-embedding-3-large",
  "usage": { "prompt_tokens": 5, "total_tokens": 5 }
}

Response Headers

Header	Description
`X-Cost-Usd-Cents`	Actual cost of this request in USD cents
`X-Tokens-Input`	Number of input (prompt) tokens
`X-Balance-Remaining-Cents`	Your remaining balance in USD cents
`X-Model-Used`	Slug of the model that actually responded
`X-Request-Id`	Unique request identifier

POST /v1/audio/speech Auth Required Text-to-speech (OpenAI-compatible) ▸

OpenAI-compatible text-to-speech endpoint, proxied to OpenRouter TTS models. Returns raw audio bytes (not JSON) with the upstream Content-Type. Supported today: x-ai/grok-voice-tts-1.0 — voices eve, ara, rex, sal, leo (case-insensitive), 20+ languages auto-detected. Balance-only — x402 walk-up is not available on this REST endpoint, use the MCP text_to_speech tool instead. TTS models do not appear in GET /api/v1/models.

Request Body

Field	Type	Description
model	string	required TTS model slug, e.g. `"x-ai/grok-voice-tts-1.0"`
input	string	required Text to synthesize verbatim, up to 15000 characters
voice	string	required Voice id, e.g. `"sal"` (case-insensitive)
response_format	string	optional `"mp3"` (default), `"wav"`, or `"pcm"`
speed	number	optional Playback speed multiplier
provider	object	optional Forwarded verbatim to OpenRouter provider routing

Example Request

curl -X POST https://api.llm4agents.com/v1/audio/speech \
  -H "Authorization: Bearer $API_KEY" -H "Content-Type: application/json" \
  -d '{"model":"x-ai/grok-voice-tts-1.0","input":"Estás en el interior de una pirámide.","voice":"sal","response_format":"mp3"}' \
  --output speech.mp3

Response 200 body is raw audio bytes (e.g. audio/mpeg for mp3) — not JSON.

Response Headers

Header	Description
`X-Charged-Usd-Cents`	Actual cost of this request in USD cents, billed per input character
`X-Model-Used`	Slug of the TTS model that responded
`X-Request-Id`	Unique request identifier

Billed per input character, charged to your balance before the upstream call and refunded in full on upstream failure (minimum 1¢ per request). x-ai/grok-voice-tts-1.0 is $15/M chars upstream plus the standard platform fee.

POST /v1/videos Auth Required Generate video (async job) ▸

Submits an asynchronous video generation job to OpenRouter, proxied through x-ai/grok-imagine-video-1.5 by default. Optionally seed the first frame with an image. This is a three-step flow: submit (POST /v1/videos, returns 202 immediately), poll (GET /v1/videos/:id until status is terminal), then download (GET /v1/videos/:id/content once completed). Balance-only — x402 walk-up is not available, use the MCP generate_video tool instead (also Bearer-only). Video models do not appear in GET /api/v1/models.

Request Body (`POST /v1/videos`)

Field	Type	Description
model	string	optional Video model slug (default `"x-ai/grok-imagine-video-1.5"`; also supports `"bytedance/seedance-1-5-pro"` for lower-cost generation)
prompt	string	required Text description of the video to generate, up to 2000 characters
image	string	optional First-frame reference image — an `https://` URL or a `data:image/{png\|jpeg\|webp};base64,...` URI
duration	integer	optional Duration in seconds, 1–15 (default `6`)
resolution	string	optional `"480p"`, `"720p"` (default), or `"1080p"`
aspect_ratio	string	optional `"16:9"` (default), `"9:16"`, `"1:1"`, `"4:3"`, `"3:4"`, `"3:2"`, or `"2:3"`
generate_audio	boolean	optional Generate a synchronized audio track (default `true`). Setting `false` is roughly half the upstream cost on models that bill for audio (e.g. seedance, veo) — reflected in the real-cost adjustment at completion.
seed	integer	optional Deterministic generation seed
provider	object	optional Forwarded verbatim to OpenRouter provider routing

Reserve pricing by resolution

These are temporary reserve caps, not the final price. The reserve is debited at submit time (before the upstream call); once the job completes, it is automatically adjusted down to the real upstream cost plus the platform fee. If the real cost exceeds the reserve, you are still only ever charged the reserve.

Resolution	Upstream rate	Reserve (incl. 20% platform fee)
480p	$0.08/s (grok-1.0: $0.05/s; seedance-1-5-pro: $0.04/s)	≈10¢/s (grok-1.0 ≈6¢/s; seedance ≈4.8¢/s)
720p (default)	$0.14/s (grok-1.0: $0.07/s; seedance-1-5-pro: $0.08/s)	≈17¢/s (grok-1.0 ≈9¢/s; seedance ≈9.6¢/s)
1080p	$0.25/s (seedance-1-5-pro: $0.12/s)	≈30¢/s (seedance ≈14.4¢/s)
Other/untabulated models	$0.50/s	≈60¢/s

Reserved as duration × per-second rate plus a per-request image-input charge when image is supplied (grok-1.5: 1¢; grok-1.0: 0.2¢), deterministically pre-charged at submit time. On completion, the charge is auto-adjusted to the real cost: the difference is refunded as a refund transaction with metadata.reason: "real_cost_adjustment" (see GET /api/v1/transactions) and GET /v1/videos/:id returns adjusted: true. Example: bytedance/seedance-1-5-pro at 720p with audio measures ≈$0.05184/s real upstream cost → final charge ≈6.2¢/s, versus a 9.6¢/s reserve. Rounding follows the standard house rule (round once, minimum 1¢).

Example Request — submit with a reference image

curl -X POST https://api.llm4agents.com/v1/videos \
  -H "Authorization: Bearer $API_KEY" -H "Content-Type: application/json" \
  -d '{"model":"x-ai/grok-imagine-video-1.5","prompt":"A drone shot flying over a neon-lit cyberpunk city at night","image":"https://example.com/first-frame.jpg","duration":6,"resolution":"720p"}'

Response `202`

{
  "id": "video_abc123",
  "status": "pending",
  "polling_url": "/v1/videos/video_abc123",
  "charged_usd_cents": 108
}

Response Headers (`POST /v1/videos`)

Header	Description
`X-Charged-Usd-Cents`	Amount reserved for this job in USD cents
`X-Model-Used`	Slug of the video model used
`X-Request-Id`	Unique request identifier

Poll (`GET /v1/videos/:id`)

Each poll reconciles against OpenRouter and updates the stored status. Terminal statuses are completed, failed, cancelled, and expired — jobs left pending/in_progress for more than 24h are auto-expired and refunded by a background sweep.

curl https://api.llm4agents.com/v1/videos/video_abc123 \
  -H "Authorization: Bearer $API_KEY"

While in progress:

{ "id": "video_abc123", "status": "in_progress", "charged_usd_cents": 108 }

Once completed — adjusted is true once the submit-time reserve has been auto-adjusted down to the real upstream cost + fee (charged_usd_cents continues to show the original reserve; the refunded difference appears as a refund transaction with metadata.reason: "real_cost_adjustment" on GET /api/v1/transactions):

{
  "id": "video_abc123",
  "status": "completed",
  "video_url": "/v1/videos/video_abc123/content",
  "charged_usd_cents": 108,
  "adjusted": true
}

Download (`GET /v1/videos/:id/content`)

Returns raw video bytes with the upstream Content-Type (not JSON). Only available once status is completed; otherwise returns 409 job_not_ready.

curl https://api.llm4agents.com/v1/videos/video_abc123/content \
  -H "Authorization: Bearer $API_KEY" \
  --output video.mp4

If a job ends failed, cancelled, or expired, the full charge is refunded automatically — you never pay for a video that was not produced. The refund shows up as { "status": "failed", "error": "...", "refunded": true, "charged_usd_cents": 0 } on GET /v1/videos/:id.

Errors

HTTP	Code	Meaning
404	`job_not_found`	Job doesn't exist, or belongs to another agent.
409	`job_not_ready`	`/content` requested before the job reached `completed`.

POST /v1/images/generations Auth Required Generate images synchronously (real-cost billing) ▸

Synchronous, OpenAI-compatible image generation via OpenRouter's dedicated /api/v1/images endpoint, proxied through x-ai/grok-imagine-image-quality by default. Generates up to n=4 images per call, all-or-nothing. Different from /v1/image/* (singular) — those are the gateway's PiAPI-backed image tools (generate_image, edit_image, analyze_image, generate_ad_banner) billed at flat per-call rates; this endpoint is the OpenRouter-native generations API billed at real upstream cost. Balance-only — x402 walk-up is not available.

Request Body

Field	Type	Description
model	string	optional Image model slug (default `"x-ai/grok-imagine-image-quality"`)
prompt	string	required Text description of the image(s) to generate, up to 4000 characters
n	integer	optional Number of images to generate, 1–4 (default `1`)
resolution	string	optional `"512"`, `"1K"`, `"2K"`, or `"4K"` — drives the reserve multiplier
aspect_ratio	string	optional e.g. `"16:9"`, `"1:1"`
quality	string	optional `"auto"`, `"low"`, `"medium"`, or `"high"`
output_format	string	optional `"png"`, `"jpeg"`, `"webp"`, or `"svg"`
background	string	optional `"auto"`, `"transparent"`, or `"opaque"`
output_compression	integer	optional Compression level for lossy formats, 0–100
seed	integer	optional Deterministic generation seed
input_references	string[]	optional Up to 5 reference images, each an `https://` URL or a `data:image/{png\|jpeg\|webp};base64,...` URI
provider	object	optional Forwarded verbatim to OpenRouter provider routing
stream	boolean	optional Only `false` is accepted — streaming is not supported by this endpoint

Reserve caps by model (momentary upper bound)

Model	Reserve cap (per image)
x-ai/grok-imagine-image-quality (default)	$0.15
google/gemini-3.1-flash-image	$0.10
openai/gpt-image-1-mini	$0.10
openai/gpt-image-1	$0.30
openai/gpt-image-2	$0.30
Other/untabulated models	$0.50

Resolution multiplier on top of the per-image cap, before n: 512/1K ×1, 2K ×2, 4K ×4.

Charged at the real upstream usage.cost (plus the standard platform fee) — NOT a flat rate. The reserve cap table above is only a momentary upper bound, debited before the upstream call and settled down to the real cost afterward. Example: 1 image at the default resolution with x-ai/grok-imagine-image-quality reserves ~18¢ but typically settles to ~8-9¢. You are never charged more than the reserve, even if the real cost exceeds it. If the upstream response is missing usage.cost, the reserve is released and the call fails with 502 (nothing charged).

Example Request

curl -X POST https://api.llm4agents.com/v1/images/generations \
  -H "Authorization: Bearer $API_KEY" -H "Content-Type: application/json" \
  -d '{"prompt":"A robot writing code on a chalkboard","n":1}' \
  | jq -r '.data[0].b64_json' | base64 -d > out.png

Response `200`

{
  "created": 1753142400,
  "data": [
    { "b64_json": "iVBORw0KGgo...", "media_type": "image/png" }
  ],
  "usage": { "cost": 0.07 }
}

Response Headers

Header	Description
`X-Charged-Usd-Cents`	Real settled cost of this request in USD cents (not the reserve)
`X-Model-Used`	Slug of the image model that responded
`X-Request-Id`	Unique request identifier

Errors

HTTP	Code	Meaning
400	`validation_error`	Bad request body — nothing charged.
402	`insufficient_balance`	The momentary reserve could not be debited — upstream never called.
404	`model_not_found`	OpenRouter rejected the model. Reserve released.
429	`rate_limited`	Too many requests.
502	`provider_error`	Upstream error, or the response was missing `usage.cost` (cannot be billed); timeouts also surface as 502 with `details.upstream_status: 504`. Reserve released.

POST /v1/tx/quote Auth Required Get relayer fee, nonces, and EIP-712 typed data to sign ▸

Step 1 of the non-custodial gasless transfer. Returns the current Vexo relayer fee, the on-chain nonces for your wallet, and the two EIP-712 payloads (Permit + TransferPermit) that your EOA must sign locally. The platform never sees your private key. This endpoint is free — auth is only used to tag the request to your agent for rate-limit and audit purposes.

Request Body

Field	Type	Description
chain	string	required Target chain. Currently only `"polygon"`.
token	string	required Token symbol (`"USDC"`) or its contract address. Currently only `USDC` is wired on `polygon`.
from	string	required The EOA that will sign both permits and owns the token balance (20-byte hex, `0x`-prefixed). The platform does NOT provision this wallet — you bring your own key.
to	string	required Recipient address.
amount	string	required Amount in human-decimal form (e.g. `"1.50"`). Always pass a string — JSON numbers lose precision.

Response `200`

{
  "chain": "polygon",
  "chainId": 137,
  "token": "USDC",
  "tokenAddress": "0x3c499c542cEF5E3811e1192ce70d8cC03d5c3359",
  "forwarderAddress": "0xba9490B2A5c94AAc18fC7aBf19151757852FB5E7",
  "from": "0x9f...",
  "to": "0xd8dA6BF26964aF9D7eEd9e03E53415D37aA96045",
  "amount": "1.50",
  "amountBaseUnits": "1500000",
  "fee": "50000",
  "feeFormatted": "0.05 USDC",
  "feeDecimal": "0.05",
  "deadline": 1756000000,
  "nonces": { "token": "3", "forwarder": "7" },
  "typedData": {
    "permit": { "domain": {...}, "types": {...}, "primaryType": "Permit", "message": {...} },
    "transferPermit": { "domain": {...}, "types": {...}, "primaryType": "TransferPermit", "message": {...} }
  },
  "requestId": "req_..."
}

Sign both typedData.permit and typedData.transferPermit with your wallet (e.g. wallet.signTypedData(domain, types, message) in ethers v6), split each raw signature into {v, r, s}, then call POST /v1/tx/send below. The deadline is 10 minutes from quote — if you sign after that, submit will return 400 expired_quote.

POST /v1/tx/send Auth Required Submit the user-signed permits; relay broadcasts the transfer ▸

Step 2 of the non-custodial gasless transfer. You submit the two signatures produced from the /quote typed data; the platform re-derives each EIP-712 digest, runs ecrecover on both signatures, verifies the recovered signer matches from, and forwards the bundle to Vexo. The on-chain transfer is executed by Vexo's relayer — the platform signs nothing, touches no keys, and pays no gas of yours. Free service; the request is logged against your API key solely for audit.

Request Body

Field	Type	Description
chain, token, from, to, amount	string	required Same values you sent to `/v1/tx/quote`.
fee	string	required Fee in token base units, echoed verbatim from the quote's `fee` field.
deadline	integer	required Unix seconds. Echo the quote's `deadline`.
nonces.token, nonces.forwarder	string	required Uint strings echoed from the quote.
permitSig	{ v, r, s }	required Split signature of `typedData.permit`. `v` is 27/28, `r`/`s` are 32-byte hex.
transferPermitSig	{ v, r, s }	required Split signature of `typedData.transferPermit`, same shape.

Response `200`

{
  "txHash": "0xab...",
  "explorerUrl": "https://polygonscan.com/tx/0xab...",
  "from": "0x9f...",
  "to": "0xd8dA6BF26964aF9D7eEd9e03E53415D37aA96045",
  "chain": "polygon",
  "chainId": 137,
  "token": "USDC",
  "tokenAddress": "0x3c499c542cEF5E3811e1192ce70d8cC03d5c3359",
  "amount": "1.50",
  "amountBaseUnits": "1500000",
  "feeBaseUnits": "50000",
  "feeDecimal": "0.05",
  "requestId": "req_..."
}

Errors

HTTP	Code	Meaning
400	`validation_error`	Bad address / non-decimal amount / malformed signature.
400	`invalid_signature`	`ecrecover` on `permitSig` or `transferPermitSig` did not match `from`. Response includes `expected_signer` and `recovered_signer`.
400	`expired_quote`	`deadline` has already passed. Call `/v1/tx/quote` again to get a fresh one.
400	`unsupported_chain` / `unsupported_token`	Chain or token not currently wired on this deployment.
409	`gasless_gas_spike`	Vexo's fee changed since the quote. Safe to retry by requesting a new quote.
502	`gasless_upstream_error`	Vexo returned an upstream error while broadcasting.
503	`gasless_operator_unavailable`	Vexo's relayer operator is temporarily unavailable.
504	`gasless_timeout`	Submit timed out. The transfer may still land on-chain.

GET /api/v1/transactions Auth Required Transaction history ▸

Query Parameters

Field	Type	Description
type	string	optional Filter by transaction type: `"deposit"`, `"usage"`, or `"refund"`. Omit to return all types.
service	string	optional Filter by service: `"llm"` (chat completions + embeddings, default), `"tts"` (audio speech), `"video"` (video generation), `"scraper"`, `"search"`, `"image"`, `"workspace"`, `"tools"`, `"bridge"`. Omit to return all services. Note: MCP tool calls (`text_to_speech`, `generate_video`) are billed under `service=tools`; `tts`/`video` cover the REST endpoints.
limit	integer	optional Number of results (1-100, default: 50)
offset	integer	optional Pagination offset (default: 0)

Examples

# All transactions
GET /api/v1/transactions

# Only deposits
GET /api/v1/transactions?type=deposit

# Only usage, second page of 20
GET /api/v1/transactions?type=usage&limit=20&offset=20

Response `200`

{
  "transactions": [
    {
      "id": 3,
      "type": "usage",
      "service": "tts",
      "amountUsdCents": 1,
      "model": "x-ai/grok-voice-tts-1.0",
      "promptTokens": null,
      "completionTokens": null,
      "totalTokens": null,
      "chain": null,
      "txHash": null,
      "metadata": {"model": "x-ai/grok-voice-tts-1.0", "chars": 37},
      "description": "TTS: x-ai/grok-voice-tts-1.0 (37 chars)",
      "createdAt": "2026-04-16T10:00:00.000Z"
    },
    {
      "id": 2,
      "type": "deposit",
      "service": "llm",
      "amountUsdCents": 5000,
      "model": null,
      "promptTokens": null,
      "completionTokens": null,
      "totalTokens": null,
      "chain": "polygon",
      "txHash": "0xabc...",
      "metadata": null,
      "description": "Deposit of 50.00 USDT on polygon",
      "createdAt": "2026-04-15T09:15:00.000Z"
    },
    {
      "id": 1,
      "type": "usage",
      "service": "llm",
      "amountUsdCents": 3,
      "model": "anthropic/claude-3-haiku",
      "promptTokens": 120,
      "completionTokens": 85,
      "totalTokens": 205,
      "chain": null,
      "txHash": null,
      "metadata": null,
      "description": "Chat completion: anthropic/claude-3-haiku",
      "createdAt": "2026-04-14T12:05:00.000Z"
    }
  ],
  "limit": 50,
  "offset": 0,
  "total": 3,
  "requestId": "req_..."
}

Scraper / MCP

A headless browser service exposed via the Model Context Protocol (MCP). Your agent connects to the MCP server and gains access to tools for fetching pages, taking screenshots, extracting data, and running persistent browser sessions — all with optional proxy support (datacenter or residential) and anti-detection stealth.

Connecting

The MCP server is hosted at a separate endpoint. Authenticate with the same API key used for the main API.

MCP Endpoint: https://mcp.llm4agents.com/mcp
Method:       POST
Auth:         Authorization: Bearer YOUR_API_KEY
Protocol:     MCP Streamable HTTP (JSON response mode)

All tools accept a proxy_tier parameter: "none" (direct), "datacenter", or "residential". Proxy tier affects pricing. Billing follows the same reserve-settle pattern as chat completions.

Pricing

All prices shown as fractions of a US cent. Sub-cent billing — you only pay for what you use.

One-Shot Tool Prices

Tool	No Proxy	Datacenter	Residential
`fetch_html`	$0.0007	$0.0009	$0.0037
`markdown`	$0.0010	$0.0012	$0.0040
`links`	$0.0007	$0.0009	$0.0037
`screenshot`	$0.0010	$0.0012	$0.0040
`pdf`	$0.0012	$0.0014	$0.0042
`extract`	$0.0012	$0.0014	$0.0042

Session Pricing

Sessions are billed based on duration + number of actions. Cost is reserved upfront at worst-case and settled to actual usage when the session is closed.

Duration	Actions	No Proxy	Datacenter	Residential
30s	3	$0.009	$0.011	$0.015
2 min	10	$0.015	$0.021	$0.037
5 min	30	$0.040	$0.048	$0.087
5 min (max)	50	$0.052	$0.060	$0.099

Session Limits

Limit	Default
Max duration	300 seconds (5 minutes)
Max actions per session	50
Concurrent sessions per agent	2
Tool timeout	30 seconds
Max payload size	5 MB

Prices and limits are configurable by the platform administrator and may change. The values shown here are the current defaults.

One-Shot Tools

Each tool opens a browser, performs one action, returns the result, and closes the browser.

TOOL fetch_html Fetch the full rendered HTML of a page (auto-escalates proxy tier on failure) ▸

Parameters

Field	Type	Description
url	string	required URL to fetch (max 2048 chars)
timeout_ms	integer	optional Per-attempt navigation timeout (1000–10000 ms, default 10000). With `auto_fallback`, the chain runs up to 3 attempts.
proxy_tier	string	required Starting tier: `"none"`, `"datacenter"`, or `"residential"`. By default the requested tier is honored exactly.
auto_fallback	boolean	optional Default `false`. When `true`, escalate to higher tiers if the requested one fails.

Strict mode (default)

By default, fetch_html uses a 10s navigation timeout, a domcontentloaded ready-state (so heavy SPAs do not stall on persistent network activity), and tries the requested proxy_tier exactly once. none means no proxy, no escalation.

Auto-fallback (`auto_fallback: true`)

If a tier fails (timeout, network error, 5xx, etc.) the tool transparently retries with the next tier:

none → datacenter → residential
datacenter → residential
residential (no fallback)

You are billed at the tier that actually returned the page, not the worst case attempted. If every tier fails, the tool returns a soft-error result (no exception) so the model can decide what to do next, billed at the originally requested (cheapest) tier. Deterministic errors (SSRF block, payload too large) abort the chain immediately.

Returns — success

{
  "ok": true,
  "tier_used": "datacenter",
  "html": "<html>...</html>",
  "status": 200,
  "finalUrl": "https://...",
  "attempts": [{ "tier": "none", "error": "Navigation timeout of 10000 ms exceeded" }]
}

Returns — total failure

{
  "ok": false,
  "tier_used": "none",
  "attempts": [
    { "tier": "none",         "error": "..." },
    { "tier": "datacenter",   "error": "..." },
    { "tier": "residential",  "error": "..." }
  ],
  "reason": "All proxy tiers failed. Last error: ...",
  "hint": "The page may be down, blocking automated requests, or unreachable. Consider trying a different URL, a search query, or another tool.",
  "url": "https://..."
}

TOOL markdown Convert a page to markdown ▸

Parameters

Field	Type	Description
url	string	required URL to convert
selector	string	optional CSS selector to scope conversion
proxy_tier	string	required

Returns

{ "markdown": "# Page Title\n\nContent...", "title": "Page Title", "url": "https://..." }

TOOL links Extract all links from a page ▸

Parameters

Field	Type	Description
url	string	required
same_origin_only	boolean	optional Only return links from the same origin
proxy_tier	string	required

Returns

{ "links": [{ "href": "https://...", "text": "Link text", "rel": "" }] }

TOOL screenshot Take a screenshot of a page ▸

Parameters

Field	Type	Description
url	string	required
selector	string	optional CSS selector to screenshot a specific element
full_page	boolean	optional Capture full scrollable page
viewport	object	optional `{ width, height }` (320-3840 x 240-2160)
proxy_tier	string	required

Returns

{ "pngBase64": "iVBOR...", "width": 1920, "height": 1080, "bytes": 184320 }

TOOL pdf Generate a PDF of a page ▸

Parameters

Field	Type	Description
url	string	required
format	string	optional `"A4"`, `"Letter"`, or `"Legal"` (default: A4)
proxy_tier	string	required

Returns

{ "pdfBase64": "JVBERi0...", "bytes": 52480 }

TOOL extract Extract structured data using CSS selectors ▸

Parameters

Field	Type	Description
url	string	required
selectors	object	required Map of `name → CSS selector`. Each selector extracts text from matching elements.
proxy_tier	string	required

Example

{ "url": "https://example.com", "selectors": { "title": "h1", "prices": ".price" }, "proxy_tier": "none" }

Returns

{ "data": { "title": "Example", "prices": ["$10", "$20", "$30"] } }

Single-match selectors return a string; multi-match selectors return an array of strings.

Session Tools

Persistent browser sessions let you navigate, click, type, and extract across multiple pages without reopening the browser. Sessions last 5 minutes and support a configurable action limit.

TOOL session_create Start a persistent browser session ▸

Opens a browser that stays alive for 5 minutes. Cost is reserved upfront (worst-case) and settled on close. Concurrent session limit applies per agent.

Parameters

Field	Type	Description
proxy_tier	string	required `"none"`, `"datacenter"`, or `"residential"`
initial_url	string	optional Navigate to this URL immediately after launch

Returns

{ "session_id": "a1b2c3d4-...", "expires_at": "2026-04-17T12:05:00.000Z" }

TOOL session_exec Execute an action inside a session ▸

Parameters

Field	Type	Description
session_id	string	required
action	object	required Action to execute (see below)

Action Types

type	Fields	Returns
`goto`	`url`	`{ status, url }`
`click`	`selector`	`{ clicked }`
`type`	`selector`, `text`	`{ typed }`
`wait_for`	`selector`, `timeout_ms?`	`{ found }`
`scroll`	`to`: `"top"` \| `"bottom"` \| `{ y }`	`{ scrolled }`
`get_html`	`selector?`	`{ html }`
`get_url`	—	`{ url }`
`screenshot`	`selector?`, `full_page?`	`{ pngBase64, bytes }`
`extract`	`selectors`	`{ data }`

TOOL session_close Close a session and settle billing ▸

Parameters

Field	Type	Description
session_id	string	required

Returns

{ "duration_ms": 45000, "actions_count": 12, "cost_cents": 1.74 }

TOOL session_status Check session status ▸

Parameters

Field	Type	Description
session_id	string	required

Returns

{ "expires_at": "2026-04-17T12:05:00.000Z", "actions_count": 5, "status": "active" }

Search Tools

Google search capabilities exposed via the same MCP server as the scraper tools. Your agent can search Google, Google News, and Google Maps and receive structured results — no browser needed.

Connecting

Uses the same MCP endpoint and authentication as scraper tools.

MCP Endpoint: https://mcp.llm4agents.com/mcp
Method:       POST
Auth:         Authorization: Bearer YOUR_API_KEY
Protocol:     MCP Streamable HTTP (JSON response mode)

Pricing

Tool	Cost per Call
google_search / google_news / google_maps	$0.0012

Flat rate per call for all 3 search tools. Billing follows the same reserve-settle pattern as other tools.

Tools

Common Parameters

Field	Type	Description
q	string	required Search query (max 2048 chars)
gl	string	optional Country code, e.g. `"us"`, `"es"` (default: `"us"`)
hl	string	optional Language code, e.g. `"en"`, `"es"` (default: `"en"`)
tbs	string	optional Date range filter, e.g. `"qdr:h"` (past hour), `"qdr:d"` (past day), `"qdr:w"` (past week)
page	integer	optional Pagination page number (default: 1)
location	string	optional Geographic location hint (max 200 chars)

TOOL google_search Search Google and get organic results ▸

Returns organic search results including knowledge graph, answer boxes, and related searches.

Returns

{
  "results": [
    {
      "title": "Example Domain",
      "link": "https://example.com",
      "snippet": "This domain is for use in examples..."
    }
  ],
  "query": "example domain"
}

TOOL google_news Search Google News for recent articles ▸

Returns news articles with title, link, snippet, date, and source.

Returns

{
  "results": [
    {
      "title": "Breaking News Story",
      "link": "https://news.example.com/story",
      "snippet": "A major development...",
      "date": "2 hours ago",
      "source": "Example News"
    }
  ],
  "query": "latest tech news"
}

TOOL google_maps Search Google Maps for places and businesses ▸

Returns places with title, address, coordinates, category, rating, phone, and website.

Returns

{
  "results": [
    {
      "title": "Best Coffee Shop",
      "address": "123 Main St, New York, NY",
      "latitude": 40.7128,
      "longitude": -74.006,
      "category": "Coffee shop",
      "rating": 4.7,
      "phone": "+1 212-555-0100",
      "website": "https://bestcoffee.example.com"
    }
  ],
  "query": "coffee shops near me"
}

TOOL google_batch_search Run multiple Google searches in a single call (up to 100 queries) ▸

Batch multiple search queries into a single API call. More efficient than calling google_search multiple times — reduces latency and HTTP roundtrips. Cost is $0.0012 × number of queries.

Parameters

Field	Type	Description
queries	array	required Array of search query objects (1–100). Each object accepts the same parameters as `google_search`: `q`, `gl`, `hl`, `tbs`, `page`, `location`.

Example Request

{
  "queries": [
    { "q": "best restaurants in NYC", "gl": "us" },
    { "q": "weather forecast NYC", "tbs": "qdr:d" },
    { "q": "NYC subway map" }
  ]
}

Returns

{
  "results": [
    {
      "results": [{ "title": "...", "link": "...", "snippet": "..." }],
      "query": "best restaurants in NYC"
    },
    {
      "results": [{ "title": "...", "link": "...", "snippet": "..." }],
      "query": "weather forecast NYC"
    },
    {
      "results": [{ "title": "...", "link": "...", "snippet": "..." }],
      "query": "NYC subway map"
    }
  ],
  "queryCount": 3
}

Image Tools

AI-powered image generation, editing, and analysis tools exposed via the same MCP server as scraper and search tools. Your agent can generate images from text, edit existing images, and analyze image content.

Looking for real-cost, OpenRouter-native image generation instead of these flat-rate gateway tools? See POST /v1/images/generations (plural) — a different endpoint, billed at real upstream cost.

Connecting

Uses the same MCP endpoint and authentication as scraper and search tools.

MCP Endpoint: https://mcp.llm4agents.com/mcp
Method:       POST
Auth:         Authorization: Bearer YOUR_API_KEY
Protocol:     MCP Streamable HTTP (JSON response mode)

Pricing

Tool	Cost per Call
generate_image	$0.01 (≤1.5 MP) / $0.02 (>1.5 MP)
edit_image	$0.02
analyze_image	$0.006
generate_ad_banner	$0.06 ($0.05 x402 walk-up)

Image inputs accept either a URL or base64-encoded image data. All image outputs are returned as base64. Billing follows the same reserve-settle pattern as other tools.

Tools

TOOL generate_image Generate an image from a text prompt ▸

Generates a PNG image from a text description using AI. Cost depends on output resolution: $0.01 for images up to 1.5 megapixels, $0.02 for larger.

Parameters

Field	Type	Description
prompt	string	required Text description of the image to generate (max 4096 chars)
width	integer	optional Image width in pixels, 512–2048 (default: 1024)
height	integer	optional Image height in pixels, 512–2048 (default: 1024)

Returns

{
  "imageBase64": "<base64 PNG data>",
  "width": 1024,
  "height": 1024,
  "megapixels": 1.048,
  "costCents": 1
}

TOOL edit_image Edit an image using a text instruction ▸

Edits an existing image based on a text instruction. The input image can be provided as a URL or base64 data. Cost is a flat $0.02 per edit.

Parameters

Field	Type	Description
prompt	string	required Instruction describing the desired edit (max 4096 chars)
image	string	required URL or base64-encoded image to edit
aspect_ratio	string	optional Output aspect ratio: `"match_input_image"` (default), `"1:1"`, `"16:9"`, `"9:16"`, `"4:3"`, `"3:4"`, `"3:2"`, `"2:3"`

Returns

{
  "imageBase64": "<base64 image data>",
  "width": 1344,
  "height": 768,
  "costCents": 2
}

TOOL analyze_image Analyze an image and answer questions about it (vision) ▸

Sends an image to a vision model along with a prompt. The model analyzes the image and returns a text response. Cost is a flat $0.006 per image.

Parameters

Field	Type	Description
prompt	string	required Question or instruction about the image (max 4096 chars)
image	string	required URL or base64-encoded image to analyze

Returns

{
  "text": "The image shows a system architecture diagram with three main components...",
  "costCents": 0.6
}

TOOL generate_ad_banner Generate an ad banner at an exact pixel size for an ad network ▸

Generates an advertising banner from a text prompt at an exact pixel size for ad networks (Google, Meta, Instagram, LinkedIn, Reddit, X, TikTok, Pinterest) using Gemini 2.5 Flash Image, then cover-crops the result to the requested size. Optionally pass reference images (e.g. a brand logo) via images and the model composes the banner using them. Flat rate: $0.06 per banner ($0.05 x402 walk-up).

Parameters

Field	Type	Description
prompt	string	required Text description of the banner to generate
preset	string	optional A catalog key (e.g. `"leaderboard"`, `"medium_rectangle"`, `"fb_story"`) or a `"WxH"` string (e.g. `"728x90"`). Mutually exclusive with `width`/`height`.
width	integer	optional Banner width in pixels, 16–4000. Required together with `height` when `preset` is omitted.
height	integer	optional Banner height in pixels, 16–4000. Required together with `width` when `preset` is omitted.
output_format	string	optional `"png"` (default) or `"jpeg"`
images	string[]	optional Up to 4 reference images (e.g. a brand logo) as https URLs or base64 data URIs. The model composes the banner using them, preserving the logo's shapes, colors and proportions.

Returns

{
  "imageBase64": "<base64 image data>",
  "width": 728,
  "height": 90,
  "aspectRatio": "21:9",
  "format": "png",
  "costCents": 6,
  "preset": "leaderboard"
}

preset is present in the response only when a preset key was used to resolve the size.

Workspace

A private per-agent file workspace backed by Cloudflare R2. Upload artifacts (markdown, screenshots, PDFs, JSON) and retrieve them later — billed per-MB with per-day storage. Both Bearer auth and x402 walk-up are supported.

Files auto-expire when days_to_store runs out (hourly cleanup cron). The workspace row is auto-created on the first paid op — workspace_create is optional. Direct R2 URLs are never exposed: download URLs route through our worker as single-use tokens, so per-download billing is always enforced.

Pricing

All paid operations have a 1¢ minimum. Free operations are rate-limited at 60 req/min per agent.

Operation	Bearer (balance)	x402 walk-up	Notes
`workspace_create`	Free	Free	One-time, idempotent
`workspace_list` / `workspace_stat` / `workspace_delete`	Free	Free	Rate-limited 60/min/agent
Upload base	$0.0001/MB	$0.00009/MB	One-time, on upload (includes 1st day of storage)
Storage per day	$0.000001/MB/day	$0.0000009/MB/day	~$0.03/GB-month
Download per MB	$0.00004/MB	$0.000036/MB	~$0.04/GB
Extend / copy	storage rate × days × size	storage rate × days × size	No upload base for these

Worked examples

Scenario	Cost (Bearer)
Upload 1 MB for 1 day	$0.01 (1¢ minimum)
Upload 100 MB for 30 days	~$0.03
Upload 1 GB for 30 days	~$0.14
Download 1 GB	~$0.05

Prices are configurable by the platform administrator and may change. Values shown are the current defaults.

Tools

10 MCP tools, called via the MCP endpoint at https://mcp.llm4agents.com/mcp. Both Bearer and x402 walk-up work on every paid tool. Inline upload/download is capped at 10 MB; larger files use the init/finalize flow.

TOOL workspace_create Idempotent — confirms the workspace exists ▸

Free. Auto-creation happens on the first paid op too, so this is optional.

{ "created": true, "agent_id": "...", "created_at": 1700000000 }

TOOL workspace_list List files (free, rate-limited) ▸

Parameters

Field	Type	Description
prefix	string	optional Filter by filename prefix (e.g. `"scrapes/"`)
limit	integer	optional Max files (1-500, default 100)

TOOL workspace_upload Inline upload, ≤10 MB ▸

Parameters

Field	Type	Description
filename	string	required Allowed: `a-zA-Z0-9._-/`, max 255 chars. Supports subdirectories.
content_base64	string	required File bytes, base64-encoded. Decoded payload must be ≤10 MB.
days_to_store	integer	required 1-365. Pre-pays storage for this duration.
content_type	string	optional MIME type to remember.

TOOL workspace_upload_init + workspace_upload_finalize Large upload via single-use presigned R2 PUT ▸

workspace_upload_init({ filename, size_bytes, days_to_store, content_type? }) reserves the cost and returns a single-use, content-length-bound presigned PUT URL valid 10 minutes. The agent uploads directly to R2 (single-use prevents URL reuse). Then workspace_upload_finalize({ upload_id }) verifies the bytes landed, settles billing, and registers the file. Must finalize within 15 min of init or the reservation is refunded by cron.

TOOL workspace_download Inline base64 OR single-use proxied URL ▸

Parameters

Field	Type	Description
filename	string	required
format	string	optional `"inline"` (default, ≤10 MB) or `"url"` (any size)
url_ttl_minutes	integer	optional 1-15, default 5. Only for `format: "url"`.

The url mode returns a URL that points back to our worker (see Public Download Endpoint) — never a direct R2 URL. The token is single-use: the second hit returns 410.

TOOL workspace_extend / workspace_copy / workspace_stat / workspace_delete Lifecycle helpers ▸

workspace_extend({ filename, additional_days }) — bills storage × additional_days × file size.

workspace_copy({ source_filename, dest_filename, days_to_store }) — server-side copy via R2, no egress. Bills destination storage only.

workspace_stat({ filename }) — file metadata (free, rate-limited).

workspace_delete({ filename }) — delete a file early (free, rate-limited, no storage refund).

Public Download Endpoint

The token-based proxy endpoint that consumes the URL returned by workspace_download({ format: "url" }). No authentication header is required — the token itself is the authorization.

GET https://api.llm4agents.com/v1/workspace/download/:token

Response	Meaning
200	Streams the file bytes. Headers: `content-type`, `content-length`, `content-disposition: attachment`, `cache-control: no-store`.
400 token_required	Empty token in URL.
410 token_invalid_or_used	Token missing, already consumed, or expired (collapsed for enumeration safety).
404 file_not_found	File deleted between issuance and consumption (rare race).

AI Tools

Inference primitives powered by Cloudflare Workers AI, exposed via the same MCP server as the scraper, search, and image tools. Summarize, translate, embed, classify, moderate, and rerank text — plus image captioning and speech transcription — all billed per call. Both Bearer auth and x402 walk-up are supported.

Connecting

Uses the same MCP endpoint and authentication as the other tools.

MCP Endpoint: https://mcp.llm4agents.com/mcp
Method:       POST
Auth:         Authorization: Bearer YOUR_API_KEY  (or X-PAYMENT for x402 walk-up)
Protocol:     MCP Streamable HTTP (JSON-RPC tools/call)

Pricing

Prices shown in US cents (¢). The Bearer column is balance billing; the x402 walk-up column is the per-call signed-payment rate.

Tool	Bearer (balance)	x402 walk-up	Notes
`ai_summarize`	0.5¢	0.45¢	Per call
`ai_translate`	0.5¢	0.45¢	Per call
`ai_embed`	0.1¢	0.09¢	Per call, 768-dim vectors
`ai_classify`	1¢	0.9¢	Per call
`ai_moderate`	1¢	0.9¢	Per call
`ai_rerank`	1¢	0.9¢	Per call
`image_to_text`	2¢	1.8¢	Per call
`speech_to_text`	1.5¢/MB	1.35¢/MB	Whisper, metered per MB of audio
`text_to_speech`	1.8¢/1k chars	1.6¢/1k chars	Text → speech audio (grok-voice-tts-1.0; inline ≤256KB, else workspace)

Prices are configurable by the platform administrator and may change. Values shown are the current defaults.

Tools

Called via the MCP endpoint at https://mcp.llm4agents.com/mcp as JSON-RPC tools/call requests. Example envelope:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "ai_summarize",
    "arguments": { "text": "...long article text...", "max_words": 60 }
  }
}

TOOL ai_summarize Summarize text into a short abstract ▸

Parameters

Field	Type	Description
text	string	required Text to summarize
max_words	integer	optional Target maximum word count for the summary

Returns

{ "summary": "A concise abstract of the input text." }

TOOL ai_translate Translate text into a target language ▸

Parameters

Field	Type	Description
text	string	required Text to translate
target_lang	string	required Target language code, e.g. `"es"`, `"fr"`
source_lang	string	optional Source language code (auto-detected if omitted)

Returns

{ "translated": "El texto traducido.", "target_lang": "es" }

TOOL ai_embed Generate 768-dim text embeddings ▸

Parameters

Field	Type	Description
input	string \| string[]	required A single string or an array of up to 100 strings to embed

Returns

{
  "embeddings": [[0.013, -0.041, ...]],
  "dimensions": 768
}

TOOL ai_classify Zero-shot classification over candidate labels ▸

Parameters

Field	Type	Description
text	string	required Text to classify
labels	array	required Candidate labels (2–20)

Returns

{
  "results": [
    { "label": "billing", "score": 0.91 },
    { "label": "support", "score": 0.06 }
  ]
}

TOOL ai_moderate Score text for unsafe content ▸

Parameters

Field	Type	Description
text	string	required Text to moderate

Returns

{
  "flagged": false,
  "categories": { "hate": 0.01, "violence": 0.00, "sexual": 0.00 }
}

TOOL ai_rerank Rerank documents by relevance to a query ▸

Parameters

Field	Type	Description
query	string	required Query to rank against
documents	array	required Candidate documents (up to 100)
top_k	integer	optional Return only the top K results

Returns

{
  "results": [
    { "index": 2, "score": 0.88 },
    { "index": 0, "score": 0.41 }
  ]
}

TOOL image_to_text Caption / describe an image ▸

Parameters

Field	Type	Description
image_base64	string	required Base64-encoded image data

Returns

{ "text": "A photo of a city skyline at sunset." }

TOOL speech_to_text Transcribe audio (Whisper) ▸

Transcribes audio using Whisper. Metered per MB of decoded audio: 1.5¢/MB (Bearer) or 1.35¢/MB (x402 walk-up).

Parameters

Field	Type	Description
audio_base64	string	required Base64-encoded audio data

Returns

{ "text": "Transcribed speech from the audio.", "costCents": 1.5 }

TOOL text_to_speech Convert text to speech audio ▸

Converts text to speech using OpenRouter TTS models (default x-ai/grok-voice-tts-1.0; voices eve, ara, rex, sal, leo, case-insensitive). Metered per 1k input characters: 1.8¢/1k chars (Bearer) or 1.6¢/1k chars (x402 walk-up). Audio ≤256KB is returned inline as base64; larger audio is saved to your workspace and returned as a download_url.

Parameters

Field	Type	Description
text	string	required Text to synthesize, up to 15000 characters
voice	string	optional Voice id, case-insensitive (default `"sal"`)
model	string	optional TTS model slug (default `"x-ai/grok-voice-tts-1.0"`)
format	string	optional `"mp3"` (default) or `"wav"`

Example Request

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "text_to_speech",
    "arguments": { "text": "Estás en el interior de una pirámide.", "voice": "sal", "format": "mp3" }
  }
}

Returns

Inline (audio ≤256KB):

{
  "audio_base64": "SUQzBAAAAAAAI1RTU0U...",
  "format": "mp3",
  "chars": 38,
  "duration_hint_ms": 2533
}

Workspace (audio >256KB):

{
  "workspace_file": "tts-a1b2c3d4-....mp3",
  "download_url": "https://api.llm4agents.com/v1/workspace/download/...",
  "expires_at": 1784760300,
  "size_bytes": 512340,
  "format": "mp3",
  "chars": 12000,
  "duration_hint_ms": 800000
}

Video Tools

Text/image-to-video generation via OpenRouter's async video API (x-ai/grok-imagine-video-1.5 by default), exposed as MCP tools alongside the REST /v1/videos endpoints. Submit a job with generate_video, then poll with the free video_status tool until the result lands in your workspace. Bearer-only — x402 walk-up is not available for video (async settle-on-completion billing is incompatible with a single per-call signed payment, the same rationale as browser session tools).

Connecting

Uses the same MCP endpoint as the other tools, Bearer auth only.

MCP Endpoint: https://mcp.llm4agents.com/mcp
Method:       POST
Auth:         Authorization: Bearer YOUR_API_KEY  (Bearer-only, no X-PAYMENT)
Protocol:     MCP Streamable HTTP (JSON-RPC tools/call)

Pricing

Prices shown in US cents (¢) are a temporary reserve, billed per second of requested duration and scaled by a model/resolution rate multiplier (grok 1080p ×2; untabulated models ×4). The final charge is the real upstream cost plus the platform fee, auto-adjusted down from the reserve when the job completes (same mechanism as REST POST /v1/videos — see transactions for the real_cost_adjustment refund).

Tool	Bearer (balance)	x402 walk-up	Notes
`generate_video`	Reserve 18¢/s × rate multiplier (grok 1080p ×2 → 36¢/s; untabulated ×4 → 72¢/s); final charge = real cost + fee, auto-adjusted at completion	—	Bearer-only, async job billing; `generate_audio:false` ≈ half price on audio-billing models
`video_status`	Free	—	Bearer-only, poll-only; completed response includes `adjusted`

Prices are configurable by the platform administrator and may change. Values shown are the current defaults.

Tools

Called via the MCP endpoint at https://mcp.llm4agents.com/mcp as JSON-RPC tools/call requests. Example envelope:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "generate_video",
    "arguments": { "prompt": "A drone shot flying over a neon-lit cyberpunk city at night", "duration": 6, "resolution": "720p" }
  }
}

TOOL generate_video Generate a video from a text prompt (async) ▸

Submits an asynchronous video generation job (the same underlying job as REST POST /v1/videos). Reserves 18¢/s of requested duration at submit time (1080p counts double); on completion the reserve is auto-adjusted down to the real upstream cost + fee (see video_status's adjusted field), never charging more than the reserve. Refunded in full if the job ends failed, cancelled, or expired. Poll the result with video_status.

Parameters

Field	Type	Description
prompt	string	required Text description of the video to generate, up to 2000 characters
model	string	optional Video model slug (default `"x-ai/grok-imagine-video-1.5"`)
image_url	string	optional First-frame reference image — an `https://` URL only (no data URIs; REST endpoint `POST /v1/videos` accepts data URIs)
duration	integer	optional Duration in seconds, 1–15 (default `6`)
resolution	string	optional `"480p"`, `"720p"` (default), or `"1080p"`
aspect_ratio	string	optional `"16:9"` (default), `"9:16"`, `"1:1"`, `"4:3"`, `"3:4"`, `"3:2"`, or `"2:3"`
generate_audio	boolean	optional Generate a synchronized audio track (default `true`). Setting `false` is roughly half the upstream cost on models that bill for audio (e.g. seedance, veo).

Returns

{
  "job_id": "video_abc123",
  "status": "pending",
  "check_with": "video_status"
}

TOOL video_status Poll a video generation job (free) ▸

Polls a job submitted with generate_video. Completed videos are saved to your workspace and returned as a download_url; adjusted indicates whether the submit-time reserve has been auto-adjusted down to the real upstream cost + fee. Failed/cancelled/expired jobs are refunded automatically.

Parameters

Field	Type	Description
job_id	string	required Job id returned by `generate_video`

Returns

Pending / in progress:

{ "status": "pending" }

Completed:

{
  "status": "completed",
  "workspace_file": "video-video_abc123.mp4",
  "download_url": "https://api.llm4agents.com/v1/workspace/download/...",
  "expires_at": 1784760300,
  "size_bytes": 8421900,
  "adjusted": true
}

Failed / cancelled / expired:

{
  "status": "failed",
  "error": "upstream generation error",
  "refunded": true
}

Notify Tools

Send notifications to chat platforms, webhooks, email, and SMS via the same MCP server. Chat and webhook tools relay through credentials you supply in the call; email and SMS are sent through the platform's own providers. Both Bearer auth and x402 walk-up are supported.

webhook_post is SSRF-guarded: requests to private, loopback, and link-local address ranges are rejected before any outbound call is made.

Pricing

Prices shown in US cents (¢). Per call.

Tool	Bearer (balance)	x402 walk-up	Notes
`send_telegram`	1¢	0.9¢	Agent-supplied bot token
`send_discord`	1¢	0.9¢	Agent-supplied webhook URL
`send_slack`	1¢	0.9¢	Agent-supplied webhook URL
`webhook_post`	1¢	0.9¢	SSRF-guarded
`send_email`	2¢	1.8¢	Via platform mail provider
`send_sms`	3¢	2.7¢	Via platform SMS provider

Prices are configurable by the platform administrator and may change. Values shown are the current defaults.

Tools

Called via JSON-RPC tools/call at https://mcp.llm4agents.com/mcp. Example:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "send_slack",
    "arguments": {
      "webhook_url": "https://hooks.slack.com/services/T000/B000/XXXX",
      "text": "Deploy finished ✅"
    }
  }
}

TOOL send_telegram Send a Telegram message via a bot ▸

Parameters

Field	Type	Description
bot_token	string	required Telegram bot token
chat_id	string	required Target chat ID
text	string	required Message text
parse_mode	string	optional `"Markdown"` or `"HTML"`

Returns

{ "ok": true, "message_id": 12345 }

TOOL send_discord Post a message to a Discord webhook ▸

Parameters

Field	Type	Description
webhook_url	string	required Discord webhook URL
content	string	required Message content

Returns

{ "ok": true }

TOOL send_slack Post a message to a Slack incoming webhook ▸

Parameters

Field	Type	Description
webhook_url	string	required Slack incoming webhook URL
text	string	required Message text

Returns

{ "ok": true }

TOOL webhook_post POST an arbitrary JSON body to a URL (SSRF-guarded) ▸

Parameters

Field	Type	Description
url	string	required Destination URL (private / loopback / link-local ranges are rejected)
body	object	required JSON body to POST
headers	object	optional Extra request headers

Returns

{ "status": 200, "body": "..." }

TOOL send_email Send an email via the platform mail provider ▸

Parameters

Field	Type	Description
to	string	required Recipient email address
subject	string	required Email subject
html	string	optional HTML body (provide `html` or `text`)
text	string	optional Plain-text body (provide `html` or `text`)
from	string	optional From address (defaults to platform sender)

Returns

{ "ok": true, "id": "msg_..." }

TOOL send_sms Send an SMS via the platform SMS provider ▸

Parameters

Field	Type	Description
to	string	required Recipient number in E.164 format, e.g. `"+14155550100"`
body	string	required Message body

Returns

{ "ok": true, "id": "sms_..." }

Data Tools

Utility lookups and converters — DNS, IP geolocation, URL unfurling, RSS, YouTube transcripts, WHOIS, crypto prices, FX conversion, QR codes, and CAPTCHA solving — exposed via the same MCP server. Several are free; paid tools support both Bearer auth and x402 walk-up.

Pricing

Prices shown in US cents (¢). Per call.

Tool	Bearer (balance)	x402 walk-up	Notes
`dns_lookup`	Free	Free	Rate-limited
`ip_geolocate`	Free	Free	Rate-limited
`url_unfurl`	1¢	0.9¢	Open Graph / metadata
`rss_parse`	1¢	0.9¢	RSS / Atom feeds
`youtube_transcript`	1¢	0.9¢
`whois`	1¢	0.9¢
`crypto_price`	1¢	0.9¢
`fx_convert`	1¢	0.9¢
`qr_generate`	1¢	0.9¢	Returns SVG
`captcha_solve_create`	2¢	1.8¢	Submits a solve task
`captcha_solve_result`	Free	Free	Polls task result

Prices are configurable by the platform administrator and may change. Values shown are the current defaults.

Tools

Called via JSON-RPC tools/call at https://mcp.llm4agents.com/mcp. Example:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "crypto_price",
    "arguments": { "ids": ["bitcoin", "ethereum"], "vs_currencies": ["usd"] }
  }
}

TOOL dns_lookup Resolve DNS records (free) ▸

Parameters

Field	Type	Description
name	string	required Hostname to resolve
type	string	optional Record type, e.g. `"A"`, `"AAAA"`, `"MX"`, `"TXT"` (default `"A"`)

Returns

{ "name": "example.com", "type": "A", "records": ["93.184.216.34"] }

TOOL ip_geolocate Geolocate an IP address (free) ▸

Parameters

Field	Type	Description
ip	string	required IPv4 or IPv6 address

Returns

{ "ip": "8.8.8.8", "country": "US", "city": "Mountain View", "lat": 37.4, "lon": -122.0 }

TOOL url_unfurl Fetch Open Graph / link preview metadata ▸

Parameters

Field	Type	Description
url	string	required URL to unfurl

Returns

{ "title": "...", "description": "...", "image": "https://...", "site_name": "..." }

TOOL rss_parse Parse an RSS / Atom feed ▸

Parameters

Field	Type	Description
url	string	required Feed URL
limit	integer	optional Max items to return

Returns

{ "items": [{ "title": "...", "link": "...", "published": "..." }] }

TOOL youtube_transcript Fetch a YouTube video transcript ▸

Parameters

Field	Type	Description
video	string	required Video ID or URL
lang	string	optional Transcript language code

Returns

{ "transcript": [{ "start": 0.0, "text": "..." }], "lang": "en" }

TOOL whois Look up domain registration data ▸

Parameters

Field	Type	Description
domain	string	required Domain name

Returns

{ "domain": "example.com", "registrar": "...", "created": "...", "expires": "..." }

TOOL crypto_price Get current crypto prices ▸

Parameters

Field	Type	Description
ids	array	required Coin IDs (up to 50), e.g. `["bitcoin", "ethereum"]`
vs_currencies	array	optional Quote currencies, e.g. `["usd", "eur"]`

Returns

{ "bitcoin": { "usd": 68000 }, "ethereum": { "usd": 3500 } }

TOOL fx_convert Convert between fiat currencies ▸

Parameters

Field	Type	Description
amount	number	required Amount to convert
from	string	required Source currency code, e.g. `"USD"`
to	string	required Target currency code, e.g. `"EUR"`

Returns

{ "amount": 100, "from": "USD", "to": "EUR", "result": 92.4, "rate": 0.924 }

TOOL qr_generate Generate a QR code (SVG) ▸

Parameters

Field	Type	Description
data	string	required Text or URL to encode
size	integer	optional Pixel size of the rendered SVG
ecc	string	optional Error-correction level: `"L"`, `"M"`, `"Q"`, `"H"`

Returns

{ "svg": "<svg ...>...</svg>" }

TOOL captcha_solve_create Submit a CAPTCHA solve task ▸

Creates an asynchronous solve task. Poll for the answer with captcha_solve_result (free).

Parameters

Field	Type	Description
type	string	required CAPTCHA type, e.g. `"recaptcha_v2"`, `"hcaptcha"`, `"turnstile"`
website_url	string	required Page URL where the CAPTCHA appears
website_key	string	required Site key of the CAPTCHA widget

Additional type-specific fields may be supplied as needed.

Returns

{ "task_id": "task_abc123" }

TOOL captcha_solve_result Poll a CAPTCHA solve task (free) ▸

Parameters

Field	Type	Description
task_id	string	required Task ID returned by `captcha_solve_create`

Returns

{ "status": "ready", "solution": "03AGdBq..." }

Vector Tools

A private per-agent vector store backed by Cloudflare Vectorize (bge 768-dim embeddings). Upsert text (auto-embedded) or raw vectors, run similarity queries with optional metadata filters, and delete by ID — all via the same MCP server. Both Bearer auth and x402 walk-up are supported.

Pricing

Prices shown in US cents (¢).

Tool	Bearer (balance)	x402 walk-up	Notes
`vector_upsert`	0.5¢ / 100 items	0.45¢ / 100 items	Auto-embeds `text` if no `vector`
`vector_query`	1¢	0.9¢	Per query
`vector_delete`	Free	Free	Rate-limited

Prices are configurable by the platform administrator and may change. Values shown are the current defaults.

Tools

Called via JSON-RPC tools/call at https://mcp.llm4agents.com/mcp. Example:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "vector_query",
    "arguments": { "query": "how do refunds work", "top_k": 5 }
  }
}

TOOL vector_upsert Insert or update vectors (auto-embeds text) ▸

Each item must provide either text (auto-embedded to a 768-dim vector) or a precomputed vector. Up to 100 items per call.

Parameters

Field	Type	Description
items	array	required Up to 100 items. Each: `id` (required), `text` or `vector`, optional `metadata` object.

Example Request

{
  "items": [
    { "id": "doc-1", "text": "Refunds are processed within 5 days.", "metadata": { "topic": "billing" } },
    { "id": "doc-2", "vector": [0.01, -0.04, ...] }
  ]
}

Returns

{ "upserted": 2 }

TOOL vector_query Similarity search over the agent's vectors ▸

Parameters

Field	Type	Description
query	string \| number[768]	required A query string (auto-embedded) or a 768-dim query vector
top_k	integer	optional Number of nearest matches to return
filter	object	optional Metadata filter to restrict the search

Returns

{
  "matches": [
    { "id": "doc-1", "score": 0.87, "metadata": { "topic": "billing" } }
  ]
}

TOOL vector_delete Delete vectors by ID (free) ▸

Parameters

Field	Type	Description
ids	array	required Vector IDs to delete (up to 100)

Returns

{ "deleted": 2 }

Web Crawl

Crawl a site starting from a URL, following links breadth-first up to configurable page and depth limits, with optional JS rendering and include/exclude filters. Output as concatenated markdown or a link map, and optionally save straight into your workspace. Exposed via the same MCP server.

web_crawl is balance-only — it does not accept x402 walk-up. Use a Bearer API key with a funded balance.

Pricing

Prices shown in US cents (¢).

Tool	Bearer (balance)	x402 walk-up	Notes
`web_crawl`	0.5¢ / page (min 2¢)	— (not supported)	Billed per page crawled

Prices are configurable by the platform administrator and may change. Values shown are the current defaults.

Tools

Called via JSON-RPC tools/call at https://mcp.llm4agents.com/mcp. Example:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "web_crawl",
    "arguments": {
      "start_url": "https://example.com/docs",
      "max_pages": 25,
      "max_depth": 2,
      "output": "markdown"
    }
  }
}

TOOL web_crawl Breadth-first site crawl (balance-only) ▸

Parameters

Field	Type	Description
start_url	string	required URL to start crawling from
max_pages	integer	optional Maximum number of pages to fetch
max_depth	integer	optional Maximum link depth from `start_url`
allow_subdomains	boolean	optional Follow links into subdomains (default false)
render	boolean	optional Render JavaScript before extraction
include	array	optional URL patterns to include
exclude	array	optional URL patterns to exclude
output	string	optional `"markdown"` (default) or `"links"`
save_to_workspace	string	optional Workspace filename to save the result to

Returns

{
  "pages_crawled": 18,
  "output": "markdown",
  "content": "# Page 1\n...\n\n# Page 2\n...",
  "costCents": 9
}

Memory Tools

A simple per-agent key/value store for durable agent memory. Set JSON values (up to 64 KB) with an optional TTL, read them back, list by prefix, and delete — via the same MCP server. Reads are free; writes support both Bearer auth and x402 walk-up.

Pricing

Prices shown in US cents (¢). Per call.

Tool	Bearer (balance)	x402 walk-up	Notes
`memory_set`	1¢	0.9¢	Value ≤ 64 KB JSON
`memory_get`	Free	Free	Rate-limited
`memory_list`	Free	Free	Rate-limited
`memory_delete`	Free	Free	Rate-limited

Prices are configurable by the platform administrator and may change. Values shown are the current defaults.

Tools

Called via JSON-RPC tools/call at https://mcp.llm4agents.com/mcp. Example:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "memory_set",
    "arguments": { "key": "user:42:prefs", "value": { "theme": "dark" }, "ttl_days": 30 }
  }
}

TOOL memory_set Store a JSON value under a key ▸

Parameters

Field	Type	Description
key	string	required Key to store under
value	any (JSON)	required JSON value, serialized size ≤ 64 KB
ttl_days	integer	optional Auto-expire after this many days

Returns

{ "ok": true, "key": "user:42:prefs" }

TOOL memory_get Read a value by key (free) ▸

Parameters

Field	Type	Description
key	string	required Key to read

Returns

{ "key": "user:42:prefs", "value": { "theme": "dark" }, "found": true }

TOOL memory_list List keys by prefix (free, paginated) ▸

Parameters

Field	Type	Description
prefix	string	optional Filter keys by prefix
limit	integer	optional Max keys to return
cursor	string	optional Pagination cursor from a previous response

Returns

{ "keys": ["user:42:prefs"], "cursor": null }

TOOL memory_delete Delete a key (free) ▸

Parameters

Field	Type	Description
key	string	required Key to delete

Returns

{ "ok": true, "deleted": true }

Semantic Memory Tools

Long-term semantic memory for agents (mem0-parity). Unlike the raw key/value Memory Tools, this layer extracts atomic facts from text or a conversation with an LLM, deduplicates paraphrases against what it already knows (cosine ≥ 0.85), and stores them in a per-agent, tenant-isolated vector namespace (Cloudflare Vectorize, bge-768, cosine) with the canonical text in D1. Recall is semantic and reranked. Via the same MCP server; both tools support Bearer auth and x402 walk-up.

Pricing

Prices shown in US cents (¢). Per call.

Tool	Bearer (balance)	x402 walk-up	Notes
`semantic_memory_add`	1¢ / ~4KB	0.9¢ / ~4KB	Metered: 1¢ minimum per ~4000-char chunk (LLM extract + embed + upsert)
`semantic_memory_search`	1¢	1¢	Per call (embed + vector query + rerank)

Prices are configurable by the platform administrator and may change. Values shown are the current defaults.

Tools

Called via JSON-RPC tools/call at https://mcp.llm4agents.com/mcp. Example:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "semantic_memory_add",
    "arguments": { "messages": [
      { "role": "user", "content": "I'm on the premium plan and the checkout page keeps erroring." }
    ] }
  }
}

TOOL semantic_memory_add Extract, dedup & store long-term facts ▸

Parameters

Field	Type	Description
text	string	optional Free text to extract facts from (provide exactly one of `text` / `messages`)
messages	array	optional Conversation as `[{ role, content }]`

Returns

{ "added": 2, "updated": 1, "memory_ids": ["…"] }

TOOL semantic_memory_search Semantic recall of stored facts ▸

Parameters

Field	Type	Description
query	string	required What to recall
top_k	integer	optional Max results, ≤ 20 (default 8)

Returns

[ { "id": "…", "text": "The customer is on the premium plan", "score": 0.81, "created_at": 0, "updated_at": 0 } ]

Episodic Memory Tools

Part of the Agent Memory OS. Episodic memory records what actually happened — a situation, the action taken, and the outcome, with a valence/magnitude signal — as distinct trajectories (as opposed to the atomic facts stored by Semantic Memory). memory_reflect periodically distills a batch of unreflected episodes into durable semantic lessons plus suggested rules, and also auto-saves any recurring, successful procedure it recognizes as a reusable skill (best-effort, capped, never overwrites an operator- or peer-authored skill of the same name). Via the same MCP server; both tools support Bearer auth and x402 walk-up (episode_log is FREE).

Pricing

Prices shown in US cents (¢).

Tool	Bearer (balance)	x402 walk-up	Notes
`episode_log`	FREE	—	Rate-limited: 120 requests/min
`memory_reflect`	2¢ / ~20 episodes	1.8¢	Metered: 2¢ minimum per ~20-episode chunk

Prices are configurable by the platform administrator and may change. Values shown are the current defaults.

Tools

Called via JSON-RPC tools/call at https://mcp.llm4agents.com/mcp. Example:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "episode_log",
    "arguments": {
      "situation": "Customer asked to cancel the premium plan",
      "action": "Offered a 20% retention discount",
      "outcome": "Customer accepted and stayed",
      "valence": 0.8
    }
  }
}

TOOL episode_log Record a situation/action/outcome trajectory ▸

Parameters

Field	Type	Description
situation	string	required What happened / the context
action	string	optional What the agent did
outcome	string	optional What resulted
valence	number	optional How good/bad the outcome was, -1 to 1
magnitude	number	optional How significant the episode was, 0 to 1
provenance	object	optional Source/confidence/visibility metadata (see Semantic Memory)

Returns

{ "episode_id": "…" }

TOOL memory_reflect Distill recent episodes into durable lessons + reusable skills ▸

Parameters

Field	Type	Description
window	integer	optional Max unreflected episodes to consider, 1-50 (default 20)

Returns

{ "lessons_added": 2, "rules_suggested": ["…"], "skills_added": 1, "reflected_count": 14 }

Alongside lessons and rules, memory_reflect also recognizes recurring, successful procedures across the reflected episodes and auto-saves up to 5 of them as reusable skills (source: "agent_self", self-editable, idempotent by name). skills_added reports how many were saved; skill generation is best-effort and never fails the reflect call, and it will never overwrite a skill of the same name authored by the operator or another agent.

Recall Tools

Part of the Agent Memory OS. memory_recall is the single call for "what do I know that's relevant here" — it fuses semantic facts and episodic experiences for a query into one ranked list, using Reciprocal Rank Fusion across stores plus provenance/valence/recency weighting. Via the same MCP server; supports Bearer auth and x402 walk-up.

Pricing

Prices shown in US cents (¢). Per call.

Tool	Bearer (balance)	x402 walk-up	Notes
`memory_recall`	2¢	1.8¢	Per call (embed + query semantic & episodic + RRF fuse)

Prices are configurable by the platform administrator and may change. Values shown are the current defaults.

Tools

Called via JSON-RPC tools/call at https://mcp.llm4agents.com/mcp. Example:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "memory_recall",
    "arguments": { "query": "what plan is this customer on?", "top_k": 5 }
  }
}

TOOL memory_recall Unified recall across semantic + episodic memory ▸

Parameters

Field	Type	Description
query	string	required What to recall
top_k	integer	optional Max results, ≤ 20 (default 8)

Returns

{
  "recall_id": "…",
  "items": [ { "source": "semantic", "text": "…", "score": 0.83,
    "provenance": { "source": "observation", "valence": 0, "confidence": 0.9, "created_at": 0 } } ],
  "sources_failed": []
}

Knowledge Graph Tools

Part of the Agent Memory OS. A private, bi-temporal knowledge graph per agent (Vectorize + D1). graph_add uses an LLM to extract entities and relations from text or a conversation, classifies each entity as a merge/gray-zone/new match against existing entities, judges ambiguous merges and contradictions with an LLM, and self-invalidates edges it determines are contradicted by new facts (tracked via valid_at / invalid_at, never deleted). graph_search is a hybrid semantic + full-text + multi-hop BFS search fused by Reciprocal Rank Fusion, with point-in-time queries via as_of. Via the same MCP server; both tools support Bearer auth and x402 walk-up.

Pricing

Prices shown in US cents (¢).

Tool	Bearer (balance)	x402 walk-up	Notes
`graph_add`	4¢ / ~3KB	3.6¢	Metered: 4¢ minimum per ~3000-char chunk (LLM extract + classify + embed)
`graph_search`	2¢	1.8¢	Per call (semantic + FTS + BFS + hydrate)

Prices are configurable by the platform administrator and may change. Values shown are the current defaults.

Tools

Called via JSON-RPC tools/call at https://mcp.llm4agents.com/mcp. Example:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "graph_add",
    "arguments": { "text": "Acme Corp acquired Widgets Inc in March 2024." }
  }
}

TOOL graph_add Extract & persist entities/relations into the graph ▸

Parameters

Field	Type	Description
text	string	optional Free text to extract entities/relations from (provide exactly one of `text` / `messages`)
messages	array	optional Conversation as `[{ role, content }]`
provenance	object	optional Source/confidence/visibility metadata (see Semantic Memory)

Returns

{ "added_entities": 2, "merged_entities": 0, "added_edges": 1, "invalidated_edges": 0, "edge_failures": 0 }

TOOL graph_search Hybrid semantic + full-text + multi-hop graph search ▸

Parameters

Field	Type	Description
query	string	required What to search for
top_k	integer	optional Max results, ≤ 20 (default 8)
hops	integer	optional BFS expansion depth, 0-3 (default 1)
as_of	integer	optional Point-in-time epoch ms — only edges valid at this instant

Returns

{ "facts": [ { "fact_text": "Acme Corp acquired Widgets Inc", "source_entity": "Acme Corp", "relation": "acquired", "target_entity": "Widgets Inc", "valid_at": 1710000000000, "score": 0.77 } ] }

Skill Tools

Part of the Agent Memory OS. Procedural memory: named, triggerable skills an agent can save, search, and self-edit — reusable "how to do X" procedures distinct from the facts in Semantic Memory. Via the same MCP server; all three tools support Bearer auth, and skill_save supports x402 walk-up.

Pricing

Prices shown in US cents (¢). Per call.

Tool	Bearer (balance)	x402 walk-up	Notes
`skill_save`	1¢	0.9¢	Save or self-edit a skill
`skill_search`	FREE	—	Rate-limited: 60 requests/min
`skill_get`	FREE	—	Rate-limited: 60 requests/min

Prices are configurable by the platform administrator and may change. Values shown are the current defaults.

Tools

Called via JSON-RPC tools/call at https://mcp.llm4agents.com/mcp. Example:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "skill_save",
    "arguments": {
      "name": "handle-refund-request",
      "trigger": "customer asks for a refund",
      "body": "1. Verify order id. 2. Check refund window. 3. Issue via billing API."
    }
  }
}

TOOL skill_save Save or self-edit a named, triggerable skill ▸

Parameters

Field	Type	Description
name	string	required Unique skill name (saving an existing name self-edits it)
trigger	string	required Situation description used for matching in `skill_search`
body	string	required The skill's procedure / content
provenance	object	optional Source/confidence/visibility metadata (see Semantic Memory)

Returns

{ "skill_id": "…", "updated": false }

TOOL skill_search Find skills whose trigger matches a situation ▸

Parameters

Field	Type	Description
situation	string	required Situation to match against saved triggers
top_k	integer	optional Max results, ≤ 20 (default 5)

Returns

[ { "name": "handle-refund-request", "trigger": "customer asks for a refund", "body": "1. Verify order id. …" } ]

TOOL skill_get Fetch a skill by exact name ▸

Parameters

Field	Type	Description
name	string	required Exact skill name

Returns

{ "found": true, "name": "handle-refund-request", "trigger": "customer asks for a refund", "body": "1. Verify order id. …" }

Context Tools

Part of the Agent Memory OS. context_assemble is the "working memory" call: it pages every requested store — semantic, episodic, graph, and skill — for a goal, ranks the results, and packs as many as fit into a token budget, returning one formatted brief. Via the same MCP server; supports Bearer auth and x402 walk-up.

Pricing

Prices shown in US cents (¢). Per call.

Tool	Bearer (balance)	x402 walk-up	Notes
`context_assemble`	3¢	2.7¢	Per call (embeds goal, queries up to 4 stores, ranks & packs)

Prices are configurable by the platform administrator and may change. Values shown are the current defaults.

Tools

Called via JSON-RPC tools/call at https://mcp.llm4agents.com/mcp. Example:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "context_assemble",
    "arguments": { "goal": "resolve this customer's billing dispute", "token_budget": 1500 }
  }
}

TOOL context_assemble Assemble a token-budgeted working-memory brief ▸

Parameters

Field	Type	Description
goal	string	required What the brief should be useful for
token_budget	integer	optional Approx. token budget, 200-8000 (default 2000)
sources	array	optional Subset of `["semantic","episodic","graph","skill"]` to page (default all)

Returns

{ "brief": "…", "included": 9, "budget_used": 1432, "sources_failed": [] }

Consolidation Tools

Part of the Agent Memory OS. memory_consolidate is background maintenance for semantic memory: it merges near-duplicate memories (dedup, cosine ≥ 0.95, keeping the higher-confidence/more-recent record) and forgets stale, low-confidence memories (decay, 90+ days old). Bounded per call and safe to run repeatedly (idempotent). Via the same MCP server; supports Bearer auth and x402 walk-up.

Pricing

Prices shown in US cents (¢). Per call.

Tool	Bearer (balance)	x402 walk-up	Notes
`memory_consolidate`	2¢	1.8¢	Per call, bounded by `max_ops`

Prices are configurable by the platform administrator and may change. Values shown are the current defaults.

Tools

Called via JSON-RPC tools/call at https://mcp.llm4agents.com/mcp. Example:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "memory_consolidate",
    "arguments": { "max_ops": 50, "operations": ["decay", "dedup"] }
  }
}

TOOL memory_consolidate Merge near-duplicates & forget stale memories ▸

Parameters

Field	Type	Description
max_ops	integer	optional Max operations to perform, 1-100 (default 50)
operations	array	optional Subset of `["decay","dedup"]` to run (default both)

Returns

{ "scanned": 12, "merged": 3, "expired": 2, "ops_failed": [] }

Web3 Tools

Read-only on-chain lookups across Ethereum, Polygon, Base, and Solana — token balances, transaction status, NFT metadata, and ENS resolution — via the same MCP server. These tools never sign or send transactions. Both Bearer auth and x402 walk-up are supported.

Pricing

Prices shown in US cents (¢). Per call.

Tool	Bearer (balance)	x402 walk-up	Notes
`token_balance`	1¢	0.9¢	Native or token balance
`tx_status`	1¢	0.9¢
`nft_metadata`	1¢	0.9¢	EVM ERC-721
`ens_resolve`	1¢	0.9¢	Forward / reverse

Prices are configurable by the platform administrator and may change. Values shown are the current defaults.

Tools

Called via JSON-RPC tools/call at https://mcp.llm4agents.com/mcp. Supported chain values: "ethereum", "polygon", "base", "solana". Example:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "token_balance",
    "arguments": { "chain": "base", "address": "0xabc...", "token": "0xUSDC..." }
  }
}

TOOL token_balance Read a native or token balance ▸

Parameters

Field	Type	Description
chain	string	required `"ethereum"`, `"polygon"`, `"base"`, or `"solana"`
address	string	required Wallet address to query
token	string	optional Token contract / mint. Omit for the native asset.

Returns

{ "chain": "base", "address": "0xabc...", "balance": "12.5", "symbol": "USDC", "decimals": 6 }

TOOL tx_status Look up a transaction's status ▸

Parameters

Field	Type	Description
chain	string	required `"ethereum"`, `"polygon"`, `"base"`, or `"solana"`
tx_hash	string	required Transaction hash / signature

Returns

{ "status": "confirmed", "block": 12345678, "confirmations": 42 }

TOOL nft_metadata Fetch ERC-721 NFT metadata (EVM) ▸

Parameters

Field	Type	Description
chain	string	required EVM chain: `"ethereum"`, `"polygon"`, or `"base"`
contract	string	required ERC-721 contract address
token_id	string	required Token ID

Returns

{ "name": "...", "image": "https://...", "attributes": [{ "trait_type": "...", "value": "..." }] }

TOOL ens_resolve Resolve ENS names forward or reverse ▸

Parameters

Field	Type	Description
query	string	required ENS name (forward) or address (reverse)
direction	string	required `"forward"` (name → address) or `"reverse"` (address → name)

Returns

{ "query": "vitalik.eth", "direction": "forward", "result": "0xd8dA6BF2..." }

Document Tools

Extract text and structured data from documents — parse PDFs, pull tables/cells from Office and CSV files, and convert articles to reader-mode markdown — via the same MCP server. PDF and office extraction read from a workspace file or a URL.

pdf_parse and doc_extract are balance-only — they do not accept x402 walk-up. article_extract supports both Bearer and x402 walk-up.

Pricing

Prices shown in US cents (¢).

Tool	Bearer (balance)	x402 walk-up	Notes
`pdf_parse`	0.5¢ / page (min 1¢)	— (not supported)	Balance-only
`doc_extract`	0.5¢ / unit (min 1¢)	— (not supported)	Unit = sheet / table chunk
`article_extract`	1¢	0.9¢	Reader-mode markdown

Prices are configurable by the platform administrator and may change. Values shown are the current defaults.

Tools

Called via JSON-RPC tools/call at https://mcp.llm4agents.com/mcp. Example:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "pdf_parse",
    "arguments": { "url": "https://example.com/report.pdf", "pages": "1-5" }
  }
}

TOOL pdf_parse Extract text from a PDF (balance-only) ▸

Provide either a workspace_file or a url.

Parameters

Field	Type	Description
workspace_file	string	optional Workspace filename of the PDF (provide this or `url`)
url	string	optional URL of the PDF (provide this or `workspace_file`)
pages	string	optional Page range, e.g. `"1-5"` or `"3"`

Returns

{ "pages": 5, "text": "...extracted text...", "costCents": 3 }

TOOL doc_extract Extract data from docx / xlsx / csv (balance-only) ▸

Parameters

Field	Type	Description
workspace_file	string	required Workspace filename of the document
format	string	required `"docx"`, `"xlsx"`, or `"csv"`

Returns

{ "format": "xlsx", "units": 2, "data": [ ... ], "costCents": 1 }

TOOL article_extract Reader-mode markdown from a web article ▸

Parameters

Field	Type	Description
url	string	required Article URL to extract

Returns

{ "title": "...", "byline": "...", "markdown": "# Title\n\n...", "word_count": 1240 }

Gasless Stablecoin Transfers (non-custodial)

Send stablecoin transfers on Polygon from your own EOA without holding any native MATIC. You bring the private key; the platform only validates the signatures you produce and relays them to Vexo's StablecoinForwarder. The service is free — the platform holds no funds, signs nothing, and charges nothing. Your API key is used solely to tag the request for rate-limit and audit. Logged as transactions.type = "gas_sponsored" with amount_usd_cents = 0.

No custody. The from EOA is never provisioned, derived, or touched by the platform. Your private key stays on your side. The platform can only recover the signer from the signatures you submit and verify it matches the from you declared — if a byte is off, the request is rejected with 400 invalid_signature.

Two-step flow

POST /v1/tx/quote — ask the relay for fee + nonces + typed data
Send {chain, token, from, to, amount}. The platform fetches the current Vexo relayer fee (action: "fee-quote"), reads the on-chain nonces(from) from the token and the forwarder, and returns the full EIP-712 payloads (Permit + TransferPermit) pre-filled for you, plus a deadline 10 minutes out. No signatures yet.
Sign both typed-data objects locally with your wallet
Use ethers.Wallet.signTypedData(domain, types, message) (or any EIP-712 signer) on typedData.permit and typedData.transferPermit. Split each raw signature into {v, r, s}. The platform never sees your key.
POST /v1/tx/send — submit the signed bundle
Echo chain, token, from, to, amount, fee, deadline, nonces from the quote and add permitSig + transferPermitSig. The platform re-derives each EIP-712 digest, runs ecrecover, and forwards to Vexo's action: "transfer" only if both recovered signers match from. Vexo broadcasts on-chain and returns the txHash.

What the platform does & does not do

Does	Does not
Quote Vexo's relayer fee in token base units Read on-chain nonces over JSON-RPC Build canonical EIP-712 domain + message Verify `ecrecover(permitSig) == from` Verify `ecrecover(transferPermitSig) == from` Forward the signed bundle to Vexo Log the request against your API key	Hold, custody, or derive your private key Move funds from any deposit wallet on your behalf Debit your USD balance (this endpoint is free) Sign anything itself — signatures are yours Pay the Vexo fee for you — fee comes from `from`'s token balance, same as in Vexo's standard flow

Supported chains & tokens

Chain	Chain ID	Token	Contract
Polygon (mainnet)	`137`	USDC	`0x3c499c542cEF5E3811e1192ce70d8cC03d5c3359`

Example: USDC transfer with your own wallet

Two requests (/quote, /send) bracketing local signing.

import { ethers } from 'ethers';

const wallet = new ethers.Wallet(process.env.USER_PRIVATE_KEY);
const base = 'https://api.llm4agents.com';
const headers = {
  'Authorization': `Bearer ${process.env.PROXY_KEY}`,
  'Content-Type': 'application/json',
};

// 1. Ask the relay for fee + nonces + typed data
const q = await fetch(`${base}/v1/tx/quote`, {
  method: 'POST', headers,
  body: JSON.stringify({
    chain: 'polygon', token: 'USDC',
    from: wallet.address,
    to: '0xd8dA6BF26964aF9D7eEd9e03E53415D37aA96045',
    amount: '1.50',
  }),
}).then(r => r.json());

// 2. Sign both typed-data objects locally
const signEip712 = async (td) => {
  const { EIP712Domain, ...types } = td.types; // ethers wants types without EIP712Domain
  const raw = await wallet.signTypedData(td.domain, types, td.message);
  const s = ethers.Signature.from(raw);
  return { v: s.v, r: s.r, s: s.s };
};
const permitSig = await signEip712(q.typedData.permit);
const transferPermitSig = await signEip712(q.typedData.transferPermit);

// 3. Submit signed bundle
const tx = await fetch(`${base}/v1/tx/send`, {
  method: 'POST', headers,
  body: JSON.stringify({
    chain: q.chain, token: q.token, from: q.from, to: q.to, amount: q.amount,
    fee: q.fee, deadline: q.deadline, nonces: q.nonces,
    permitSig, transferPermitSig,
  }),
}).then(r => r.json());

console.log(`Sent: ${tx.txHash} (${tx.explorerUrl})`);

# 1. Get the quote
curl https://api.llm4agents.com/v1/tx/quote \
  -H "Authorization: Bearer sk-proxy-..." \
  -H "Content-Type: application/json" \
  -d '{
    "chain": "polygon",
    "token": "USDC",
    "from": "0x9f...",
    "to": "0xd8dA6BF26964aF9D7eEd9e03E53415D37aA96045",
    "amount": "1.50"
  }'

# 2. Sign typedData.permit and typedData.transferPermit locally (out-of-band).

# 3. Submit the signed bundle
curl https://api.llm4agents.com/v1/tx/send \
  -H "Authorization: Bearer sk-proxy-..." \
  -H "Content-Type: application/json" \
  -d '{
    "chain": "polygon",
    "token": "USDC",
    "from": "0x9f...",
    "to": "0xd8dA6BF26964aF9D7eEd9e03E53415D37aA96045",
    "amount": "1.50",
    "fee": "50000",
    "deadline": 1756000000,
    "nonces": { "token": "3", "forwarder": "7" },
    "permitSig": { "v": 27, "r": "0x...", "s": "0x..." },
    "transferPermitSig": { "v": 28, "r": "0x...", "s": "0x..." }
  }'

Notes & gotchas

You own the EOA. from must hold the USDC and the signatures. The platform has no key that can move your funds — not through this endpoint, not anywhere.
The Vexo fee is paid in the same stablecoin by from, in addition to amount. Vexo collects it as part of the on-chain transfer; the platform does not touch it.
deadline is 10 minutes. Sign promptly. If you sign after it passes, /send returns 400 expired_quote — just call /quote again.
Gas spikes return 409 gasless_gas_spike. If Vexo's fee moved between your quote and your send, safe to retry from /quote.
Signature recovery is strict. ecrecover on both signatures must equal the checksummed from; mismatches return 400 invalid_signature with the expected and recovered addresses.
Timeouts may still land. 504 gasless_timeout means the upstream call didn't complete in time; the on-chain transfer can still confirm afterwards. Idempotency is the caller's responsibility.
amount is a decimal string. JSON numbers lose precision; always pass a string like "1.50". Same for fee (uint base units) and nonces.*.

Billing

The API uses a reserve-proxy-settle pattern for each chat completion request:

Reserve
Before each request, the estimated cost (based on input tokens + max output tokens) is reserved from your balance. If your balance is insufficient, the request is rejected with 402 insufficient_balance.
Proxy
The request is forwarded to the LLM provider.
Settle
After the response completes, the actual cost is calculated from real token counts. Any over-reserved amount is returned to your balance.

Pricing

Each model has per-token pricing (input and output) visible via GET /api/v1/models. These are the final prices you pay.

Billing Headers

Every chat completion response includes billing information in the response headers:

Header	Description
`X-Cost-Usd-Cents`	Actual cost of this request in USD cents
`X-Tokens-Input`	Number of input tokens used
`X-Tokens-Output`	Number of output tokens generated
`X-Balance-Remaining-Cents`	Your remaining balance after this request

Models

Available models are synced from upstream providers and updated regularly. Use GET /api/v1/models to see current pricing and availability.

Model slugs follow the format: provider/model-name. Examples:

anthropic/claude-3-haiku
openai/gpt-4o
google/gemini-pro
meta-llama/llama-3-70b-instruct

You can use the search query parameter to filter models by name or slug.

Model Fallbacks

Send a prioritized list of models using the models parameter instead of model. If the primary model fails (rate limits, downtime, context length errors, moderation), the next model in the list is tried automatically.

How It Works

Send models array — provide 2-3 model slugs in priority order.
Automatic failover — if the first model fails, the next is tried seamlessly.
Check which model responded — the X-Model-Used response header tells you which model actually processed the request.

Rules

models and model are mutually exclusive — use one or the other, not both.
The array must contain 2 to 3 model slugs.
All models must be active and approved in the platform.
No duplicate slugs allowed.
Billing reserves balance using the most expensive model in the list to prevent under-reservation. The final charge uses the model that actually responded.

Fallback Triggers

Fallback activates when the primary model encounters:

Context length validation errors
Rate limiting from the upstream provider
Provider downtime or errors
Content moderation rejections

Example

curl https://api.llm4agents.com/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "models": ["anthropic/claude-sonnet-4", "openai/gpt-4o"],
    "messages": [{"role": "user", "content": "Hello"}]
  }'

# Check which model responded:
# Response header X-Model-Used: anthropic/claude-sonnet-4

from openai import OpenAI

client = OpenAI(
    base_url="https://api.llm4agents.com/v1",
    api_key="your-api-key"
)

response = client.chat.completions.with_raw_response.create(
    model="anthropic/claude-sonnet-4",
    messages=[{"role": "user", "content": "Hello"}],
    extra_body={
        "models": ["anthropic/claude-sonnet-4", "openai/gpt-4o"]
    }
)
print(response.headers["x-model-used"])

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.llm4agents.com/v1',
  apiKey: 'your-api-key',
});

const response = await client.chat.completions.create({
  model: 'anthropic/claude-sonnet-4',
  messages: [{ role: 'user', content: 'Hello' }],
}, {
  body: {
    models: ['anthropic/claude-sonnet-4', 'openai/gpt-4o'],
  },
});

OpenAI Compatibility

The /v1/chat/completions endpoint is fully compatible with OpenAI's API format. Any SDK or tool that supports OpenAI can use this API by changing the base_url.

Supported Parameters

model -- model slug (required unless using models)
models -- array of 2-3 model slugs for fallback routing (alternative to model)
messages -- message array (required)
temperature -- sampling temperature (0-2)
max_tokens -- maximum output tokens
stream -- enable Server-Sent Events streaming

Any additional fields in the request body are passed through to the upstream provider, allowing you to use provider-specific parameters.

SDK Examples

from openai import OpenAI

client = OpenAI(
    base_url="https://api.llm4agents.com/v1",
    api_key="your-api-key"
)

# Streaming example
stream = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "Explain quantum computing"}],
    stream=True
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.llm4agents.com/v1',
  apiKey: 'your-api-key',
});

// Streaming example
const stream = await client.chat.completions.create({
  model: 'openai/gpt-4o',
  messages: [{ role: 'user', content: 'Explain quantum computing' }],
  stream: true,
});
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
}

curl https://api.llm4agents.com/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o",
    "messages": [{"role": "user", "content": "Explain quantum computing"}],
    "temperature": 0.7,
    "max_tokens": 1000
  }'

Supported Chains

Chain	Tokens	Notes
Solana	USDT, USDC	Fast finality, low fees
Polygon	USDT, USDC	EVM-compatible, low fees

Send stablecoins to your generated deposit wallet address. Deposits are verified on-chain and credited to your LLM4Agents balance — the balance that funds LLM completions, scraper, search, and image tools. Gasless-transfer relayer fees are charged separately by Vexo from the agent's own EOA in the token being transferred and never debit the LLM4Agents balance.

Errors

All error responses include an error field with a machine-readable code and a message field with a human-readable description.

{
  "error": "insufficient_balance",
  "message": "Not enough funds. Required: 15 cents, available: 3 cents",
  "requestId": "req_..."
}

HTTP Status	Code	Description
400	`validation_error`	Invalid request body or parameters
401	`missing_api_key`	No Authorization header provided
401	`invalid_api_key`	API key not recognized
402	`insufficient_balance`	Not enough funds for this request
403	`agent_suspended`	Agent account has been suspended
404	`model_not_found`	Requested model is not available
422	`model_disabled`	Model exists but is currently disabled
429	`rate_limited`	Too many requests, slow down
400	`invalid_signature`	`ecrecover` on `permitSig` or `transferPermitSig` did not match `from` on `/v1/tx/send`
400	`expired_quote`	The `deadline` returned by `/v1/tx/quote` has passed; request a fresh quote
400	`unsupported_chain` / `unsupported_token`	Chain or token not currently wired for gasless transfers
409	`gasless_gas_spike`	Vexo's fee changed between `/quote` and `/send`; safe to retry with a new quote
502	`provider_error`	Error from the upstream LLM provider
502	`gasless_upstream_error`	Vexo returned an upstream error on `/v1/tx/send`
503	`gasless_operator_unavailable`	Vexo's relayer operator temporarily unavailable on `/v1/tx/send`
504	`gasless_timeout`	Gasless submit timed out; the transfer may still land on-chain

Rate Limits

Endpoint	Limit	Window
Registration (`/api/v1/agents/register`)	5 requests	per IP per minute
Chat completions (`/v1/chat/completions`)	600 requests	per API key per minute
Other authenticated endpoints	120 requests	per API key per minute

When rate limited, the API returns 429 Too Many Requests. Implement exponential backoff in your client.

LLM for Agents API

Getting Started

Authentication

TypeScript SDK

Install

Initialize

Chat — completions, streaming, tool-loop conversations

Wallets — generate, balance, transactions

Gasless transfers — non-custodial stablecoin sends

MCP tools — scraper, search, image

Models

Error handling

Endpoints Reference

Request Body

Response 201

Response 200

Request Body

Response 200

Response 200

Query Parameters

Response 200

Request Body

Response 200

Response Headers

Request Body

Response 200

Response Headers

Request Body

Example Request

Response Headers

Request Body (POST /v1/videos)

Reserve pricing by resolution

Example Request — submit with a reference image

Response 202

Response Headers (POST /v1/videos)

Poll (GET /v1/videos/:id)

Download (GET /v1/videos/:id/content)

Errors

Request Body

Reserve caps by model (momentary upper bound)

Example Request

Response 200

Response Headers

Errors

Request Body

Response 200

Request Body

Response 200

Errors

Query Parameters

Examples

Response 200

Scraper / MCP

Connecting

Pricing

One-Shot Tool Prices

Session Pricing

Session Limits

One-Shot Tools

Parameters

Strict mode (default)

Auto-fallback (auto_fallback: true)

Returns — success

Returns — total failure

Parameters

Returns

Parameters

Returns

Parameters

Returns

Parameters

Returns

Parameters

Example

Returns

Session Tools

Parameters

Returns

Parameters

Action Types

Response `201`

Response `200`

Response `200`

Response `200`

Response `200`

Response `200`

Response `200`

Request Body (`POST /v1/videos`)

Response `202`

Response Headers (`POST /v1/videos`)

Poll (`GET /v1/videos/:id`)

Download (`GET /v1/videos/:id/content`)

Response `200`

Response `200`

Response `200`

Response `200`

Auto-fallback (`auto_fallback: true`)