Cost Calculation

Recall automatically calculates costs for all workflow executions, providing transparent pricing based on AI model usage and execution charges. Understanding these costs helps you optimize workflows and manage your budget effectively.

Credits

Recall uses credits as the unit of measurement for all usage. 1 credit = $0.005.

All plan limits, usage meters, and billing thresholds are displayed in credits throughout the Recall UI. Dollar amounts in this documentation are provided for reference.

How Costs Are Calculated

Every workflow execution includes two cost components:

Base Execution Charge: 1 credit ($0.005) per execution

AI Model Usage: Variable cost based on token consumption

modelCost = (inputTokens × inputPrice + outputTokens × outputPrice) / 1,000,000
totalCredits = baseExecutionCharge + modelCost × 200

AI model prices are per million tokens. The calculation divides by 1,000,000 to get the actual cost. Workflows without AI blocks only incur the base execution charge.

Model Breakdown in Logs

For workflows using AI blocks, you can view detailed cost information in the logs:

The model breakdown shows:

Token Usage: Input and output token counts for each model
Cost Breakdown: Individual costs per model and operation
Model Distribution: Which models were used and how many times
Total Cost: Aggregate cost for the entire workflow execution

Pricing Options

Hosted Models - Recall provides API keys with a 1.1x pricing multiplier for Agent blocks:

OpenAI

Model	Base Price (Input/Output)	Hosted Price (Input/Output)
GPT-5.1	$1.25 / $10.00	$1.38 / $11.00
GPT-5	$1.25 / $10.00	$1.38 / $11.00
GPT-5 Mini	$0.25 / $2.00	$0.28 / $2.20
GPT-5 Nano	$0.05 / $0.40	$0.06 / $0.44
GPT-4o	$2.50 / $10.00	$2.75 / $11.00
GPT-4.1	$2.00 / $8.00	$2.20 / $8.80
GPT-4.1 Mini	$0.40 / $1.60	$0.44 / $1.76
GPT-4.1 Nano	$0.10 / $0.40	$0.11 / $0.44
o1	$15.00 / $60.00	$16.50 / $66.00
o3	$2.00 / $8.00	$2.20 / $8.80
o4 Mini	$1.10 / $4.40	$1.21 / $4.84

Anthropic

Model	Base Price (Input/Output)	Hosted Price (Input/Output)
Claude Opus 4.5	$5.00 / $25.00	$5.50 / $27.50
Claude Opus 4.1	$15.00 / $75.00	$16.50 / $82.50
Claude Sonnet 4.5	$3.00 / $15.00	$3.30 / $16.50
Claude Sonnet 4.0	$3.00 / $15.00	$3.30 / $16.50
Claude Haiku 4.5	$1.00 / $5.00	$1.10 / $5.50

Google

Model	Base Price (Input/Output)	Hosted Price (Input/Output)
Gemini 3 Pro Preview	$2.00 / $12.00	$2.20 / $13.20
Gemini 2.5 Pro	$1.25 / $10.00	$1.38 / $11.00
Gemini 2.5 Flash	$0.30 / $2.50	$0.33 / $2.75

The 1.1x multiplier covers infrastructure and API management costs.

Your Own API Keys - Use any model at base pricing:

Provider	Example Models	Input / Output
Deepseek	V3, R1	$0.75 / $1.00
xAI	Grok 4 Latest, Grok 3	$3.00 / $15.00
Groq	Llama 4 Scout, Llama 3.3 70B	$0.11 / $0.34
Cerebras	Llama 4 Scout, Llama 3.3 70B	$0.11 / $0.34
Ollama	Local models	Free
VLLM	Local models	Free

Pay providers directly with no markup

Pricing shown reflects rates as of September 10, 2025. Check provider documentation for current pricing.

Bring Your Own Key (BYOK)

Use your own API keys for AI model providers instead of Recall's hosted keys to pay base prices with no markup.

Supported Providers

Provider	Usage
OpenAI	Knowledge Base embeddings, Agent block
Anthropic	Agent block
Google	Agent block
Mistral	Knowledge Base OCR

Setup

Navigate to Settings → BYOK in your workspace
Click Add Key for your provider
Enter your API key and save

BYOK keys are encrypted at rest. Only workspace admins can manage keys.

When configured, workflows use your key instead of Recall's hosted keys. If removed, workflows automatically fall back to hosted keys with the multiplier.

Voice Input

Voice input uses ElevenLabs Scribe v2 Realtime for speech-to-text transcription. It is available in the Mothership chat and in deployed chat voice mode.

Context	Cost per session	Max duration
Mothership (workspace)	~5 credits ($0.024)	3 minutes
Deployed chat (voice mode)	~2 credits ($0.008)	1 minute

Each voice session is billed when it starts. In deployed chat voice mode, each conversation turn (speak → agent responds → speak again) is a separate session. Multi-turn conversations are billed per turn.

Voice input requires ELEVENLABS_API_KEY to be configured. When the key is not set, voice input controls are hidden.

Plans

Recall has two paid plan tiers — Pro and Max. Either can be used individually or with a team. Team plans pool credits across all seats in the organization.

Plan	Price	Credits Included	Daily Refresh
Community	$0	1,000 (one-time)	—
Pro	$25/mo	6,000/mo	+50/day
Max	$100/mo	25,000/mo	+200/day
Enterprise	Custom	Custom	—

To use Pro or Max with a team, select Get For Team in subscription settings and choose the tier and number of seats. Credits are pooled across the organization at the per-seat rate (e.g. Max for Teams with 3 seats = 75,000 credits/mo pooled).

Daily Refresh Credits

Paid plans include a small daily credit allowance that does not count toward your plan limit. Each day, usage up to the daily refresh amount is excluded from billable usage. This allowance resets every 24 hours and does not carry over — use it or lose it.

Plan	Daily Refresh
Pro	50 credits/day ($0.25)
Max	200 credits/day ($1.00)

For team plans, the daily refresh scales with seats (e.g. Max for Teams with 3 seats = 600 credits/day).

Annual Billing

All paid plans are available with annual billing at a 15% discount. Switch between monthly and annual billing in Settings → Subscription.

Plan	Monthly	Annual (per month)	Annual Total
Pro	$25/mo	$21.25/mo	$255/yr
Max	$100/mo	$85/mo	$1,020/yr

Team plans follow the same pricing per seat.

On-Demand Billing

By default, your usage is capped at the credits included in your plan. To allow usage beyond your plan's included amount, you can either enable on-demand billing or manually edit your usage limit to any value above your plan's minimum.

Enable On-Demand: Removes the usage cap entirely. You pay for any overage at the end of the billing period.
Edit Usage Limit: Set a specific cap above your plan's included amount to control how much overage you're willing to allow.
Disable On-Demand: Resets your usage limit back to the plan's included amount (only available if your current usage hasn't already exceeded it).

On-demand billing is managed by workspace admins for team plans. Non-admin team members cannot toggle on-demand billing.

Plan Limits

Rate Limits

Plan	Sync (req/min)	Async (req/min)
Free	50	200
Pro	150	1,000
Max	300	2,500
Enterprise	600	5,000

Max (individual) shares the same rate limits as team plans. Team plans (Pro or Max for Teams) use the Max-tier rate limits.

Concurrent Execution Limits

Plan	Concurrent Executions
Free	5
Pro	50
Max / Team	200
Enterprise	200 (customizable)

Concurrent execution limits control how many workflow executions can run simultaneously within a workspace. When the limit is reached, new executions are queued and admitted as running executions complete. Manual runs from the editor are not subject to these limits.

File Storage

Plan	Storage
Free	5 GB
Pro	50 GB
Max	500 GB
Enterprise	500 GB (customizable)

Team plans (Pro or Max for Teams) use 500 GB.

Execution Time Limits

Plan	Sync	Async
Free	5 minutes	90 minutes
Pro / Max / Team / Enterprise	50 minutes	90 minutes

Sync executions run immediately and return results directly. These are triggered via the API with async: false (default) or through the UI. Async executions (triggered via API with async: true, webhooks, or schedules) run in the background.

If a workflow exceeds its time limit, it will be terminated and marked as failed with a timeout error. Design long-running workflows to use async execution or break them into smaller workflows.

Billing Model

Recall uses a base subscription + overage billing model:

How It Works

Pro Plan ($500/month — 6,000 credits):

Monthly subscription includes 6,000 credits of usage
Usage under 6,000 credits → No additional charges
Usage over 6,000 credits (with on-demand enabled) → Pay the overage at month end
Example: 7,000 credits used = $25 (subscription) + $5 (overage for 1,000 extra credits at $0.005/credit)

Team Plans:

Usage is pooled across all team members in the organization
Overage is calculated from total team usage against the pooled limit
Organization owner receives one bill

Enterprise Plans:

Fixed monthly price, no overages
Custom usage limits per agreement

Threshold Billing

When on-demand is enabled and unbilled overage reaches $50, Recall automatically bills the full unbilled amount.

Example:

Day 10: $70 overage → Bill $70 immediately
Day 15: Additional $35 usage ($105 total) → Already billed, no action
Day 20: Another $50 usage ($155 total, $85 unbilled) → Bill $85 immediately

This spreads large overage charges throughout the month instead of one large bill at period end.

Usage Monitoring

Monitor your usage and billing in Settings → Subscription:

Current Usage: Real-time credit usage for the current billing period
Usage Limits: Plan limits with a visual progress bar
On-Demand Billing: Toggle on-demand billing to allow usage beyond your plan's included credits
Plan Management: Upgrade, downgrade, or switch between monthly and annual billing

Programmatic Usage Tracking

You can query your current usage and limits programmatically using the API:

Endpoint:

GET /api/users/me/usage-limits

Authentication:

Include your API key in the X-API-Key header

Example Request:

curl -X GET -H "X-API-Key: YOUR_API_KEY" -H "Content-Type: application/json" https://tryrecall.com/api/users/me/usage-limits

Example Response:

{
  "success": true,
  "rateLimit": {
    "sync": {
      "isLimited": false,
      "requestsPerMinute": 150,
      "maxBurst": 300,
      "remaining": 300,
      "resetAt": "2025-09-08T22:51:55.999Z"
    },
    "async": {
      "isLimited": false,
      "requestsPerMinute": 1000,
      "maxBurst": 2000,
      "remaining": 2000,
      "resetAt": "2025-09-08T22:51:56.155Z"
    },
    "authType": "api"
  },
  "usage": {
    "currentPeriodCost": 12.34,
    "limit": 100,
    "plan": "pro_6000"
  }
}

Rate Limit Fields:

requestsPerMinute: Sustained rate limit (tokens refill at this rate)
maxBurst: Maximum tokens you can accumulate (burst capacity)
remaining: Current tokens available (can be up to maxBurst)

Response Fields:

currentPeriodCost reflects usage in the current billing period (in dollars)
limit is derived from individual limits (Free/Pro/Max) or pooled organization limits (Team/Enterprise)
plan is the highest-priority active plan associated with your user

Cost Optimization Strategies

Model Selection: Choose models based on task complexity. Simple tasks can use GPT-4.1-nano while complex reasoning might need o1 or Claude Opus.
Prompt Engineering: Well-structured, concise prompts reduce token usage without sacrificing quality.
Local Models: Use Ollama or VLLM for non-critical tasks to eliminate API costs entirely.
Caching and Reuse: Store frequently used results in variables or files to avoid repeated AI model calls.
Batch Processing: Process multiple items in a single AI request rather than making individual calls.

Next Steps

Review your current usage in Settings → Subscription
Learn about Logging to track execution details
Explore the External API for programmatic cost monitoring
Check out workflow optimization techniques to reduce costs

Common Questions

Every execution incurs a base charge of 1 credit ($0.005). On top of that, any AI model usage is billed based on token consumption. Workflows that do not use AI blocks only pay the base execution charge.

1 credit equals $0.005. All plan limits, usage meters, and billing thresholds in the Recall UI are displayed in credits.

No. Daily refresh credits reset every 24 hours and do not accumulate. If you do not use them within the day, they are lost.

By default, your usage is capped at your plan's included credits and executions will stop. If you enable on-demand billing or manually raise your usage limit in Settings, you can continue running workflows and pay for the overage at the end of the billing period.

When you use Recall's hosted API keys (instead of bringing your own), a 1.1x multiplier is applied to the base model pricing for Agent blocks. This covers infrastructure and API management costs. You can avoid this multiplier by using your own API keys via the BYOK feature.

Yes. If you run local models through Ollama or VLLM, there are no API costs for those model calls. You still pay the base execution charge of 1 credit per execution.

When on-demand billing is enabled and your unbilled overage reaches $50, Recall automatically bills the full unbilled amount. This spreads large charges throughout the month instead of accumulating one large bill at period end.

Cost Calculation

Common Questions

On this page