Recall automatically calculates costs for all workflow executions, providing transparent pricing based on AI model usage and execution charges. Understanding these costs helps you optimize workflows and manage your budget effectively.
Credits
Recall uses credits as the unit of measurement for all usage. 1 credit = $0.005.
All plan limits, usage meters, and billing thresholds are displayed in credits throughout the Recall UI. Dollar amounts in this documentation are provided for reference.
How Costs Are Calculated
Every workflow execution includes two cost components:
Base Execution Charge: 1 credit ($0.005) per execution
AI Model Usage: Variable cost based on token consumption
modelCost = (inputTokens × inputPrice + outputTokens × outputPrice) / 1,000,000
totalCredits = baseExecutionCharge + modelCost × 200AI model prices are per million tokens. The calculation divides by 1,000,000 to get the actual cost. Workflows without AI blocks only incur the base execution charge.
Model Breakdown in Logs
For workflows using AI blocks, you can view detailed cost information in the logs:

The model breakdown shows:
- Token Usage: Input and output token counts for each model
- Cost Breakdown: Individual costs per model and operation
- Model Distribution: Which models were used and how many times
- Total Cost: Aggregate cost for the entire workflow execution
Pricing Options
Hosted Models - Recall provides API keys with a 1.1x pricing multiplier for Agent blocks:
OpenAI
| Model | Base Price (Input/Output) | Hosted Price (Input/Output) |
|---|---|---|
| GPT-5.1 | $1.25 / $10.00 | $1.38 / $11.00 |
| GPT-5 | $1.25 / $10.00 | $1.38 / $11.00 |
| GPT-5 Mini | $0.25 / $2.00 | $0.28 / $2.20 |
| GPT-5 Nano | $0.05 / $0.40 | $0.06 / $0.44 |
| GPT-4o | $2.50 / $10.00 | $2.75 / $11.00 |
| GPT-4.1 | $2.00 / $8.00 | $2.20 / $8.80 |
| GPT-4.1 Mini | $0.40 / $1.60 | $0.44 / $1.76 |
| GPT-4.1 Nano | $0.10 / $0.40 | $0.11 / $0.44 |
| o1 | $15.00 / $60.00 | $16.50 / $66.00 |
| o3 | $2.00 / $8.00 | $2.20 / $8.80 |
| o4 Mini | $1.10 / $4.40 | $1.21 / $4.84 |
Anthropic
| Model | Base Price (Input/Output) | Hosted Price (Input/Output) |
|---|---|---|
| Claude Opus 4.5 | $5.00 / $25.00 | $5.50 / $27.50 |
| Claude Opus 4.1 | $15.00 / $75.00 | $16.50 / $82.50 |
| Claude Sonnet 4.5 | $3.00 / $15.00 | $3.30 / $16.50 |
| Claude Sonnet 4.0 | $3.00 / $15.00 | $3.30 / $16.50 |
| Claude Haiku 4.5 | $1.00 / $5.00 | $1.10 / $5.50 |
| Model | Base Price (Input/Output) | Hosted Price (Input/Output) |
|---|---|---|
| Gemini 3 Pro Preview | $2.00 / $12.00 | $2.20 / $13.20 |
| Gemini 2.5 Pro | $1.25 / $10.00 | $1.38 / $11.00 |
| Gemini 2.5 Flash | $0.30 / $2.50 | $0.33 / $2.75 |
The 1.1x multiplier covers infrastructure and API management costs.
Your Own API Keys - Use any model at base pricing:
| Provider | Example Models | Input / Output |
|---|---|---|
| Deepseek | V3, R1 | $0.75 / $1.00 |
| xAI | Grok 4 Latest, Grok 3 | $3.00 / $15.00 |
| Groq | Llama 4 Scout, Llama 3.3 70B | $0.11 / $0.34 |
| Cerebras | Llama 4 Scout, Llama 3.3 70B | $0.11 / $0.34 |
| Ollama | Local models | Free |
| VLLM | Local models | Free |
Pay providers directly with no markup
Pricing shown reflects rates as of September 10, 2025. Check provider documentation for current pricing.
Bring Your Own Key (BYOK)
Use your own API keys for AI model providers instead of Recall's hosted keys to pay base prices with no markup.
Supported Providers
| Provider | Usage |
|---|---|
| OpenAI | Knowledge Base embeddings, Agent block |
| Anthropic | Agent block |
| Agent block | |
| Mistral | Knowledge Base OCR |
Setup
- Navigate to Settings → BYOK in your workspace
- Click Add Key for your provider
- Enter your API key and save
BYOK keys are encrypted at rest. Only workspace admins can manage keys.
When configured, workflows use your key instead of Recall's hosted keys. If removed, workflows automatically fall back to hosted keys with the multiplier.
Voice Input
Voice input uses ElevenLabs Scribe v2 Realtime for speech-to-text transcription. It is available in the Mothership chat and in deployed chat voice mode.
| Context | Cost per session | Max duration |
|---|---|---|
| Mothership (workspace) | ~5 credits ($0.024) | 3 minutes |
| Deployed chat (voice mode) | ~2 credits ($0.008) | 1 minute |
Each voice session is billed when it starts. In deployed chat voice mode, each conversation turn (speak → agent responds → speak again) is a separate session. Multi-turn conversations are billed per turn.
Voice input requires ELEVENLABS_API_KEY to be configured. When the key is not set, voice input controls are hidden.
Plans
Recall has two paid plan tiers — Pro and Max. Either can be used individually or with a team. Team plans pool credits across all seats in the organization.
| Plan | Price | Credits Included | Daily Refresh |
|---|---|---|---|
| Community | $0 | 1,000 (one-time) | — |
| Pro | $25/mo | 6,000/mo | +50/day |
| Max | $100/mo | 25,000/mo | +200/day |
| Enterprise | Custom | Custom | — |
To use Pro or Max with a team, select Get For Team in subscription settings and choose the tier and number of seats. Credits are pooled across the organization at the per-seat rate (e.g. Max for Teams with 3 seats = 75,000 credits/mo pooled).
Daily Refresh Credits
Paid plans include a small daily credit allowance that does not count toward your plan limit. Each day, usage up to the daily refresh amount is excluded from billable usage. This allowance resets every 24 hours and does not carry over — use it or lose it.
| Plan | Daily Refresh |
|---|---|
| Pro | 50 credits/day ($0.25) |
| Max | 200 credits/day ($1.00) |
For team plans, the daily refresh scales with seats (e.g. Max for Teams with 3 seats = 600 credits/day).
Annual Billing
All paid plans are available with annual billing at a 15% discount. Switch between monthly and annual billing in Settings → Subscription.
| Plan | Monthly | Annual (per month) | Annual Total |
|---|---|---|---|
| Pro | $25/mo | $21.25/mo | $255/yr |
| Max | $100/mo | $85/mo | $1,020/yr |
Team plans follow the same pricing per seat.
On-Demand Billing
By default, your usage is capped at the credits included in your plan. To allow usage beyond your plan's included amount, you can either enable on-demand billing or manually edit your usage limit to any value above your plan's minimum.
- Enable On-Demand: Removes the usage cap entirely. You pay for any overage at the end of the billing period.
- Edit Usage Limit: Set a specific cap above your plan's included amount to control how much overage you're willing to allow.
- Disable On-Demand: Resets your usage limit back to the plan's included amount (only available if your current usage hasn't already exceeded it).
On-demand billing is managed by workspace admins for team plans. Non-admin team members cannot toggle on-demand billing.
Plan Limits
Rate Limits
| Plan | Sync (req/min) | Async (req/min) |
|---|---|---|
| Free | 50 | 200 |
| Pro | 150 | 1,000 |
| Max | 300 | 2,500 |
| Enterprise | 600 | 5,000 |
Max (individual) shares the same rate limits as team plans. Team plans (Pro or Max for Teams) use the Max-tier rate limits.
Concurrent Execution Limits
| Plan | Concurrent Executions |
|---|---|
| Free | 5 |
| Pro | 50 |
| Max / Team | 200 |
| Enterprise | 200 (customizable) |
Concurrent execution limits control how many workflow executions can run simultaneously within a workspace. When the limit is reached, new executions are queued and admitted as running executions complete. Manual runs from the editor are not subject to these limits.
File Storage
| Plan | Storage |
|---|---|
| Free | 5 GB |
| Pro | 50 GB |
| Max | 500 GB |
| Enterprise | 500 GB (customizable) |
Team plans (Pro or Max for Teams) use 500 GB.
Execution Time Limits
| Plan | Sync | Async |
|---|---|---|
| Free | 5 minutes | 90 minutes |
| Pro / Max / Team / Enterprise | 50 minutes | 90 minutes |
Sync executions run immediately and return results directly. These are triggered via the API with async: false (default) or through the UI.
Async executions (triggered via API with async: true, webhooks, or schedules) run in the background.
If a workflow exceeds its time limit, it will be terminated and marked as failed with a timeout error. Design long-running workflows to use async execution or break them into smaller workflows.
Billing Model
Recall uses a base subscription + overage billing model:
How It Works
Pro Plan ($500/month — 6,000 credits):
- Monthly subscription includes 6,000 credits of usage
- Usage under 6,000 credits → No additional charges
- Usage over 6,000 credits (with on-demand enabled) → Pay the overage at month end
- Example: 7,000 credits used = $25 (subscription) + $5 (overage for 1,000 extra credits at $0.005/credit)
Team Plans:
- Usage is pooled across all team members in the organization
- Overage is calculated from total team usage against the pooled limit
- Organization owner receives one bill
Enterprise Plans:
- Fixed monthly price, no overages
- Custom usage limits per agreement
Threshold Billing
When on-demand is enabled and unbilled overage reaches $50, Recall automatically bills the full unbilled amount.
Example:
- Day 10: $70 overage → Bill $70 immediately
- Day 15: Additional $35 usage ($105 total) → Already billed, no action
- Day 20: Another $50 usage ($155 total, $85 unbilled) → Bill $85 immediately
This spreads large overage charges throughout the month instead of one large bill at period end.
Usage Monitoring
Monitor your usage and billing in Settings → Subscription:
- Current Usage: Real-time credit usage for the current billing period
- Usage Limits: Plan limits with a visual progress bar
- On-Demand Billing: Toggle on-demand billing to allow usage beyond your plan's included credits
- Plan Management: Upgrade, downgrade, or switch between monthly and annual billing
Programmatic Usage Tracking
You can query your current usage and limits programmatically using the API:
Endpoint:
GET /api/users/me/usage-limitsAuthentication:
- Include your API key in the
X-API-Keyheader
Example Request:
curl -X GET -H "X-API-Key: YOUR_API_KEY" -H "Content-Type: application/json" https://tryrecall.com/api/users/me/usage-limitsExample Response:
{
"success": true,
"rateLimit": {
"sync": {
"isLimited": false,
"requestsPerMinute": 150,
"maxBurst": 300,
"remaining": 300,
"resetAt": "2025-09-08T22:51:55.999Z"
},
"async": {
"isLimited": false,
"requestsPerMinute": 1000,
"maxBurst": 2000,
"remaining": 2000,
"resetAt": "2025-09-08T22:51:56.155Z"
},
"authType": "api"
},
"usage": {
"currentPeriodCost": 12.34,
"limit": 100,
"plan": "pro_6000"
}
}Rate Limit Fields:
requestsPerMinute: Sustained rate limit (tokens refill at this rate)maxBurst: Maximum tokens you can accumulate (burst capacity)remaining: Current tokens available (can be up tomaxBurst)
Response Fields:
currentPeriodCostreflects usage in the current billing period (in dollars)limitis derived from individual limits (Free/Pro/Max) or pooled organization limits (Team/Enterprise)planis the highest-priority active plan associated with your user
Cost Optimization Strategies
- Model Selection: Choose models based on task complexity. Simple tasks can use GPT-4.1-nano while complex reasoning might need o1 or Claude Opus.
- Prompt Engineering: Well-structured, concise prompts reduce token usage without sacrificing quality.
- Local Models: Use Ollama or VLLM for non-critical tasks to eliminate API costs entirely.
- Caching and Reuse: Store frequently used results in variables or files to avoid repeated AI model calls.
- Batch Processing: Process multiple items in a single AI request rather than making individual calls.
Next Steps
- Review your current usage in Settings → Subscription
- Learn about Logging to track execution details
- Explore the External API for programmatic cost monitoring
- Check out workflow optimization techniques to reduce costs