Choose the plan that's right for you
Developer
Powerful speed and reliability to start your project
Business
A plan that scales with your production usage
Enterprise
Personalized configurations for serving at scale
Base model parameter count | $/1M tokens |
---|---|
0B - 16B | $0.20 |
16.1B - 80B | $0.90 |
MoE 0B - 56B (Mixtral 8x7B) | $0.50 |
MoE 56.1B - 176B (DBRX, Mixtral 8x22B) | $1.20 |
Per-token pricing is applied only for serverless inference. See below for dedicated deployment pricing.
LoRA models deployed to our serverless inference service are charged at the same rate as the underlying base model. There is no additional cost for serving LoRA models.
SDXL, $/step | SDXL w/ ControlNet, $/step |
---|---|
$0.0002 | $0.0003 |
For image generation models like SDXL we charge based on the number of inference steps (denoising iterations).
For multi-modal models like LLaVA, each image is billed as 576 prompt tokens.
Base model parameter count | $/1M input tokens |
---|---|
up to 150M | $0.008 |
150M - 350M | $0.016 |
Embedding model pricing is based on the number of input tokens processed by the model.
Model | $ / 1M tokens in training |
---|---|
Models up to 16B parameters | $0.50 |
Models 16.1B - 80B | $3.00 |
Mixtral 8x7B | $2.00 |
Fireworks charges based on the total number of tokens in your fine-tuning dataset (dataset size * number of epochs).
Usage limits
Usage limits restrict how much you can spend on the Fireworks platform per calendar month. The usage limit is determined by your total past Fireworks spend, including both credits and past invoices. You can purchase prepaid credits at https://fireworks.ai/billing to increase your historical spend for the current calendar month.
Note: The usage limit is applied before credits so it is possible to hit the usage limit before all of your current credits are depleted. If this is an issue, contact Fireworks to increase your usage limit.
Tier | Usage Limit | Qualification |
---|---|---|
Tier 1 | $50 / month | Default (payment method required to use more than available credits) |
Tier 2 | $500 / month | Total historical spend of $100+ |
Tier 3 | $5,000 / month | Total historical spend of $1,000+ |
Tier 4 | $50,000 / month | Total historical spend of $10,000+ |
Custom | Contact us at inquiries@fireworks.ai |
If you do not have a payment method on file, your account will be suspended after your credits are depleted. Failure to pay a past invoice may also result in account suspension. Your usage limit will be set to $0/month in both cases.
API requests will be rejected when your account's usage limit is exceeded. Visit https://fireworks.ai/billing to add your payment method and monitor your usage and invoices.
Dedicated deployments are billed by GPU-second. We charge $3.89 per hour for one NVIDIA A100 80 GB GPU. Pricing scales linearly when using multiple A100 GPUs. NVIDIA H100 80 GB GPUs are available to Enterprise accounts - contact us for details.