Glostarep

Vercel AI Gateway Provider Sorting: Route by Cost, Speed or Throughput

Vercel AI Gateway Provider Sorting: Route by Cost, Speed or Throughput

Choosing the right AI provider for each request has always required trade-offs. Cost, speed, and output volume each pull in different directions. Now, Vercel gives developers explicit control over that decision. The company has shipped a new Vercel AI Gateway provider sorting feature. It lets teams rank providers by cost, time to first token, or throughput, automatically, at request time.

Previously, AI Gateway blended provider reliability, output quality, cost, and response speed into a default ordering. That default still exists. However, developers can now override it with a single sort parameter on providerOptions.gateway. This update is especially useful for models with many available providers and noticeable variation in price or performance.

Three sort values are available. First, sort: 'cost' routes requests to the lowest-price provider. It measures input price per million tokens and puts the cheapest option first. This works best for high-volume, cost-sensitive workloads. Second, sort: 'ttft' ranks providers by median time to first token. The fastest provider goes first. Teams building latency-sensitive applications will find this most useful. Third, sort: 'tps' ranks by median tokens per second throughput, highest first. This suits long-output generation where total response time matters most.

A key strength of Vercel AI Gateway provider sorting is that ranking happens at request time. Newly added providers, price changes, and shifts in observed latency all flow through automatically. Developers make no code changes to benefit from updated rankings.

The feature also composes cleanly with existing routing controls. Teams can combine sort with Zero Data Retention (ZDR) filtering. In that setup, AI Gateway first filters to only ZDR-compliant providers, then sorts the remaining ones by the chosen metric. In addition, sort works alongside the existing order parameter. Providers listed in order move to the front, while the rest follow the requested sort criterion.

Full routing visibility comes built in. Every response includes a sort block in the routing metadata. It shows which providers were considered, the metric values used to rank them, the attempt order, and any providers skipped due to degraded health. The result is transparent, fully auditable routing.

For full implementation details, Vercel’s documentation is available on the AI Gateway provider filtering and ordering page.

Leave a Comment

Your email address will not be published. Required fields are marked *