Intent Detection
Every request to GreatRouter starts with intent detection. The router reads your prompt and classifies the task — chat, image generation, transcription, embeddings, translation, code generation, and more. This classification happens in milliseconds and determines which category of models the request should be routed to. A photorealistic image prompt goes to image models. A code review request goes to reasoning models with function-calling capabilities.
Model Selection
Once intent is classified, GreatRouter evaluates the available models in that category. It considers capability tags (vision, reasoning, web search, streaming), latency profiles, cost per token or per image, and current provider availability. The result is a recommendation of the best model for your specific request — not just the most popular one, but the one that matches your quality expectations and budget.
Cost Optimization
GreatRouter supports two cost optimization modes. In price-optimized mode, the router automatically selects the cheapest capable model for each request. With budget_dollars caps, you set a maximum cost per request and the router ensures it stays within that ceiling. If the preferred model exceeds the cap, it falls back to a more affordable alternative automatically.
Observability and Fallback
Every routed request returns metadata: the model used, latency, cost, and capability tags. You can inspect these in the dashboard or stream them to your observability stack. When a provider goes down, GreatRouter automatically fails over to the next best model — no configuration required. The result is a single API that's more reliable than any individual provider.