With So Many AI Models Out There, How Do You Choose? (Part 1)

Written by

Silas Grey

Updated on:June-15th-2025

This article only wants to solve one problem: there are so many large models on the market now, how should we choose in our work scenarios?

Here is a list of OpenRouter models:

https://openrouter.ai/rankings

AI enthusiasts must be familiar with OpenRouter, which is an AI large model aggregation platform that brings together top large models from many manufacturers such as OpenAI, Anthropic , Google, DeepSeek, etc.

Its ranking is based on the sum of prompt and completion tokens of each model , which is then normalized using the GPT-4 tokenizer for comparison.

What does it mean?

Prompt and completion tokens are the sum of the number of prompt tokens that the user inputs to the model and the number of tokens that the large model generates for the response.

The calculation of the token number uniformly uses the GPT-4 tokenizer , which is a unified standard for calculating tokens.

The picture below is the GPT-4 tokenizer. Enter any content and the corresponding token number will be displayed.

Website: https://platform.openai.com/tokenizer

The OpenRouter rankings statistics are updated every 10 minutes.

This is actually done to ensure fairness.

Let’s take a look at this ranking, which is classified into 13 usage scenarios.

The bar chart can clearly show the usage ratio of different models.

Note that this is not a ranking of big model performance, but a ranking of big model token usage.

It can be seen which large model people prefer to use in various fields, as well as the changes in usage.

For example, in the field of programming , the most commonly used one is not Claude-3.7-sonnet, nor the newly crowned king Gemini-2.5-pro, but GPT-4o-mini. This is because in the user's actual usage scenario, it is not necessarily the most powerful model that will be used. Factors such as price, ease of use, and scenario applicability must also be considered.

In the field of programming, the usage of mainstream large models is not much different, but if we look at the fields of technology and science , GPT-4o-mini is the only one that dominates, accounting for 88.5%.

If you are a scientific researcher, you should definitely use GPT-4o-mini first to see the effect, and then try other large models.

In the frequently used translation scenarios, we can see that Gemini-1.5-Flash-8B is the most used. In the past two months, the usage of Gemini-2.0-Flash has also increased, accounting for about 40%.

There are so many large models nowadays. Choosing a suitable large model in your own work scenario is more cost-effective than choosing the model with the strongest performance.

OpenRouter LLM Rankings is a good reference indicator when we choose large models.