Token limits in ChatGPT models
Depending on the model, the token limits on the model will vary. As of Feb 2024, the token limit for the family of GPT-4 models ranges from 8,192 to 128,000 tokens. This means the sum of prompt and completion tokens for an API call cannot exceed 32,768 tokens for the GPT-4-32K model. If the prompt is 30,000 tokens, the response cannot be more than 2,768 tokens. The GPT4-Turbo 128K is the most recent model as of Feb 2024, with 128,000 tokens, which is close to 300 pages of text in a single prompt and completion. This is a massive context prompt compared to its predecessor models.
Though this can be a technical limitation, there are creative ways to address the problem of limitation, such as using chunking and condensing your prompts. We discussed chunking strategies in Chapter 4, which can help you address token limitations.
The following figure shows various models and token limits:
Model Token Limit
GPT-3.5-turbo | 4,096 |
GPT-3.5-turbo-16k | 16,384 |
GPT-3.5-turbo-0613 | 4,096 |
GPT-3.5-turbo-16k-0613 | 16,384 |
GPT-4 | 8,192 |
GPT-4-0613 | 32,768 |
GPT-4-32K | 32,768 |
GPT-4-32-0613 | 32,768 |
GPT-4-Turbo 128K | 128,000 |
Figure 5.4 – Models and associated Token Limits
For the latest updates on model limits for newer versions of models, please check the OpenAI website.
Tokens and cost considerations
The cost of using ChatGPT or similar models via an API is often tied to the number of tokens processed, encompassing both the input prompts and the model’s generated responses.
In terms of pricing, providers typically have a per-token charge, leading to a direct correlation between conversation length and cost; the more tokens processed, the higher the cost. The latest cost updates can be found on the OpenAI website.
From an optimization perspective, understanding this cost-token relationship can guide more efficient API usage. For instance, creating more succinct prompts and configuring the model for brief yet effective responses can help control token count and, consequently, manage expenses.
We hope you now have a good understanding of the key components of a prompt. Now, you are ready to learn about prompt engineering. In the next section, we will explore the details of prompt engineering and effective strategies, enabling you to maximize the potential of your prompt contents through the one-shot and few-shot learning approaches.