Fun with tiktoken and NodeJS

The goal of this blog post is to explain how you can use the tiktoken library with NodeJS to convert words to tokens and tokens to words. Why does this matter? Because when you use ChatGPT, you are charged by the token. Understanding how words are exchanged with tokens can help you predict the costs of using the ChatGPT APIs.

To learn more about how OpenAI charges you, see my previous blog post:

What is tiktoken?

tiktoken is an open-source library (MIT license) developed by ChatGPT. There is both a Python and NodeJS version of this library. In this blog post, we focus on the NodeJS version (js-tiktoken).

You can install tiktoken by executing the following command:

npm install js-tiktoken

After you install tiktoken, you can covert words to tokens by using the following code:

import {encodingForModel} from "js-tiktoken";

// get the encoding used by GPT-4 Turbo
const enc = encodingForModel("gpt-4-turbo-preview");


// encode "Hello World!" and show the results
const result1 = enc.encode("Hello World!");
console.log(result1);

// encode "Hello World, I enjoy playing chess!" and show the results
const result2 = enc.encode("Hello, I enjoy playing chess!");
console.log(result2);

The code above uses the tiktoken encodingForModel() method to get the correct encoding for GPT-4 Turbo. Next, the tiktoken encode() method is used to convert the text “Hello World!” and “Hello, I enjoy playing chess!” into their corresponding tokens. When you run this code, you get the following output:

[ 9906, 4435, 0 ]
[ 9906, 11, 358, 4774, 5737, 33819, 0 ]

The first sentence “Hello World!” is converted into an array of 3 tokens. The first token represents “Hello,” the second token represents “World,” and the third token represents the exclamation mark (“!”).

The second sentence “Hello, I enjoy playing chess!” is converted into 7 tokens. Each word corresponds to a token and the final 2 tokens are explained by the comma (“,”) and exclamation mark (“!”).

Notice that the word “Hello” in both sentences is converted to the token 9906 and the exclamation mark (“!”) is converted to the token 0.

Counting OpenAI Message Tokens

OpenAI provides sample code for predicting the number of tokens required when making a request against different ChatGPT models. Unfortunately, OpenAI provides the code in Python. But fortunately for you, I’ve converted the code to Javascript:

import {encodingForModel} from "js-tiktoken";
import OpenAI from "openai";

const openai = new OpenAI();

// Given an array of messages and a model name, returns token count.
// Based on Python code at https://cookbook.openai.com/examples/how_to_count_tokens_with_tiktoken
function countPromptTokens(messages, modelName) {
    let tokensPerMessage; // there is token overheard per message
    let tokensPerName; // there is token overheard per name field

    switch (modelName) {
        case "gpt-3.5-turbo-0613":
        case "gpt-3.5-turbo-16k-0613":
        case "gpt-4-0314":
        case "gpt-4-32k-0314":
        case "gpt-4-0613":
        case "gpt-4-32k-0613":
            tokensPerMessage = 3;
            tokensPerName = 1;
            break;
        case "gpt-3.5-turbo-0301":
            tokensPerMessage = 4;
            tokensPerName = -1;
            break;
        case "gpt-3.5-turbo":
            console.log("Warning: gpt-3.5-turbo may update over time. Returning num tokens assuming gpt-3.5-turbo-0613.");
            return countTokens(messages, modelName="gpt-3.5-turbo-0613");
        case "gpt-4":
            console.log("Warning: gpt-4 may update over time. Returning num tokens assuming gpt-4-0613.");
            return countTokens(messages, modelName="gpt-4-0613");
        default:
            throw new Error(`countTokens not implemented for model ${modelName}`);
    }

    // get the right encoder
    const encoding = encodingForModel(modelName);

    let numTokens = 0;
    for (let message of messages) {
        numTokens += tokensPerMessage;
        for (const [key, value] of Object.entries(message)) {
            numTokens += encoding.encode(value).length;
            if (key == "name") {
                numTokens += tokensPerName;
            }
        }
    }
    numTokens += 3;  // every reply is primed with <|start|>assistant<|message|>
    return numTokens;
}

// Given a message and a modelName returns token count
function countCompletionTokens(message, modelName) {
   // get the right encoder
   const encoding = encodingForModel(modelName);

   return encoding.encode(message).length;
}


// call ChatGPT
async function promptGPT(messages, modelName) {
    const completion = await openai.chat.completions.create({
      messages: messages,
      model: modelName,
    });
  
    return completion;
}

const messages = [
    {
        "role": "system",
        "name": "example_user",
        "content": "You are a helpful assistant who is obssessed with dinosaurs.",
    },
];

let modelName = "gpt-3.5-turbo-0301";

// call GPT
const result = await promptGPT(messages, modelName);
const GPTPromptCount = result.usage.prompt_tokens;
const GPTCompletionCount = result.usage.completion_tokens;
console.log(`ChatGPT: prompt count=${GPTPromptCount}, completion count=${GPTCompletionCount}`);

// call countPromptTokens and countCompletionTokens
const myPromptCount = countPromptTokens(messages, modelName);
const myCompletionCount = countCompletionTokens(result.choices[0].message.content, modelName);
console.log(`My Count: prompt count=${myPromptCount}, completion count=${myCompletionCount}`);

There is a lot going on in this code so let me walk you through it. There are three functions:

  1. countPromptTokens() – this function takes an array of messages and performs a token count. Notice that there is some overhead associated with each message and different models handle this overhead differently. For example, the “gpt-3.5-turbo-0613” model requires an additional 3 tokens per message.
  2. countCompletionTokens() – this function takes a single message and performs a token count. You can use this function to calculate the number of tokens associated with the completion message returned by the Chat Completions API.
  3. promptGPT() – this function calls the OpenAI Chat Completions endpoint. The Chat Completions endpoint returns a usage field which represents the actual number of tokens used in the ChatGPT request (both prompt and completion tokens).

When you run the code above, the same list of messages is sent to ChatGPT and the countPromptTokens() and countCompletionTokens() functions so you can verify that these functions are returning the right results. Here is a sample of the output:

ChatGPT: prompt count=22, completion count=99
My Count: prompt count=22, completion count=99

These results should be reassuring. Both ChatGPT and the countPromptTokens() function return the same number of prompt tokens: 22. Both ChatGPT and the countCompletionTokens() function return the same number of completion tokens: 99.

Conclusion

The goal of this blog post was to explain how you can use the tiktoken library to convert words to tokens. By taking advantage of this library, you can predict the number of tokens required when making a call to an OpenAI endpoint. In the code above, I demonstrated how you can create both a countPromptTokens() and countCompletionTokens() function. Because OpenAI charges you per token, you can use tiktoken to predict the costs of calling OpenAI APIs.

Get the latest OpenAI programming tips delivered directly to your inbox.