Tiktoken 1.0.1

Tiktoken

Nuget package dotnet License: MIT Discord

This implementation aims for maximum performance, especially in the token count operation.
There's also a benchmark console app here for easy tracking of this.
We will be happy to accept any PR.

Implemented encodings

  • cl100k_base
  • r50k_base
  • p50k_base
  • p50k_edit

Usage

var encoding = Tiktoken.Encoding.ForModel("gpt-4");
var tokens = encoding.Encode("hello world"); // [15339, 1917]
var text = encoding.Decode(tokens); // hello world
var numberOfTokens = encoding.CountTokens(text); // 2

var encoding = Tiktoken.Encoding.Get("p50k_base");
var tokens = encoding.Encode("hello world"); // [31373, 995]
var text = encoding.Decode(tokens); // hello world

Benchmarks

You can view the reports for each version here


BenchmarkDotNet=v0.13.5, OS=macOS Ventura 13.3.1 (a) (22E772610a) [Darwin 22.4.0]
Apple M1 Pro, 1 CPU, 10 logical and 10 physical cores
.NET SDK=7.0.203
  [Host]     : .NET 7.0.5 (7.0.523.17405), Arm64 RyuJIT AdvSIMD
  Job-MEZVRD : .NET 7.0.5 (7.0.523.17405), Arm64 RyuJIT AdvSIMD DEBUG

BuildConfiguration=Debug  

Method Categories Data Mean Ratio Gen0 Gen1 Gen2 Allocated Alloc Ratio
SharpTokenV1_0_28_ CountTokens 1. (...)57. [19866] 5,247,614.0 ns 1.00 601.5625 289.0625 - 3805771 B 1.00
TiktokenSharpV1_0_5_ CountTokens 1. (...)57. [19866] 1,509,944.3 ns 0.29 250.0000 125.0000 - 1571154 B 0.41
Tiktoken_ CountTokens 1. (...)57. [19866] 411,159.5 ns 0.08 49.3164 - - 309449 B 0.08
SharpTokenV1_0_28_ CountTokens Hello, World! 3,217.5 ns 1.00 0.6752 - - 4240 B 1.00
TiktokenSharpV1_0_5_ CountTokens Hello, World! 6,631.3 ns 2.06 2.1820 0.0381 - 13728 B 3.24
Tiktoken_ CountTokens Hello, World! 329.0 ns 0.10 0.0420 - - 264 B 0.06
SharpTokenV1_0_28_ CountTokens King(...)edy. [275] 62,301.4 ns 1.00 8.5449 0.3662 - 54160 B 1.00
TiktokenSharpV1_0_5_ CountTokens King(...)edy. [275] 21,541.3 ns 0.35 5.0964 0.2136 - 32096 B 0.59
Tiktoken_ CountTokens King(...)edy. [275] 4,575.6 ns 0.07 0.6409 - - 4032 B 0.07
SharpTokenV1_0_28_Encode Encode 1. (...)57. [19866] 4,962,747.6 ns 1.00 601.5625 296.8750 - 3805769 B 1.00
TiktokenSharpV1_0_5_Encode Encode 1. (...)57. [19866] 1,643,795.4 ns 0.33 251.9531 126.9531 1.9531 1571156 B 0.41
Tiktoken_Encode Encode 1. (...)57. [19866] 427,241.4 ns 0.09 59.5703 29.7852 - 375665 B 0.10
SharpTokenV1_0_28_Encode Encode Hello, World! 3,281.3 ns 1.00 0.6752 0.0038 - 4240 B 1.00
TiktokenSharpV1_0_5_Encode Encode Hello, World! 6,636.0 ns 2.02 2.1820 0.0458 - 13728 B 3.24
Tiktoken_Encode Encode Hello, World! 433.3 ns 0.13 0.1135 0.0005 - 712 B 0.17
SharpTokenV1_0_28_Encode Encode King(...)edy. [275] 62,339.9 ns 1.00 8.5449 0.4883 - 54160 B 1.00
TiktokenSharpV1_0_5_Encode Encode King(...)edy. [275] 21,609.6 ns 0.35 5.0964 0.3052 - 32096 B 0.59
Tiktoken_Encode Encode King(...)edy. [275] 5,224.8 ns 0.08 0.8011 0.0153 - 5056 B 0.09

Possible optimizations

  • Modes - Fast(without special token regex)/Strict
  • SIMD?
  • Parallelism?
  • string as dictionary key?

Support

Priority place for bugs: https://github.com/tryAGI/LangChain/issues
Priority place for ideas and general questions: https://github.com/tryAGI/LangChain/discussions
Discord: https://discord.gg/Ca2xhfBf3v

Showing the top 20 packages that depend on Tiktoken.

Packages Downloads
Anthropic
Generated C# SDK based on official Anthropic OpenAPI specification.
4
Anthropic
Generated C# SDK based on official Anthropic OpenAPI specification.
3

.NET Framework 4.6.1

.NET 6.0

  • No dependencies.

.NET 7.0

  • No dependencies.

.NET Standard 2.0

  • No dependencies.

.NET Standard 2.1

  • No dependencies.

Version Downloads Last updated
3.1.4 3 03/26/2026
3.1.3 2 03/26/2026
3.1.2 2 03/26/2026
3.1.2-alpha.0.1 2 03/26/2026
3.1.1 2 03/26/2026
3.1.1-alpha.0.1 2 03/26/2026
3.1.0 2 03/26/2026
3.1.0-rc.1.4 2 03/26/2026
3.1.0-rc.1.3 2 03/26/2026
3.1.0-rc.1.2 2 03/26/2026
3.1.0-rc.1.1 2 03/26/2026
3.1.0-rc.1 2 03/26/2026
3.0.1-alpha.0.4 2 03/26/2026
3.0.0 2 03/26/2026
2.2.0 2 03/06/2026
2.1.1 2 03/06/2026
2.1.0 2 03/06/2026
2.0.3 2 03/06/2026
2.0.2 2 03/06/2026
2.0.1 2 03/06/2026
2.0.0 2 03/06/2026
1.2.0 2 03/06/2026
1.1.3 2 03/06/2026
1.1.2 2 03/06/2026
1.1.1 2 03/06/2026
1.1.0 2 03/06/2026
1.0.2 2 03/06/2026
1.0.1 2 03/06/2026
1.0.0 2 03/06/2026
0.9.7 2 03/06/2026
0.9.6 2 03/06/2026
0.9.5 2 03/06/2026
0.9.4 2 03/06/2026
0.9.3 2 03/06/2026
0.9.2 2 03/06/2026