Home

Awesome

@anush008/tokenizers

The official Node bindings are in a jinx with limited support for Node versions and architectures. This package offers multi-arch bindings for @huggingface/tokenizers with Node v20.x supported.

Supports:

Installation

npm install @anush008/tokenizers

Features

Basic example

import { Tokenizer } from "@anush008/tokenizers";

const tokenizer = await Tokenizer.fromFile("tokenizer.json");
const wpEncoded = await tokenizer.encode("Who is John?");

console.log(wpEncoded.getLength());
console.log(wpEncoded.getTokens());
console.log(wpEncoded.getIds());
console.log(wpEncoded.getAttentionMask());
console.log(wpEncoded.getOffsets());
console.log(wpEncoded.getOverflowing());
console.log(wpEncoded.getSpecialTokensMask());
console.log(wpEncoded.getTypeIds());
console.log(wpEncoded.getWordIds());

License

MIT