Awesome
minbpe-rs
Port of Andrej Karpathy's minbpe to Rust.
Quick Start
Create a Rust application crate with cargo
,
$> cargo new minbpe-test
In the resulting project, add minbpe
to Cargo.toml
,
[dependencies]
minbpe = "0.1.0"
Refer crates.io
for selecting the latest version. Next in src/main.rs
,
use std::path::Path;
use minbpe::{BasicTokenizer, Saveable, Tokenizer, Trainable};
fn main() {
let text = "aaabdaaabac" ;
let mut tokenizer = BasicTokenizer::new() ;
tokenizer.train( text , 256 + 3 , false ) ;
println!( "{:?}" , tokenizer.encode(text) ) ;
println!( "{:?}" , tokenizer.decode( &[258, 100, 258, 97, 99] ) ) ;
tokenizer.save( Path::new( "./" ) , "toy" ) ;
}
Execute the binary with cargo run
,
$> cargo run
...
Compiling minbpe-test v0.1.0 (~/minbpe-test)
Finished dev [unoptimized + debuginfo] target(s) in 15.71s
Running `target/debug/minbpe-test`
[258, 100, 258, 97, 99]
"aaabdaaabac"
License
Licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.