https://github.com/alasdairforsythe/tokenmonster
Ungreedy subword tokenizer and vocabulary trainer for Python, Go & Javascript
text-tokenization
tokenisation
tokenization
tokenize
tokenizer
tokenizing
vocabulary
vocabulary-builder
vocabulary-generator
Added: over 1 year ago - Last Synced: 11 months ago
- Created: May 12, 2023
- Relevant topics? true
- External users? true
- Open source license? true
- Active? true
- Fork? false
- Main Language: Go
- Commits: 168
- Committers: 1
- Issues: 25
- Pull Requests: 3
- Owner: alasdairforsythe
- Stars: 515
- Forks: 20
- Packages: 2
- Downloads: 885
