onnxruntime-extensions/operators
Sayan Shaw 7851b51ee3
Add initial tiktoken and Phi3SmallTokenizer support (#729)
* add initial tiktoken support

* add vector hash and equal for bpe ranks map

* change lambda comparator

* move phi-3-small files

* final changes

* move tiktoken files from data2 to data

* add unit test

* add tokenizer module

* merge json and tiktoken impl

* fix tiktoken encoding problem

* address comments

* remove dummy tokens

---------

Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com>
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
2024-08-02 10:24:02 -07:00
..
audio Feature extraction C API for whipser model (#755) 2024-07-11 11:20:36 -07:00
azure Enable using system certs on Android. (#543) 2023-08-24 12:17:07 +10:00
cuda sync to flash attention kernel 2.5.9 and add document of how to write custom op (#757) 2024-07-10 07:09:40 -07:00
cv2 Refactor OrtxStatus to be header-only implmentation. (#720) 2024-05-17 15:40:11 -07:00
math Feature extraction C API for whipser model (#755) 2024-07-11 11:20:36 -07:00
text Add CUDA build support and some code refinements (#581) 2023-10-30 21:06:30 -07:00
tokenizer Add initial tiktoken and Phi3SmallTokenizer support (#729) 2024-08-02 10:24:02 -07:00
vision Refactor the unit tests and cmake build script (#726) 2024-05-30 14:16:14 -07:00
ocos_operators_placeholder.cc Ensure noexcep_operators and ocos_operators get built always (#570) 2023-10-09 18:29:02 -07:00