Граф коммитов

163 Коммитов

Автор SHA1 Сообщение Дата
Wenbing Li be5aa773e3
Unify the image operations in extensions library (#831)
* Unify the image operations in extensions library

* fix the build configuration issue

* More build fixings

* Fix the native image codec

* fix encode_image

* Add bgr/rgb conversion for encoding image

* parity check

* build break

* update PNG encoding parameters

* build break on Linux

* using MSE to compare images

* fix the discrependency between Linux and Windows

* final code refinement

* one more change

* fix the C++ warnings

---------

Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com>
2024-10-30 09:17:06 -07:00
Sayan Shaw 0e6bffa201
Fix regex prefast warnings (#832)
* fix regex prefast warnings

* remove try catch

---------

Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com>
2024-10-29 22:36:59 -07:00
Sayan Shaw 7ab9d24cb4
Add general regex support (#822)
* Add general regex support

* add case 5 support instead of replacing with s+

* add more test cases

* address comments

* add back gpt2 and llama regex methods for efficiency

---------

Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com>
2024-10-21 16:29:17 -07:00
Wenbing Li 62c0a7bfda
fix the unigram detector for last HG tokenizer (#820) 2024-10-03 14:25:53 -07:00
Wenbing Li 12a9e8beb4
support sentence-piece add_dummy_prefix for all models (#819)
* add compatibility docs

continue updating the doc

updating doc 2

* support sentence-piece add_dummy_prefix for all models

* revert the flag

* initialize the add_dummy_prefx for llama model
2024-10-01 09:08:59 -07:00
Wenbing Li 2c3e936cfc
support the merges array in tokenizer.json (#817) 2024-09-26 11:01:13 -07:00
Wenbing Li f204a4c791
Add a decoder for Unigram tokenizer and unify some classes among tokenizers (#816)
* rename and formalize the file names

* add the decoder impl

* fix a typo
2024-09-25 10:25:06 -07:00
Wenbing Li 6b94f4d7a5
Fix the Unicode code discrepency on CLIP model (#814)
* refine the code structure

* more fixing on unicode

* fix the codepoint 304

* add the clip tokenizer data files abck
2024-09-23 16:49:24 -07:00
Wenbing Li 176c1d0138
Support the Unigram tokenizer kind from sentencepiece library (#811)
* initial commit

* Ugm vocab loaded is good

* test passed

* fixes unit test on win32

* finish the parity check

* code refinement

* code refinement for review
2024-09-19 15:46:13 -07:00
Sayan Shaw 0d5d19f67b
fix prefast warning (#809)
Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com>
2024-09-15 22:34:07 -07:00
Sayan Shaw 8bc8e43da1
Add C++ regex support for Llama3, Standard Library, and Custom Cases (#804)
* add C++ standard library regex support for GPT2 case

* reorder regex handling

* try without STL

* missing case

* add llama3 regex support

* add custom regex impl

* change regex based on model

* modify tests, add docs, and code cleanup

* add regex test and const strings

---------

Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com>
2024-09-10 23:17:49 -07:00
Wenbing Li b8b2ebfb85
optimize spm tokenizer for long text (#799)
* optimize spm tokenizer for long text

* refine the split logic

* re-trigger CI pipeline.
2024-08-30 14:58:40 -07:00
Wenbing Li 2d02a687be
Optimize the tokenizer for efficiency (#797)
* optimize the tokenizer for efficiency

* fix the unit test failures.

* fix the api test case failures

* removed the unused code.

* More test cases fixings

* One more fixing

* fix macOS build issues

* refine the test

* add more diagnosis info.

* fix unit test in CI Linux

* fix the pp_api test failure
2024-08-27 18:57:50 -07:00
Wenbing Li 8f2c35fad0
Add more tests for pre-processing C APIs (#793)
* initial api for tokenizer

* More fixings and test data refinement

* add a simple wrapper for pre-processing APIs

* fix the test issues

* test if the tokenizer is spm based

* fix the failed test cases

* json pointer does not work
2024-08-21 16:48:39 -07:00
Wenbing Li 711a2cfa69
add a convert_token_string_to_an_id API for the prompt ids (#794)
* add a convert token string to an id API for the prompt ids

* fix the build issues on Linux
2024-08-19 16:44:07 -07:00
Sayan Shaw 7851b51ee3
Add initial tiktoken and Phi3SmallTokenizer support (#729)
* add initial tiktoken support

* add vector hash and equal for bpe ranks map

* change lambda comparator

* move phi-3-small files

* final changes

* move tiktoken files from data2 to data

* add unit test

* add tokenizer module

* merge json and tiktoken impl

* fix tiktoken encoding problem

* address comments

* remove dummy tokens

---------

Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com>
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
2024-08-02 10:24:02 -07:00
Wenbing Li 8b002b86ab
Fix the case that bos_token is null (#781) 2024-07-31 17:50:20 -07:00
Wenbing Li b4ebfc9519
Fix spm converted FastTokenizer issue on non-ascii char (#778)
* Fix spm converted tokenizer issue on non-ascii char

* remove pkg_resource in python
2024-07-31 14:22:25 -07:00
Wenbing Li 8153bc1a3a
Feature extraction C API for whipser model (#755)
* Feature extraction C API for whipser model

* Update the docs

* Update the docs2

* refine the code

* fix some issues

* fix the Linux build

* fix more data consistency issue

* More code refinements
2024-07-11 11:20:36 -07:00
cao lei 95d65e4ec0
sync to flash attention kernel 2.5.9 and add document of how to write custom op (#757)
* sync to flash attention kernel 2.5.9

* support users to overload GetMayInplace and ReleaseMayInplace

* Undo the change for pybind11 dependency
2024-07-10 07:09:40 -07:00
Xavier Dupré bef5f07e33
Add custom ops ReplaceZero (#739)
* Add custom ops ReplaceZero

* fix merge conflicts
2024-06-18 11:36:14 +02:00
Xavier Dupré 05df33b302
Add missing documentation for fused kernels (#744) 2024-06-18 10:54:15 +02:00
Xavier Dupré 690bed71b6
Add operator MulSigmoid, MulMulSigmoid (#741)
* Add operator MulSigmoid

* add mul mul sigmoid

* add comments

* Apply suggestions from code review

---------

Co-authored-by: Wei-Sheng Chin <wechi@microsoft.com>
2024-06-12 10:29:42 +02:00
Xavier Dupré f5055466d5
Add custom kernel ScatterNDOfShape (#705)
* first draft

* clang

* Draft for ScatterNFOfShape

* fix build

* disable test when cuda is missing

* fix implementation

* update test

* add MaskedScatterNdOfShape

* fix merge conflicts
2024-06-11 09:59:46 +02:00
Xavier Dupré 79f3b048d4
Add custom op Transpose2DCast (#737)
* Add custom op Transpose2DCast

* fix compilation issues

* fix compilation issues
2024-06-06 17:44:21 +02:00
Xavier Dupré 1e8c1211a5
Add custom kernels AddSharedInput, MulSharedInput (#734)
* Add custom kernel AddSharedInput, MulSharedInput

* fix compilation

* compilation issue

* fix unit test
2024-06-05 10:42:22 +02:00
Wenbing Li ca433cbea7
Refactor the unit tests and cmake build script (#726)
* refine the build script

* complete the unit tests.

* remove the commented code
2024-05-30 14:16:14 -07:00
Xavier Dupré b60df02fd0
Use of OrtxStatus in kernel NegXPlus1 (#732)
* first draft for NegXPlus1

* complete

* fix unit test

* rename one test

* remove test if not cuda

* switch to OrtxStatus

---------

Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
2024-05-30 10:56:06 +02:00
Xavier Dupré 95a49faabe
Add kernel NegXPlus1 = 1 - X (#709)
* first draft for NegXPlus1

* complete

* fix unit test

* rename one test

* remove test if not cuda

---------

Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
2024-05-29 15:26:44 +02:00
Wenbing Li c3c5f1cbb1
Remove C++ filesystem library dependency for the compatibility of old system (#721)
* Remove C++ filesystem library dependency for the compatibility of old OS.

* Update file_sys.h
2024-05-18 07:23:45 -07:00
Wenbing Li 97ee9eb56f
Refactor OrtxStatus to be header-only implmentation. (#720) 2024-05-17 15:40:11 -07:00
Wenbing Li 311dd35401
Add ImageProcessor for Multimodel model Pre-processing (#715)
* only keep the image decoder from opencv

* initial build

* refine the code

* Add clear functions

* Update CMakeLists.txt

* Update opencv.cmake

* change the output type to float

* get the result

* align image-process with original Python

* move the LoadRawImages into library

* fix the calculation error

* fix the pipeline build issue

* fix the build breaks in ci pipeline

* support json configuration file and refactor the code.
2024-05-15 14:35:14 -07:00
Baiju Meswani 660af0d79a
Return added_tokens_ by reference (#711) 2024-05-07 11:47:57 -07:00
Wenbing Li c58c930739
Ignore all streaming output of invalid utf-8 string (#704)
* Ignore all streaming output of invalid utf-8 string

* Update bpe_streaming.hpp

* add the phi-3 tokenizer test

* add a streaming test for phi-3 model

* fix the utf-8 validation

* fix the utf-8 validation 2

* fix the utf-8 validation 3

* fix the utf-8 validation 4
2024-05-06 16:46:55 -07:00
cao lei e645cdab8d
Introduce flash attention and cutlass library (#708)
* refactor cuda ops, remove contrib folder

* introduce flash attention and cutlass

* resolve comments

---------

Co-authored-by: Lei Cao <leca@microsoft.com@onnxruntime-a10.bxgbzpva45kedp3rhbsbit4phb.jx.internal.cloudapp.net>
2024-05-05 22:52:28 -07:00
cao lei dfdf52e759
refactor cuda ops, remove contrib folder (#707)
Co-authored-by: Lei Cao <leca@microsoft.com@onnxruntime-a10.bxgbzpva45kedp3rhbsbit4phb.jx.internal.cloudapp.net>
2024-05-03 12:18:59 -07:00
Wenbing Li 8645a846fb
A tutorial of build ort-extensions from source as a static library (#703)
* The tutorial of build from source as a static library

* update test flag control

* add the tutorial
2024-05-01 13:46:27 -07:00
Tang, Cheng 3b889fc42f
update custom op v2 struct to be able to invoke from eager mode (#700)
Co-authored-by: Cheng Tang <chenta@a100.crj0ad2y1kku1j4yxl4sj10o4e.gx.internal.cloudapp.net>
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
2024-04-30 13:53:39 -07:00
Wenbing Li a8bce4328b
Add the tokenizer C ABI (#693)
* initial checkins

* fix the selectedops build failures

* add the tokenization implementation

* update the windows DEF file for c abi in cmake file

* fix the build on linux

* fix some warnings and remove the unused code

* initial import of unit tests from tfmtok

* add streaming API support

* fix the merges loading issues

* complete export from tfmtok - needs input id fixing

* fix the unit test failures.

* fix all unit test failure

* refactor streaming code

* remove the unused code

---------

Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com>
2024-04-29 16:45:49 -07:00
Tang, Cheng 1f31d33ed4
Eager mode: cuda kernel support (#694)
* add UT for neg_pos_cuda in eager mode and fix build break in Windows

* fix Linux build break

* adjust argument and path

* remove old cudaContext

* add ort cuda test back

* fix cuda tests

* undo debug code

* undo useless change

---------

Co-authored-by: jslhcl <jslhcl@gmail.com>
Co-authored-by: Cheng Tang <chenta@a100.crj0ad2y1kku1j4yxl4sj10o4e.gx.internal.cloudapp.net>
Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com>
2024-04-24 12:49:00 -07:00
Wenbing Li f9290e8bac
Add a status class for future tokenizer API implementation (#690)
* Add a status class for future API implementation

* Update bpe_kernels.cc

* fix the ios package pipeline

* update mistral test model name
2024-04-18 21:12:14 -07:00
Wenbing Li 646462790b
Refactor the header file directory and integrate the eager tensor implementation (#689)
* refactor the header file in include folder

* fix the basic-token eager unit test case

* a more flexible way to handle string tensor shape.

* fix the unit test path issue

* remove the multi-inherits to avoid issue during pointer casting

* add api cmake build support

* undo some temporary changes

* code refinement

* fix variadic arg

* only expose the context for ort version >= 17

* fix a shape bug

* fix the cuda build issue

* change ifdef condition of GetAllocator

* finalize the ort c abi wrapper file name

* fix the iOS build break

* align gtest version with triton

* Update ext_apple_framework.cmake for iOS header files

---------

Co-authored-by: Cheng Tang <chenta@a100.crj0ad2y1kku1j4yxl4sj10o4e.gx.internal.cloudapp.net>
2024-04-17 12:58:19 -07:00
cao lei 2234001184
refactor ORT-Extension for the coming GroupQueryAttention work (#674)
* refactor ORT-Extension for the coming GroupQueryAttention work

* fix typo and add #if ORT_API_VERSION >= 15 for GetOrtAllocator

* fix cuda build
2024-03-20 10:55:04 -07:00
Wenbing Li 61369fb970
Unify the spm/bpe tokenizers (#666)
* Unify the spm/bpe tokenizers

* fix the build error

* fix the decoding issue

* add model name in exported onnx

* fixing the unit tests

* revert the unneccesary file format changes
2024-03-06 10:07:05 -08:00
Adam Pocock d47a3dd2dd
Fix batching in fairseq SentencepieceTokenizer (#640)
* Move fairseq fix out of the loop.

* Tidying up the patch.
2024-01-30 09:39:16 -08:00
cao lei 44e494bab4
file name typo (#636) 2024-01-17 13:55:53 -08:00
RandySheriffH 1ccc405bab
FastGelu float16 (#621)
* support float16

* add ut for float16

* support bfloat16

* refactor

* fetch prop

* ifdef f16

* remove header

* ifdef cuda

* typename mapped type

---------

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2023-12-11 14:31:17 -08:00
Wenbing Li 90067494d3
Fix some gs-failure check issues and typos. (#618) 2023-12-07 15:37:54 -08:00
RandySheriffH 9b2daf56de
Rashuai/bert cuda (#614)
* cmake fast gelu

* bridge func and cuda kernel

* tune ut

* fix build warning

* fix format

* tune ut

* drop OCOS_ENABLE_CONTRIB

* tune cmake

---------

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2023-12-05 12:59:19 -08:00
Sayan Shaw 1d8b81f59b
add fairseq decoder support (#612)
Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com>
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
2023-11-30 11:52:31 -08:00