* add batch_mode and padding for GPT2Tokenizer
* fix text
* fix test and add doc
* fix test
* fix comments
* delete header
Co-authored-by: Ze Tao <zetao@microsoft.com>
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
* add vector_to_string
* fix merge conflict
* fix building failure
* remove debug code
* fix test
* move back unicode
* fix typo
* move base64 back
* move the right place
* support only int64_t
Co-authored-by: Ze Tao <zetao@microsoft.com>
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
* Add attribute global_replace to StringRegexReplace
Signed-off-by: xavier dupré <xavier.dupre@gmail.com>
* fix potential wrong pointer
Signed-off-by: xavier dupré <xavier.dupre@gmail.com>
* update sep
Signed-off-by: xavier dupré <xavier.dupre@gmail.com>
Co-authored-by: xavier dupré <xavier.dupre@gmail.com>
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
> It seems to be working now.
I enabled some less secured option in pipeline. let's see how it goes.
* enable c++ test on Windows
* add linux/macos platform
* no extractfile task on unix-like platform
* fixing on unix-like platform
* try a fixing on macos
* basic changes
build issues fixing
runing on Windows Platform
deploy the ort library in CMake
update gitignore
* Add C++ shared tests
* enable ctest
* fixing the python build issue
* remove cc test
* why does macos needs openmp package?
* test attribute
* finish improvement
Co-authored-by: Ze Tao <zetao@microsoft.com>
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
* enable directly pip package build.
* some link symbols
* fixing on Windows platform
* update the build instruction
* update the ci pipeline
* Fix the Linux and MacOS build.
* Update mshost.yaml
* updat the ci python version
* update the pipeline
* simplify the instruction.
* update according to the comments.
Co-authored-by: Wenbing Li <wenli@MacM1.local>
* initialize a bbpe tokenizer
* add the json library.
* gpt2 tokenizer cpp implementation.
* Tom/add tutorial (#32)
* Added getting started instructions for Windows
Signed-off-by: Tom Wildenhain <tomwi@microsoft.com>
* Created a tutorial for converting models with custom ops. WIP
* Removed long outputs
* Changed to keras syntax and added setup instructions
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
* rename gpt2 test case file
* polish the symbol names in the sources
* polish it again.
* fix the build issue on macos
* another fixing
* another fixing 3
Co-authored-by: TomWildenhain-Microsoft <67606533+TomWildenhain-Microsoft@users.noreply.github.com>
* Added getting started instructions for Windows
Signed-off-by: Tom Wildenhain <tomwi@microsoft.com>
* Created a tutorial for converting models with custom ops. WIP
* Removed long outputs
* Changed to keras syntax and added setup instructions
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
* refine the codebase
* remove the dup def file
* surface the build error in the build script.
* fix the build break for gcc
* run the test after build in Linux
* enable build on Linux
* move pssetup files into the subfolder
* clean up
* correct the package path
* enable non-python build
* add Linux CI
* Update mshost.yaml for Azure Pipelines
* change to std::complex
* Added getting started instructions for Windows
Signed-off-by: Tom Wildenhain <tomwi@microsoft.com>
* Added casting from python for additional types
* Added error for float16
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
* refactoring
* remove useless include
* remove pragma once from cc files
* add custom_op_test.onnx
* remove unnecessary imports, add header in project file, run C++ unit tests
* refactor tests
* Update mshost.yaml
* Implements dummy operators with double and strings
* udpate CI
* Implements StringUpper C++ version
* Fix runtime issue preventing from registering multiple python ops
* add c++ operator StringJoin
* Support multi output for python and C++ operators
* remove torch in requirements-dev.txt