Changming Sun
c30df08aee
Add OneBranch.Official.yml ( #242 )
2022-06-06 10:25:44 -07:00
Changming Sun
70bfe4345c
Create onebranch pipeline for win32-wheel ( #240 )
...
For compliance work.
2022-06-03 18:26:44 -07:00
Changming Sun
36e5c474fd
Update requirements.txt to restrict protobuf version ( #241 )
2022-06-03 16:09:25 -07:00
Wenbing Li
da4784a2cc
update the bert end to end example with hftok ( #236 )
2022-06-01 10:41:42 -07:00
shaahji
49548f843d
Issue #230 : Add HuggingFace vocab format to Bert tokenizer
...
HuggingFace vocab format is newline separated (unlike GPT which is
json). Newline separated is likely to be faster and doesn't require
an external library to parse it. Instead of introducing a json based
format, added support for native HuggingFace newline separated token
format.
2022-05-26 14:17:20 -07:00
Lucas Gomez Jimenez
21dc637a6d
Ensure custom op domain is released ( #234 )
2022-05-25 16:51:49 -07:00
Wenbing Li
471a858b56
add a patch opencv compiling ( #233 )
2022-05-19 10:08:47 -07:00
Nat Kershaw (MSFT)
1678c0be9a
Update README.md ( #229 )
2022-05-13 15:34:20 -07:00
Wenbing Li
80c26772af
Update wheels_win32.yml for Azure Pipelines ( #228 )
2022-05-12 10:26:07 -07:00
Wenbing Li
909acb7ce4
build and packaging script improvement for release ( #218 )
...
* integrate opencv
* small fixing
* Add the opencv includes and libs
* refine a little bit
* standardize the output folder.
* fix ctest on Linux
* fix setup.py on output folder change.
* more fixings for CI pipeline
* more fixing 1
* more fixing 2
* more fixing 3
* ci pipeline fixing 1
* ci pipeline fixing 2
* a silly typo...
* ci pipeline fixing 3
* fixing the file copy issue.
* last fixing.
* re-test the fullpath in build_ext.
* One more try
* extent timeout
* mshost.yml indent
* Update mshost.yaml for Azure Pipelines
* cibuild build python versions
* Update wheels.yml
* only build python 3.8/3.9
* Update wheels.yml for Azure Pipelines
* seperate the ci pipeline
2022-05-11 16:51:59 -07:00
Wenbing Li
972552126f
Update README.md
2022-05-04 15:33:09 -07:00
Wenbing Li
bfbfa5a304
An end-to-end BERT model with pre-/post- processing. ( #224 )
...
* bert demo
* add some comments
* support multiple outputs in ONNX model
* code polishing
* encoding issue on Windows platform.
2022-04-20 16:14:46 -07:00
Wenbing Li
fb378b72a0
Update ONNX version information in extensions ( #222 )
...
* upgrade the onnx versions
* onnx model op version
2022-04-13 10:02:25 -07:00
Wenbing Li
bcb41fcf0e
Update README.md
2022-03-29 15:49:02 -07:00
Wenbing Li
63076479a0
gpt2 end-end pre-processing script fixing with pytorch 1.11. ( #215 )
...
* gpt2 end-end pre-processing script fixing with pytorch 1.11.
* the fixings for CI failures.
* new dependency for gpt2bs.py
* remove the gpt2bs ci since it started failing in ORT 1.11
2022-03-29 14:37:50 -07:00
Wenbing Li
98fb96d4e8
updat the tutorial ( #214 )
2022-03-23 14:28:18 -07:00
Qing
59d39123c7
fix typo ( #212 )
2022-03-22 14:31:37 -07:00
Wenbing Li
6eb41afcb2
fixing the security warning on blingfire. ( #209 )
2022-03-21 09:44:30 -07:00
TruscaPetre
6b3a97202b
Update README.md ( #210 )
...
Correcting an English syntactic mistake.
2022-03-20 17:09:48 -07:00
Wenbing Li
d09f988acd
fixing test on PyTorch 1.11 release ( #208 )
...
* fixing test on Pytorch 1.11 release
* fix the type error in the PyTorch 1.11
* set a fixed int64 data type for the test
2022-03-15 19:27:33 -07:00
Wenbing Li
febe63a5f0
Update README.md
2022-03-09 17:35:46 -08:00
Wenbing Li
9d5ce81ab9
update the docs for new API ( #207 )
2022-03-08 19:14:13 -08:00
Wenbing Li
4bb3a22c45
The new pre/post processing API, replacing ONNXCompose ( #205 )
...
* traced processing module
* before debugging.
* updates
* temporary
* the trace mode pass
* code adjusting for ci pipeline.
* only torch 1.11 support prim:pythonop
* extending sequence processing module.
2022-03-08 16:32:59 -08:00
Wenbing Li
2e2ee11772
still keeps visual studio version sticking to 2019 ( #206 )
2022-03-07 16:36:52 -08:00
Wenbing Li
b99e0c4138
Test onnx 1.11 package on MacOS ( #202 )
2022-02-23 10:13:00 -08:00
shaahji
d14abe1461
Identified bug in SentencePieceTokenizer where encoding was ( #200 )
...
not restricted to specific token. Sentencepiece.Encode itself doesn't
clear the input vector before populating the result for the input
token.
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
2022-02-22 10:52:28 -08:00
Wenbing Li
bd70ea153d
Temporarily disable the onnx 1.11
...
onnx 1.11 is failing on MacOS CI pipeline.
2022-02-17 14:29:47 -08:00
Wenbing Li
5bf46560a6
Update gpt2bs.py
2022-02-10 11:07:18 -08:00
Wenbing Li
8eb89cdd8b
clean up the unused files ( #199 )
...
* clean up the unused files
* one more files
* update the doc as well
2022-02-09 15:18:13 -08:00
Wenbing Li
fd044872b2
cp ci pipeline
2022-02-09 10:28:45 -08:00
Wenbing Li
d0ff193eec
support the sequence tensor in pnp.ProcessingModule. ( #197 )
...
* support the sequence tensor in ProcessingModule.
* version check
* version check 2
2022-02-04 14:48:47 -08:00
Wenbing Li
459c4f7d61
Update pytorch_custom_ops_tutorial.ipynb ( #196 )
...
* Update pytorch_custom_ops_tutorial.ipynb
* Update mshost.yaml
* tidy up
2022-02-02 10:34:20 -08:00
Wenbing Li
f418557d85
Update mshost.yaml ( #195 )
...
* Update mshost.yaml
* fix the build error for Android.
* add the spm tokenizer in the selected ops list
2022-01-24 10:57:18 -08:00
Wenbing Li
a074038642
support custom function in the script mode. ( #194 )
2022-01-20 10:24:01 -08:00
Wenbing Li
0374ab0f1b
support the model joining with extra inputs. ( #192 )
2022-01-14 10:43:03 -08:00
Wenbing Li
cc858e831b
Update setup.cfg
2022-01-07 16:41:37 -08:00
Wenbing Li
03abb5130a
Refine the ONNXCompose implementation ( #191 )
...
* a fixing for the sequence tensors
* Refine the ONNXCompose implementation.
2022-01-05 16:50:45 -08:00
joburkho
a9737505ca
Joburkho/change to const char star ( #187 )
...
* Correct memory reservation.
* Change output_sentences to vector of const char*.
2021-12-01 00:01:52 -08:00
Wenbing Li
d079ba4bed
Update mshost.yaml for Azure Pipelines ( #184 )
...
* Update mshost.yaml for Azure Pipelines
* Update mshost.yaml for Azure Pipelines
* hotfix for ort beam search step
2021-11-29 13:22:23 -08:00
Wenbing Li
0b2cfe7dd7
update the docs for imagenet pnp ( #186 )
2021-11-29 10:53:15 -08:00
Wenbing Li
9bd453e0f1
initial checkins for onnxcompose ( #185 )
...
* initial checkins for onnxcompose
* update ci pipeline for the test.
* add the missing quotes
* Switch to looseVersion for torch version.
* testif
* padding_length
* skip the gpt2
* add onnxruntime 1.9 test package.
* fix a memory bug on pyop.
2021-11-22 21:02:39 -08:00
Mojimi
dddd85397d
Add android pipeline ( #183 )
...
* build locally success
* update
* fix pipeline
* fix pipeline
* fix pipeline
* fix tool chain file
* validate quickly
* fix tool chain
* update
* finished
* fix bug
* fix bugs
* remove tree
* update android ndk
* fix bugs
* remove java install
* bring back build
* fix model
* resolve conflict
* remove uncessary file
* remove tensorflow_text version 2.6.0
Co-authored-by: Ze Tao <zetao@microsoft.com>
2021-11-09 10:25:03 -08:00
Mojimi
abdd5b1bd8
Add MaskedFill ( #182 )
...
* add StringRemove
* update
Co-authored-by: Ze Tao <zetao@microsoft.com>
2021-11-04 09:29:17 +08:00
Mojimi
b5a8a1abd9
add test ( #180 )
...
Co-authored-by: Ze Tao <zetao@microsoft.com>
2021-11-01 10:08:09 +08:00
Mojimi
537c492219
Add tests for mapping-related operators ( #179 )
...
* init
* finish vector_to_string
* add more test
Co-authored-by: Ze Tao <zetao@microsoft.com>
2021-10-29 09:00:06 +08:00
Mojimi
c1e9fdcb08
Add test for segment extraction ( #177 )
...
* add test for ECMARegex
* add empty input test case
* fix bug
* update
Co-authored-by: Ze Tao <zetao@microsoft.com>
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
2021-10-28 09:14:47 +08:00
Zuwei Zhao
05f7ded825
Add check for empty input in StringJoin operator and fix empty string input error in BlingFire sentence breaker. ( #175 )
...
* Add test cases and fix empty string error in BlingFire sentence breaker.
* Throw error if input text to join is empty array.
* Fix scalar support and access violation.
* Resolve comments.
* Resolve comments.
Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com>
2021-10-27 20:21:16 +08:00
Mojimi
64c972fb02
Add test for StringECMARegexReplace ( #176 )
...
* add test for ECMARegex
* add empty input test case
Co-authored-by: Ze Tao <zetao@microsoft.com>
2021-10-26 10:28:06 +08:00
Mojimi
46d096f1af
Fix ::tolower error when locale is not 'C' ( #174 )
...
* add test and implement tolower
* fix locale
* fix locale
Co-authored-by: Ze Tao <zetao@microsoft.com>
2021-10-20 20:59:29 -07:00
Mojimi
448518534c
Add native test for bert tokenizer ( #173 )
...
* add native test for bert tokenizer
* add python test
* fix unicode category
Co-authored-by: Ze Tao <zetao@microsoft.com>
2021-10-19 11:09:38 -07:00