ort-customops/operators/text/string_to_vector.cc

146 строки
4.3 KiB
C++
Исходник Обычный вид История

#include <charconv>
#include "farmhash.h"
#include "string_utils.h"
#include "string_to_vector.hpp"
#include "string_tensor.h"
StringToVectorImpl::StringToVectorImpl(std::string& map, std::string& unk) {
ParseMappingTable(map);
ParseUnkownValue(unk);
}
Use API-lite for custom ops (#386) * use lite custom op api for math * add vision ops * add cx2 ops * remove useless code * support register custom kernel struct * add string tensor support * add more text kernels * fix issue with std stringg as scalar * migrate all text ops * initial tokenizer change * migrate all tokenizers * Resolve conflict with main (#433) * resolve conflict * resolve conflict --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Update custom-op-lite PR (#440) * add the onnxruntime 1.14 release into the CI pipeline (#387) * add the onnxruntime 1.14 release into the CI pipeline * torch 2.0 crashed on Linux * Fix size_t overflow issue for RobertaTokenizer (#388) Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Pre and Post processing example for openAI Whisper model (#380) * add a stft-norm custom op for log-mel spectrum. * undo the debug change * Support ONNX standard STFT op signature. * Add a unit test onnx STFT compatible mode. * add whisper pre-/post- processing example * Update dlib.cmake * undo test code changes * Update setup.cfg * update the end2end example with STFT op * Added optional outputs for GPT2, CLIP and Roberta Tokenizers (#389) * Initial optional i/o for robertap * Small fix * Added working optional output functionality to RobertaTokenizer with tests * Added optional outputs to CLIPTokenizer * Added optional outputs to GPT2Tokenizer * Use ternary operators --------- Authored-by: Sayan Shaw <sayanshaw@microsoft.com> * ignore the unknown token id on bpe deocder (#391) * Use dependency name 'nlohmann_json' which is the same name that ORT uses. (#393) * Add an audio decoder custom op for whisper end-to-end processing (#385) * evaluate the audio decoder library * MP3 Decoder * rename it to test_audio_codec * add the audio decoder to whisper model * whisper end-to-end draft * fix the mp3 decoder * Running with ONNX models * Add more audio format supports * refine the end-to-end script * Update operators/audio/audio_decoder.hpp Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * Update operators/audio/audio_decoder.hpp Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * Update operators/audio/audio_decoder.hpp Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * some fixings of comments and more test cases. * changes for review comments. * Update audio_decoder.hpp * Update audio_decoder.hpp * code refinement * Update operators/audio/audio_decoder.hpp Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> --------- Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * make tensorflow be optional for unittest (#394) * make tensorflow be optional for unitest. * typo * built-in bounding box op (#382) * built-in bounding box op * update boundary check * assert policy * more boundary test and check * XYXY--> X horizon --------- Co-authored-by: Scott McKay <skottmckay@gmail.com> * a quick nuget package impl. (#396) * Update wheels_linux.yml: change the linux machine pool name (#398) * Add a merge step in whisper end-to-end script and fixed some issues (#399) * add merged models in whisper model * verify the final model * support batch > 1 in BpeDecoder (#400) * support batch > 1 in BpeDecoder * update the shape in helper function * [object detection ppp] YoLo as example (#397) * object detection * Unit test add e2e fastestdet model test --------- Co-authored-by: Changming Sun <chasun@microsoft.com> Co-authored-by: Scott McKay <skottmckay@gmail.com> * some fixing for python package (#401) * more code fixing related whisper models (#403) * Added windows nuget work temporarily for testing (#402) * Added windows nuget work temporarily for testing * Cleanup * Add back onnxruntime.lib in props file for possible future ORT need --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> * Remove unnecessary nupkg file and update nuspec (#405) * Add nuget pack to build.bat and small nuget changes for demo * Temporarily adding nuget.exe to build package until we can add to CI machine * Switch back from Release to RelWithDebInfo * Remove unnecessary changes --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Add initial NuGet pipeline for Windows x64 build (#406) * initial nuget pipeline * Update nuget.yml for Azure Pipelines * update nuget.yml for extensions specific packaging TODO: add certain template yml files * added component governance template yaml * change template yaml path * remove RoslynAnalyzers * Add packDestination to nuget pack task (change from default) * fix nuspec path * Update nuget.yml for Azure Pipelines * Update nuget.yml for Azure Pipelines * Update nuget.yml for Azure Pipelines * Update 2 nuget.yml for Azure Pipelines * Update NativeNuget.nuspec * Update nuget.yml for Azure Pipelines * update nuspec * Update 3 nuget.yml for Azure Pipelines * Update 4 nuget.yml for Azure Pipelines * Update 7 nuget.yml for Azure Pipelines * Remove unnecessary nupkg file and update nuspec (#405) * Add nuget pack to build.bat and small nuget changes for demo * Temporarily adding nuget.exe to build package until we can add to CI machine * Switch back from Release to RelWithDebInfo * Remove unnecessary changes --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Update 8 nuget.yml for Azure Pipelines * Update 9 nuget.yml for Azure Pipelines * add DLL signing * Update nuget.yml for Azure Pipelines * fix indendation * Update 11 nuget.yml for Azure Pipelines * Update 12 nuget.yml for Azure Pipelines * Update 12 nuget.yml for Azure Pipelines * Revert some unneccesary changes on nuget.yml * clean up nuget.yml and update nuspec release notes * small changes * update commit id and release notes --------- Co-authored-by: Wenbing Li <wenbingl@outlook.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Compatible with onnxruntime-gpu package (#410) * be compatible without onnxruntime-gpu version * some fixing * Add nuget README and remove ort lib references from props (#409) * Add nuget README and remove ort lib references from props * replace commit id in nuspec dynamically * remove $ sign for commit id token --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> * Add an C# demo project for NuGet package (#407) * Add a nuget test app * remove unused file * Compatible with onnxruntime-gpu package (#410) * be compatible without onnxruntime-gpu version * some fixing * turn it as a .net demo project --------- Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com> * Make Whisper E2E script more portable (#412) This PR makes the Whisper E2E script more portable for other environments. * Update macos wheel timeout to 180 min (#390) * Update ci timeout to 120 min * Only update WindowsPython job timeout * Update ci timeout to 90 min * update macos wheel timeout to 180 min --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Fix OneBranch PR pipeline CodeQL issue (#413) * test codeql 3000 * switch codeql from compiled to python * switch back to compiled --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Adding down-sampling and stereo mixing features for AudioDecoder (#420) * initial draft * second * third * polishing * fix the M_PI name in LINUX platform * fix bessel function issue * add a unit test case * fix the unit test name * Fix Secure Supply Chain Analysis Warning in PR pipeline (#414) * remove package sources * remove NuGet.config * add .sscignore for cfs0011 * change sscignore * add CFS0013 to sscignore --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * fix onnx version to 1.13.1 (#422) * [NuGet] All platform package pipeline (#408) * nuget ci package * disable macos arm64 build for err * Get the iOS xcframework build working with the split build/pack approach. (#416) * refine build_xcframework.py Cleanup/clarify various things - naming of parameters and files - consistency Make handling of additional build args more generic Update the artifact download dir/extract dir to more intuitive names Update scripts - make usage from CI pipeline clearer (e.g. don't hide directory names inside script) - keep comments in nuspec - remove unused args - make additional arg handling more Co-authored-by: Scott McKay <skottmckay@gmail.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * Add new required pre/post processing ops to Android and iOS packages. (#415) * Revert "Pin onnx version to 1.13.1" (#423) * Revert "fix onnx version to 1.13.1 (#422)" This reverts commit eb29d225a747e4de8afa81c55c3b71bec14e566f. * Update requirements.txt * PyOp attribute supports int and float data type (#425) * Fix Android AAR in nuget package. Requires libortextensions.so. (#429) * build for mac M1 (#430) * Fix the unit test failure with ONNX 1.14 package. (#428) * Fix the unit test failure with ONNX 1.14 package. * more tests * Update whisper_e2e.py * Add nuget.org publish version option (#426) * Add nuget.org publish version option * typo * small fix * typo --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops --------- Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com> Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> Co-authored-by: JiCheng <247153481@qq.com> Co-authored-by: Scott McKay <skottmckay@gmail.com> Co-authored-by: Changming Sun <chasun@microsoft.com> Co-authored-by: Wenbing Li <wenbingl@outlook.com> Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com> Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Fix a build err (#442) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Fix build err on ort 141 (#444) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Remove shape from span (#445) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err * remove shape --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Fix python tests (#446) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err * remove shape * fix python tests --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Fix max build (#449) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err * remove shape * fix python tests * fix packaging err * fix mac build --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Fix comments (#452) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err * remove shape * fix python tests * fix packaging err * fix mac build * fixing the universal2 python package for macOS (#448) * Remove onnx<1.14 from requirements.txt (#447) * remove onnx<1.14 from requirements.txt * downgrade protobuf * move protobuf req to requirements-dev.txt --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> * fix comments * comment version macro --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com> Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Fix build err (#453) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err * remove shape * fix python tests * fix packaging err * fix mac build * fix comments * comment version macro * define Compute for StftNormal --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Merge latest main (#461) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err * remove shape * fix python tests * fix packaging err * fix mac build * fix comments * comment version macro * define Compute for StftNormal --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * revert wanted changes in test * revert unwanted changed * add string_strip op --------- Co-authored-by: Cheng Tang <chenta@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: RandySheriffH <48490400+RandySheriffH@users.noreply.github.com> Co-authored-by: Randy Shuai <rashuai@microsoft.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com> Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> Co-authored-by: JiCheng <247153481@qq.com> Co-authored-by: Scott McKay <skottmckay@gmail.com> Co-authored-by: Changming Sun <chasun@microsoft.com> Co-authored-by: Wenbing Li <wenbingl@outlook.com> Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com>
2023-05-31 04:04:44 +03:00
std::vector<std::vector<int64_t>> StringToVectorImpl::Compute(const std::vector<std::string>& str_input,
const std::vector<int64_t>& input_dim,
std::vector<int64_t>& output_dim) const {
std::vector<std::vector<int64_t>> result;
2023-02-27 21:31:44 +03:00
// Set output dimension
output_dim = input_dim;
output_dim.push_back(vector_len_);
std::string key;
for (size_t i = 0; i < str_input.size(); i++) {
key = str_input[i];
auto it = map_.find(key);
if (it != map_.end()) {
result.push_back(it->second);
} else {
result.push_back(unk_value_);
}
}
return result;
}
void StringToVectorImpl::ParseMappingTable(std::string& map) {
auto lines = SplitString(map, "\n", true);
if (lines.empty()) {
return;
}
vector_len_ = ParseVectorLen(lines[0]);
if (vector_len_ == 0) {
ORTX_CXX_API_THROW(MakeString("The mapped value of string input cannot be empty: ", lines[0]),
ORT_INVALID_ARGUMENT);
}
std::vector<int64_t> values(vector_len_);
for (auto& line : lines) {
auto kv = SplitString(line, "\t", true);
if (kv.size() != 2) {
ORTX_CXX_API_THROW(MakeString("Failed to parse mapping_table when processing the line: ", line),
ORT_INVALID_ARGUMENT);
}
ParseValues(kv[1], values);
// string to vector mapping
map_[std::string{kv[0]}] = values;
}
}
void StringToVectorImpl::ParseUnkownValue(std::string& unk) {
auto unk_strs = SplitString(unk, " ", true);
if (unk_strs.size() != vector_len_) {
ORTX_CXX_API_THROW(
MakeString("Incompatible dimension: required vector length of unknown_value should be: ", vector_len_),
ORT_INVALID_ARGUMENT);
}
for (auto& str : unk_strs) {
int64_t value;
auto [end, ec] = std::from_chars(str.data(), str.data() + str.size(), value);
if (end != str.data() + str.size()) {
ORTX_CXX_API_THROW(MakeString("Failed to parse unknown_value when processing the number: ", str),
ORT_INVALID_ARGUMENT);
}
unk_value_.push_back(value);
}
}
size_t StringToVectorImpl::ParseVectorLen(const std::string_view& line) {
auto kv = SplitString(line, "\t", true);
if (kv.size() != 2) {
ORTX_CXX_API_THROW(MakeString("Failed to parse mapping_table when processing the line: ", line),
ORT_INVALID_ARGUMENT);
}
auto value_strs = SplitString(kv[1], " ", true);
return value_strs.size();
}
void StringToVectorImpl::ParseValues(const std::string_view& v, std::vector<int64_t>& values) {
std::vector<std::string_view> value_strs = SplitString(v, " ", true);
int64_t value;
for (size_t i = 0; i < value_strs.size(); i++) {
auto [end, ec] = std::from_chars(value_strs[i].data(), value_strs[i].data() + value_strs[i].size(), value);
if (end != value_strs[i].data() + value_strs[i].size()) {
ORTX_CXX_API_THROW(MakeString("Failed to parse map when processing the number: ", value_strs[i]),
ORT_INVALID_ARGUMENT);
}
values[i] = value;
}
}
OrtStatusPtr KernelStringToVector::OnModelAttach(const OrtApi& api, const OrtKernelInfo& info) {
std::string map, unk;
auto status = OrtW::GetOpAttribute(info, "map", map);
if (!status) {
status = OrtW::GetOpAttribute(info, "unk", unk);
}
if (!status) {
impl_ = std::make_shared<StringToVectorImpl>(map, unk);
}
return status;
}
OrtStatusPtr KernelStringToVector::Compute(const ortc::Tensor<std::string>& input,
ortc::Tensor<int64_t>& out) const {
// Setup input
Use API-lite for custom ops (#386) * use lite custom op api for math * add vision ops * add cx2 ops * remove useless code * support register custom kernel struct * add string tensor support * add more text kernels * fix issue with std stringg as scalar * migrate all text ops * initial tokenizer change * migrate all tokenizers * Resolve conflict with main (#433) * resolve conflict * resolve conflict --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Update custom-op-lite PR (#440) * add the onnxruntime 1.14 release into the CI pipeline (#387) * add the onnxruntime 1.14 release into the CI pipeline * torch 2.0 crashed on Linux * Fix size_t overflow issue for RobertaTokenizer (#388) Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Pre and Post processing example for openAI Whisper model (#380) * add a stft-norm custom op for log-mel spectrum. * undo the debug change * Support ONNX standard STFT op signature. * Add a unit test onnx STFT compatible mode. * add whisper pre-/post- processing example * Update dlib.cmake * undo test code changes * Update setup.cfg * update the end2end example with STFT op * Added optional outputs for GPT2, CLIP and Roberta Tokenizers (#389) * Initial optional i/o for robertap * Small fix * Added working optional output functionality to RobertaTokenizer with tests * Added optional outputs to CLIPTokenizer * Added optional outputs to GPT2Tokenizer * Use ternary operators --------- Authored-by: Sayan Shaw <sayanshaw@microsoft.com> * ignore the unknown token id on bpe deocder (#391) * Use dependency name 'nlohmann_json' which is the same name that ORT uses. (#393) * Add an audio decoder custom op for whisper end-to-end processing (#385) * evaluate the audio decoder library * MP3 Decoder * rename it to test_audio_codec * add the audio decoder to whisper model * whisper end-to-end draft * fix the mp3 decoder * Running with ONNX models * Add more audio format supports * refine the end-to-end script * Update operators/audio/audio_decoder.hpp Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * Update operators/audio/audio_decoder.hpp Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * Update operators/audio/audio_decoder.hpp Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * some fixings of comments and more test cases. * changes for review comments. * Update audio_decoder.hpp * Update audio_decoder.hpp * code refinement * Update operators/audio/audio_decoder.hpp Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> --------- Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * make tensorflow be optional for unittest (#394) * make tensorflow be optional for unitest. * typo * built-in bounding box op (#382) * built-in bounding box op * update boundary check * assert policy * more boundary test and check * XYXY--> X horizon --------- Co-authored-by: Scott McKay <skottmckay@gmail.com> * a quick nuget package impl. (#396) * Update wheels_linux.yml: change the linux machine pool name (#398) * Add a merge step in whisper end-to-end script and fixed some issues (#399) * add merged models in whisper model * verify the final model * support batch > 1 in BpeDecoder (#400) * support batch > 1 in BpeDecoder * update the shape in helper function * [object detection ppp] YoLo as example (#397) * object detection * Unit test add e2e fastestdet model test --------- Co-authored-by: Changming Sun <chasun@microsoft.com> Co-authored-by: Scott McKay <skottmckay@gmail.com> * some fixing for python package (#401) * more code fixing related whisper models (#403) * Added windows nuget work temporarily for testing (#402) * Added windows nuget work temporarily for testing * Cleanup * Add back onnxruntime.lib in props file for possible future ORT need --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> * Remove unnecessary nupkg file and update nuspec (#405) * Add nuget pack to build.bat and small nuget changes for demo * Temporarily adding nuget.exe to build package until we can add to CI machine * Switch back from Release to RelWithDebInfo * Remove unnecessary changes --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Add initial NuGet pipeline for Windows x64 build (#406) * initial nuget pipeline * Update nuget.yml for Azure Pipelines * update nuget.yml for extensions specific packaging TODO: add certain template yml files * added component governance template yaml * change template yaml path * remove RoslynAnalyzers * Add packDestination to nuget pack task (change from default) * fix nuspec path * Update nuget.yml for Azure Pipelines * Update nuget.yml for Azure Pipelines * Update nuget.yml for Azure Pipelines * Update 2 nuget.yml for Azure Pipelines * Update NativeNuget.nuspec * Update nuget.yml for Azure Pipelines * update nuspec * Update 3 nuget.yml for Azure Pipelines * Update 4 nuget.yml for Azure Pipelines * Update 7 nuget.yml for Azure Pipelines * Remove unnecessary nupkg file and update nuspec (#405) * Add nuget pack to build.bat and small nuget changes for demo * Temporarily adding nuget.exe to build package until we can add to CI machine * Switch back from Release to RelWithDebInfo * Remove unnecessary changes --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Update 8 nuget.yml for Azure Pipelines * Update 9 nuget.yml for Azure Pipelines * add DLL signing * Update nuget.yml for Azure Pipelines * fix indendation * Update 11 nuget.yml for Azure Pipelines * Update 12 nuget.yml for Azure Pipelines * Update 12 nuget.yml for Azure Pipelines * Revert some unneccesary changes on nuget.yml * clean up nuget.yml and update nuspec release notes * small changes * update commit id and release notes --------- Co-authored-by: Wenbing Li <wenbingl@outlook.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Compatible with onnxruntime-gpu package (#410) * be compatible without onnxruntime-gpu version * some fixing * Add nuget README and remove ort lib references from props (#409) * Add nuget README and remove ort lib references from props * replace commit id in nuspec dynamically * remove $ sign for commit id token --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> * Add an C# demo project for NuGet package (#407) * Add a nuget test app * remove unused file * Compatible with onnxruntime-gpu package (#410) * be compatible without onnxruntime-gpu version * some fixing * turn it as a .net demo project --------- Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com> * Make Whisper E2E script more portable (#412) This PR makes the Whisper E2E script more portable for other environments. * Update macos wheel timeout to 180 min (#390) * Update ci timeout to 120 min * Only update WindowsPython job timeout * Update ci timeout to 90 min * update macos wheel timeout to 180 min --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Fix OneBranch PR pipeline CodeQL issue (#413) * test codeql 3000 * switch codeql from compiled to python * switch back to compiled --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Adding down-sampling and stereo mixing features for AudioDecoder (#420) * initial draft * second * third * polishing * fix the M_PI name in LINUX platform * fix bessel function issue * add a unit test case * fix the unit test name * Fix Secure Supply Chain Analysis Warning in PR pipeline (#414) * remove package sources * remove NuGet.config * add .sscignore for cfs0011 * change sscignore * add CFS0013 to sscignore --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * fix onnx version to 1.13.1 (#422) * [NuGet] All platform package pipeline (#408) * nuget ci package * disable macos arm64 build for err * Get the iOS xcframework build working with the split build/pack approach. (#416) * refine build_xcframework.py Cleanup/clarify various things - naming of parameters and files - consistency Make handling of additional build args more generic Update the artifact download dir/extract dir to more intuitive names Update scripts - make usage from CI pipeline clearer (e.g. don't hide directory names inside script) - keep comments in nuspec - remove unused args - make additional arg handling more Co-authored-by: Scott McKay <skottmckay@gmail.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * Add new required pre/post processing ops to Android and iOS packages. (#415) * Revert "Pin onnx version to 1.13.1" (#423) * Revert "fix onnx version to 1.13.1 (#422)" This reverts commit eb29d225a747e4de8afa81c55c3b71bec14e566f. * Update requirements.txt * PyOp attribute supports int and float data type (#425) * Fix Android AAR in nuget package. Requires libortextensions.so. (#429) * build for mac M1 (#430) * Fix the unit test failure with ONNX 1.14 package. (#428) * Fix the unit test failure with ONNX 1.14 package. * more tests * Update whisper_e2e.py * Add nuget.org publish version option (#426) * Add nuget.org publish version option * typo * small fix * typo --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops --------- Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com> Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> Co-authored-by: JiCheng <247153481@qq.com> Co-authored-by: Scott McKay <skottmckay@gmail.com> Co-authored-by: Changming Sun <chasun@microsoft.com> Co-authored-by: Wenbing Li <wenbingl@outlook.com> Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com> Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Fix a build err (#442) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Fix build err on ort 141 (#444) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Remove shape from span (#445) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err * remove shape --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Fix python tests (#446) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err * remove shape * fix python tests --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Fix max build (#449) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err * remove shape * fix python tests * fix packaging err * fix mac build --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Fix comments (#452) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err * remove shape * fix python tests * fix packaging err * fix mac build * fixing the universal2 python package for macOS (#448) * Remove onnx<1.14 from requirements.txt (#447) * remove onnx<1.14 from requirements.txt * downgrade protobuf * move protobuf req to requirements-dev.txt --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> * fix comments * comment version macro --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com> Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Fix build err (#453) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err * remove shape * fix python tests * fix packaging err * fix mac build * fix comments * comment version macro * define Compute for StftNormal --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Merge latest main (#461) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err * remove shape * fix python tests * fix packaging err * fix mac build * fix comments * comment version macro * define Compute for StftNormal --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * revert wanted changes in test * revert unwanted changed * add string_strip op --------- Co-authored-by: Cheng Tang <chenta@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: RandySheriffH <48490400+RandySheriffH@users.noreply.github.com> Co-authored-by: Randy Shuai <rashuai@microsoft.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com> Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> Co-authored-by: JiCheng <247153481@qq.com> Co-authored-by: Scott McKay <skottmckay@gmail.com> Co-authored-by: Changming Sun <chasun@microsoft.com> Co-authored-by: Wenbing Li <wenbingl@outlook.com> Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com>
2023-05-31 04:04:44 +03:00
auto& input_data = input.Data();
// Get output
Use API-lite for custom ops (#386) * use lite custom op api for math * add vision ops * add cx2 ops * remove useless code * support register custom kernel struct * add string tensor support * add more text kernels * fix issue with std stringg as scalar * migrate all text ops * initial tokenizer change * migrate all tokenizers * Resolve conflict with main (#433) * resolve conflict * resolve conflict --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Update custom-op-lite PR (#440) * add the onnxruntime 1.14 release into the CI pipeline (#387) * add the onnxruntime 1.14 release into the CI pipeline * torch 2.0 crashed on Linux * Fix size_t overflow issue for RobertaTokenizer (#388) Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Pre and Post processing example for openAI Whisper model (#380) * add a stft-norm custom op for log-mel spectrum. * undo the debug change * Support ONNX standard STFT op signature. * Add a unit test onnx STFT compatible mode. * add whisper pre-/post- processing example * Update dlib.cmake * undo test code changes * Update setup.cfg * update the end2end example with STFT op * Added optional outputs for GPT2, CLIP and Roberta Tokenizers (#389) * Initial optional i/o for robertap * Small fix * Added working optional output functionality to RobertaTokenizer with tests * Added optional outputs to CLIPTokenizer * Added optional outputs to GPT2Tokenizer * Use ternary operators --------- Authored-by: Sayan Shaw <sayanshaw@microsoft.com> * ignore the unknown token id on bpe deocder (#391) * Use dependency name 'nlohmann_json' which is the same name that ORT uses. (#393) * Add an audio decoder custom op for whisper end-to-end processing (#385) * evaluate the audio decoder library * MP3 Decoder * rename it to test_audio_codec * add the audio decoder to whisper model * whisper end-to-end draft * fix the mp3 decoder * Running with ONNX models * Add more audio format supports * refine the end-to-end script * Update operators/audio/audio_decoder.hpp Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * Update operators/audio/audio_decoder.hpp Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * Update operators/audio/audio_decoder.hpp Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * some fixings of comments and more test cases. * changes for review comments. * Update audio_decoder.hpp * Update audio_decoder.hpp * code refinement * Update operators/audio/audio_decoder.hpp Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> --------- Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * make tensorflow be optional for unittest (#394) * make tensorflow be optional for unitest. * typo * built-in bounding box op (#382) * built-in bounding box op * update boundary check * assert policy * more boundary test and check * XYXY--> X horizon --------- Co-authored-by: Scott McKay <skottmckay@gmail.com> * a quick nuget package impl. (#396) * Update wheels_linux.yml: change the linux machine pool name (#398) * Add a merge step in whisper end-to-end script and fixed some issues (#399) * add merged models in whisper model * verify the final model * support batch > 1 in BpeDecoder (#400) * support batch > 1 in BpeDecoder * update the shape in helper function * [object detection ppp] YoLo as example (#397) * object detection * Unit test add e2e fastestdet model test --------- Co-authored-by: Changming Sun <chasun@microsoft.com> Co-authored-by: Scott McKay <skottmckay@gmail.com> * some fixing for python package (#401) * more code fixing related whisper models (#403) * Added windows nuget work temporarily for testing (#402) * Added windows nuget work temporarily for testing * Cleanup * Add back onnxruntime.lib in props file for possible future ORT need --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> * Remove unnecessary nupkg file and update nuspec (#405) * Add nuget pack to build.bat and small nuget changes for demo * Temporarily adding nuget.exe to build package until we can add to CI machine * Switch back from Release to RelWithDebInfo * Remove unnecessary changes --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Add initial NuGet pipeline for Windows x64 build (#406) * initial nuget pipeline * Update nuget.yml for Azure Pipelines * update nuget.yml for extensions specific packaging TODO: add certain template yml files * added component governance template yaml * change template yaml path * remove RoslynAnalyzers * Add packDestination to nuget pack task (change from default) * fix nuspec path * Update nuget.yml for Azure Pipelines * Update nuget.yml for Azure Pipelines * Update nuget.yml for Azure Pipelines * Update 2 nuget.yml for Azure Pipelines * Update NativeNuget.nuspec * Update nuget.yml for Azure Pipelines * update nuspec * Update 3 nuget.yml for Azure Pipelines * Update 4 nuget.yml for Azure Pipelines * Update 7 nuget.yml for Azure Pipelines * Remove unnecessary nupkg file and update nuspec (#405) * Add nuget pack to build.bat and small nuget changes for demo * Temporarily adding nuget.exe to build package until we can add to CI machine * Switch back from Release to RelWithDebInfo * Remove unnecessary changes --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Update 8 nuget.yml for Azure Pipelines * Update 9 nuget.yml for Azure Pipelines * add DLL signing * Update nuget.yml for Azure Pipelines * fix indendation * Update 11 nuget.yml for Azure Pipelines * Update 12 nuget.yml for Azure Pipelines * Update 12 nuget.yml for Azure Pipelines * Revert some unneccesary changes on nuget.yml * clean up nuget.yml and update nuspec release notes * small changes * update commit id and release notes --------- Co-authored-by: Wenbing Li <wenbingl@outlook.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Compatible with onnxruntime-gpu package (#410) * be compatible without onnxruntime-gpu version * some fixing * Add nuget README and remove ort lib references from props (#409) * Add nuget README and remove ort lib references from props * replace commit id in nuspec dynamically * remove $ sign for commit id token --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> * Add an C# demo project for NuGet package (#407) * Add a nuget test app * remove unused file * Compatible with onnxruntime-gpu package (#410) * be compatible without onnxruntime-gpu version * some fixing * turn it as a .net demo project --------- Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com> * Make Whisper E2E script more portable (#412) This PR makes the Whisper E2E script more portable for other environments. * Update macos wheel timeout to 180 min (#390) * Update ci timeout to 120 min * Only update WindowsPython job timeout * Update ci timeout to 90 min * update macos wheel timeout to 180 min --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Fix OneBranch PR pipeline CodeQL issue (#413) * test codeql 3000 * switch codeql from compiled to python * switch back to compiled --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Adding down-sampling and stereo mixing features for AudioDecoder (#420) * initial draft * second * third * polishing * fix the M_PI name in LINUX platform * fix bessel function issue * add a unit test case * fix the unit test name * Fix Secure Supply Chain Analysis Warning in PR pipeline (#414) * remove package sources * remove NuGet.config * add .sscignore for cfs0011 * change sscignore * add CFS0013 to sscignore --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * fix onnx version to 1.13.1 (#422) * [NuGet] All platform package pipeline (#408) * nuget ci package * disable macos arm64 build for err * Get the iOS xcframework build working with the split build/pack approach. (#416) * refine build_xcframework.py Cleanup/clarify various things - naming of parameters and files - consistency Make handling of additional build args more generic Update the artifact download dir/extract dir to more intuitive names Update scripts - make usage from CI pipeline clearer (e.g. don't hide directory names inside script) - keep comments in nuspec - remove unused args - make additional arg handling more Co-authored-by: Scott McKay <skottmckay@gmail.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * Add new required pre/post processing ops to Android and iOS packages. (#415) * Revert "Pin onnx version to 1.13.1" (#423) * Revert "fix onnx version to 1.13.1 (#422)" This reverts commit eb29d225a747e4de8afa81c55c3b71bec14e566f. * Update requirements.txt * PyOp attribute supports int and float data type (#425) * Fix Android AAR in nuget package. Requires libortextensions.so. (#429) * build for mac M1 (#430) * Fix the unit test failure with ONNX 1.14 package. (#428) * Fix the unit test failure with ONNX 1.14 package. * more tests * Update whisper_e2e.py * Add nuget.org publish version option (#426) * Add nuget.org publish version option * typo * small fix * typo --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops --------- Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com> Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> Co-authored-by: JiCheng <247153481@qq.com> Co-authored-by: Scott McKay <skottmckay@gmail.com> Co-authored-by: Changming Sun <chasun@microsoft.com> Co-authored-by: Wenbing Li <wenbingl@outlook.com> Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com> Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Fix a build err (#442) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Fix build err on ort 141 (#444) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Remove shape from span (#445) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err * remove shape --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Fix python tests (#446) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err * remove shape * fix python tests --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Fix max build (#449) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err * remove shape * fix python tests * fix packaging err * fix mac build --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Fix comments (#452) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err * remove shape * fix python tests * fix packaging err * fix mac build * fixing the universal2 python package for macOS (#448) * Remove onnx<1.14 from requirements.txt (#447) * remove onnx<1.14 from requirements.txt * downgrade protobuf * move protobuf req to requirements-dev.txt --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> * fix comments * comment version macro --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com> Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Fix build err (#453) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err * remove shape * fix python tests * fix packaging err * fix mac build * fix comments * comment version macro * define Compute for StftNormal --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Merge latest main (#461) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err * remove shape * fix python tests * fix packaging err * fix mac build * fix comments * comment version macro * define Compute for StftNormal --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * revert wanted changes in test * revert unwanted changed * add string_strip op --------- Co-authored-by: Cheng Tang <chenta@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: RandySheriffH <48490400+RandySheriffH@users.noreply.github.com> Co-authored-by: Randy Shuai <rashuai@microsoft.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com> Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> Co-authored-by: JiCheng <247153481@qq.com> Co-authored-by: Scott McKay <skottmckay@gmail.com> Co-authored-by: Changming Sun <chasun@microsoft.com> Co-authored-by: Wenbing Li <wenbingl@outlook.com> Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com>
2023-05-31 04:04:44 +03:00
std::vector<int64_t> output_dim;
auto mapping_result = impl_->Compute(input_data, input.Shape(), output_dim);
Use API-lite for custom ops (#386) * use lite custom op api for math * add vision ops * add cx2 ops * remove useless code * support register custom kernel struct * add string tensor support * add more text kernels * fix issue with std stringg as scalar * migrate all text ops * initial tokenizer change * migrate all tokenizers * Resolve conflict with main (#433) * resolve conflict * resolve conflict --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Update custom-op-lite PR (#440) * add the onnxruntime 1.14 release into the CI pipeline (#387) * add the onnxruntime 1.14 release into the CI pipeline * torch 2.0 crashed on Linux * Fix size_t overflow issue for RobertaTokenizer (#388) Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Pre and Post processing example for openAI Whisper model (#380) * add a stft-norm custom op for log-mel spectrum. * undo the debug change * Support ONNX standard STFT op signature. * Add a unit test onnx STFT compatible mode. * add whisper pre-/post- processing example * Update dlib.cmake * undo test code changes * Update setup.cfg * update the end2end example with STFT op * Added optional outputs for GPT2, CLIP and Roberta Tokenizers (#389) * Initial optional i/o for robertap * Small fix * Added working optional output functionality to RobertaTokenizer with tests * Added optional outputs to CLIPTokenizer * Added optional outputs to GPT2Tokenizer * Use ternary operators --------- Authored-by: Sayan Shaw <sayanshaw@microsoft.com> * ignore the unknown token id on bpe deocder (#391) * Use dependency name 'nlohmann_json' which is the same name that ORT uses. (#393) * Add an audio decoder custom op for whisper end-to-end processing (#385) * evaluate the audio decoder library * MP3 Decoder * rename it to test_audio_codec * add the audio decoder to whisper model * whisper end-to-end draft * fix the mp3 decoder * Running with ONNX models * Add more audio format supports * refine the end-to-end script * Update operators/audio/audio_decoder.hpp Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * Update operators/audio/audio_decoder.hpp Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * Update operators/audio/audio_decoder.hpp Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * some fixings of comments and more test cases. * changes for review comments. * Update audio_decoder.hpp * Update audio_decoder.hpp * code refinement * Update operators/audio/audio_decoder.hpp Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> --------- Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * make tensorflow be optional for unittest (#394) * make tensorflow be optional for unitest. * typo * built-in bounding box op (#382) * built-in bounding box op * update boundary check * assert policy * more boundary test and check * XYXY--> X horizon --------- Co-authored-by: Scott McKay <skottmckay@gmail.com> * a quick nuget package impl. (#396) * Update wheels_linux.yml: change the linux machine pool name (#398) * Add a merge step in whisper end-to-end script and fixed some issues (#399) * add merged models in whisper model * verify the final model * support batch > 1 in BpeDecoder (#400) * support batch > 1 in BpeDecoder * update the shape in helper function * [object detection ppp] YoLo as example (#397) * object detection * Unit test add e2e fastestdet model test --------- Co-authored-by: Changming Sun <chasun@microsoft.com> Co-authored-by: Scott McKay <skottmckay@gmail.com> * some fixing for python package (#401) * more code fixing related whisper models (#403) * Added windows nuget work temporarily for testing (#402) * Added windows nuget work temporarily for testing * Cleanup * Add back onnxruntime.lib in props file for possible future ORT need --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> * Remove unnecessary nupkg file and update nuspec (#405) * Add nuget pack to build.bat and small nuget changes for demo * Temporarily adding nuget.exe to build package until we can add to CI machine * Switch back from Release to RelWithDebInfo * Remove unnecessary changes --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Add initial NuGet pipeline for Windows x64 build (#406) * initial nuget pipeline * Update nuget.yml for Azure Pipelines * update nuget.yml for extensions specific packaging TODO: add certain template yml files * added component governance template yaml * change template yaml path * remove RoslynAnalyzers * Add packDestination to nuget pack task (change from default) * fix nuspec path * Update nuget.yml for Azure Pipelines * Update nuget.yml for Azure Pipelines * Update nuget.yml for Azure Pipelines * Update 2 nuget.yml for Azure Pipelines * Update NativeNuget.nuspec * Update nuget.yml for Azure Pipelines * update nuspec * Update 3 nuget.yml for Azure Pipelines * Update 4 nuget.yml for Azure Pipelines * Update 7 nuget.yml for Azure Pipelines * Remove unnecessary nupkg file and update nuspec (#405) * Add nuget pack to build.bat and small nuget changes for demo * Temporarily adding nuget.exe to build package until we can add to CI machine * Switch back from Release to RelWithDebInfo * Remove unnecessary changes --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Update 8 nuget.yml for Azure Pipelines * Update 9 nuget.yml for Azure Pipelines * add DLL signing * Update nuget.yml for Azure Pipelines * fix indendation * Update 11 nuget.yml for Azure Pipelines * Update 12 nuget.yml for Azure Pipelines * Update 12 nuget.yml for Azure Pipelines * Revert some unneccesary changes on nuget.yml * clean up nuget.yml and update nuspec release notes * small changes * update commit id and release notes --------- Co-authored-by: Wenbing Li <wenbingl@outlook.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Compatible with onnxruntime-gpu package (#410) * be compatible without onnxruntime-gpu version * some fixing * Add nuget README and remove ort lib references from props (#409) * Add nuget README and remove ort lib references from props * replace commit id in nuspec dynamically * remove $ sign for commit id token --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> * Add an C# demo project for NuGet package (#407) * Add a nuget test app * remove unused file * Compatible with onnxruntime-gpu package (#410) * be compatible without onnxruntime-gpu version * some fixing * turn it as a .net demo project --------- Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com> * Make Whisper E2E script more portable (#412) This PR makes the Whisper E2E script more portable for other environments. * Update macos wheel timeout to 180 min (#390) * Update ci timeout to 120 min * Only update WindowsPython job timeout * Update ci timeout to 90 min * update macos wheel timeout to 180 min --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Fix OneBranch PR pipeline CodeQL issue (#413) * test codeql 3000 * switch codeql from compiled to python * switch back to compiled --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Adding down-sampling and stereo mixing features for AudioDecoder (#420) * initial draft * second * third * polishing * fix the M_PI name in LINUX platform * fix bessel function issue * add a unit test case * fix the unit test name * Fix Secure Supply Chain Analysis Warning in PR pipeline (#414) * remove package sources * remove NuGet.config * add .sscignore for cfs0011 * change sscignore * add CFS0013 to sscignore --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * fix onnx version to 1.13.1 (#422) * [NuGet] All platform package pipeline (#408) * nuget ci package * disable macos arm64 build for err * Get the iOS xcframework build working with the split build/pack approach. (#416) * refine build_xcframework.py Cleanup/clarify various things - naming of parameters and files - consistency Make handling of additional build args more generic Update the artifact download dir/extract dir to more intuitive names Update scripts - make usage from CI pipeline clearer (e.g. don't hide directory names inside script) - keep comments in nuspec - remove unused args - make additional arg handling more Co-authored-by: Scott McKay <skottmckay@gmail.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * Add new required pre/post processing ops to Android and iOS packages. (#415) * Revert "Pin onnx version to 1.13.1" (#423) * Revert "fix onnx version to 1.13.1 (#422)" This reverts commit eb29d225a747e4de8afa81c55c3b71bec14e566f. * Update requirements.txt * PyOp attribute supports int and float data type (#425) * Fix Android AAR in nuget package. Requires libortextensions.so. (#429) * build for mac M1 (#430) * Fix the unit test failure with ONNX 1.14 package. (#428) * Fix the unit test failure with ONNX 1.14 package. * more tests * Update whisper_e2e.py * Add nuget.org publish version option (#426) * Add nuget.org publish version option * typo * small fix * typo --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops --------- Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com> Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> Co-authored-by: JiCheng <247153481@qq.com> Co-authored-by: Scott McKay <skottmckay@gmail.com> Co-authored-by: Changming Sun <chasun@microsoft.com> Co-authored-by: Wenbing Li <wenbingl@outlook.com> Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com> Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Fix a build err (#442) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Fix build err on ort 141 (#444) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Remove shape from span (#445) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err * remove shape --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Fix python tests (#446) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err * remove shape * fix python tests --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Fix max build (#449) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err * remove shape * fix python tests * fix packaging err * fix mac build --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Fix comments (#452) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err * remove shape * fix python tests * fix packaging err * fix mac build * fixing the universal2 python package for macOS (#448) * Remove onnx<1.14 from requirements.txt (#447) * remove onnx<1.14 from requirements.txt * downgrade protobuf * move protobuf req to requirements-dev.txt --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> * fix comments * comment version macro --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com> Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> * Fix build err (#453) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err * remove shape * fix python tests * fix packaging err * fix mac build * fix comments * comment version macro * define Compute for StftNormal --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Merge latest main (#461) * resolve conflict * resolve conflict * minor fix * rename from TensorT to Tensor * fix string tensor * Add OrtLiteCustomOp * switch to string view * fix regex ops * fix build * fix a build err * remove shape * fix python tests * fix packaging err * fix mac build * fix comments * comment version macro * define Compute for StftNormal --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> * revert wanted changes in test * revert unwanted changed * add string_strip op --------- Co-authored-by: Cheng Tang <chenta@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: RandySheriffH <48490400+RandySheriffH@users.noreply.github.com> Co-authored-by: Randy Shuai <rashuai@microsoft.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com> Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> Co-authored-by: JiCheng <247153481@qq.com> Co-authored-by: Scott McKay <skottmckay@gmail.com> Co-authored-by: Changming Sun <chasun@microsoft.com> Co-authored-by: Wenbing Li <wenbingl@outlook.com> Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com>
2023-05-31 04:04:44 +03:00
auto* output_data = out.Allocate(output_dim);
// Set output tensor data
int idx = 0;
for (auto& res : mapping_result) {
for (int64_t value : res) {
output_data[idx] = value;
idx++;
}
}
return nullptr;
}