* Change WASM_COMPATIBLE_SOURCE=OFF by default
The default was WASN_COMPATIBLE_SOURCE=ON COMPILE_WASM=OFF which is a
testing configuration, not a sensible default for native or wasm.
* Always USE_WASM_COMPATIBLE_SOURCE with COMPILE_WASM
* Set CMP0077 to fix variable handling
* first attempt to enable vocabs pass as byte arrays
* pass vocabs bytes as AlignedMemory
* add vocabIndices to avoid double loading
* small fix on parameter names and documentation
* fix windows build plus tiny update on documentation
* update marian-dev submodule
* move validate model bytearray in BatchTranslator
* small refactors on validateBinaryModel()
* switch vocab memories to std::vector<marian::Ptr<AlignedMemory>>
* update marian-dev submodule
* replace marian::Ptr to std::shared_ptr for vocab memories
* add note for vocab memories
* First draft of faithful translation
* Comments explaining pre and post
* Comments on response_builder
* Updating bergamot-translator-tests with new outputs
* Cosmetic changes in response target text construction
* Replacing &(x[0]) -> x.data() to avoid illegal indices
* Removing nullptr given both branches init pointer with legal values
* pre, post -> gap(i) addressing review comments
Functions which were pre and post before are subsumed by gap(i), and the
algorithm in ResponseBuilder adjusted to fix.
`x = nullptr` is back, should be harmless.
* Updating brt with paragraph outputs
* Bumping brt with updated outputs, buffer text at begin as well
* Bumping BRT with sync after bytearray collapse merge
* Pointing BRT to main after merge
Co-authored-by: Nikolay Bogoychev <nheart@gmail.com>
* Adding bytearray option
* collapse intermediate for bytearray apps
* Removing service-cli-bytearray
* Removing the bergamot bytearray app
* Bumping updates to brt collapsing apps
* Reasonable defaults and hard check when cmd enabled
* Update documentation for flags
* Bump brt with MKL check and skip
* Bumping BRT with MKL_FOUND instead of USE_MKL
* Bumping BRT with no mkl enforce
* Bumping BRT with ssse3 output
* Let's try disabling OpenBLAS
* Trying to disable apple accelerate
* Using WASM compatible BLAS can enable intgemm
* Adding a CMake -L to see what exactly is the diff
* Revert "Let's try disabling OpenBLAS"
This reverts commit 9a6b9bc53bf7dec956889f6e0b7047e5388e1b7e.
* Revert "Using WASM compatible BLAS can enable intgemm"
This reverts commit 936a592e18431c279e6c5952a278d012d18ff295.
* Restricting mac tests through tags and on GitHub CI
* Using only check-bytearray
* Bumping BRT with change of default behaviour
* Improved script that patches wasm artifacts to enable wormhole
- Made the regex pattern ignore multiple whitespaces b/w words of
the matching pattern
* Fix for loading EN->DE vocabularies in wasm test page
- Loading vocabularies for EN->DE was failing because of
the new structure of bergamot-models
* Safe transfer of bindings through typedefs
* Removing Translation* files and bringing in counterparts
* Remove previously commented out code
* Removing commented out include
* Absorb Translation* documentation
Co-authored-by: abhi-agg <66322306+abhi-agg@users.noreply.github.com>
* Update marian-dev to the newest mac version
* Attempt windows workflow
* force workflow rerun
* Separate id
* Attempt 3 at github action
* Marian dev submodule now compiles with apple clang
* Updated ssplit version to something more recent
* Attempt to fix compile on wasm
* Do not compile subproject tests
* Fix emscripten compilation on Mac
* 99% on the way to windows compile
* Try with a different generator
* Build release not debug
* Revert CMakeLists.txt hacks
* Fix sse2 compilation failure
* MSVC settings for WIN32
* Add nodefaultlib LIBCMT
* Do not compile ssplit.cpp as it contains sys/mman.h
* Revert ab56b9aa4f4360b0ab98d5806658d4302f31db1d
* Update paths
* Set the build type to release if not set previously
* Attempt to build release with the windows workflow
* Attempt 5 at VS studio release build
* Attempt 6 at getting release build on MSVC generator
* The windows build is debug at the moment...
* fix ssplit for ubuntu 16.04
* Fix compilation with clang
* Compile on ubuntu16.04
* Explain what is going on
* Updated ssplit and workflow
* Bindings to load model and shortlist files as bytes
* Modified wasm test page for byte based loading of files
* Updates wasm README for byte loading based usage of TranslationModel
* Control validating the config options via a boolean flag
- parseOptions() function now validates the parsed options
based on the validate argument
* Minor syntactic fix
- AlignedMemory is AlignedVector<char> now instead of AlignedVector<const void*>
- This solves the issue of allocating 8x of the actual required memory for
loading files as bytes
Adds regression-tests to the workflow for native minimal/custom marian and full builds.
Co-authored-by: abhi-agg <66322306+abhi-agg@users.noreply.github.com>
* Changing Annotation to adhere to [begin, end)
* Stronger unit tests on sentences + num words, num sentences
* Hotfix with empty string view from EOS
* No more absolving empty-sentence; Added tests now defined behaviour
* Uncommenting important section in unit test
* Ensure empty string view default, beginning at end so marker points
* Further strengthen and comment unit-tests, mark exactly where empty sentence is happening
* Review comments: Dummy sentence + docs
- What should be a simple fast accessor is turning into compute.
Normally the way to deal with this, for better or worse, is to put 0 at
the beginning of sentenceEndIds_. (Putting 0 at the beginning of
sentenceEndIds_)
- Indices into what? Mentioned to be flatByteRanges_.
* Documentation updates
* More changes to docs
Co-authored-by: abhi-agg <66322306+abhi-agg@users.noreply.github.com>
* Removes vocabs and propogates fixes for breaks
* Prettify diff: Undoing comment shuffles due to merge conflict edits
* 20% of time actual work, 80% prettifying diff
* Histories members -> poof!
We however have Histories in constructor, which we will remove out of the way
soon.
Co-authored-by: Kenneth Heafield <kpu@users.noreply.github.com>