* Fix the issue #1033
fix the issue #1033 "converting to ‘const std::unordered_set<std::basic_string<char> >’from initializer"
* Fix the issue #1033
fix the issue #1033 "converting to ‘const std::unordered_set<std::basic_string >’from initializer".
* Fixed a g++ explicit constructor compatibility error for unordered_set.
* Change std::unordered_set<std::basic_string<char>>() to
std::unordered_set<std::string>().
* Update DMLC core most recent version
* Modify runtime to minimize io when building for SGX
* Add SGX example app
* Prefer streaming versions of packed_func function
* Make python BuildConfig serializable/deserializable to/from string
* Make C++ BuildConfig serializable/deserializable to/from string
* Revert "Make python BuildConfig serializable/deserializable to/from string"
This reverts commit a5e1fb3ff63a161cc0d63475d2a32816cc4c3666.
* Revert "Make C++ BuildConfig serializable/deserializable to/from string"
This reverts commit ec0c2c54543050fe6f264d06eebff33dee70370b.
* Converted BuildConfig to use TVM node system
* Fix lint
* Fix lint
* Added code to set node attributes through the C API
* Fixed bug in build_config()
* Fix lint
* Fix lint
* Fix test errors
* Reduced scope of node __setattr__ to apply only to BuildConfig
* Fix lint
* Fix lint
* Changed python BuildConfig to be immutable, with values set once on construction.
* Fix lint
* Fix C++ test
* Fixed BuildConfig setting python-side args
* Fix lint
* Removed dependency on reflection.cc to construct BuildConfig (allow use in runtime library)
* Fix lint
* Revert "Fix lint"
This reverts commit 16ed6d7a1ca5e551b035bad46e8361ea487cd45b.
* Revert "Removed dependency on reflection.cc to construct BuildConfig (allow use in runtime library)"
This reverts commit 43817c97a2ee045791e0c031d962fa97636ce8f6.
* Avoid accessing BuildConfig when using runtime lib
* Fix missing import
* Fix error running under cython (root cause: node handle is not valid until after __init__ has returned, so cannot call __dir__ during __init__
* Fix error where BuildConfig._node_defaults was not copied in build_config()
* Fix lint
* Fix lint
* Fix lint
* Fix lint
* Add comments to python BuildConfig
* Ported injective schedules to C++. Added some elementwise ops.
* Fix lint errors
* Added reduction ops and schedules
* Fix lint errors
* Fix lint errors
* Fix lint errors
* Added transform ops
* Fix lint errors
* Fix lint errors
* Added softmax, log_softmax, leaky_relu and flatten ops.
Fixed issue where TVM_DECLARE_INTRIN_UNARY used the PureExtern flag
instead of PureIntrinsic.
Added softmax CUDA schedule.
* Fix lint
* Fix lint
* Added binary_dense, batch_norm_inference, dense, dilate, scale_shift_*,
global_pool and pool ops.
Extended pad to allow specifying pad_value.
Fixed issue where pad would throw if padding was zero in all dimensions.
* Fix lint
* Fix lint
* Added CUDA schedules for dense, pool and global_pool
* Added extern schedules for generic and CUDA
* Fix lint
* Added x86 binary schedules
* Fix lint
* Added rocm dense schedule. Added rocBLAS and cuBLAS support to dense ops
* Added pow ops. Added x86 default and injective schedules
* Fix lint
* Fix lint
* Fix lint
* Fix lint
* Fix lint
* Fix indent
* Removed schedules directory
* Changed left_shift, right_shift to operators. Changed pad_value in pad() to remove pointer usage
* Fixed usage of pad in nn/pooling.h. Fixed declaration of operator>>
* Fixed comments for shift operators
* Added comments to utility functions
* Added TOPI C++ library, exporting broadcast_add op
* Fix lint
* Share libinfo.py with TVM
* Fix lint
* Add other broadcast ops
* Fix lint
* Fix imports in topi
* Fix lib names
* Fixed build issue where windows builds don't apply correct definitions
* Removed TVM_EXPORTS from topi library
* Attempted CI build fix
* Add topi lib to tvm_multilib
* Fix Jenkinsfile
* Added TOPI build target to Makefile
* Fix nn op namespaces.
* Fix lint
* Renamed TOPI lib to libtvm_topi
* Removed _ffi/base.py
* Remove _ffi from topi, now shared with tvm.
* Make libtvm_topi loading optional
* Fix compiler warnings
* Fix lint
* Fix lint
* Fix lint
* Fix build error by making new libs argument to Target optional
* Added C++ Target type interop. Added registration of remaining C++ ops and schedules. Added test of broadcast ops
* Fix lint
* Fix lint
* Fix compile error
* Fix compiler warnings
* Fix compiler warnings
* Fixed int vector interop. Fixed argmin incorrectly invoking argmax. Fixed corner case in default schedules of attempting to fuse 0 length axes. Added tests for reduce ops.
* Refactored reduce builders
* Fixed typos in topi.cc. Added basic test.
* Fixed padding size error. Added dense, dilate, pooling tests
* Fixed issue where clip would output a different dtype to the input. Added split_sections op to cover the other mode of the python split op. Added tests.
* Changed extension type numbers to avoid clash with NNVM
* Fix lint
* Fix compiler warnings
* Removed use of std::vector from the public TOPI API
* Fix lint
* Add TOPI C++ tests to CI
* Fixed detail namespacing. Improved comments.
* when there is no intrin func, using body for initialization. For issue 714.
* Refine code per review comments, and add a test case.
* Fix lint issues.
* Re-organize the tensorize test cases, and add a new case for none-reset
mode.
* Fix a typo.
* Delete the unit case because merged it into test_schedule_tensorize.py already.
* always use new tensor in its stage when rewrite for cache read
* revert previous changes to sync up with master
* support using the ptr with an original offset
* update test case and fix CI error
* [SCHEDULE]enable partition const loop with build flag (#719)
* enable partition loop with build flag
* add a testcase, and modify LoopPartition related cases
* * add document for split_const_loop
* Port build_module.py to C++
* Fix lint errors
* Fix more lint errors
* Fix more lint errors
* Fix more lint errors
* Fix build error
* Implemented style fixes
* Fix lint errors
* Added function to construct target from string
lower now returns array
* Fix lint error
* Implemented review changes - style & Target options -> std::vector
* Fixed lint, argument alignment and added unit test
* Changed test to target LLVM, fixed sign compare warnings
* Reverted unit test to CUDA, changed Jenkinsfile to enable GPU for C++ tests
* Slight change to Jenkinsfile
* Changed build_module test from CUDA to LLVM
* Added function var() to construct a Var instance.
Changed implementation of LLVMEnabled()
* Reverted Jenkinsfile
* [WIP] C++ topi contributions
Summary:
This diff implements C++ topi contributions for:
- relu with parametrix threshold
- pad with generic padBefore / padAfter specification
- matmult with transposes
- conv2d_nchw, conv2d_hwcn with runtime constant padding and strides
- depthwise_conv2d_nchw with runtime constant padding and strides
- group_conv2d_ngchw with runtime constant padding and strides
- broadcast_to a broadcastable shape
- broadcast_bop where bop is an usual binary op (+ - * / %)
Convolution padding is implemented using the pad operation.
To avoid extra memory consumption, it is generally recommended to inline the padding with the autoinliner.
Unfortunately in its current form the elemwise checks are too restrictive to allow inlining.
So this diff also proposes an extension to LHS injective (i.e. no reduction axis in the current IR design)
Test Plan:
Tested in C++ testsuite in a separate repository, I am looking for suggestions to quickly spin up some tests for tvm.
Reviewers: tqchen
Subscribers:
Tasks:
Tags:
Blame Revision:
* Review + Lint + GSG C++
* Typofix.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Probe for nvrtc in lib directory as well.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Conda build recipe for TVM.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* prefetch interface added
* prefetch python comments modified. prefetch info data structure maintained.
* start injecting prefetches. first step (domain touch) implemented.
* domain touch tested.
* Prefetch ir_mutator and ir_visitor dispatch registered.
* modify domain touched from passing a func_ref to passing a tensor
* modify domain touched from passing a func_ref to passing a tensor
* modify Tensor copy to Tensor ref
* temp commit for rebase
* debug info removed, typo fixed, ready to rebase
* prefetch flatten test add!
* roll back builtin functions to side effect functions
* lint error fixed!
* add cache line size to storage flatten argument
* forgot modifications add
* change code style to dmlc-like; get rid of can_prove, use manually compute instead
* python lint error fixed
* modify instrinsic name to pass tests
* [TEST] get rid of str(), replace them by accessing attributes
* change map to list comprehension
* redundant numpy import removed
* Support for batch ComputeOp
* Support for batch ComputeOp
* Fix CrossThreadReduction
* Fix lint
* Add UpdateArray, remove support for batch reduce
* Tuple input support for reduce
* rfactor works with multiple reducer; support multiple reducers with different types
* Small fix
* Small fix
* Change return type of rfactor to Array<Expr>
* Fix lint
* Improve
* Add tutorial
* Improve tutorial
* Improve tutorial
* [LANG] CommReducer
* Reorganize c_api
* Remove InitValue and Combine; refactor Functor
* Make CommReducer an Expr
* Make comm_reducer type independent
* Make CommReducerNode a Node
* Small fix
* Refine
* Refine front api; add integration testcases for min/max
* Fix python
* Refine
* Fix lint and add example