the original rb_wasm_setjmp implementation always unwinds to the root
call frame to have setjmp compatible interface, and simulate sjlj's
undefined behavior. Therefore, every vm_exec call unwinds to main, and
a deep call stack makes setjmp call very expensive. The following
snippet from optcarrot takes 5s even though it takes less than 0.3s on
native.
```
[0x0, 0x4, 0x8, 0xc].map do |attr|
(0..7).map do |j|
(0...0x10000).map do |i|
clr = i[15 - j] * 2 + i[7 - j]
clr != 0 ? attr | clr : 0
end
end
end
```
This patch adds a WASI specialized vm_exec which uses lightweight
try-catch API without unwinding to the root frame. After this patch, the
above snippet takes only 0.5s.
configure.ac: setup build tools and register objects
main.c: wrap main with rb_wasm_rt_start to handle asyncify unwinds
tool/m4/ruby_wasm_tools.m4: setup default command based on WASI_SDK_PATH
environment variable. checks wasm-opt which is used for asyncify.
tool/wasm-clangw wasm/wasm-opt: a clang wrapper which replaces real
wasm-opt with do-nothing wasm-opt to avoid misoptimization before
asyncify. asyncify is performed at POSTLINK, but clang linker driver
tries to run optimization by wasm-opt unconditionally. inlining pass
at wasm level breaks asyncify's assumption, so should not optimize
before POSTLIK.
wasm/GNUmakefile.in: wasm specific rules to compile objects