Because we want to flush pending stale units before unloading units, the
pending_stale_p check is implemented in this waiting loop.
However, once all methods are called more than --jit-min-calls,
mjit_worker_wakeup will not be signaled again. As a result, when
mjit_recompile is called after that and pending_stale_p becomes true,
MJIT stops processing methods in the unit queue even if the queue is
very long and MJIT does nothing, waiting for the signal.
There should be a better way to handle this, but as a fix to be
backported to Ruby 3.0, let me make an obvious simple commit here.
556a728508 was not good maybe because it
wasn't using list_for_each_safe. If list_for_each_safe is safe for
list_del for any nodes (is that true?), this should be fine.
Revert "Lock the entire active_units loop"
This reverts commit 5c2ff88be2.
Revert "Lock active_units references on compaction"
This reverts commit 556a728508.
Revert "Wait for GC before unload_units"
This reverts commit a8f16df615.
Well, the previous revert actually didn't fix it, but this series of
reverts seems to rollback the situation a little.
The compilation for JIT compaction is very heavy. Triggering a second
compaction to include one more new method is probably not worth it. So
this triggers JIT compaction for ten more new methods after each
compaction.
to avoid "Too many JIT code, but skipped unloading units for JIT compaction".
Now we can forget the `in_compact` locking.
Moving some functions from mjit.c to mjit_worker.c because mjit_worker.c
should have functions executed in the JIT worker.
convert_unit_to_func's c_func / so_func construction is unnecessarily
complicated while it's not really safer than what compact_all_jit_code
does. So I changed convert_unit_to_func to be consistent with
compact_all_jit_code.
This has been a TODO since 79df14c04b. While adcf0316d1 covered the
root_fiber of the initial thread, it didn't cover root_fibers of other
threads. Now it's hooked properly in rb_threadptr_root_fiber_setup.
With regards to "XXX: Is this mjit_cont `mjit_cont_free`d?", when
rb_threadptr_root_fiber_release is called, although I'm not sure when
th->root_fiber is truthy, fiber_free seems to call cont_free and
mjit_cont_free. So mjit_conts of root_fibers seem to be freed properly.
_MSC_VER used to be the macro to switch JIT compaction. However, since
d4381d2ceb, the correct macro to switch it was changed from _MSC_VER
to _WIN32. As I didn't properly replace all relevant _MSC_VER usages
to _WIN32, these macros have been used inconsistently.
nobu replaced _WIN32 with USE_HEADER_TRANSFORMATION in 5eb446d12f.
Therefore we had USE_HEADER_TRANSFORMATION and _MSC_VER. This commit
makes sure such inconsistent _MSC_VER usages will be unified to the new
header, also renaming it to USE_JIT_COMPACTION to be more precise about
the requirements. The header transformation itself is not quite relevant
to places changed in this commit.
Isn't setting `in_compact = true` enough to avoid a race condition
between JIT compaction and unload_units? Now I think it is.
This change will make it easier to spend more time on compile_compact_jit_code.
For now it seems to take only 0.0723ms though.
This reverts commit 6cb6d5abc3.
This reverts commit 1484b786ae.
I think we don't need these assertions anymore. I believe the problem
is solved by abf678a439
We are seeing an error where code that is generated with MJIT contains
references to objects that have been moved. I believe this is due to a
race condition in the compaction function.
`gc_compact` has two steps:
1. Run a full GC to pin objects
2. Compact / update references
Step one is executed with `garbage_collect`. `garbage_collect` calls
`gc_enter` / `gc_exit`, these functions acquire a JIT lock and release a
JIT lock. So a lock is held for the duration of step 1.
Step two is executed by `gc_compact_after_gc`. It also holds a JIT
lock.
I believe the problem is that the JIT is free to execute between step 1
and step 2. It copies call cache values, but doesn't pin them when it
copies them. So the compactor thinks it's OK to move the call cache
even though it is not safe.
We need to hold a lock for the duration of `garbage_collect` *and*
`gc_compact_after_gc`. This patch introduces a lock level which
increments and decrements. The compaction function can increment and
decrement the lock level and prevent MJIT from executing during both
steps.
This is a temporary commit to try to find a GC issue. It seems like
mjit is pointing at a moved address in the call cache. I want to assert
that they aren't TMOVED or garbage objects at the time they get copied
MinGW test_jit fails with no error message. Perhaps linker flags should
not be passed when compilation is happening.
Anyway splitting these stages doesn't matter for performance. So let me
just split it to fix the issue. Probably this helps Solaris's issue too.
It's to avoid memory leak for actual usage (because they don't get
unloaded properly), but also for fixing CI timed out due to JIT
compaction taking too long time on --jit-wait (which runs every time)
http://ci.rvm.jp/results/trunk-mjit-wait@silicon-docker/2911601