This commit adds `free_empty_pages` which frees all empty heap pages and
moves the number of pages freed to the allocatable pages counter. This
is used in Process.warmup to improve performance because page
invalidation from copy-on-write is slower than allocating a new page.
This both save time for when it will be eventually needed,
and avoid mutating heap pages after a potential fork.
Instrumenting some large Rails app, I've witnessed up to
58% of String instances having their coderange still unknown.
[Feature #18885]
For now, the optimizations performed are:
- Run a major GC
- Compact the heap
- Promote all surviving objects to oldgen
Other optimizations may follow.
* Add rb_io_path and rb_io_open_descriptor.
* Use rb_io_open_descriptor to create PTY objects
* Rename FMODE_PREP -> FMODE_EXTERNAL and expose it
FMODE_PREP I believe refers to the concept of a "pre-prepared" file, but
FMODE_EXTERNAL is clearer about what the file descriptor represents and
aligns with language in the IO::Buffer module.
* Ensure that rb_io_open_descriptor closes the FD if it fails
If FMODE_EXTERNAL is not set, then it's guaranteed that Ruby will be
responsible for closing your file, eventually, if you pass it to
rb_io_open_descriptor, even if it raises an exception.
* Rename IS_EXTERNAL_FD -> RUBY_IO_EXTERNAL_P
* Expose `rb_io_closed_p`.
* Add `rb_io_mode` to get IO mode.
---------
Co-authored-by: KJ Tsanaktsidis <ktsanaktsidis@zendesk.com>
[Feature #18885]
For now, the optimizations performed are:
- Run a major GC
- Compact the heap
- Promote all surviving objects to oldgen
Other optimizations may follow.
[Feature #19443]
Until recently most libc would cache `getpid()` so this was a cheap check to make.
However as of glibc version 2.25 the PID cache is removed and calls to getpid() always
invoke the actual system call which significantly degrades the performance of existing applications.
The reason glibc removed the cache is that some libraries were bypassing fork(2)
by issuing system calls themselves, causing stale cache issues.
That isn't a concern for Ruby as bypassing MRI's primitive for forking would
render the VM unusable, so we can safely cache the PID.
rb_pid_t is 32 bits on some platforms, which will cause a warning on GCC
due to POSFIXABLE always returning true.
include/ruby/internal/arithmetic/fixnum.h:43:31: warning: comparison is always true due to limited range of data type [-Wtype-limits]
[Feature #19443]
It's not uncommon for database client and similar network libraries
to protect themselves from Process.fork by regularly checking Process.pid
Until recently most libc would cache `getpid()` so this was a cheap
check to make.
However as of glibc version 2.25 the PID cache is removed and calls to
`getpid()` always invoke the actual system call which significantly degrades
the performance of existing applications.
The reason glibc removed the cache is that some libraries were bypassing
`fork(2)` by issuing system calls themselves, causing stale cache issues.
That isn't a concern for Ruby as bypassing MRI's primitive for forking
would render the VM unusable, so we can safely cache the PID.
* Revert "Remove special handling of `SIGCHLD`. (#7482)"
This reverts commit 44a0711eab.
* Revert "Remove prototypes for functions that are no longer used. (#7497)"
This reverts commit 4dce12bead.
* Revert "Remove SIGCHLD `waidpid`. (#7476)"
This reverts commit 1658e7d966.
* Fix change to rjit variable name.
The last part of the sentence was accidentally put in enumeration, It
made an impression that it's one of the rules, not the result of
applying the rules.
Replacing `...` with `*pids` seems to clarify the expected variadic arguments.
Note that the expected arguments are two or more with a signal and pids.
That is, the method must have at least one pid, which cannot be omitted:
```console
% ruby -e 'Process.kill(0)'
-e:1:in `kill': wrong number of arguments (given 1, expected 2+) (ArgumentError)
from -e:1:in `<main>'
```
The forked child process is a grandchild process from the viewpoint of
the process which invoked the caller process. That means the child is
detached at that point, and it does not need to fork twice.
First, rb_mjit_fork should call rb_thread_atfork to stop threads after
fork in the child process. Unfortunately, we cannot use rb_fork_ruby to
prevent this kind of mistakes because MJIT needs special handling of
waiting_pid and mjit_pause/resume.
Second, mjit_waitpid_finished should be checked regardless of
trap_interrupt. It doesn't seem like the flag is not set when SIGCHLD is
handled for an MJIT child process.
The argument of `rb_syswait` is now `rb_pid_t` which may differ from
`int`. Also it is an undefined behavior to take the result of casted
void function (in `rb_protect`).
As discussed in [Bug #18911], I'm adding some documentation to
`Process._fork` to clarify that it is not expected to cover
calls to `Process.daemon`.
[Bug #18911]: https://bugs.ruby-lang.org/issues/18911
Co-authored-by: Yusuke Endoh <mame@ruby-lang.org>
Currently the calculation only counts the size of the struct. This commit adds the size of the associated st tables, id tables, and linked lists.
Still missing is the size of the ractors and (potentially) the size of the object space.
* On some platforms (e.g., macOS), the user's default group access
list may exceed `NGROUPS_MAX`.
* Use upcase "GID" instead of "gid" for other than variable names.
* process.c: Add Process._fork
This API is supposed for application monitoring libraries to hook fork
event.
[Feature #17795]
Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
Instead of looking for Object::ENV (which can be overwritten),
directly look for the envtbl variable. As that is static in hash.c,
and the lookup code is in process.c, add a couple non-static
functions that will return envtbl (or envtbl#to_hash).
Fixes [Bug #18164]
All occurrences of rb_fork_ruby are followed by a call rb_thread_fork in
the created child process.
This is refactoring and a potential preparation for [Feature #17795].
(rb_fork_ruby may be wrapped by Process._fork_.)
* Rename `rb_scheduler` to `rb_fiber_scheduler`.
* Use public interface if available.
* Use `rb_check_funcall` where possible.
* Don't use `unblock` unless the fiber was non-blocking.