[Bug #20169]
Embedded strings are not safe for system calls without the GVL because
compaction can cause pages to be locked causing the operation to fail
with EFAULT. This commit changes io_fwrite to use rb_str_tmp_frozen_no_embed_acquire,
which guarantees that the return string is not embedded.
The macro SafeStringValue() became just StringValue() in c5c05460ac,
and it is deprecated nowadays.
This patch replaces remaining macro usage. Some occurrences are left in
ext/stringio and ext/win32ole, they should be fixed upstream.
The macro itself is not deleted, because it may be used in extensions.
When opening a file with `File.open`, and then setting the encoding with
`IO#set_encoding`, it still correctly performs CRLF -> LF conversion on
Windows when reading files with a CRLF line ending in them (in text
mode).
However, the file is opened instead with either the `rb_io_fdopen` or
`rb_file_open` APIs from C, the CRLF conversion is _NOT_ set up
correctly; it works if the encoding is not specified, but if
`IO#set_encoding` is called, the conversion stops happening. This seems
to be because the encflags never get ECONV_DEFAULT_NEWLINE_DECORATOR
set in these codepaths.
Concretely, this means that the conversion doesn't happen in the
following circumstances:
* When loading ruby files with require (that calls rb_io_fdopen)
* When parsing ruuby files with RubyVM::AbstractSyntaxTree (that calls
rb_file_open).
This then causes the ErrorHighlight tests to fail on windows if git has
checked them out with CRLF line endings - the error messages it's
testing wind up with literal \r\n sequences in them because the iseq
text from the parser contains un-newline-converted strings.
This commit fixes the problem by copy-pasting the relevant snippet which
sets this up in `rb_io_extract_modeenc` (for the File.open path) into
the relevant codepaths for `rb_io_fdopen` and `rb_file_open`.
[Bug #20101]
Before this patch, the MN scheduler waits for the IO with the
following steps:
1. `poll(fd, timeout=0)` to check fd is ready or not.
2. if fd is not ready, waits with MN thread scheduler
3. call `func` to issue the blocking I/O call
The advantage of advanced `poll()` is we can wait for the
IO ready for any fds. However `poll()` becomes overhead
for already ready fds.
This patch changes the steps like:
1. call `func` to issue the blocking I/O call
2. if the `func` returns `EWOULDBLOCK` the fd is `O_NONBLOCK`
and we need to wait for fd is ready so that waits with MN
thread scheduler.
In this case, we can wait only for `O_NONBLOCK` fds. Otherwise
it waits with blocking operations such as `read()` system call.
However we don't need to call `poll()` to check fd is ready
in advance.
With this patch we can observe performance improvement
on microbenchmark which repeats blocking I/O (not
`O_NONBLOCK` fd) with and without MN thread scheduler.
```ruby
require 'benchmark'
f = open('/dev/null', 'w')
f.sync = true
TN = 1
N = 1_000_000 / TN
Benchmark.bm{|x|
x.report{
TN.times.map{
Thread.new{
N.times{f.print '.'}
}
}.each(&:join)
}
}
__END__
TN = 1
user system total real
ruby32 0.393966 0.101122 0.495088 ( 0.495235)
ruby33 0.493963 0.089521 0.583484 ( 0.584091)
ruby33+MN 0.639333 0.200843 0.840176 ( 0.840291) <- Slow
this+MN 0.512231 0.099091 0.611322 ( 0.611074) <- Good
```
This commit moves IO#readline to Ruby. In order to call C functions,
keyword arguments must be converted to hashes. Prior to this commit,
code like `io.readline(chomp: true)` would allocate a hash. This
commits moves the keyword "denaturing" to Ruby, allowing us to send
positional arguments to the C API and avoiding the hash allocation.
Here is an allocation benchmark for the method:
```
x = GC.stat(:total_allocated_objects)
File.open("/usr/share/dict/words") do |f|
f.readline(chomp: true) until f.eof?
end
p ALLOCATIONS: GC.stat(:total_allocated_objects) - x
```
Before this commit, the output was this:
```
$ make run
./miniruby -I./lib -I. -I.ext/common -r./arm64-darwin22-fake ./test.rb
{:ALLOCATIONS=>707939}
```
Now it is this:
```
$ make run
./miniruby -I./lib -I. -I.ext/common -r./arm64-darwin22-fake ./test.rb
{:ALLOCATIONS=>471962}
```
[Bug #19890] [ruby-core:114803]
Deprecate Kernel#open and IO support for subprocess creation and
forking. This deprecates subprocess creation and forking in
- Kernel#open
- URI.open
- IO.binread
- IO.foreach
- IO.readlines
- IO.read
- IO.write
This behavior is slated to be removed in Ruby 4.0
[Feature #19630]
Because a thread calling IO#close now blocks in a native condvar wait,
it's possible for there to be _no_ threads left to actually handle
incoming signals/ubf calls/etc.
This manifested as failing tests on Solaris 10 (SPARC), because:
* One thread called IO#close, which sent a SIGVTALRM to the other
thread to interrupt it, and then waited on the condvar to be notified
that the reading thread was done.
* One thread was calling IO#read, but it hadn't yet reached the actual
call to select(2) when the SIGVTALRM arrived, so it never unblocked
itself.
This results in a deadlock.
The fix is to use a real Ruby mutex for the close lock; that way, the
closing thread goes into sigwait-sleep and can keep trying to interrupt
the select(2) thread.
See the discussion in: https://github.com/ruby/ruby/pull/7865/
* Add rb_io_path and rb_io_open_descriptor.
* Use rb_io_open_descriptor to create PTY objects
* Rename FMODE_PREP -> FMODE_EXTERNAL and expose it
FMODE_PREP I believe refers to the concept of a "pre-prepared" file, but
FMODE_EXTERNAL is clearer about what the file descriptor represents and
aligns with language in the IO::Buffer module.
* Ensure that rb_io_open_descriptor closes the FD if it fails
If FMODE_EXTERNAL is not set, then it's guaranteed that Ruby will be
responsible for closing your file, eventually, if you pass it to
rb_io_open_descriptor, even if it raises an exception.
* Rename IS_EXTERNAL_FD -> RUBY_IO_EXTERNAL_P
* Expose `rb_io_closed_p`.
* Add `rb_io_mode` to get IO mode.
---------
Co-authored-by: KJ Tsanaktsidis <ktsanaktsidis@zendesk.com>
* Documentation consistency.
* Improve consistency of `pread`/`pwrite` implementation when given length.
* Remove HAVE_PREAD / HAVE_PWRITE - it is no longer optional.
When one thread is closing a file descriptor whilst another thread is
concurrently reading it, we need to wait for the reading thread to be
done with it to prevent a potential EBADF (or, worse, file descriptor
reuse).
At the moment, that is done by keeping a list of threads still using the
file descriptor in io_close_fptr. It then continually calls
rb_thread_schedule() in fptr_finalize_flush until said list is empty.
That busy-looping seems to behave rather poorly on some OS's,
particulary FreeBSD. It can cause the TestIO#test_race_gets_and_close
test to fail (even with its very long 200 second timeout) because the
closing thread starves out the using thread.
To fix that, I introduce the concept of struct rb_io_close_wait_list; a
list of threads still using a file descriptor that we want to close. We
call `rb_notify_fd_close` to let the thread scheduler know we're closing
a FD, which fills the list with threads. Then, we call
rb_notify_fd_close_wait which will block the thread until all of the
still-using threads are done.
This is implemented with a condition variable sleep, so no busy-looping
is required.