This _partially_ reverts commit
d2ba8ea54a, but for UDP sockets only.
With TCP sockets (and other things which use `rsock_init_inetsock`), the
order of operations is to call `getaddrinfo(3)` with AF_UNSPEC, look at
the returned addresses, pick one, and then call `socket(2)` with the
family for that address (i.e. AF_INET or AF_INET6).
With UDP sockets, however, this is reversed; `UDPSocket.new` takes an
address family as an argument, and then calls `socket(2)` with that
family. A subsequent call to UDPSocket#connect will then call
`getaddrinfo(3)` with that family.
The problem here is that...
* If you are in a networking situation that _only_ has loopback addrs,
* And you want to look up a name like "localhost" (or NULL)
* And you pass AF_INET or AF_INET6 as the ai_family argument to
getaddrinfo(3),
* And you pass AI_ADDRCONFIG to the hints argument as well,
then glibc on Linux will not return an address. This is because
AI_ADDRCONFIG is supposed to return addresses for families we actually
have an address for and could conceivably connect to, but also is
documented to explicitly ignore localhost in that situation.
It honestly doesn't make a ton of sense to pass AI_ADDRCONFIG if you're
explicitly passing the address family anyway, because you're not looking
for "an address for this name we can connect to"; you're looking for "an
IPv(4|6) address for this name". And the original glibc bug that
d2ba8ea5 was supposed to work around was related to parallel issuance of
A and AAAA queries, which of course won't happen if an address family is
explicitly specified.
So, we fix this by not passing AI_ADDRCONFIG for calls to
`rsock_addrinfo` that we also pass an explicit family to (i.e. for
UDPsocket).
[Bug #20048]
Our current implementation of rb_postponed_job_register suffers from
some safety issues that can lead to interpreter crashes (see bug #1991).
Essentially, the issue is that jobs can be called with the wrong
arguments.
We made two attempts to fix this whilst keeping the promised semantics,
but:
* The first one involved masking/unmasking when flushing jobs, which
was believed to be too expensive
* The second one involved a lock-free, multi-producer, single-consumer
ringbuffer, which was too complex
The critical insight behind this third solution is that essentially the
only user of these APIs are a) internal, or b) profiling gems.
For a), none of the usages actually require variable data; they will
work just fine with the preregistration interface.
For b), generally profiling gems only call a single callback with a
single piece of data (which is actually usually just zero) for the life
of the program. The ringbuffer is complex because it needs to support
multi-word inserts of job & data (which can't be atomic); but nobody
actually even needs that functionality, really.
So, this comit:
* Introduces a pre-registration API for jobs, with a GVL-requiring
rb_postponed_job_prereigster, which returns a handle which can be
used with an async-signal-safe rb_postponed_job_trigger.
* Deprecates rb_postponed_job_register (and re-implements it on top of
the preregister function for compatability)
* Moves all the internal usages of postponed job register
pre-registration
When making an outgoing TCP or UDP connection, set AI_ADDRCONFIG in the
hints we send to getaddrinfo(3) (if supported). This will prompt the
resolver to _NOT_ issue A or AAAA queries if the system does not
actually have an IPv4 or IPv6 address (respectively).
This makes outgoing connections marginally more efficient on
non-dual-stack systems, since we don't have to try connecting to an
address which can't possibly work.
More importantly, however, this works around a race condition present
in some older versions of glibc on aarch64 where it could accidently
send the two outgoing DNS queries with the same DNS txnid, and get
confused when receiving the responses. This manifests as outgoing
connections sometimes taking 5 seconds (the DNS timeout before retry) to
be made.
Fixes#19144
> https://github.com/flori/json/pull/525
> Rename escape_slash in script_safe and also escape E+2028 and E+2029
Co-authored-by: Jean Boussier <jean.boussier@gmail.com>
> https://github.com/flori/json/pull/454
> Remove unnecessary initialization of create_id in JSON.parse()
Co-authored-by: Watson <watson1978@gmail.com>
It is rather common to directly interpolate JSON string inside
<script> tags in HTML as to provide configuration or parameters to a
script.
However this may lead to XSS vulnerabilities, to prevent that 3
characters need to be escaped:
- `/` (forward slash)
- `U+2028` (LINE SEPARATOR)
- `U+2029` (PARAGRAPH SEPARATOR)
The forward slash need to be escaped to prevent closing the script
tag early, and the other two are valid JSON but invalid Javascript
and can be used to break JS parsing.
Given that the intent of escaping forward slash is the same than escaping
U+2028 and U+2029, I chos to rename and repurpose the existing `escape_slash`
option.
Previously in the JSON::Ext parser, when we encountered an "Infinity"
token (and weren't allowing NaN/Infinity) we would try to display the
"unexpected token" at the character before.
https://github.com/flori/json/commit/42ac170712
According to https://bugs.openjdk.org/browse/JDK-8268605, pthread_create
may fail spuriously. This change implements a simple retry as a modest
measure, which is also used by JDK.
This entirely changes how it is tested. Rather than to use counters
we now record the timeline of events with associated threads which
makes it much easier to assert that certains events are only preceded
by a specific event, and makes it much easier to debug unexpected
timelines.
Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>
Co-Authored-By: JP Camara <jp@jpcamara.com>
Co-Authored-By: John Hawthorn <john@hawthorn.email>
* Before this it was compiled but not used, because TruffleRuby has
a stringio.rb in stdlib and .rb has precedence over .so.
In fact that extension never worked on TruffleRuby,
because rb_io_extract_modeenc() has never been defined on TruffleRuby.
* So this just skip compiling the extension since compilation of it now fails:
https://github.com/ruby/openssl/issues/699https://github.com/ruby/stringio/commit/d791b63df6
Node has not been managed by GC from Ruby 2.5.
Therefore these codes are not needed. If ObjectSpace depends on Node,
it needs to update the file when node type is updated. Delete node
related codes to avoid such update.
Context: https://github.com/ivoanjo/gvl-tracing/pull/4
Some hooks may want to collect data on a per thread basis.
Right now the only way to identify the concerned thread is to
use `rb_nativethread_self()` or similar, but even then because
of the thread cache or MaNy, two distinct Ruby threads may report
the same native thread id.
By passing `thread->self`, hooks can use it as a key to store
the metadata.
NB: Most hooks are executed outside the GVL, so such data collection
need to use a thread-safe data-structure, and shouldn't use the
reference in other ways from inside the hook.
They must also either pin that value or handle compaction.
After a pthread for getaddrinfo is detached, we cannot predict when the
thread will exit. It would lead to a segfault by setting
pthread_setaffinity to the terminated pthread. I guess this problem
would be more likely to occur in high-load environments.
This change detaches the pthread after pthread_setaffinity is called.
[Feature #19965]
Add a new API rb_profile_thread_frames(), which is essentialy a
per-thread version of rb_profile_frames().
While the original rb_profile_frames() always returns results about the
current active thread obtained by GET_EC(), this new API takes a Thread
to be profiled as an argument.
This should come in handy when profiling I/O-bound programs such as
webapps, since this new API allows us to learn about Threads performing
I/O (which do not have the GVL).
Profiling worker threads (such as Sidekiq workers) may be another
application.
Implements [Feature #10602]
Co-authored-by: Mike Perham <mike@perham.net>
extconf.rb
(https://github.com/ruby/zlib/pull/69)
The android NDK (android-ndk-r21e) does not have crc32_z, adler32_z, nor
z_size_t in its zlib.h header file. However, it _does_ have the crc32_z
and adler32_z symbols in the libz.a static library!
mkmf performs two tests for have_func:
* It sees if a program that includes the header and takes the address of
the symbol can compile
* It sees if a program that defines the symbol as `extern void sym_name()`
and calls it can be linked
If either test works, it considers the function present. The
android-ndk-r21e is passing the second test but not the first for
crc32_z/adler32_z. So, we define HAVE_ZLIB_SIZE_T_FUNCS, but then can't
actually compile the extension (since the prototypes aren't in the
header file).
We can keep this working how it was working before by _also_ checking
for `have_type("z_size_t", "zlib.h")`. The have_type check _only_ looks
in the header file for the type; if a program including the header file
and using the type can't compile, the type is considered absent
regardless of what might be in libz.a.
https://github.com/ruby/zlib/commit/3b9fe962d8
We use the Cloudflare fork of zlib
(https://github.com/cloudflare/zlib), which we find gives improved
performance on AWS Graviton ARM instances. That fork does not define
crc32_z and alder32_z functions.
Until two days ago, Ruby's zlib gem worked fine, because cloudflare zlib
_also_ did not define z_size_t, which meant Ruby did not try and use
these functions.
Since a3ba99596d
however, cloudflare zlib _does_ define z_size_t (but NOT crc32_z or
alder32_z). The zlib gem would try and use these nonexistant
functions and not compile.
This patch fixes it by actually specifically detecting the functions
that the gem wants to call, rather than just the presence of the
z_size_t type.
https://github.com/ruby/zlib/commit/c96e8b9a57
When pthread_create is available, rb_getaddrinfo creates a pthread and
executes getaddrinfo(3) in it. The caller thread waits for the pthread
to complete, but detaches it if interrupted. This allows name resolution
to be interuppted by Timeout.timeout, etc. even if it takes a long time
(for example, when the DNS server does not respond). [Feature #19965]
I don't think it's possible to create a CI with a mid which would need
escaping to be in a JSON string, but I think we might as well not rely
on that assumption.