It was found that a feature to check and add ruby2_keywords flag to an
existing Hash is needed when arguments are serialized and deserialized.
It is possible to do the same without explicit APIs, but it would be
good to provide them as a core feature.
https://github.com/rails/rails/pull/38105#discussion_r361863767
Hash.ruby2_keywords_hash?(hash) checks if hash is flagged or not.
Hash.ruby2_keywords_hash(hash) returns a duplicated hash that has a
ruby2_keywords flag,
[Bug #16486]
ar_talbe (Hash representation for <=8 size) can use transient heap
and the memory area can move. So we need to restore `pair' ptr after
`func` call (which can run any programs) because of moving.
Saves comitters' daily life by avoid #include-ing everything from
internal.h to make each file do so instead. This would significantly
speed up incremental builds.
We take the following inclusion order in this changeset:
1. "ruby/config.h", where _GNU_SOURCE is defined (must be the very
first thing among everything).
2. RUBY_EXTCONF_H if any.
3. Standard C headers, sorted alphabetically.
4. Other system headers, maybe guarded by #ifdef
5. Everything else, sorted alphabetically.
Exceptions are those win32-related headers, which tend not be self-
containing (headers have inclusion order dependencies).
Reduce macros to make them inline functions, as well as mark
MJIT_FUNC_EXPORTED functions explicitly as such.
Definition of ar_hint_t is simplified. This has been the only possible
definition so far.
Akatsuki reported ENV['TZ'] = 'UTC' improved 7x-8x faster on following code.
t = Time.now; 100000.times { Time.new(2019) }; Time.now - t
https://hackerslab.aktsk.jp/2019/12/01/141551
commit 4bc1669127(reduce tzset) dramatically improved this situation. But still,
TZ=UTC is faster than default.
This patch removs unnecessary tzset() call completely.
Performance check
----------------------
test program: t = Time.now; 100000.times { Time.new(2019) }; Time.now - t
before: 0.387sec
before(w/ TZ): 0.197sec
after: 0.162sec
after(w/ TZ): 0.165sec
OK. Now, Time creation 2x faster *and* TZ=UTC doesn't improve anything.
We can forget this hack completely. :)
Side note:
This patch slightly changes Time.new(t) behavior implicitly. Before this patch, it might changes
default timezone implicitly. But after this patch, it doesn't. You need to reset TZ
(I mean ENV['TZ'] = nil) explicitly.
But I don't think this is big impact. Don't try to change /etc/localtime on runtime.
Side note2: following test might be useful for testing "ENV['TZ'] = nil".
-----------------------------------------
% cat <<'End' | sudo sh -s
rm -f /etc/localtime-; cp -a /etc/localtime /etc/localtime-
rm /etc/localtime; ln -s /usr/share/zoneinfo/Asia/Tokyo /etc/localtime
./ruby -e '
p Time.new(2000).zone # JST
File.unlink("/etc/localtime"); File.symlink("/usr/share/zoneinfo/America/Los_Angeles", "/etc/localtime")
p Time.new(2000).zone # JST (ruby does not follow /etc/localtime modification automatically)
ENV["TZ"] = nil
p Time.new(2000).zone # PST (ruby detect /etc/localtime modification)
'
rm /etc/localtime; cp -a /etc/localtime- /etc/localtime; rm /etc/localtime-
End
These functions are used from within a compilation unit so we can
make them static, for better binary size. This changeset reduces
the size of generated ruby binary from 26,590,128 bytes to
26,584,472 bytes on my macihne.
This removes the related tests, and puts the related specs behind
version guards. This affects all code in lib, including some
libraries that may want to support older versions of Ruby.
This removes the security features added by $SAFE = 1, and warns for access
or modification of $SAFE from Ruby-level, as well as warning when calling
all public C functions related to $SAFE.
This modifies some internal functions that took a safe level argument
to no longer take the argument.
rb_require_safe now warns, rb_require_string has been added as a
version that takes a VALUE and does not warn.
One public C function that still takes a safe level argument and that
this doesn't warn for is rb_eval_cmd. We may want to consider
adding an alternative method that does not take a safe level argument,
and warn for rb_eval_cmd.
Looking at the list of symbols inside of libruby-static.a, I found
hundreds of functions that are defined, but used from nowhere.
There can be reasons for each of them (e.g. some functions are
specific to some platform, some are useful when debugging, etc).
However it seems the functions deleted here exist for no reason.
This changeset reduces the size of ruby binary from 26,671,456
bytes to 26,592,864 bytes on my machine.
This changes object_id from being based on the objects location in
memory (or a nearby memory location in the case of a conflict) to be
based on an always increasing number.
This number is a Ruby Integer which allows it to overflow the size of a
pointer without issue (very unlikely to happen in real programs
especially on 64-bit, but a nice guarantee).
This changes obj_to_id_tbl and id_to_obj_tbl to both be maps of Ruby
objects to Ruby objects (previously they were Ruby object to C integer)
which simplifies updating them after compaction as we can run them
through gc_update_table_refs.
Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
This changes object_id from being based on the objects location in
memory (or a nearby memory location in the case of a conflict) to be
based on an always increasing number.
This number is a Ruby Integer which allows it to overflow the size of a
pointer without issue (very unlikely to happen in real programs
especially on 64-bit, but a nice guarantee).
This changes obj_to_id_tbl and id_to_obj_tbl to both be maps of Ruby
objects to Ruby objects (previously they were Ruby object to C integer)
which simplifies updating them after compaction as we can run them
through gc_update_table_refs.
Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
* Stop making a redundant hash copy in Hash#dup
It was making a copy of the hash without rehashing, then created an
extra copy of the hash to do the rehashing. Since rehashing creates
a new copy already, this change just uses that rehashing to make
the copy.
[Bug #16121]
* Remove redundant Check_Type after to_hash
* Fix freeing and clearing destination hash in Hash#initialize_copy
The code was assuming the state of the destination hash based on the
source hash for clearing any existing table on it. If these don't match,
then that can cause the old table to be leaked. This can be seen by
compiling hash.c with `#define HASH_DEBUG 1` and running the following
script, which will crash from a debug assertion.
```ruby
h = 9.times.map { |i| [i, i] }.to_h
h.send(:initialize_copy, {})
```
* Remove dead code paths in rb_hash_initialize_copy
Given that `RHASH_ST_TABLE_P(h)` is defined as `(!RHASH_AR_TABLE_P(h))`
it shouldn't be possible for a hash to be neither of these, so there
is no need for the removed `else if` blocks.
* Share implementation between Hash#replace and Hash#initialize_copy
This also fixes key rehashing for small hashes backed by an array
table for Hash#replace. This used to be done consistently in ruby
2.5.x, but stopped being done for small arrays in ruby 2.6.x.
This also bring optimization improvements that were done for
Hash#initialize_copy to Hash#replace.
* Add the Hash#dup benchmark
This fixes instance_exec and similar methods. It also fixes
Enumerator::Yielder#yield, rb_yield_block, and a couple of cases
with Proc#{<<,>>}.
This support requires the addition of rb_yield_values_kw, similar to
rb_yield_values2, for passing the keyword flag.
Unlike earlier attempts at this, this does not modify the rb_block_call_func
type or add a separate function type. The functions of type
rb_block_call_func are called by Ruby with a separate VM frame, and we can
get the keyword flag information from the VM frame flags, so it doesn't need
to be passed as a function argument.
These changes require the following VM functions accept a keyword flag:
* vm_yield_with_cref
* vm_yield
* vm_yield_with_block
Previously, calling transform_values would call rb_hash_aset for each
key, needing to rehash it and look up its location.
Instead, we can use rb_hash_stlike_foreach_with_replace to replace the
values as we iterate without rehashing the keys.
Treat the ** syntax as passing a copy of the hash as the last
positional argument. If the hash being double splatted is empty, do
not add a positional argument.
Remove rb_no_keyword_hash, no longer needed.
After 5e86b005c0, I now think ANYARGS is
dangerous and should be extinct. This commit adds function prototypes
for rb_hash_foreach / st_foreach_safe. Also fixes some prototype
mismatches.