Граф коммитов

429 Коммитов

Автор SHA1 Сообщение Дата
Aaron Patterson edc7af48ac Stop transitioning to UNDEF when undefining an instance variable
Cases like this:

```ruby
obj = Object.new
loop do
  obj.instance_variable_set(:@foo, 1)
  obj.remove_instance_variable(:@foo)
end
```

can cause us to use many more shapes than we want (and even run out).
This commit changes the code such that when an instance variable is
removed, we'll walk up the shape tree, find the shape, then rebuild any
child nodes that happened to be below the "targetted for removal" IV.

This also requires moving any instance variables so that indexes derived
from the shape tree will work correctly.

Co-Authored-By: Jemma Issroff <jemmaissroff@gmail.com>
Co-authored-by: John Hawthorn <jhawthorn@github.com>
2022-12-07 09:57:11 -08:00
S-H-GAMELINKS 1f4f6c9832 Using UNDEF_P macro 2022-11-16 18:58:33 +09:00
John Hawthorn 02f1554224
Implement object shapes for T_CLASS and T_MODULE (#6637)
* Avoid RCLASS_IV_TBL in marshal.c
* Avoid RCLASS_IV_TBL for class names
* Avoid RCLASS_IV_TBL for autoload
* Avoid RCLASS_IV_TBL for class variables
* Avoid copying RCLASS_IV_TBL onto ICLASSes
* Use object shapes for Class and Module IVs
2022-10-31 14:05:37 -07:00
Jemma Issroff ad63b668e2
Revert "Revert "This commit implements the Object Shapes technique in CRuby.""
This reverts commit 9a6803c90b.
2022-10-11 08:40:56 -07:00
Aaron Patterson 9a6803c90b
Revert "This commit implements the Object Shapes technique in CRuby."
This reverts commit 68bc9e2e97d12f80df0d113e284864e225f771c2.
2022-09-30 16:01:50 -07:00
Jemma Issroff d594a5a8bd
This commit implements the Object Shapes technique in CRuby.
Object Shapes is used for accessing instance variables and representing the
"frozenness" of objects.  Object instances have a "shape" and the shape
represents some attributes of the object (currently which instance variables are
set and the "frozenness").  Shapes form a tree data structure, and when a new
instance variable is set on an object, that object "transitions" to a new shape
in the shape tree.  Each shape has an ID that is used for caching. The shape
structure is independent of class, so objects of different types can have the
same shape.

For example:

```ruby
class Foo
  def initialize
    # Starts with shape id 0
    @a = 1 # transitions to shape id 1
    @b = 1 # transitions to shape id 2
  end
end

class Bar
  def initialize
    # Starts with shape id 0
    @a = 1 # transitions to shape id 1
    @b = 1 # transitions to shape id 2
  end
end

foo = Foo.new # `foo` has shape id 2
bar = Bar.new # `bar` has shape id 2
```

Both `foo` and `bar` instances have the same shape because they both set
instance variables of the same name in the same order.

This technique can help to improve inline cache hits as well as generate more
efficient machine code in JIT compilers.

This commit also adds some methods for debugging shapes on objects.  See
`RubyVM::Shape` for more details.

For more context on Object Shapes, see [Feature: #18776]

Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com>
Co-Authored-By: John Hawthorn <john@hawthorn.email>
2022-09-28 08:26:21 -07:00
Aaron Patterson 06abfa5be6
Revert this until we can figure out WB issues or remove shapes from GC
Revert "* expand tabs. [ci skip]"

This reverts commit 830b5b5c35.

Revert "This commit implements the Object Shapes technique in CRuby."

This reverts commit 9ddfd2ca00.
2022-09-26 16:10:11 -07:00
Jemma Issroff 9ddfd2ca00 This commit implements the Object Shapes technique in CRuby.
Object Shapes is used for accessing instance variables and representing the
"frozenness" of objects.  Object instances have a "shape" and the shape
represents some attributes of the object (currently which instance variables are
set and the "frozenness").  Shapes form a tree data structure, and when a new
instance variable is set on an object, that object "transitions" to a new shape
in the shape tree.  Each shape has an ID that is used for caching. The shape
structure is independent of class, so objects of different types can have the
same shape.

For example:

```ruby
class Foo
  def initialize
    # Starts with shape id 0
    @a = 1 # transitions to shape id 1
    @b = 1 # transitions to shape id 2
  end
end

class Bar
  def initialize
    # Starts with shape id 0
    @a = 1 # transitions to shape id 1
    @b = 1 # transitions to shape id 2
  end
end

foo = Foo.new # `foo` has shape id 2
bar = Bar.new # `bar` has shape id 2
```

Both `foo` and `bar` instances have the same shape because they both set
instance variables of the same name in the same order.

This technique can help to improve inline cache hits as well as generate more
efficient machine code in JIT compilers.

This commit also adds some methods for debugging shapes on objects.  See
`RubyVM::Shape` for more details.

For more context on Object Shapes, see [Feature: #18776]

Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com>
Co-Authored-By: John Hawthorn <john@hawthorn.email>
2022-09-26 09:21:30 -07:00
S-H-GAMELINKS 79f50b9d02 Using is_broken_string function 2022-09-10 09:32:51 +09:00
Nobuyoshi Nakada 5044832fec
`w_bigfixnum` is used only for large FIXNUM 2022-09-02 22:11:12 +09:00
Nobuyoshi Nakada 92d2476208
Adjust styles [ci skip] 2022-09-02 14:49:42 +09:00
John Hawthorn 0608a9a086
Optimize Marshal dump/load for large (> 31-bit) FIXNUM (#6229)
* Optimize Marshal dump of large fixnum

Marshal's FIXNUM type only supports 31-bit fixnums, so on 64-bit
platforms the 63-bit fixnums need to be represented in Marshal's
BIGNUM.

Previously this was done by converting to a bugnum and serializing the
bignum object.

This commit avoids allocating the intermediate bignum object, instead
outputting the T_FIXNUM directly to a Marshal bignum. This maintains the
same representation as the previous implementation, including not using
LINKs for these large fixnums (an artifact of the previous
implementation always allocating a new BIGNUM).

This commit also avoids unnecessary st_lookups on immediate values,
which we know will not be in that table.

* Fastpath for loading FIXNUM from Marshal bignum

* Run update-deps
2022-08-15 16:14:12 -07:00
Nobuyoshi Nakada 58c8b6e862
Adjust styles [ci skip] 2022-08-06 10:13:20 +09:00
Peter Zhu efb91ff19b Rename rb_ary_tmp_new to rb_ary_hidden_new
rb_ary_tmp_new suggests that the array is temporary in some way, but
that's not true, it just creates an array that's hidden and not on the
transient heap. This commit renames it to rb_ary_hidden_new.
2022-07-26 09:12:09 -04:00
Takashi Kokubun 5b21e94beb Expand tabs [ci skip]
[Misc #18891]
2022-07-21 09:42:04 -07:00
S-H-GAMELINKS 420f3ced4d Using is_ascii_string to check encoding 2022-06-17 12:02:50 +09:00
Kazuhiro NISHIYAMA 04f07713d1
Fix typos [ci skip] 2021-12-25 10:33:49 +09:00
Jean Boussier afcbb501ac marshal.c Marshal.load accepts a freeze: true option.
Fixes [Feature #18148]

When set, all the loaded objects are returned as frozen.

If a proc is provided, it is called with the objects already frozen.
2021-10-05 18:34:56 +02:00
S.H dc9112cf10
Using NIL_P macro instead of `== Qnil` 2021-10-03 22:34:45 +09:00
Nobuyoshi Nakada d087214658 Restore Hash#compare_by_identity mode [Bug #18171] 2021-10-02 11:43:35 +09:00
Jean byroot Boussier 529fc204af
marshal.c: don't call the proc with partially initialized objects. (#4866)
For cyclic objects, it requires to keep a st_table of the partially
initialized objects.
2021-09-30 16:50:31 +02:00
Nobuyoshi Nakada 8226c33bb5
Add symname_equal_lit for comparison with a string literal 2021-09-23 22:07:52 +09:00
Nobuyoshi Nakada 49af9012a2
Prohibit invalid encoding symbols [Bug #18184] 2021-09-23 16:02:44 +09:00
Nobuyoshi Nakada 7cec727612
Check instance variable count overflow 2021-09-23 14:01:08 +09:00
Nobuyoshi Nakada 64bdad5991
Extract ruby2_keywords predicate and setter 2021-09-23 13:59:45 +09:00
Nobuyoshi Nakada 842a4cb915
Turned to_be_skipped_id to an inline function 2021-09-23 10:55:49 +09:00
Nobuyoshi Nakada 552728a23a
Check the encoding of `ruby2_keywords_flag` [Bug #18184] 2021-09-23 00:07:17 +09:00
Nobuyoshi Nakada 7c0230b05d
Check the entire name as `ruby2_keywords_flag` [Bug #18184] 2021-09-22 19:04:55 +09:00
Jean Boussier 89242279e6 Marshal.load: do not call the proc until strings have their encoding
Ref: https://bugs.ruby-lang.org/issues/18141
2021-09-15 08:00:18 +09:00
S.H 71ba09632b
Remove unneeded declarations 2021-03-20 21:00:29 +09:00
Stefan Stüben 8c2e5bbf58 Don't redefine #rb_intern over and over again 2020-10-21 12:45:18 +09:00
Nobuyoshi Nakada 5c49bb5486
Removed useless casts 2020-09-05 17:34:12 +09:00
卜部昌平 ff30358d13 RARRAY_AREF: convert into an inline function
RARRAY_AREF has been a macro for reasons.  We might not be able to
change that for public APIs, but why not relax the situation internally
to make it an inline function.
2020-08-15 12:09:26 +09:00
卜部昌平 41703fcfab r_object0: do not goto into a branch
I'm not necessarily against every goto in general, but jumping into a
branch is definitely a bad idea.  Better refactor.
2020-06-29 11:05:41 +09:00
Jeremy Evans 98286e9850 Ensure origins for all included, prepended, and refined modules
This fixes various issues when a module is included in or prepended
to a module or class, and then refined, or refined and then included
or prepended to a module or class.

Implement by renaming ensure_origin to rb_ensure_origin, making it
non-static, and calling it when refining a module.

Fix Module#initialize_copy to handle origins correctly.  Previously,
Module#initialize_copy did not handle origins correctly.  For example,
this code:

```ruby
module B; end
class A
  def b; 2 end
  prepend B
end
a = A.dup.new
class A
  def b; 1 end
end
p a.b
```

Printed 1 instead of 2.  This is because the super chain for
a.singleton_class was:

```
a.singleton_class
A.dup
B(iclass)
B(iclass origin)
A(origin) # not A.dup(origin)
```

The B iclasses would not be modified, so the includer entry would be
still be set to A and not A.dup.

This modifies things so that if the class/module has an origin,
all iclasses between the class/module and the origin are duplicated
and have the correct includer entry set, and the correct origin
is created.

This requires other changes to make sure all tests still pass:

* rb_undef_methods_from doesn't automatically handle classes with
  origins, so pass it the origin for Comparable when undefing
  methods in Complex. This fixed a failure in the Complex tests.

* When adding a method, the method cache was not cleared
  correctly if klass has an origin.  Clear the method cache for
  the klass before switching to the origin of klass.  This fixed
  failures in the autoload tests related to overridding require,
  without breaking the optimization tests.  Also clear the method
  cache for both the module and origin when removing a method.

* Module#include? is fixed to skip origin iclasses.

* Refinements are fixed to use the origin class of the module that
  has an origin.

* RCLASS_REFINED_BY_ANY is removed as it was only used in a single
  place and is no longer needed.

* Marshal#dump is fixed to skip iclass origins.

* rb_method_entry_make is fixed to handled overridden optimized
  methods for modules that have origins.

Fixes [Bug #16852]
2020-06-03 09:50:37 -07:00
卜部昌平 9e41a75255 sed -i 's|ruby/impl|ruby/internal|'
To fix build failures.
2020-05-11 09:24:08 +09:00
卜部昌平 d7f4d732c1 sed -i s|ruby/3|ruby/impl|g
This shall fix compile errors.
2020-05-11 09:24:08 +09:00
卜部昌平 9e6e39c351
Merge pull request #2991 from shyouhei/ruby.h
Split ruby.h
2020-04-08 13:28:13 +09:00
Yusuke Endoh b23fd59cbb marshal.c: Support dump and load of a Hash with the ruby2_keywords flag
It is useful for a program that dumps and load arguments (like drb).
In future, they should deal with both positional arguments and keyword
ones explicitly, but until ruby2_keywords is deprecated, it is good to
support the flag in marshal.

The implementation is similar to String's encoding; it is dumped as a
hidden instance variable.

[Feature #16501]
2020-01-17 17:20:19 +09:00
Nobuyoshi Nakada 012f297311
Get rid of use of magic number 'E' 2020-01-11 20:19:58 +09:00
卜部昌平 5e22f873ed decouple internal.h headers
Saves comitters' daily life by avoid #include-ing everything from
internal.h to make each file do so instead.  This would significantly
speed up incremental builds.

We take the following inclusion order in this changeset:

1.  "ruby/config.h", where _GNU_SOURCE is defined (must be the very
    first thing among everything).
2.  RUBY_EXTCONF_H if any.
3.  Standard C headers, sorted alphabetically.
4.  Other system headers, maybe guarded by #ifdef
5.  Everything else, sorted alphabetically.

Exceptions are those win32-related headers, which tend not be self-
containing (headers have inclusion order dependencies).
2019-12-26 20:45:12 +09:00
Jeremy Evans ffd0820ab3 Deprecate taint/trust and related methods, and make the methods no-ops
This removes the related tests, and puts the related specs behind
version guards.  This affects all code in lib, including some
libraries that may want to support older versions of Ruby.
2019-11-18 01:00:25 +02:00
卜部昌平 7b6fde4258 drop-in type check for rb_define_module_function
We can check the function pointer passed to rb_define_module_function
like how we do so in rb_define_method.  The difference is that this
changeset reveales lots of atiry mismatches.
2019-08-29 18:34:09 +09:00
卜部昌平 50f5a0a8d6 rb_hash_foreach now free from ANYARGS
After 5e86b005c0, I now think ANYARGS is
dangerous and should be extinct.  This commit adds function prototypes
for rb_hash_foreach / st_foreach_safe.  Also fixes some prototype
mismatches.
2019-08-27 15:52:26 +09:00
卜部昌平 6dd60cf114 st_foreach now free from ANYARGS
After 5e86b005c0, I now think ANYARGS is
dangerous and should be extinct.  This commit deletes ANYARGS from
st_foreach.  I strongly believe that this commit should have had come
with b0af0592fd, which added extra
parameter to st_foreach callbacks.
2019-08-27 15:52:26 +09:00
Nobuyoshi Nakada ffdef3674a
Warn instance variable `E`
It is not dumped, as it is a short alias for `:encoding`.
2019-08-10 13:18:41 +09:00
git 3c3783ac88 * expand tabs. 2019-08-10 11:43:07 +09:00
Nobuyoshi Nakada ab31693af1
Share caches for short encoding ivar name. 2019-08-10 11:38:49 +09:00
git e315f3a134 * expand tabs. 2019-07-31 10:22:47 +09:00
Koichi Sasada 72825c35b0 Use 1 byte hint for ar_table [Feature #15602]
On ar_table, Do not keep a full-length hash value (FLHV, 8 bytes)
but keep a 1 byte hint from a FLHV (lowest byte of FLHV).
An ar_table only contains at least 8 entries, so hints consumes
8 bytes at most. We can store hints in RHash::ar_hint.

On 32bit CPU, we use 4 entries ar_table.

The advantages:
* We don't need to keep FLHV so ar_table only consumes
  16 bytes (VALUEs of key and value) * 8 entries = 128 bytes.
* We don't need to scan ar_table, but only need to check hints
  in many cases. Especially we don't need to access ar_table
  if there is no match entries (in many cases).
  It will increase memory cache locality.

The disadvantages:
* This technique can increase `#eql?` time because hints can
  conflicts (in theory, it conflicts once in 256 times).
  It can introduce incompatibility if there is a object x where
  x.eql? returns true even if hash values are different.
  I believe we don't need to care such irregular case.
* We need to re-calculate FLHV if we need to switch from ar_table
  to st_table (e.g. exceeds 8 entries).
  It also can introduce incompatibility, on mutating key objects.
  I believe we don't need to care such irregular case too.

Add new debug counters to measure the performance:
* artable_hint_hit - hint is matched and eql?#=>true
* artable_hint_miss - hint is not matched but eql?#=>false
* artable_hint_notfound - lookup counts
2019-07-31 09:52:03 +09:00