Previously, Array#uniq would return subclass instance if the
length of the array were 2 or greater, and would return Array
instance if the length of the array were 0 or 1.
Fixes [Bug #7768]
My previous attempt to correct #2068 apparently failed and the confusing
wording ("instances") was merged into trunk instead.
This should address any potential confusion.
rb_str_buf_new always allocates at least 127 bytes of capacity, even
when less is requested.
> ObjectSpace.dump(%w[a b c].join)
{"address":"0x7f935f06ebf0", "type":"STRING", "class":"0x7f935d8b7bb0", "bytesize":3, "capacity":127, "value":"abc", "encoding":"UTF-8", "memsize":168, "flags":{"wb_protected":true}}
Instead, by using rb_str_new and then setting the length to 0, we can
allocate the exact amount of memory needed, without extra capacity.
> ObjectSpace.dump(%w[a b c].join)
{"address":"0x7f903fcab530", "type":"STRING", "class":"0x7f903f8b7988", "embedded":true, "bytesize":3, "value":"abc", "encoding":"UTF-8", "memsize":40, "flags":{"wb_protected":true}}
ARY_SHARED_P and ARY_EMBED_P included:
assert(!FL_TEST((ary), ELTS_SHARED) || !FL_TEST((ary), RARRAY_EMBED_FLAG)),
The two predicate macros are used in many other assert conditions,
which caused memory bloat during C compilation.
This change factors out the assertion above to a function.
Now gcc consumes 160 MB instead of 250 MB to compile array.c.
The assertion blows up gcc 8 by consuming approx. 1.8 GB memory.
This change reduces the amount of memory required to about 200 MB.
A follow-up of ae750799c1.
Shared arrays created by Array#dup and so on points
a shared_root object to manage lifetime of Array buffer.
However, sometimes shared_root is called only shared so
it is confusing. So I fixed these wording "shared" to "shared_root".
* RArray::heap::aux::shared -> RArray::heap::aux::shared_root
* ARY_SHARED() -> ARY_SHARED_ROOT()
* ARY_SHARED_NUM() -> ARY_SHARED_ROOT_REFCNT()
Also, add some debug_counters to count shared array objects.
* ary_shared_create: shared ary by Array#dup and so on.
* ary_shared: finished in shard.
* ary_shared_root_occupied: shared_root but has only 1 refcnt.
The number (ary_shared - ary_shared_root_occupied) is meaningful.
Array#minmax was previous not implemented, so calling #minmax on
array was actually calling Enumerable#minmax. This is a simple
implementation of #minmax by just calling rb_ary_min and
rb_ary_max, which improves performance significantly.
Fixes [Bug #15929]
For array.c (Array#sort) and enum.c (Enumerable#sort_by),
add comments mentioning that sort.reverse! / sort_by { ... }.reverse!
can/should be used to reverse the result. [ci skip]
* array.c (rb_ary_join_m): warn use of non-nil $,.
* io.c (rb_output_fs_setter): warn when set to non-nil value.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67606 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Because hard to specify commits related to r67479 only.
So please commit again.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67499 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Currently we are not explicit enough regarding the potentially confusing
behavior of `Array#-` and `Array#difference` when it comes to duplicate items
within receiver arrays.
Although the original documentation for these methods does use an array with
multiple instance of the same integers, the explanation for the behavior is
actually imprecise.
> removing any items that also appear in +other_ary+
Not only does `Array#-` remove any items that also appear in `other_ary` but
it also remove any instance of any item in `other_ary`.
One may expect `Array#-` to behave like mathematical subtraction or difference
when it doesn't. One could be forgiven to expect the following behavior:
```ruby
[1,1,2,2,3,3,4,4] - [1,2,3,4]
=> [1,2,3,4]
```
In reality this is the result:
```ruby
[1,1,2,2,3,3,4,4] - [1,2,3,4]
=> []
```
I hope that I've prevented this potential confusion with the clarifications
in this change. I can offer this as evidence of likeliness for confusion:
https://twitter.com/olivierlacan/status/1084930269533085696
I'll freely admit I was surprised by this behavior myself since I needed to
obtain an Array with only one instance of each item in the argument array
removed.
[Fix GH-2068] [ci skip]
From: Olivier Lacan <hi@olivierlacan.com>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66831 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Before this patch, if `reject!` is called on a shared array it can
mutate the shared array rather than a copy. This patch marks the array
as "going to be modified" so that the shared source array isn't
mutated.
[Bug #15479] [ruby-core:90781]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66756 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* array.c (rb_ary_splice): do not use RARRAY_PTR() here because it can cause
GC because of rb_ary_detransient(). Here ary can contain T_NONE object
because of increasing capacity and not initialized yet.
error log: http://ci.rvm.jp/results/trunk-test@ruby-sky1/1557174
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66513 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* internal.h: rename the following names:
* li_table -> ar_table. "li" means linear (from linear search),
but we use the word "array" (from data layout).
* RHASH_ARRAY -> RHASH_AR_TABLE. AR_TABLE is more clear.
* rb_hash_array_* -> rb_hash_ar_table_*.
* RHASH_TABLE_P() -> RHASH_ST_TABLE_P(). more clear.
* RHASH_CLEAR() -> RHASH_ST_CLEAR().
* hash.c: rename "linear_" prefix functions to "ar_" prefix.
* hash.c (linear_init_table): rename to ar_alloc_table.
* debug_counter.h: rename obj_hash_array to obj_hash_ar.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66390 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Before this patch Array#all? was not implemented in Array class
and alternatively Enumerable#all? was used, while #any? has its
own method entry in Array class. Similarly, Array#none? and #one?
also lacks its own implementation.
This patch provides Array-specific implementations for above three
methods to enable faster method lookup.
[Fix GH-2041]
From: Koji Onishi <fursich0@gmail.com>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66212 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* include/ruby/ruby.h: de-transient at
`RARRAY_PTR_USE` and `RARRAY_PTR_USE_START`.
Introduce `RARRAY_PTR_USE_TRANSIENT` and
`RARRAY_PTR_USE_START_TRANSIENT` if you don't want to
de-transient an array. Generally, it is difficult
so C-extension writers should not use them.
* array.c: use `RARRAY_PTR_USE_TRANSIENT` if possible.
* hash.c: ditto.
* enum.c (enum_sort_by): remove `rb_ary_transient_heap_evacuate()`
because `RARRAY_PTR_USE` do de-transient.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66165 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
In this example code:
```ruby
def foo
[1, 2, 3, 4]
end
```
The array literal uses a `duparray` instruction. Before this patch,
`rb_ary_resurrect` would malloc and memcpy a new array buffer. This
patch changes `rb_ary_resurrect` to use `ary_make_partial` so that the
new array object shares the underlying buffer with the array stored in
the instruction sequences.
Before this patch, the new array object is not shared:
```
$ ruby -r objspace -e'p ObjectSpace.dump([1, 2, 3, 4])'
"{\"address\":\"0x7fa2718372d0\", \"type\":\"ARRAY\", \"class\":\"0x7fa26f8b0010\", \"length\":4, \"memsize\":72, \"flags\":{\"wb_protected\":true}}\n"
```
After this patch:
```
$ ./ruby -r objspace -e'p ObjectSpace.dump([1, 2, 3, 4])'
"{\"address\":\"0x7f9a76883638\", \"type\":\"ARRAY\", \"class\":\"0x7f9a758af900\", \"length\":4, \"shared\":true, \"references\":[\"0x7f9a768837c8\"], \"memsize\":40, \"flags\":{\"wb_protected\":true}}\n"
```
[Feature #15289] [ruby-core:90097]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66095 b2dd03c8-39d4-4d8f-98ff-823fe69b080e