RARRAY_AREF has been a macro for reasons. We might not be able to
change that for public APIs, but why not relax the situation internally
to make it an inline function.
Adds a full discussion of #dig, along with links from Array, Hash, Struct, and OpenStruct.
CSV::Table and CSV::Row are over in ruby/csv. I'll get to them soon.
The art to the thing is to figure out how much (or how little) to say at each #dig.
* Enhanced RDoc for Array#fill
* Update array.c
There's one more at 5072. I'll get it.
Co-authored-by: Eric Hodel <drbrain@segment7.net>
* Update array.c
Co-authored-by: Eric Hodel <drbrain@segment7.net>
* Update array.c
Co-authored-by: Eric Hodel <drbrain@segment7.net>
* Update array.c
Co-authored-by: Eric Hodel <drbrain@segment7.net>
* Update array.c
Co-authored-by: Eric Hodel <drbrain@segment7.net>
* Update array.c
Co-authored-by: Eric Hodel <drbrain@segment7.net>
For the most common cases of `rotate!` one place to the right or to the
left, instead of doing some reversals of the array we just keep a single
value in a temporary value, use memmove and then put the temporary
value where it should be.
Saves comitters' daily life by avoid #include-ing everything from
internal.h to make each file do so instead. This would significantly
speed up incremental builds.
We take the following inclusion order in this changeset:
1. "ruby/config.h", where _GNU_SOURCE is defined (must be the very
first thing among everything).
2. RUBY_EXTCONF_H if any.
3. Standard C headers, sorted alphabetically.
4. Other system headers, maybe guarded by #ifdef
5. Everything else, sorted alphabetically.
Exceptions are those win32-related headers, which tend not be self-
containing (headers have inclusion order dependencies).
These functions are used from within a compilation unit so we can
make them static, for better binary size. This changeset reduces
the size of generated ruby binary from 26,590,128 bytes to
26,584,472 bytes on my macihne.
This removes the related tests, and puts the related specs behind
version guards. This affects all code in lib, including some
libraries that may want to support older versions of Ruby.
The declaration of local variable in loop, it will initialize local variable for each run of the loop with clang generated code.
So, it shouldn't declare the local variable in heavy loop.
Array#sum with float elements will be faster around 30%.
* Before
user system total real
3.320000 0.010000 3.330000 ( 3.336088)
* After
user system total real
2.590000 0.010000 2.600000 ( 2.602399)
* Test code
require 'benchmark'
Benchmark.bmbm do |x|
ary = []
10000.times { ary << Random.rand }
x.report do
50000.times do
ary.sum
end
end
end
to suppress the following warning:
```
compiling cxxanyargs.cpp
In file included from cxxanyargs.cpp:1:
In file included from ../../.././include/ruby/ruby.h:2150:
../../.././include/ruby/intern.h:56:19: warning: 'register' storage class specifier is deprecated and incompatible with C++17 [-Wdeprecated-register]
void rb_mem_clear(register VALUE*, register long);
^~~~~~~~~
../../.././include/ruby/intern.h:56:36: warning: 'register' storage class specifier is deprecated and incompatible with C++17 [-Wdeprecated-register]
void rb_mem_clear(register VALUE*, register long);
^~~~~~~~~
```
Previously, Array#uniq would return subclass instance if the
length of the array were 2 or greater, and would return Array
instance if the length of the array were 0 or 1.
Fixes [Bug #7768]
My previous attempt to correct #2068 apparently failed and the confusing
wording ("instances") was merged into trunk instead.
This should address any potential confusion.
rb_str_buf_new always allocates at least 127 bytes of capacity, even
when less is requested.
> ObjectSpace.dump(%w[a b c].join)
{"address":"0x7f935f06ebf0", "type":"STRING", "class":"0x7f935d8b7bb0", "bytesize":3, "capacity":127, "value":"abc", "encoding":"UTF-8", "memsize":168, "flags":{"wb_protected":true}}
Instead, by using rb_str_new and then setting the length to 0, we can
allocate the exact amount of memory needed, without extra capacity.
> ObjectSpace.dump(%w[a b c].join)
{"address":"0x7f903fcab530", "type":"STRING", "class":"0x7f903f8b7988", "embedded":true, "bytesize":3, "value":"abc", "encoding":"UTF-8", "memsize":40, "flags":{"wb_protected":true}}
ARY_SHARED_P and ARY_EMBED_P included:
assert(!FL_TEST((ary), ELTS_SHARED) || !FL_TEST((ary), RARRAY_EMBED_FLAG)),
The two predicate macros are used in many other assert conditions,
which caused memory bloat during C compilation.
This change factors out the assertion above to a function.
Now gcc consumes 160 MB instead of 250 MB to compile array.c.
The assertion blows up gcc 8 by consuming approx. 1.8 GB memory.
This change reduces the amount of memory required to about 200 MB.
A follow-up of ae750799c1.
Shared arrays created by Array#dup and so on points
a shared_root object to manage lifetime of Array buffer.
However, sometimes shared_root is called only shared so
it is confusing. So I fixed these wording "shared" to "shared_root".
* RArray::heap::aux::shared -> RArray::heap::aux::shared_root
* ARY_SHARED() -> ARY_SHARED_ROOT()
* ARY_SHARED_NUM() -> ARY_SHARED_ROOT_REFCNT()
Also, add some debug_counters to count shared array objects.
* ary_shared_create: shared ary by Array#dup and so on.
* ary_shared: finished in shard.
* ary_shared_root_occupied: shared_root but has only 1 refcnt.
The number (ary_shared - ary_shared_root_occupied) is meaningful.
Array#minmax was previous not implemented, so calling #minmax on
array was actually calling Enumerable#minmax. This is a simple
implementation of #minmax by just calling rb_ary_min and
rb_ary_max, which improves performance significantly.
Fixes [Bug #15929]
For array.c (Array#sort) and enum.c (Enumerable#sort_by),
add comments mentioning that sort.reverse! / sort_by { ... }.reverse!
can/should be used to reverse the result. [ci skip]
* array.c (rb_ary_join_m): warn use of non-nil $,.
* io.c (rb_output_fs_setter): warn when set to non-nil value.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67606 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Because hard to specify commits related to r67479 only.
So please commit again.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67499 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Currently we are not explicit enough regarding the potentially confusing
behavior of `Array#-` and `Array#difference` when it comes to duplicate items
within receiver arrays.
Although the original documentation for these methods does use an array with
multiple instance of the same integers, the explanation for the behavior is
actually imprecise.
> removing any items that also appear in +other_ary+
Not only does `Array#-` remove any items that also appear in `other_ary` but
it also remove any instance of any item in `other_ary`.
One may expect `Array#-` to behave like mathematical subtraction or difference
when it doesn't. One could be forgiven to expect the following behavior:
```ruby
[1,1,2,2,3,3,4,4] - [1,2,3,4]
=> [1,2,3,4]
```
In reality this is the result:
```ruby
[1,1,2,2,3,3,4,4] - [1,2,3,4]
=> []
```
I hope that I've prevented this potential confusion with the clarifications
in this change. I can offer this as evidence of likeliness for confusion:
https://twitter.com/olivierlacan/status/1084930269533085696
I'll freely admit I was surprised by this behavior myself since I needed to
obtain an Array with only one instance of each item in the argument array
removed.
[Fix GH-2068] [ci skip]
From: Olivier Lacan <hi@olivierlacan.com>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66831 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Before this patch, if `reject!` is called on a shared array it can
mutate the shared array rather than a copy. This patch marks the array
as "going to be modified" so that the shared source array isn't
mutated.
[Bug #15479] [ruby-core:90781]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66756 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* array.c (rb_ary_splice): do not use RARRAY_PTR() here because it can cause
GC because of rb_ary_detransient(). Here ary can contain T_NONE object
because of increasing capacity and not initialized yet.
error log: http://ci.rvm.jp/results/trunk-test@ruby-sky1/1557174
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66513 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* internal.h: rename the following names:
* li_table -> ar_table. "li" means linear (from linear search),
but we use the word "array" (from data layout).
* RHASH_ARRAY -> RHASH_AR_TABLE. AR_TABLE is more clear.
* rb_hash_array_* -> rb_hash_ar_table_*.
* RHASH_TABLE_P() -> RHASH_ST_TABLE_P(). more clear.
* RHASH_CLEAR() -> RHASH_ST_CLEAR().
* hash.c: rename "linear_" prefix functions to "ar_" prefix.
* hash.c (linear_init_table): rename to ar_alloc_table.
* debug_counter.h: rename obj_hash_array to obj_hash_ar.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66390 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Before this patch Array#all? was not implemented in Array class
and alternatively Enumerable#all? was used, while #any? has its
own method entry in Array class. Similarly, Array#none? and #one?
also lacks its own implementation.
This patch provides Array-specific implementations for above three
methods to enable faster method lookup.
[Fix GH-2041]
From: Koji Onishi <fursich0@gmail.com>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66212 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* include/ruby/ruby.h: de-transient at
`RARRAY_PTR_USE` and `RARRAY_PTR_USE_START`.
Introduce `RARRAY_PTR_USE_TRANSIENT` and
`RARRAY_PTR_USE_START_TRANSIENT` if you don't want to
de-transient an array. Generally, it is difficult
so C-extension writers should not use them.
* array.c: use `RARRAY_PTR_USE_TRANSIENT` if possible.
* hash.c: ditto.
* enum.c (enum_sort_by): remove `rb_ary_transient_heap_evacuate()`
because `RARRAY_PTR_USE` do de-transient.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66165 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
In this example code:
```ruby
def foo
[1, 2, 3, 4]
end
```
The array literal uses a `duparray` instruction. Before this patch,
`rb_ary_resurrect` would malloc and memcpy a new array buffer. This
patch changes `rb_ary_resurrect` to use `ary_make_partial` so that the
new array object shares the underlying buffer with the array stored in
the instruction sequences.
Before this patch, the new array object is not shared:
```
$ ruby -r objspace -e'p ObjectSpace.dump([1, 2, 3, 4])'
"{\"address\":\"0x7fa2718372d0\", \"type\":\"ARRAY\", \"class\":\"0x7fa26f8b0010\", \"length\":4, \"memsize\":72, \"flags\":{\"wb_protected\":true}}\n"
```
After this patch:
```
$ ./ruby -r objspace -e'p ObjectSpace.dump([1, 2, 3, 4])'
"{\"address\":\"0x7f9a76883638\", \"type\":\"ARRAY\", \"class\":\"0x7f9a758af900\", \"length\":4, \"shared\":true, \"references\":[\"0x7f9a768837c8\"], \"memsize\":40, \"flags\":{\"wb_protected\":true}}\n"
```
[Feature #15289] [ruby-core:90097]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66095 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This args[1]-- overflows when it is zero. Should do that only
when we can say it is nonzero.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65798 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* hash.c, internal.h: support theap for small Hash.
Introduce RHASH_ARRAY (li_table) besides st_table and small Hash
(<=8 entries) are managed by an array data structure.
This array data can be managed by theap.
If st_table is needed, then converting array data to st_table data.
For st_table using code, we prepare "stlike" APIs which accepts hash value
and are very similar to st_ APIs.
This work is based on the GSoC achievement
by tacinight <tacingiht@gmail.com> and refined by ko1.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65454 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transient_heap.c, transient_heap.h: implement TransientHeap (theap).
theap is designed for Ruby's object system. theap is like Eden heap
on generational GC terminology. theap allocation is very fast because
it only needs to bump up pointer and deallocation is also fast because
we don't do anything. However we need to evacuate (Copy GC terminology)
if theap memory is long-lived. Evacuation logic is needed for each type.
See [Bug #14858] for details.
* array.c: Now, theap for T_ARRAY is supported.
ary_heap_alloc() tries to allocate memory area from theap. If this trial
sccesses, this array has theap ptr and RARRAY_TRANSIENT_FLAG is turned on.
We don't need to free theap ptr.
* ruby.h: RARRAY_CONST_PTR() returns malloc'ed memory area. It menas that
if ary is allocated at theap, force evacuation to malloc'ed memory.
It makes programs slow, but very compatible with current code because
theap memory can be evacuated (theap memory will be recycled).
If you want to get transient heap ptr, use RARRAY_CONST_PTR_TRANSIENT()
instead of RARRAY_CONST_PTR(). If you can't understand when evacuation
will occur, use RARRAY_CONST_PTR().
(re-commit of r65444)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65449 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* transient_heap.c, transient_heap.h: implement TransientHeap (theap).
theap is designed for Ruby's object system. theap is like Eden heap
on generational GC terminology. theap allocation is very fast because
it only needs to bump up pointer and deallocation is also fast because
we don't do anything. However we need to evacuate (Copy GC terminology)
if theap memory is long-lived. Evacuation logic is needed for each type.
See [Bug #14858] for details.
* array.c: Now, theap for T_ARRAY is supported.
ary_heap_alloc() tries to allocate memory area from theap. If this trial
sccesses, this array has theap ptr and RARRAY_TRANSIENT_FLAG is turned on.
We don't need to free theap ptr.
* ruby.h: RARRAY_CONST_PTR() returns malloc'ed memory area. It menas that
if ary is allocated at theap, force evacuation to malloc'ed memory.
It makes programs slow, but very compatible with current code because
theap memory can be evacuated (theap memory will be recycled).
If you want to get transient heap ptr, use RARRAY_CONST_PTR_TRANSIENT()
instead of RARRAY_CONST_PTR(). If you can't understand when evacuation
will occur, use RARRAY_CONST_PTR().
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65444 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* array.c: [DOC] small doc fixes for Array#difference and Array#-.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65184 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* array.c: [DOC] use `<code>other_ary</code>s' instead of `+other_ary+s',
which is not rendered correctly.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65066 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* array.c (yield_indexed_values): use RARRAY_AREF/ASET instead of
using RARRAY_PTR().
* enum.c (nmin_filter): ditto.
* proc.c (rb_sym_to_proc): ditto.
* enum.c (rb_nmin_run): use RARRAY_PTR_USE() instead of RARRAY_PTR().
It is safe because they don't make new referecen from an array.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64986 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
I introduce a `difference` method equivalent to the `-` operator, but
which accept more than array as argument. This improved readability, and
it is also coherent with the `+` operator, which has a similar `concat`
method. The method doesn't modify the original object and return a new
object instead. I plan to introduce a `difference!` method as well.
Tests and documentation are included.
It solves partially https://bugs.ruby-lang.org/issues/14097
From: Ana María Martínez Gómez <ammartinez@suse.de>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64921 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Avoid repeating code and improve readability in `rb_ary_or` and
`rb_ary_union_multi`. Similaty as done with `rb_ary_union`.
[Fix GH-1747] [Feature #14097]
From: Ana María Martínez Gómez <ammartinez@suse.de>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64790 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
`Array#uniq` is not really related with `Array#|`, so I replaced it by
`Array#union`.
[Fix GH-1747] [Feature #14097]
From: Ana María Martínez Gómez <ammartinez@suse.de>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64789 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Avoid repeating code and improve readability in `rb_ary_or` and
`rb_ary_union_multi`.
[Fix GH-1747] [Feature #14097]
From: Ana María Martínez Gómez <ammartinez@suse.de>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64788 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
I introduce a `union` method equivalent to the `|` operator, but which
accept more than array as argument. This improved readability, and it
is also coherent with the `+` operator, which has a similar `concat`
method. The method doesn't modify the original object and return a new
object instead. It is plan to introduce a `union!` method as well.
Tests and documentation are included.
It solves partially https://bugs.ruby-lang.org/issues/14097
[Fix GH-1747] [Feature #14097]
From: Ana María Martínez Gómez <ammartinez@suse.de>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64787 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* vm_args.c: rb_ary_dup(args->rest) to be used at most once during
parameter setup. [Feature #15010]
A patch by chopraanmol1 (Anmol Chopra) <chopraanmol1@gmail.com>.
* array.c (rb_ary_behead): added to remove first n elements.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64583 b2dd03c8-39d4-4d8f-98ff-823fe69b080e