github/ruby - ruby

Граф коммитов

Автор	SHA1	Сообщение	Дата
shyouhei	74fe1cc3d9	string.c: this assumption is false [ci skip] Looking at the lines right above, it is clear than a blue sky that we cannot assume `p` to be aligned at all when UNALIGNED_WORD_ACCESS is true. It is a wrong idea to use __builtin_assume_aligned for that situation. See also: https://travis-ci.org/ruby/ruby/jobs/451710732#L2007 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65592 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-11-07 05:23:03 +00:00
shyouhei	4a80c0540f	adopt sanitizer API These APIs are much like <valgrind/memcheck.h>. Use them to fine-grain annotate the usage of our memory. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65573 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-11-06 10:06:07 +00:00
ko1	870363886f	fix type. * string.c (rb_str_format_m): should pass `int`. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65456 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-10-30 22:16:26 +00:00
ko1	312b105d0e	introduce TransientHeap. [Bug #14858 ] * transient_heap.c, transient_heap.h: implement TransientHeap (theap). theap is designed for Ruby's object system. theap is like Eden heap on generational GC terminology. theap allocation is very fast because it only needs to bump up pointer and deallocation is also fast because we don't do anything. However we need to evacuate (Copy GC terminology) if theap memory is long-lived. Evacuation logic is needed for each type. See [Bug #14858] for details. * array.c: Now, theap for T_ARRAY is supported. ary_heap_alloc() tries to allocate memory area from theap. If this trial sccesses, this array has theap ptr and RARRAY_TRANSIENT_FLAG is turned on. We don't need to free theap ptr. * ruby.h: RARRAY_CONST_PTR() returns malloc'ed memory area. It menas that if ary is allocated at theap, force evacuation to malloc'ed memory. It makes programs slow, but very compatible with current code because theap memory can be evacuated (theap memory will be recycled). If you want to get transient heap ptr, use RARRAY_CONST_PTR_TRANSIENT() instead of RARRAY_CONST_PTR(). If you can't understand when evacuation will occur, use RARRAY_CONST_PTR(). (re-commit of r65444) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65449 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-10-30 21:53:56 +00:00
svn	69b8ffcd5b	* expand tabs. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65448 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-10-30 21:02:12 +00:00
ko1	7d359f9b69	revert r65444 and r65446 because of commit miss git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65447 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-10-30 21:01:55 +00:00
ko1	90ac549fa6	introduce TransientHeap. [Bug #14858 ] * transient_heap.c, transient_heap.h: implement TransientHeap (theap). theap is designed for Ruby's object system. theap is like Eden heap on generational GC terminology. theap allocation is very fast because it only needs to bump up pointer and deallocation is also fast because we don't do anything. However we need to evacuate (Copy GC terminology) if theap memory is long-lived. Evacuation logic is needed for each type. See [Bug #14858] for details. * array.c: Now, theap for T_ARRAY is supported. ary_heap_alloc() tries to allocate memory area from theap. If this trial sccesses, this array has theap ptr and RARRAY_TRANSIENT_FLAG is turned on. We don't need to free theap ptr. * ruby.h: RARRAY_CONST_PTR() returns malloc'ed memory area. It menas that if ary is allocated at theap, force evacuation to malloc'ed memory. It makes programs slow, but very compatible with current code because theap memory can be evacuated (theap memory will be recycled). If you want to get transient heap ptr, use RARRAY_CONST_PTR_TRANSIENT() instead of RARRAY_CONST_PTR(). If you can't understand when evacuation will occur, use RARRAY_CONST_PTR(). git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65444 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-10-30 20:46:24 +00:00
stomar	8f0eb44d93	string.c: improve docs for String#strip and related * string.c: [DOC] improve docs for String#{strip,lstrip,rstrip}{,!}: small clarification, avoid referring to the receiver as `str' (does not appear in the call-seq of the generated HTML docs), enable links for cross-references, simplify rdoc. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65382 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-10-26 20:30:09 +00:00
stomar	af7f9de4b9	array.c, file.c, string.c: [DOC] fix typos git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65185 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-10-19 21:35:51 +00:00
nobu	569fe2922f	string.c: grapheme cluster regexp failure * string.c (get_reg_grapheme_cluster): show error info and relax to rb_fatal from rb_bug. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65096 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-10-16 09:11:12 +00:00
stomar	41a486e966	string.c: [DOC] add example code for String#strip! git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65068 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-10-13 19:04:02 +00:00
stomar	ee49d04540	string.c: small doc improvement * string.c: [DOC] move unaltered case for String#strip to the end, similar to other strip methods. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65067 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-10-13 19:02:51 +00:00
nobu	fa8b08b424	Prefer `rb_fstring_lit` over `rb_fstring_cstr` The former states explicitly that the argument must be a literal, and can optimize away `strlen` on all compilers. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65059 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-10-13 09:59:22 +00:00
nobu	83a01e6f52	Added comments to rb_setup_fake_str and rb_fstring_new [ci skip] `ptr` for these functions must refer constant string literals. Otherwise, the result string's content can be modified/discarded unexpectedly. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65058 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-10-13 09:23:56 +00:00
marcandre	2521b079fa	[DOC] Improve String#strip documentation. Patch by Josh Goldberg. [Fix GH-1933] [ci skip] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64757 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-09-16 02:49:44 +00:00
shyouhei	22444ae9b1	move function declarations from insns.def to internal.h Just avoid being loose. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63755 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-06-27 00:57:16 +00:00
stomar	7215cecfb5	string.c: [DOC] grammar fixes git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63632 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-06-11 20:16:27 +00:00
nobu	46d7dc1162	[Docs] Improve documentation of String#lines * Document about optional getline arguments * Add examples, especially for the demonstration of `chomp: true` [Fix GH-1886] From: Koki Takahashi <hakatasiloving@gmail.com> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63610 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-06-08 10:45:01 +00:00
normal	256411b47f	String#uminus dedupes unconditionally [Feature #14478] [ruby-core:85669] Thanks-to: Sam Saffron <sam.saffron@gmail.com> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63566 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-06-04 23:26:03 +00:00
nobu	ce2f4f8526	string.c: trivial optimizations * string.c (rb_str_aset): prefer BUILTIN_TYPE over TYPE after SPECIAL_CONST_P check. * string.c (rb_str_start_with): prefer RB_TYPE_P over switch by TYPE. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63543 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-06-01 06:53:26 +00:00
nobu	87ccf7e50a	string.c: doc for [Feature #13712 ] * string.c (rb_str_start_with): [DOC] start_with? example with regexp. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63541 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-06-01 06:37:14 +00:00
normal	2fd1525b0f	string.c: MAYBE_UNUSED to suppress warnings for `old` Building with HAVE_MALLOC_USABLE_SIZE currently makes SIZED_REALLOC_N ignore the old size arg. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63487 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-05-22 01:58:47 +00:00
normal	0a4be5beda	string.c: size hints for free and realloc calls Another part of the plan to reduce dependencies on malloc_usable_size: https://bugs.ruby-lang.org/issues/10238 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63485 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-05-22 01:13:08 +00:00
nobu	703a5dd3e0	string.c: adjust to rb_str_upto_each * range.c (range_each_func): adjust the signature of the callback function to rb_str_upto_each, and exit the loop if the callback returned non-zero. * string.c (rb_str_upto_endless_each): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63290 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-04-28 11:16:54 +00:00
nobu	2c8f16e6c0	string.c: fix scanned substring with `\K` * string.c (scan_once): fix the matched substring with `\K`, the beginning of that string may differ from the matched position. [ruby-core:86663] [Bug #14707] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63252 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-04-24 12:25:46 +00:00
mame	7f95eed19e	Introduce endless range [Feature#12912] Typical usages: ``` p ary[1..] # drop the first element; identical to ary[1..-1] (1..).each {\|n\|...} # iterate forever from 1; identical to 1.step{...} ``` git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63192 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-04-19 15:18:50 +00:00
nobu	7c35618c53	string.c: suppress warning * string.c (str_undump): get rid of warning C4129 by VC. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63170 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-04-17 04:12:57 +00:00
nobu	cea438b0ca	string.c: fix dumped suffix * string.c (rb_str_dump): get rid of an error on evaling with frozen-string-literal enabled. [ruby-core:86539] [Bug #14687] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63164 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-04-16 07:12:06 +00:00
nobu	5b2b1130cf	string.c: fix checking order * string.c (str_undump): check for suffix before if Unicode escape conflicts with it. the message "but used force_encoding" sounds strange when it is not used. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63162 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-04-16 06:37:42 +00:00
stomar	5e99863393	string.c: [DOC] fix typo git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63160 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-04-14 16:50:42 +00:00
naruse	42f1b58964	Factor out get_reg_grapheme_cluster git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62893 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-03-22 07:58:39 +00:00
naruse	41b2ef4685	fix each_grapheme_cluster's size [Bug #14363 ] From: Hugo Peixoto <hugo.peixoto@gmail.com> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62892 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-03-22 07:58:38 +00:00
naruse	6e0f5b8407	Revert "each_grapheme_cluster shouldn't return size [Bug #14363 ]" This reverts commit r62887. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62891 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-03-22 07:58:37 +00:00
naruse	613decd088	each_grapheme_cluster shouldn't return size [Bug #14363 ] From: Stefan Schüßler <mail@stefanschuessler.de> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62888 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-03-22 06:59:54 +00:00
nobu	7506fde3e9	Improve documentation for 'text '.split The documentation didn't mention trailing spaces and the example only demonstrated the case with leading spaces. [Fix GH-1845] From: Rodrigo Rosenfeld Rosas <rr.rosas@gmail.com> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62881 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-03-21 16:02:26 +00:00
nobu	1bf9dec04c	string.c: [DOC] split with block [ci skip] * string.c (rb_str_split_m): [DOC] about split with block. [Feature #4780] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62790 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-03-17 04:13:26 +00:00
nobu	2258a97fe2	string.c: split with block * string.c (rb_str_split_m): yield each split substrings if the block is given, instead of returing the array. [Feature #4780] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62763 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-03-15 11:08:04 +00:00
nobu	c05fa459bb	quote symbols * sprintf.c (ruby__sfvextra): quote symbols as identifiers. * string.c (rb_id_quote_unprintable): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62747 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-03-14 02:35:51 +00:00
k0kubun	288b44328d	Export some missing symbols for MJIT tool/ruby_vm/views/_insn_name_info.erb: on Linux, rb_vm_insn_name_offset was needed to compile with --jit-debug (Usually --jit-debug requires more symbols than the situation without --jit-debug because -O2 skips some functions to compile). vm.c: when running transform_mjit_header.rb with --jit-wait, rb_source_location_cstr was repoted to be missing. string.c: ditto, for rb_str_eql numeric.c: ditto, for rb_float_eql git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62313 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-02-08 13:54:37 +00:00
k0kubun	ed935aa5be	mjit_compile.c: merge initial JIT compiler which has been developed by Takashi Kokubun <takashikkbn@gmail> as YARV-MJIT. Many of its bugs are fixed by wanabe <s.wanabe@gmail.com>. This JIT compiler is designed to be a safe migration path to introduce JIT compiler to MRI. So this commit does not include any bytecode changes or dynamic instruction modifications, which are done in original MJIT. This commit even strips off some aggressive optimizations from YARV-MJIT, and thus it's slower than YARV-MJIT too. But it's still fairly faster than Ruby 2.5 in some benchmarks (attached below). Note that this JIT compiler passes `make test`, `make test-all`, `make test-spec` without JIT, and even with JIT. Not only it's perfectly safe with JIT disabled because it does not replace VM instructions unlike MJIT, but also with JIT enabled it stably runs Ruby applications including Rails applications. I'm expecting this version as just "initial" JIT compiler. I have many optimization ideas which are skipped for initial merging, and you may easily replace this JIT compiler with a faster one by just replacing mjit_compile.c. `mjit_compile` interface is designed for the purpose. common.mk: update dependencies for mjit_compile.c. internal.h: declare `rb_vm_insn_addr2insn` for MJIT. vm.c: exclude some definitions if `-DMJIT_HEADER` is provided to compiler. This avoids to include some functions which take a long time to compile, e.g. vm_exec_core. Some of the purpose is achieved in transform_mjit_header.rb (see `IGNORED_FUNCTIONS`) but others are manually resolved for now. Load mjit_helper.h for MJIT header. mjit_helper.h: New. This is a file used only by JIT-ed code. I'll refactor `mjit_call_cfunc` later. vm_eval.c: add some #ifdef switches to skip compiling some functions like Init_vm_eval. win32/mkexports.rb: export thread/ec functions, which are used by MJIT. include/ruby/defines.h: add MJIT_FUNC_EXPORTED macro alis to clarify that a function is exported only for MJIT. array.c: export a function used by MJIT. bignum.c: ditto. class.c: ditto. compile.c: ditto. error.c: ditto. gc.c: ditto. hash.c: ditto. iseq.c: ditto. numeric.c: ditto. object.c: ditto. proc.c: ditto. re.c: ditto. st.c: ditto. string.c: ditto. thread.c: ditto. variable.c: ditto. vm_backtrace.c: ditto. vm_insnhelper.c: ditto. vm_method.c: ditto. I would like to improve maintainability of function exports, but I believe this way is acceptable as initial merging if we clarify the new exports are for MJIT (so that we can use them as TODO list to fix) and add unit tests to detect unresolved symbols. I'll add unit tests of JIT compilations in succeeding commits. Author: Takashi Kokubun <takashikkbn@gmail.com> Contributor: wanabe <s.wanabe@gmail.com> Part of [Feature #14235] --- * Known issues * Code generated by gcc is faster than clang. The benchmark may be worse in macOS. Following benchmark result is provided by gcc w/ Linux. * Performance is decreased when Google Chrome is running * JIT can work on MinGW, but it doesn't improve performance at least in short running benchmark. * Currently it doesn't perform well with Rails. We'll try to fix this before release. --- * Benchmark reslts Benchmarked with: Intel 4.0GHz i7-4790K with 16GB memory under x86-64 Ubuntu 8 Cores - 2.0.0-p0: Ruby 2.0.0-p0 - r62186: Ruby trunk (early 2.6.0), before MJIT changes - JIT off: On this commit, but without `--jit` option - JIT on: On this commit, and with `--jit` option Optcarrot fps Benchmark: https://github.com/mame/optcarrot \| \|2.0.0-p0 \|r62186 \|JIT off \|JIT on \| \|:--------\|:--------\|:--------\|:--------\|:--------\| \|fps \|37.32 \|51.46 \|51.31 \|58.88 \| \|vs 2.0.0 \|1.00x \|1.38x \|1.37x \|1.58x \| MJIT benchmarks Benchmark: https://github.com/benchmark-driver/mjit-benchmarks (Original: https://github.com/vnmakarov/ruby/tree/rtl_mjit_branch/MJIT-benchmarks) \| \|2.0.0-p0 \|r62186 \|JIT off \|JIT on \| \|:----------\|:--------\|:--------\|:--------\|:--------\| \|aread \|1.00 \|1.09 \|1.07 \|2.19 \| \|aref \|1.00 \|1.13 \|1.11 \|2.22 \| \|aset \|1.00 \|1.50 \|1.45 \|2.64 \| \|awrite \|1.00 \|1.17 \|1.13 \|2.20 \| \|call \|1.00 \|1.29 \|1.26 \|2.02 \| \|const2 \|1.00 \|1.10 \|1.10 \|2.19 \| \|const \|1.00 \|1.11 \|1.10 \|2.19 \| \|fannk \|1.00 \|1.04 \|1.02 \|1.00 \| \|fib \|1.00 \|1.32 \|1.31 \|1.84 \| \|ivread \|1.00 \|1.13 \|1.12 \|2.43 \| \|ivwrite \|1.00 \|1.23 \|1.21 \|2.40 \| \|mandelbrot \|1.00 \|1.13 \|1.16 \|1.28 \| \|meteor \|1.00 \|2.97 \|2.92 \|3.17 \| \|nbody \|1.00 \|1.17 \|1.15 \|1.49 \| \|nest-ntimes\|1.00 \|1.22 \|1.20 \|1.39 \| \|nest-while \|1.00 \|1.10 \|1.10 \|1.37 \| \|norm \|1.00 \|1.18 \|1.16 \|1.24 \| \|nsvb \|1.00 \|1.16 \|1.16 \|1.17 \| \|red-black \|1.00 \|1.02 \|0.99 \|1.12 \| \|sieve \|1.00 \|1.30 \|1.28 \|1.62 \| \|trees \|1.00 \|1.14 \|1.13 \|1.19 \| \|while \|1.00 \|1.12 \|1.11 \|2.41 \| Discourse's script/bench.rb Benchmark: https://github.com/discourse/discourse/blob/v1.8.7/script/bench.rb NOTE: Rails performance was somehow a little degraded with JIT for now. We should fix this. (At least I know opt_aref is performing badly in JIT and I have an idea to fix it. Please wait for the fix.) * JIT off Your Results: (note for timings- percentile is first, duration is second in millisecs) categories_admin: 50: 17 75: 18 90: 22 99: 29 home_admin: 50: 21 75: 21 90: 27 99: 40 topic_admin: 50: 17 75: 18 90: 22 99: 32 categories: 50: 35 75: 41 90: 43 99: 77 home: 50: 39 75: 46 90: 49 99: 95 topic: 50: 46 75: 52 90: 56 99: 101 *** JIT on Your Results: (note for timings- percentile is first, duration is second in millisecs) categories_admin: 50: 19 75: 21 90: 25 99: 33 home_admin: 50: 24 75: 26 90: 30 99: 35 topic_admin: 50: 19 75: 20 90: 25 99: 30 categories: 50: 40 75: 44 90: 48 99: 76 home: 50: 42 75: 48 90: 51 99: 89 topic: 50: 49 75: 55 90: 58 99: 99 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62197 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-02-04 11:22:28 +00:00
mame	552a5a993c	string.c (rb_str_format_m): Fix the example code of the doc Change `%08x` to `%016x` because of two reasons: * `%016x` demonstrates that we can use two or more digits here. * Currently, many people uses 64-bit environment. (I'm unsure if object_id is a good example here, though...) I'm unsure if git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62091 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-29 08:40:22 +00:00
nobu	9237049efe	string.c: clear substring code range * string.c (str_substr): substring of broken code range string may be valid or broken. patch by tommy (Masahiro Tomita) at [ruby-dev:50430] [Bug #14388]. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62040 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-25 13:10:14 +00:00
shyouhei	dc1e6f17ba	sizeof(uintptr_t) != sizeof(uintptr_t *) Reported by mame. Thanks. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61865 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-16 03:09:53 +00:00
shyouhei	39cfa67b4f	__builtin_assume_aligned for (foo ) casts These casts are guarded. Must be safe to assume alignments. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61829 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-15 02:35:18 +00:00
nobu	6b5e0bd98c	exclude flexible array size with old compilers git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61814 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-14 11:19:18 +00:00
mame	982e9e6235	string.c (struct mapping_buffer): Use FLEX_ARY_LEN git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61811 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-13 13:08:05 +00:00
usa	b87571100a	should cause preprocess error as other cases * string.c (NONASCII_MASK): should cause preprocess error immediately if the compiler does not satisfy our assumptions. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61756 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-10 03:54:02 +00:00
nobu	e9cb552ec9	internal.h: remove dependecy on ruby/encoding.h git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61713 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-09 06:24:11 +00:00
nobu	ee85a6e72b	internal.h: remove dependecy on ruby/io.h git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61712 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-09 06:24:10 +00:00
nobu	e043ae7348	string.c: out-of-bounds access * string.c (rb_str_enumerate_lines): fix out-of-bounds access when record separator is longer than the last element. [Bug #14257] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61636 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-06 08:44:17 +00:00
shyouhei	beaf2ace87	ULL suffix is a C99ism Don't assume long long == 8 bytes. If you can assume C99, there are macros named UINT64_C and such for appropriate integer literal suffixes. If you can't, no way but do a bitwise or. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61594 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-04 07:51:17 +00:00
nobu	a1bbb2a780	Fix doc typo in Symbol#to_proc [Fix GH-1785] [ci skip] From: Dimitris Zorbas <dimitrisplusplus@gmail.com> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61588 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-04 00:44:40 +00:00
nobu	634a48c5c1	string.c: chomp rs at the end * string.c (rb_str_enumerate_lines): should chomp record separator only, but not a newline, at the end of the receiver as well as middle, if the separator is given. [ruby-core:84552] [Bug #14257] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61513 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-29 12:19:03 +00:00
kazu	c9bdee3e0d	[DOC] Fix typos in downcase [ci skip] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61488 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-27 00:04:30 +00:00
nobu	e2479cc43f	encoding.c: rb_enc_find_index2 * string.c (str_undump): use rb_enc_find_index2 to find encoding by unterminated string. check the format before encoding name. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61396 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-22 01:03:17 +00:00
nobu	168c019998	string.c: fix memory leak git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61386 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-21 07:59:00 +00:00
naruse	05d1d29d1f	Don't allow mixed escape git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61381 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-21 05:09:17 +00:00
naruse	188d85934b	move dump format validation into parsing epilogue git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61380 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-21 05:09:16 +00:00
naruse	29c6ca423c	fix escapes in undump git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61379 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-21 05:08:57 +00:00
nobu	7c18db61a1	string.c: multiple codepoints * string.c (undump_after_backslash): fix multiple codepoints in braces. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61290 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-16 00:30:52 +00:00
nobu	ae18c8f5b6	string.c: suppress warning * string.c (str_undump): suppress maybe-uninitialized warning by gcc 7 and later. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61289 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-16 00:03:51 +00:00
tadd	bbec11d329	Implement String#undump to unescape String#dump-ed string [Feature #12275] [close GH-1765] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61228 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-14 08:47:13 +00:00
nobu	a1692f7fdf	string.c: fix rb_external_str_new_with_enc * string.c (rb_external_str_new_with_enc): do not search non-ascii by NULL pointer. [ruby-core:84055] [Bug #14150] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60979 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-02 07:09:16 +00:00
nobu	73e41247b9	string.c: prefer rb_syserr_fail git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60761 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-11-14 03:02:58 +00:00
rhe	a82aaea719	string.c: fix up r60748 An #ifdef was missing in r60748 and build broke on systems without crypt_r(). https://rubyci.org/logs/rubyci.s3.amazonaws.com/unstable11s/ruby-trunk/log/20171112T162503Z.fail.html.gz git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60749 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-11-12 17:10:29 +00:00
rhe	0b845a8458	string.c: fix memory leak in String#crypt Use ALLOCV to allocate struct crypt_data for slightly cleaner and less error-prone code. It is currently possible it leaks when an invalid argument is passed to String#crypt or rb_str_new_cstr() fails to allocate memory. SIZEOF_CRYPT_DATA macro in missing/crypt.h is removed since it is not used any longer. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60748 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-11-12 15:55:04 +00:00
stomar	8b1c1c55a9	string.c: improve docs for String#{concat,<<} * string.c: [DOC] remove a misleading call-seq for String#concat, which suggests that all arguments must be Integers in this case; also clarify in the example that the receiver is modified; fix grammar for String#<<; move references to the end. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60712 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-11-07 20:15:59 +00:00
stomar	3262f4910f	string.c: fix typos * string.c: [DOC] fix typos in doxygen comments. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60707 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-11-07 20:11:09 +00:00
stomar	51b0230a9b	string.c: improve docs * string.c: [DOC] fix rdoc for cross reference; fix grammar. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60574 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-10-29 21:43:36 +00:00
watson1978	b03a44c4ac	string.c: Improve String#prepend performance if only one argument is given * string.c (rb_str_prepend_multi): Prepend the string without generating temporary String object if only one argument is given. This is very similar with https://github.com/ruby/ruby/pull/1634 String#prepend -> 47.5 % up [Fix GH-1670] [ruby-core:82195] [Bug #13773] * Before String#prepend 1.517M (± 1.8%) i/s - 7.614M in 5.019819s * After String#prepend 2.236M (± 3.4%) i/s - 11.234M in 5.029716s * Test code require 'benchmark/ips' Benchmark.ips do \|x\| x.report "String#prepend" do \|loop\| loop.times { "!".prepend("hello") } end end git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60480 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-10-27 14:55:03 +00:00
nobu	2050b50d85	string.c: comment layout [ci skip] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60331 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-10-22 00:00:41 +00:00
svn	4b8c94dd84	* remove trailing spaces. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60329 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-10-21 23:49:36 +00:00
sonots	84616bf979	* string.c: [DOC] Split rdoc of String#<< and String#concat [ci skip] Split String#<< and String#concat docs to reflect single and multiple arguments patched by MSP-Greg [fix GH-1614] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60328 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-10-21 23:49:35 +00:00
sonots	d9e11970f8	* string.c: Remove errant "the" in gsub documentation patched by jlmuir (J. Lewis Muir) [fix GH-1679] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60324 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-10-21 23:35:40 +00:00
nobu	80c50308f9	Improve performance of string interpolation This patch will add pre-allocation in string interpolation. By this, unecessary capacity resizing is avoided. For small strings, optimized `rb_str_resurrect` operation is faster, so pre-allocation is done only when concatenated strings are large. `MIN_PRE_ALLOC_SIZE` was decided by experimenting with local machine (x86_64-apple-darwin 16.5.0, Apple LLVM version 8.1.0 (clang - 802.0.42)). String interpolation will be faster around 72% when large string is created. * Before ``` Calculating ------------------------------------- Large string interpolation 1.276M (± 5.9%) i/s - 6.358M in 5.002022s Small string interpolation 5.156M (± 5.5%) i/s - 25.728M in 5.005731s ``` * After ``` Calculating ------------------------------------- Large string interpolation 2.201M (± 5.8%) i/s - 11.063M in 5.043724s Small string interpolation 5.192M (± 5.7%) i/s - 25.971M in 5.020516s ``` * Test code ```ruby require 'benchmark/ips' Benchmark.ips do \|x\| x.report "Large string interpolation" do \|t\| a = "Hellooooooooooooooooooooooooooooooooooooooooooooooooooo" b = "Wooooooooooooooooooooooooooooooooooooooooooooooooooorld" t.times do "#{a}, #{b}!" end end x.report "Small string interpolation" do \|t\| a = "Hello" b = "World" t.times do "#{a}, #{b}!" end end end ``` [Fix GH-1626] From: Nao Minami <south37777@gmail.com> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60320 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-10-21 23:21:05 +00:00
hsbt	2c27e52f8e	Add documentation for `chomp` option. https://github.com/ruby/ruby/pull/1717 Patch by @ksss [fix GH-1717] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60308 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-10-21 16:11:58 +00:00
sonots	ec30bc5930	* string.c (deleted_prefix_length, deleted_suffix_length): Add doxygen comment. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60254 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-10-21 10:33:25 +00:00
naruse	6187b0001b	[Feature #13712 ] String#start_with? supports regexp git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60234 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-10-21 06:51:01 +00:00
glass	8320be1007	string.c: avoid unnecessary call of str_strlen() * string.c (rb_strseq_index): refactor and avoid call of str_strlen() when offset == 0. it will improve performance of String#index and #include? * benchmark/bm_string_index.rb: benchmark for this change git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60086 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-10-01 13:44:49 +00:00
nobu	16759238ad	string.c: fix ASCII-only on succ * string.c (str_succ): clear coderange cache when no alpha-numeric character case, carried part may become ASCII-only. [ruby-core:83062] [Bug #13952] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60066 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-30 00:01:23 +00:00
nobu	2d42119903	string.c: ASCII-incompatible is not ASCII only * string.c (tr_trans): ASCII-incompatible encoding strings cannot be ASCII-only even if valid. [ruby-core:83056] [Bug #13950] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60060 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-29 08:15:50 +00:00
nobu	8c59fdb8d8	dup String#split return value * string.c (rb_str_split): return duplicated receiver, when no splits. patched by tompng (tomoya ishida) in [ruby-core:82911], and the test case by Seiei Miyagi <hanachin@gmail.com>. [Bug#13925] [Fix GH-1705] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60002 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-23 07:09:07 +00:00
nobu	e1be1d0c38	dup String#rpartition return value * string.c (rb_str_rpartition): return duplicated receiver, when no splits. [ruby-core:82911] [Bug#13925] Author: Seiei Miyagi <hanachin@gmail.com> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60001 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-23 07:09:06 +00:00
nobu	b0326bce01	dup String#partition return value * string.c (rb_str_partition): return duplicated receiver, when no splits. [ruby-core:82911] [Bug#13925] Author: Seiei Miyagi <hanachin@gmail.com> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60000 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-23 07:09:05 +00:00
nobu	b2da3824c5	refinements in string interpolation * compile.c (iseq_compile_each0): insert to_s method call, so that refinements activated at the caller should take place. [Feature #13812] * insns.def (tostring): fix up converted object to a string, infect and fallback. * insns.def (branchiftype): new instruction for conversion. branches if TOS is an instance of the given type. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59950 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-18 02:27:13 +00:00
kazu	0f25c6d7d5	Fix a typo [ci skip] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59764 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-06 13:46:31 +00:00
nobu	bd10ce165c	string.c: fix false coderange * string.c (rb_enc_str_scrub): enc can differ from the actual encoding of the string, the cached coderange is useless then. [ruby-core:82674] [Bug #13874] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59763 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-06 13:11:44 +00:00
nobu	faa26f5570	string.c: optimize enumerate_grapheme_clusters * string.c (rb_str_enumerate_grapheme_clusters): optimize when single byte only. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59762 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-06 12:50:10 +00:00
nobu	e568455826	string.c: grapheme clusters on frozen string * string.c (rb_str_enumerate_grapheme_clusters): enumerate on shared frozen string. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59743 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-04 14:04:54 +00:00
nobu	805d6f6f3c	string.c: enumerator_element * string.c (enumerator_element): push or yield elements, and return 1 if needs checks. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59742 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-04 14:04:50 +00:00
nobu	ded7e1c2a1	string.c: make array in WANTARRAY * string.c (WANTARRAY): make array for the result in method functions and pass it to enumerator functions. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59736 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-03 13:21:07 +00:00
nobu	adac779218	string.c: enumerator_wantarray * string.c (enumerator_wantarray): show warnings at method functions for proper method names. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59732 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-03 02:08:55 +00:00
nobu	71de56621e	string.c: fix for non-Unicode encodings * string.c (rb_str_enumerate_grapheme_clusters): should enumerate chars for non-Unicode encodings. [Feature #13780] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59731 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-03 01:47:19 +00:00
nobu	fc1bf16696	string.c: suppress a warning * string.c (rb_str_enumerate_grapheme_clusters): suppress a maybe-uninitialized warning by old gcc. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59730 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-03 00:39:23 +00:00
nobu	bb03f02805	string.c: adjust indent [ci skip] * string.c (rb_str_enumerate_grapheme_clusters): adjust indent. [Feature #13780] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59700 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-08-31 08:07:59 +00:00
naruse	df49fc659e	String#each_grapheme_cluster and String#grapheme_clusters added to enumerate grapheme clusters [Feature #13780] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59698 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-08-31 06:35:28 +00:00
glass	b3c70d4c7e	string.c: fix potential bug in String#split * string.c (rb_str_split_m): fix potential bug when rb_memsearch() matches a octet in the middle of a multi-byte character sequence. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59673 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-08-28 10:55:37 +00:00
naruse	d067b263c7	Add optimization for creating zerofill string ``` require 'benchmark' n = 1 * 1024 * 1024 * 1024 Benchmark.bmbm do \|x\| x.report("") { 0.chr n } x.report("ljust") { String.new(capacity: n).ljust(n, "\0") } end ``` Before ```% ./ruby test.rb Rehearsal ----------------------------------------- * 0.358396 0.392753 0.751149 ( 1.134231) ljust 0.203277 0.389223 0.592500 ( 0.594816) -------------------------------- total: 1.343649sec user system total real * 0.282647 0.304600 0.587247 ( 0.589205) ljust 0.201834 0.283801 0.485635 ( 0.487617) ``` After ```% ./ruby test.rb Rehearsal ----------------------------------------- * 0.000522 0.000021 0.000543 ( 0.000534) ljust 0.208551 0.321030 0.529581 ( 0.542083) -------------------------------- total: 0.530124sec user system total real * 0.000069 0.000006 0.000075 ( 0.000069) ljust 0.206698 0.301032 0.507730 ( 0.517674) ``` git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59614 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-08-17 16:34:40 +00:00
nobu	2b770b4674	string.c: improve String#scan * string.c (rb_str_rstrip_bang): improve the performance in 50% for a string pattern, and in 10% for a regexp pattern. get rid of making MatchData in middle, which is not used. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59496 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-08-04 04:39:53 +00:00
nobu	8458e709ab	string.c: rb_str_initialize * string.c (rb_str_initialize): new function to (re)initialize a string with data and encoding. extracted from rb_external_str_new_with_enc. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59448 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-07-30 02:56:29 +00:00
sonots	510957df33	string.c: add String#delete_suffix and String#delete_suffix! to remove trailing suffix [Feature #13665] [Fix GH-1661] * string.c (rb_str_delete_suffix_bang): add a new method to remove suffix destuctively. * string.c (rb_str_delete_suffix): add a new method to remove suffix non-destuctively. * test/ruby/test_string.rb: add tests. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59377 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-07-20 16:29:19 +00:00
normal	0493b1ce3a	revert r59359, r59356, r59355, r59354 These caused numerous CI failures I haven't been able to reproduce [ruby-core:82102] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59364 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-07-19 01:35:04 +00:00
normal	86e266bb60	string: preserve taint flag with String#-@ (uminus) * string.c (tainted_fstr_update): move up (rb_fstring): support registering tainted strings (register_fstring_tainted): extract from rb_fstring_existing0 (rb_tainted_fstring_existing): use register_fstring_tainted instead git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59359 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-07-18 09:52:55 +00:00
normal	d04c085b3c	hash: keep fstrings of tainted strings for string keys The same hash keys may be loaded from tainted data sources frequently (e.g. parsing headers from socket or loading YAML data from a file). If a non-tainted fstring already exists (because the application expects the hash key), cache and deduplicate the tainted version in the new tainted_frozen_strings table. For non-embedded strings, this also allows sharing with the underlying malloc-ed data. * vm_core.h (rb_vm_struct): add tainted_frozen_strings * vm.c (ruby_vm_destruct): free tainted_frozen_strings (Init_vm_objects): initialize tainted_frozen_strings (rb_vm_tfstring_table): accessor for tainted_frozen_strings * internal.h: declare rb_fstring_existing, rb_vm_tfstring_table * hash.c (fstring_existing_str): remove (moved to string.c) (hash_aset_str): use rb_fstring_existing * string.c (rb_fstring_existing): new, based on fstring_existing_str (tainted_fstr_update): new (rb_fstring_existing0): new, based on fstring_existing_str (rb_tainted_fstring_existing): new, special case for tainted strings (rb_str_free): delete from tainted_frozen_strings table * test/ruby/test_optimization.rb (test_hash_reuse_fstring): new test [ruby-core:82012] [Bug #13737] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59354 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-07-18 02:29:59 +00:00
rhe	06a3a10acf	string.c: preserve coderange in String#setbyte Fix a wrong jump so replacing a byte in an ASCII-only string with an ASCII character won't clear the coderange. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59272 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-07-06 07:21:17 +00:00
rhe	2c8cec96cc	string.c: remove dead code in str_fill_term() The length of a string never exceeds the capacity. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59271 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-07-06 07:21:16 +00:00
sonots	10082360b9	string.c: add String#delete_prefix and String#delete_prefix! to remove leading substr [Feature #12694] [fix GH-1632] * string.c (rb_str_delete_prefix_bang): add a new method to remove prefix destuctively. * string.c (rb_str_delete_prefix): add a new method to remove prefix non-destuctively. * test/ruby/test_string.rb: add tests. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59132 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-06-21 07:43:26 +00:00
nobu	f5052d45be	string.c: check just before modification * string.c (rb_str_chomp_bang): check if modifiable after checking an argument and just before modification, as it can get frozen during the argument conversion to String. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59112 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-06-18 04:38:01 +00:00
stomar	038c2e52d8	string.c: docs for String#split * string.c: [DOC] clarify docs for String#split when called with limit and capture groups. Reported by Cichol Tsai. [ruby-core:81505] [Bug #13621] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59002 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-06-02 21:29:27 +00:00
watson1978	d0015e4ac6	Improve performance of implicit type conversion To convert the object implicitly, it has had two parts in convert_type() which are 1. lookink up the method's id 2. calling the method Seems that strncmp() and strcmp() in convert_type() are slightly heavy to look up the method's id for type conversion. This patch will add and use internal APIs (rb_convert_type_with_id, rb_check_convert_type_with_id) to call the method without looking up the method's id when convert the object. Array#flatten -> 19 % up Array#+ -> 3 % up [ruby-dev:50024] [Bug #13341] [Fix GH-1537] ### Before Array#flatten 104.119k (± 1.1%) i/s - 525.690k in 5.049517s Array#+ 1.993M (± 1.8%) i/s - 10.010M in 5.024258s ### After Array#flatten 124.005k (± 1.0%) i/s - 624.240k in 5.034477s Array#+ 2.058M (± 4.8%) i/s - 10.302M in 5.019328s ### Test Code require 'benchmark/ips' class Foo def to_ary [1,2,3] end end Benchmark.ips do \|x\| ary = [] 100.times { \|i\| ary << i } array = [ary] x.report "Array#flatten" do \|i\| i.times { array.flatten } end x.report "Array#+" do \|i\| obj = Foo.new i.times { array + obj } end end git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58978 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-31 12:30:57 +00:00
nobu	a77cb8c80f	string.c: adjust style [ci skip] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58897 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-26 06:39:06 +00:00
k0kubun	592c3f9b10	string.c: Optimize String#concat when argc is 1 Optimize performance regression introduced in r56021. * Benchmark (i7-4790K @ 4.00GH, x86_64 GNU/Linux) Benchmark.ips do \|x\| x.report("String#concat (1)") { "a".concat("b") } if RUBY_VERSION >= "2.4.0" x.report("String#concat (2)") { "a".concat("b", "c") } end end * Ruby 2.3 Calculating ------------------------------------- String#concat (1) 6.003M (± 5.2%) i/s - 30.122M in 5.031646s * Ruby 2.4 (Before this patch) Calculating ------------------------------------- String#concat (1) 4.458M (± 8.9%) i/s - 22.298M in 5.058084s String#concat (2) 3.660M (± 5.6%) i/s - 18.314M in 5.020527s * Ruby 2.4 (After this patch) Calculating ------------------------------------- String#concat (1) 6.448M (± 5.2%) i/s - 32.215M in 5.010833s String#concat (2) 3.633M (± 9.0%) i/s - 18.056M in 5.022603s [fix GH-1631] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58886 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-25 11:14:40 +00:00
nobu	7db534a20c	vm_insnhelper.c: rb_eql_opt should call eql? * vm_insnhelper.c (rb_eql_opt): should call #eql? on Float and String, not #==. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58882 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-25 05:29:35 +00:00
normal	2079b71000	string.c: fix String#crypt leak introduced in r58866 * string.c (rb_str_crypt): define LARGE_CRYPT_DATA when allocating git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58876 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-24 21:26:14 +00:00
nobu	92261511b6	string.c: for small crypt_data * string.c (rb_str_crypt): struct crypt_data defined in missing/crypt.h is small enough. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58866 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-24 06:55:09 +00:00
ko1	9e1624cfe8	Add debug counters. * debug_counter.h: add the following counters to measure object types. obj_free: freed count obj_str_ptr: freed count of Strings they have extra buff. obj_str_embed: freed count of Strings they don't have extra buff. obj_str_shared: freed count of Strings they have shared extra buff. obj_str_nofree: freed count of Strings they are marked as nofree. obj_str_fstr: freed count of Strings they are marked as fstr. obj_ary_ptr: freed count of Arrays they have extra buff. obj_ary_embed: freed count of Arrays they don't have extra buff. obj_obj_ptr: freed count of Objects (T_OBJECT) they have extra buff. obj_obj_embed: freed count of Objects they don't have extra buff. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58865 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-24 06:46:44 +00:00
normal	144e067007	string.c (rb_str_crypt): fix excessive stack use with crypt_r "struct crypt_data" is 131232 bytes on x86-64 GNU/Linux, making it unsafe to use tiny Fiber stack sizes. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58864 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-24 03:01:44 +00:00
stomar	40bc846bf8	string.c: fix String#{casecmp,casecmp?} for non-string arguments * string.c: make String#{casecmp,casecmp?} return nil for non-string arguments instead of raising a TypeError. * test/ruby/test_string.rb: add tests. Reported by Marcus Stollsteimer. Based on a patch by Shingo Morita. [ruby-core:80145] [Bug #13312] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58837 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-21 19:28:48 +00:00
nobu	f3a49ebc92	string.c: cut down intermediate string * string.c (rb_external_str_new_with_enc): cut down intermediate string for conversion source, by appending with conversion. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58709 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-14 00:21:00 +00:00
nobu	7323de517d	revert r58703 & r58705 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58708 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-13 16:04:05 +00:00
nobu	a3960d0a60	string.c: fix up r58703 * string.c (rb_external_str_new_with_enc): fix the case of conversion failure. when conversion failed for some reason, just ignores the default internal encoding and returns in the given encoding. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58705 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-13 14:20:19 +00:00
nobu	7678c0d702	string.c: cut down intermediate string * string.c (rb_external_str_new_with_enc): cut down intermediate string for conversion source, by appending with conversion. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58703 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-13 13:34:39 +00:00
nobu	7d52ad9e17	string.c: fix one-off bug * string.c (rb_str_cat_conv_enc_opts): fix one-off bug. `ofs` equals `olen` when appending at the end. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58702 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-13 12:31:01 +00:00
nobu	2c0baa97a9	string.c: remove bare Unicode. * string.c (rb_str_unicode_normalize): remove bare Unicode. do not assume that all compilers can handle UTF-8. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58688 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-12 15:29:55 +00:00
stomar	8303bc6cf6	string.c: docs for String#match * string.c: [DOC] add example for String#match with pos argument. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58669 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-11 18:59:45 +00:00
stomar	4a0eaeeb4d	string.c: docs for Symbol * string.c: [DOC] adopt call-seq's for Symbol#{match,match?} from String methods; other small improvements for Symbol docs. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58668 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-11 18:58:27 +00:00
stomar	477cb2b8d8	string.c: docs for Symbol#{match,match?} * string.c: [DOC] mention pos argument for Symbol#{match,match?}. Patch by Yuki Kurihara (ksss). [Fix GH-1606] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58666 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-11 18:56:32 +00:00
nobu	208240870a	string.c: fix r58618 * string.c (unicode_normalize_common): aggregation type cannot be initialized with dynamic values, in C89. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58621 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-09 14:11:46 +00:00
duerst	a4301ec214	replace hand-written argument check by call to rb_scan_args in unicode_normalize_common In string.c, replace hand-written argument count check by call to rb_scan_args. This allows to use rb_funcallv once, rather than using rb_funcall twice. Thanks to Hanmac (Hans Mackowiak) for the idea, see https://bugs.ruby-lang.org/issues/11078#note-7. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58618 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-09 11:13:45 +00:00
nobu	d7f2c72322	string.c: fix types * string.c (id_normalize, id_normalized_p): fix types, IDs should be ID. * string.c (unicode_normalize_common): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58575 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-06 01:01:52 +00:00
stomar	6ae3cf02f7	string.c: [DOC] improve docs for String.new git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58569 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-04 13:19:43 +00:00
ktsj	4e6daedc70	string.c: [DOC] Properly refer to keyword argument by its name git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58567 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-04 08:59:01 +00:00
duerst	f47033e237	refactor common parts of unicode normalization functions into unicode_normalize_common In string.c, refactor the common parts (requiring of unicode_normalize/normalize.rb, check of number of arguments) of the unicode normalization functions (rb_str_unicode_normalize, rb_str_unicode_normalize_bang, rb_str_unicode_normalized_p) into the new function unicode_normalize_common. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58558 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-04 02:16:27 +00:00
duerst	140560e4ee	move definition of String#unicode_normalized? to C to make sure it is documented * lib/unicode_normalize.rb: Remove definition of String#unicode_normalized? (including documentation). Leave a comment explaining that the file is now empty. * string.c: Define String#unicode_normalized? in rb_str_unicode_normalized_p in C, (including documentation) * lib/unicode_normalize/normalize.rb: Remove (re)definition of String#unicode_normalized? to avoid warnings (when $VERBOSE==true) and problems when String is frozen git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58555 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-04 02:00:19 +00:00
duerst	90ab1ee023	move definition of String#unicode_normalize! to C to make sure it is documented * lib/unicode_normalize.rb: Remove definition of String#unicode_normalize! (including documentation) * string.c: Define String#unicode_normalize! in rb_str_unicode_normalize_bang in C, (including documentation) * lib/unicode_normalize/normalize.rb: Remove (re)definition of String#unicode_normalize! to avoid warnings (when $VERBOSE==true) and problems when String is frozen git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58553 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-04 01:36:52 +00:00
svn	7abf8bae23	* remove trailing spaces. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58551 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-03 12:18:37 +00:00
duerst	5fee67c9ba	move definition of String#unicode_normalize to C to make sure it is documented * lib/unicode_normalize.rb: Remove definition of String#unicode_normalize (including documentation) * string.c: Define String#unicode_normalize in rb_str_unicode_normalize in C, (including documentation) * lib/unicode_normalize/normalize.rb: Remove (re)definition of String#unicode_normalize to avoid warnings (when $VERBOSE==true) and problems when String is frozen git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58550 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-03 12:18:37 +00:00
nobu	cc68af3d02	string.c: improve insertion performace * string.c (rb_str_splice_0): improve performace of single byte optimizable cases, insertion 7bit string to 7bit string. [ruby-dev:49984] [Bug #13228] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58383 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-04-17 13:38:34 +00:00
sorah	31a755e4f2	string.c: Supress logical-op-parentheses warning * string.c(rb_str_upcase_bang): Supress logical-op-parentheses warning Patch by Fukuo Kadota <fukuo-kadota@cookpad.com>, Closes [GH-1570] [Bug #13387]. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58211 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-29 11:33:59 +00:00
nobu	f0e6e47999	string.c: use the usable size * string.c (rb_str_change_terminator_length): when called after the content has been copied, old terminator length no longer makes sense. use the whole usable size instead of capacity without terminator. [ruby-core:80257] [Bug #13339] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58042 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-21 05:28:38 +00:00
duerst	a5330fa9ea	fix accidental reversal of r57997 in r58000 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58009 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-18 01:35:03 +00:00
duerst	67c1197835	clarifiy 'codepoint' in documentation of String#each_codepoint Make sure it's clear that the returned values are not Unicode codepoints for encodings other than UTF-8/UTF-16(BE\|LE)/UTF-32(BE\|LE). [ci skip] [Bug #13321] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58000 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-17 02:24:53 +00:00
normal	9eb94b4dc1	deduplicate static rb_str_format format strings Anybody who hits these code paths can hit them again in the future, so try deduplicating across multiple runs of these methods to reduce garbage. * string.c (str_upto_each): fstring on "%.d" strftime.c (rb_strftime_with_timespec): fstring on "%0*d" git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57997 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-17 00:55:55 +00:00
nobu	8c661ba264	string.c: shortcut argument check * string.c (str_casecmp, str_casecmp_p): split to skip argument check when it is a String certainly. * string.c (sym_casecmp, sym_casecmp_p): shortcut argument checks. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57978 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-15 07:57:11 +00:00
nobu	9fa56026e5	string.c: use rb_check_string_type * string.c (rb_str_cmp_m): use rb_check_string_type for check and conversion, instead of calling the conversion method directly. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57965 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-14 03:42:43 +00:00
stomar	e6dec8f92e	docs for Symbol#casecmp and Symbol#casecmp? * string.c: [DOC] improve docs of Symbol#casecmp and Symbol#casecmp? according to the similar String methods; fix RDoc markup and typos; fix call-seq's for Symbol#{upcase,downcase,capitalize,swapcase}. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57963 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-13 20:20:40 +00:00
nobu	bd17e25588	string.c (rb_str_set_len): pathological check git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57961 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-13 11:47:45 +00:00
nobu	16e804117c	string.c: $; is a GC-root * string.c (Init_String): $; must be a GC-root, not to be collected. [ruby-core:79582] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57958 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-13 09:12:05 +00:00
stomar	605b472d2d	docs for String#casecmp and String#casecmp? * string.c: [DOC] specify when String#casecmp and String#casecmp? return nil; modify examples to better show difference to <=>; fix RDoc markup and typos. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57886 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-11 20:01:55 +00:00
normal	e064879e5a	string.c (str_uminus): update doc for deduplication As of r57698, String#-@ can return pre-existing strings. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57813 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-08 21:24:24 +00:00
nobu	e7f4d90930	fix paren * string.c (str_byte_substr): fix misplaced parenthesis at r56155. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57809 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-08 08:19:56 +00:00
kazu	d0708e9e2a	string.c: [DOC] Fix a typo in String#dump [Fix GH-1531][ci skip] Author: Alex Semyonov <alex@semyonov.us> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57802 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-07 13:04:39 +00:00
nobu	d69d98f61a	string.c: negation of LONG_MIN * string.c (rb_str_update): do not use negation of LONG_MIN, which is negative too. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57800 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-07 09:13:41 +00:00
nobu	f4d13801b6	string.c: fix integer overflow * string.c (str_byte_substr): fix another integer overflow which can happen only when SHARABLE_MIDDLE_SUBSTRING is enabled. [ruby-core:79951] [Bug #13289] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57799 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-07 09:07:57 +00:00
nobu	72f8df158f	string.c: fix integer overflow * string.c (rb_str_subpos): fix integer overflow which can happen only when SHARABLE_MIDDLE_SUBSTRING is enabled. incorpolate https://github.com/mruby/mruby/commit/7db0786abdd243ba031e24683f git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57797 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-07 05:48:15 +00:00
stomar	3ca1cbecc6	string.c: [DOC] fix doc formatting for String#==, #=== git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57778 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-04 20:08:04 +00:00
stomar	a698d99703	string.c: restore documentation for String#<< * string.c: [DOC] restore documentation for String#<< which became undocumented with r56021; fix a typo. [ruby-core:79865] [Bug #13268] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57758 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-02 10:31:56 +00:00
normal	4e90dcc9d7	string.c (str_uminus): deduplicate strings This exposes the rb_fstring internal function to return a deduped and frozen string when a non-frozen string is given. This is useful for writing all sorts of record processing key values maybe stored, but certain keys and values are often duplicated at a high frequency, so memory savings can noticeable. Use cases are many: * email/NNTP header processing There are some standard header keys everybody uses (From/To/Cc/Date/Subject/Received/Message-ID/References/In-Reply-To), as well as common ones specific to a certain lists: (ruby-core has X-Redmine-* headers) It is also useful to dedupe values, as most inboxes have multiple messages from the same sender, or MUA. * package management systems - things like RubyGems stores identical strings for licenses, dependency names, author names/emails, etc * HTTP headers/trailers - standard headers (Host/Accept/Accept-Encoding/User-Agent/...) are common, but there are also uncommon ones. Values may be deduped, as well, as it is likely a user agent will make multiple/parallel requests to the same server. * version control systems - this can be useful for deduplicating names of frequent committers (like "nobu" :) In linux.git and git.git, there are also common trailers such as Signed-Off-By/Acked-by/Reviewed-by/Fixes/... as well as less common ones. * audio metadata - There are commonly used tags (Artist/Album/Title/Tracknumber), but Vorbis comments allows arbitrary key values to be stored. Music collections contain songs by the same artist or mutiple songs from the same album, so deduplicating values will be helpful there, too. * JSON, YAML, XML, HTML processing Certain fields, tags and attributes are commonly used across the same and multiple documents There is no security concern in this being a DoS vector by causing immortal strings. The fstring table is not a GC-root and not walked during the mark phase. GC-able dynamic symbols since Ruby 2.2 are handled in the same manner, and that implementation also relies on the non-immortality of fstrings. [Feature #13077] [ruby-core:79663] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57698 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-02-24 01:01:23 +00:00
nobu	7de42daa21	string.c: assertion * string.c (str_shared_replace): use RUBY_ASSERT for pre-condition. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57628 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-02-14 12:29:56 +00:00
nobu	957e6e4b14	initialize variables * string.c (rb_str_enumerate_lines): initialize conditionally used variable. * thread.c (rb_fd_no_init): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57625 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-02-14 07:52:30 +00:00
nobu	959aac29e7	suppress warnings * string.c (rb_str_enumerate_lines): hint to suppress a maybe-uninitialized warning by gcc. * thread.c (rb_fd_no_init): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57618 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-02-13 05:44:15 +00:00
normal	a6b9b360ce	doc: Add example for Symbol#to_s * string.c: add example for Symbol#to_s. The docs for Symbol#to_s only include an example for Symbol#id2name, but not for #to_s which is an alias; the docs should include examples for both methods. From: Marcus Stollsteimer <sto.mar@web.de> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57536 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-02-05 00:22:03 +00:00
normal	b0cfa46bce	symbol.c (rb_id2str): eliminate branch to set class Since the fstring table encompasses all strings in the symbol table, we may reuse the fstring table walk to set the class and eliminate the branch in rb_id2str. * string.c (Init_String): use rb_cString immediately after definition * symbol.c (rb_id2str): eliminate branch to set class git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57521 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-02-03 23:55:06 +00:00
normal	5c988df0dd	string.c (rb_str_tmp_frozen_release): release embedded strings Handle the embedded case first, since we may have an embedded duplicate and non-embedded original string. * string.c (rb_str_tmp_frozen_release): handled embedded strings * test/ruby/test_io.rb (test_write_no_garbage): new test [ruby-core:78898] [Bug #13085] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57471 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-01-30 21:54:32 +00:00
normal	9c4ba969a5	io.c: recycle garbage on write * string.c (STR_IS_SHARED_M): new flag to mark shared mulitple times (STR_SET_SHARED): set STR_IS_SHARED_M (rb_str_tmp_frozen_acquire, rb_str_tmp_frozen_release): new functions (str_new_frozen): set/unset STR_IS_SHARED_M as appropriate * internal.h: declare new functions * io.c (fwrite_arg, fwrite_do, fwrite_end): new (io_fwrite): use new functions Introduce rb_str_tmp_frozen_acquire and rb_str_tmp_frozen_release to manage a hidden, frozen string. Reuse one bit of the embed length for shared strings as STR_IS_SHARED_M to indicate a string has been shared multiple times. In the common case, the string is only shared once so the object slot can be reclaimed immediately. minimum results in each 3 measurements. (time and size) Execution time (sec) name trunk built io_copy_stream_write 0.682 0.254 io_copy_stream_write_socket 1.225 0.751 Speedup ratio: compare with the result of `trunk' (greater is better) name built io_copy_stream_write 2.680 io_copy_stream_write_socket 1.630 Memory usage (last size) (B) name trunk built io_copy_stream_write 95436800.000 6512640.000 io_copy_stream_write_socket 117628928.000 7127040.000 Memory consuming ratio (size) with the result of `trunk' (greater is better) name built io_copy_stream_write 14.654 io_copy_stream_write_socket 16.505 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57469 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-01-30 20:40:18 +00:00
shugo	d33726b837	string.c: rindex(//) should set $~. This seems a bug introduced by r520 (1.4.0). [ruby-core:79110] [Bug #13135] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57374 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-01-19 08:13:03 +00:00
nobu	803621f6d7	file.c: refine message * file.c (rb_get_path_check_convert): refine the error message when the path name contains null byte. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57336 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-01-16 02:43:55 +00:00
nobu	9029464175	string.c: replacement and block * string.c (rb_enc_str_scrub): only one of replacement and block is allowed. [ruby-core:79038] [Bug #13119] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57304 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-01-11 02:31:02 +00:00
nobu	a3aa4da773	string.c: yield invalid part * string.c (rb_enc_str_scrub): yield the invalid part only with ASCII-incompatible. [ruby-core:79039] [Bug #13120] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57303 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-01-11 02:18:45 +00:00
nobu	c763f0fb9b	string.c: block for scrub with ASCII-incompatible * string.c (rb_enc_str_scrub): honor the given block with ASCII-incompatible encoding. [ruby-core:79039] [Bug #13120] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57302 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-01-11 01:03:37 +00:00
nobu	10bd48e402	string.c: CRLF in paragraph mode * string.c (rb_str_enumerate_lines): allow CRLF to separate paragraphs. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57185 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-25 23:56:55 +00:00
nobu	091f99b4b9	string.c: consistent paragraph mode with IO * string.c (rb_str_enumerate_lines): in paragraph mode, do not include newlines which separate paragraphs, so that it will be consistent with IO#each_line. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57184 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-25 23:50:09 +00:00
nobu	d124fa3a35	string.c: suppress a warning * string.c (rb_str_casecmp_p): [DOC] use Unicode escape form to get rid of warning C4819 by Microsoft Visual C++. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57154 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-22 22:16:19 +00:00
rhe	44ba4fd362	string.c: add missing size_t cast Add size_t cast to avoid signed integer overflow. r56157 ("string.c: avoid signed integer overflow", 2016-09-13) missed this. Suppresses UBSan. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57122 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-20 06:53:45 +00:00
nobu	5d6292809f	no crypt.h on FreeBSD 12 * string.c (crypt.h): crypt_r() was added in FreeBSD 12.0 but is declared in unistd.h. [ruby-core:78664] [Bug #13038] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57091 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-16 05:05:42 +00:00
nobu	75755ef159	fix chomping newline only line * string.c (chomp_newline): fix chomping newline only line. rb_enc_prev_char return NULL if no previous character and must not call rb_enc_ascget on it. a patch by Ary Borenszweig <asterite AT gmail.com> at [ruby-core:78666]. [Bug #13037] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57088 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-16 01:12:09 +00:00
nobu	c95388a58d	string.c: fix method name in rdoc [ci skip] * string.c (rb_str_equal): [DOC] fix fallback method name. the peer's == method will be used, not ===. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57056 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-12 07:12:07 +00:00
nobu	6dd5ee752a	String#match? and Symbol#match? * string.c (rb_str_match_m_p): inverse of Regexp#match?. based on the patch by Herwin Weststrate <herwin@snt.utwente.nl>. [Fix GH-1483] [Feature #12898] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57053 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-12 02:56:12 +00:00
nobu	d95f5bc81a	string.c: chomp option * string.c (rb_str_enumerate_lines): implement chomp option. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56972 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-03 14:18:03 +00:00
duerst	dacf977a42	Fix/improve documentation of String/Symbol#casecmp[?] Fix documentation of String#casecmp? (examples didn't have the '?'). Add an example with non-ASCII characters. Clarify that casecmp, unlike casecmp?, only does case-insensitivity on A-Z/a-z. [ci skip] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56926 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-11-29 10:45:54 +00:00
nobu	07fb750fd0	string.c: use xmalloc * string.c (rb_str_casemap): use xmalloc simply instead of ALLOC_N. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56920 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-11-29 03:06:01 +00:00
nobu	78b0d7ac1c	string.c: fix zero-length array * string.c (mapping_buffer): get rid of zero-length array member, which is not a part of C90. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56915 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-11-28 13:16:00 +00:00
nobu	196e8b4480	string.c: enable rdoc * string.c (rb_str_casecmp_p): [DOC] move forward declaration of rb_str_downcase to enable rdoc. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56913 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-11-28 09:37:19 +00:00
duerst	ad619e02c4	implement String/Symbol#casecmp? including Unicode case folding * string.c: Implement String#casecmp? and Symbol#casecmp? by using String#downcase :fold for Unicode case folding. This does not include options such as :turkic, because these currently cannot be combined with the :fold option. This implements feature #12786. * test/ruby/test_string.rb/test_symbol.rb: Tests for above. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56912 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-11-28 08:37:32 +00:00
nobu	a2144bd72a	chomp option * io.c (extract_getline_opts): extract chomp option. [Feature #12553] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56581 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-11-05 07:28:09 +00:00
nobu	4e44f6ef86	[DOC] replace Fixnum with Integer [ci skip] * numeric.c: [DOC] update document for Integer class. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56492 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-10-26 06:11:23 +00:00
nobu	3ba353fc1a	Fixed typo [ci skip] * string.c (rb_str_sub, rb_str_gsub): [DOC] 'backlash' should read 'backslash'. [Fix GH-1461] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56460 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-10-21 02:34:19 +00:00
usa	c2dd2d268e	* internal.h (ST2FIX): new macro to convert st_index_t to Fixnum. a hash value of Object might be Bignum, but it causes many troubles expecially the Object is used as a key of a hash. so I've gave up to do so. * array.c (rb_ary_hash): use above macro. * bignum.c (rb_big_hash): ditto. * hash.c (rb_obj_hash, rb_hash_hash): ditto. * numeric.c (rb_dbl_hash): ditto. * proc.c (proc_hash): ditto. * re.c (rb_reg_hash, match_hash): ditto. * string.c (rb_str_hash_m): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56340 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-10-04 16:25:01 +00:00
nobu	63d77c2a1b	string.c: negative hash * string.c (rb_str_hash_m): hash values may be negative. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56321 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-10-01 22:51:23 +00:00
usa	7a44019031	* string.c (rb_str_hash_m): st_index_t is not guaranteed as the same size with int, and of course also not guaranteed the value can be Fixnum. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56320 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-10-01 17:06:21 +00:00
nobu	8d501ec021	string.c: fast path of lstrip_offset * string.c (lstrip_offset): add a fast path in the case of single byte optimizable strings, as well as rstrip_offset. [ruby-core:77392] [Feature #12788] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56250 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-09-26 05:10:56 +00:00
rhe	537fea9921	string.c: fix integer overflow in enc_strlen() and rb_enc_strlen_cr() * string.c (enc_strlen, rb_enc_strlen_cr): Avoid signed integer overflow. The result type of a pointer subtraction may have the same size as long. This fixes String#size returning an negative value on i686-linux environment: str = "\x00" * ((1<<31)-2)) str.slice!(-3, 3) str.force_encoding("UTF-32BE") str << 1234 p str.size git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56247 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-09-26 02:09:50 +00:00
shyouhei	2fc5210f31	* internal.h (WARN_UNUSED_RESULT): moved to configure.in, to actually check its availability rather to check GCC's version. * configure.in (WARN_UNUSED_RESULT): moved to here. * configure.in (RUBY_FUNC_ATTRIBUTE): change function declaration to return int rather than void, because it makes no sense for a warn_unused_result attributed function to return void. Funny thing however is that it also makes no sense for noreturn attributed function to return int. So there is a fundamental conflict between them. While I tested this, I confirmed both GCC 6 and Clang 3.8 prefers int over void to correctly detect necessary attributes under this setup. Maybe subject to change in future. * internal.h (UNINITIALIZED_VAR): renamed to MAYBE_UNUSED, then moved to configure.in for the same reason we move WARN_UNUSED_RESULT. * configure.in (MAYBE_UNUSED): moved to here. * internal.h (__has_attribute): deleted, because it has no use now. * string.c (rb_str_enumerate_lines): refactor macro rename. * string.c (rb_str_enumerate_bytes): ditto. * string.c (rb_str_enumerate_chars): ditto. * string.c (rb_str_enumerate_codepoints): ditto. * thread.c (do_select): ditto. * vm_backtrace.c (rb_debug_inspector_open): ditto. * vsnprintf.c (BSD_vfprintf): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56169 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-09-16 06:15:55 +00:00
rhe	00fcd967d9	string.c: avoid signed integer overflow The behavior on signed integer overflow is undefined. On platform with sizeof(long)==4, it's fairly easy that 'len + termlen' overflows, where len is the string length and termlen is the terminator length. So, prevent the integer overflow by avoiding adding to a string length, or casting to size_t before adding where the total size is passed to {RE,}ALLOC(). string.c (STR_HEAP_SIZE, RESIZE_CAPA_TERM, str_new0, rb_str_buf_new, str_shared_replace, rb_str_init, str_make_independent_expand, rb_str_resize): Avoid overflow by casting the length to size_t. size_t should be able to represent LONG_MAX+termlen. * string.c (rb_str_modify_expand): Check that the new length is in the range of long before resizing. Also refactor to use RESIZE_CAPA_TERM macro. * string.c (str_buf_cat): Fix so that it does not create a negative length String. Also fix the condition for 'string sizes too big', the total length can be up to LONG_MAX. * string.c (rb_str_plus): Check the resulting String length does not exceed LONG_MAX. * string.c (rb_str_dump): Fix integer overflow. The dump result will be longer then the original String. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56157 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-09-13 12:33:16 +00:00
rhe	eaba77154f	string.c: rename STR_EMBEDABLE_P to STR_EMBEDDABLE_P * string.c (STR_EMBEDDABLE_P): Renamed from STR_EMBEDABLE_P(). And use it in more places. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56155 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-09-13 12:28:54 +00:00
nobu	2608f7d9c5	string.c: STR_EMBEDABLE_P * string.c (STR_EMBEDABLE_P): extract the predicate macro to tell if the given length is capable in an embedded string, and fix possible integer overflow. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56151 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-09-13 12:11:57 +00:00
nobu	8a64787632	string.c: fix integer overflow * string.c (rb_str_change_terminator_length): fix integer overflow in the case growing the terminator length and the string length is around LONG_MAX. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56149 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-09-13 08:12:54 +00:00
rhe	be3baa4380	string.c: fix buffer overflow check condition in rb_str_set_len() * string.c (rb_str_set_len): The buffer overflow check is wrong. The space for termlen is allocated outside the capacity returned by rb_str_capacity(). This fixes r41920 ("string.c: multi-byte terminator", 2013-07-11). [ruby-core:77257] [Bug #12757] * test/-ext-/string/test_set_len.rb (test_capacity_equals_to_new_size): Test for this change. Applying only the test will trigger [BUG]. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56148 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-09-13 07:08:15 +00:00
akr	577de1e93d	replace fixnum by integer in documents. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56102 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-09-08 04:57:49 +00:00
nobu	9387ff7315	multiple arguments * array.c (rb_ary_concat_multi): take multiple arguments. based on the patch by Satoru Horie. [Feature #12333] * string.c (rb_str_concat_multi, rb_str_prepend_multi): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56021 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-08-27 01:26:17 +00:00

... 2 3 4 5 6 ...

1624 Коммитов