github/ruby - ruby

Граф коммитов

Автор	SHA1	Сообщение	Дата
nobu	c05fa459bb	quote symbols * sprintf.c (ruby__sfvextra): quote symbols as identifiers. * string.c (rb_id_quote_unprintable): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62747 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-03-14 02:35:51 +00:00
k0kubun	288b44328d	Export some missing symbols for MJIT tool/ruby_vm/views/_insn_name_info.erb: on Linux, rb_vm_insn_name_offset was needed to compile with --jit-debug (Usually --jit-debug requires more symbols than the situation without --jit-debug because -O2 skips some functions to compile). vm.c: when running transform_mjit_header.rb with --jit-wait, rb_source_location_cstr was repoted to be missing. string.c: ditto, for rb_str_eql numeric.c: ditto, for rb_float_eql git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62313 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-02-08 13:54:37 +00:00
k0kubun	ed935aa5be	mjit_compile.c: merge initial JIT compiler which has been developed by Takashi Kokubun <takashikkbn@gmail> as YARV-MJIT. Many of its bugs are fixed by wanabe <s.wanabe@gmail.com>. This JIT compiler is designed to be a safe migration path to introduce JIT compiler to MRI. So this commit does not include any bytecode changes or dynamic instruction modifications, which are done in original MJIT. This commit even strips off some aggressive optimizations from YARV-MJIT, and thus it's slower than YARV-MJIT too. But it's still fairly faster than Ruby 2.5 in some benchmarks (attached below). Note that this JIT compiler passes `make test`, `make test-all`, `make test-spec` without JIT, and even with JIT. Not only it's perfectly safe with JIT disabled because it does not replace VM instructions unlike MJIT, but also with JIT enabled it stably runs Ruby applications including Rails applications. I'm expecting this version as just "initial" JIT compiler. I have many optimization ideas which are skipped for initial merging, and you may easily replace this JIT compiler with a faster one by just replacing mjit_compile.c. `mjit_compile` interface is designed for the purpose. common.mk: update dependencies for mjit_compile.c. internal.h: declare `rb_vm_insn_addr2insn` for MJIT. vm.c: exclude some definitions if `-DMJIT_HEADER` is provided to compiler. This avoids to include some functions which take a long time to compile, e.g. vm_exec_core. Some of the purpose is achieved in transform_mjit_header.rb (see `IGNORED_FUNCTIONS`) but others are manually resolved for now. Load mjit_helper.h for MJIT header. mjit_helper.h: New. This is a file used only by JIT-ed code. I'll refactor `mjit_call_cfunc` later. vm_eval.c: add some #ifdef switches to skip compiling some functions like Init_vm_eval. win32/mkexports.rb: export thread/ec functions, which are used by MJIT. include/ruby/defines.h: add MJIT_FUNC_EXPORTED macro alis to clarify that a function is exported only for MJIT. array.c: export a function used by MJIT. bignum.c: ditto. class.c: ditto. compile.c: ditto. error.c: ditto. gc.c: ditto. hash.c: ditto. iseq.c: ditto. numeric.c: ditto. object.c: ditto. proc.c: ditto. re.c: ditto. st.c: ditto. string.c: ditto. thread.c: ditto. variable.c: ditto. vm_backtrace.c: ditto. vm_insnhelper.c: ditto. vm_method.c: ditto. I would like to improve maintainability of function exports, but I believe this way is acceptable as initial merging if we clarify the new exports are for MJIT (so that we can use them as TODO list to fix) and add unit tests to detect unresolved symbols. I'll add unit tests of JIT compilations in succeeding commits. Author: Takashi Kokubun <takashikkbn@gmail.com> Contributor: wanabe <s.wanabe@gmail.com> Part of [Feature #14235] --- * Known issues * Code generated by gcc is faster than clang. The benchmark may be worse in macOS. Following benchmark result is provided by gcc w/ Linux. * Performance is decreased when Google Chrome is running * JIT can work on MinGW, but it doesn't improve performance at least in short running benchmark. * Currently it doesn't perform well with Rails. We'll try to fix this before release. --- * Benchmark reslts Benchmarked with: Intel 4.0GHz i7-4790K with 16GB memory under x86-64 Ubuntu 8 Cores - 2.0.0-p0: Ruby 2.0.0-p0 - r62186: Ruby trunk (early 2.6.0), before MJIT changes - JIT off: On this commit, but without `--jit` option - JIT on: On this commit, and with `--jit` option Optcarrot fps Benchmark: https://github.com/mame/optcarrot \| \|2.0.0-p0 \|r62186 \|JIT off \|JIT on \| \|:--------\|:--------\|:--------\|:--------\|:--------\| \|fps \|37.32 \|51.46 \|51.31 \|58.88 \| \|vs 2.0.0 \|1.00x \|1.38x \|1.37x \|1.58x \| MJIT benchmarks Benchmark: https://github.com/benchmark-driver/mjit-benchmarks (Original: https://github.com/vnmakarov/ruby/tree/rtl_mjit_branch/MJIT-benchmarks) \| \|2.0.0-p0 \|r62186 \|JIT off \|JIT on \| \|:----------\|:--------\|:--------\|:--------\|:--------\| \|aread \|1.00 \|1.09 \|1.07 \|2.19 \| \|aref \|1.00 \|1.13 \|1.11 \|2.22 \| \|aset \|1.00 \|1.50 \|1.45 \|2.64 \| \|awrite \|1.00 \|1.17 \|1.13 \|2.20 \| \|call \|1.00 \|1.29 \|1.26 \|2.02 \| \|const2 \|1.00 \|1.10 \|1.10 \|2.19 \| \|const \|1.00 \|1.11 \|1.10 \|2.19 \| \|fannk \|1.00 \|1.04 \|1.02 \|1.00 \| \|fib \|1.00 \|1.32 \|1.31 \|1.84 \| \|ivread \|1.00 \|1.13 \|1.12 \|2.43 \| \|ivwrite \|1.00 \|1.23 \|1.21 \|2.40 \| \|mandelbrot \|1.00 \|1.13 \|1.16 \|1.28 \| \|meteor \|1.00 \|2.97 \|2.92 \|3.17 \| \|nbody \|1.00 \|1.17 \|1.15 \|1.49 \| \|nest-ntimes\|1.00 \|1.22 \|1.20 \|1.39 \| \|nest-while \|1.00 \|1.10 \|1.10 \|1.37 \| \|norm \|1.00 \|1.18 \|1.16 \|1.24 \| \|nsvb \|1.00 \|1.16 \|1.16 \|1.17 \| \|red-black \|1.00 \|1.02 \|0.99 \|1.12 \| \|sieve \|1.00 \|1.30 \|1.28 \|1.62 \| \|trees \|1.00 \|1.14 \|1.13 \|1.19 \| \|while \|1.00 \|1.12 \|1.11 \|2.41 \| Discourse's script/bench.rb Benchmark: https://github.com/discourse/discourse/blob/v1.8.7/script/bench.rb NOTE: Rails performance was somehow a little degraded with JIT for now. We should fix this. (At least I know opt_aref is performing badly in JIT and I have an idea to fix it. Please wait for the fix.) * JIT off Your Results: (note for timings- percentile is first, duration is second in millisecs) categories_admin: 50: 17 75: 18 90: 22 99: 29 home_admin: 50: 21 75: 21 90: 27 99: 40 topic_admin: 50: 17 75: 18 90: 22 99: 32 categories: 50: 35 75: 41 90: 43 99: 77 home: 50: 39 75: 46 90: 49 99: 95 topic: 50: 46 75: 52 90: 56 99: 101 *** JIT on Your Results: (note for timings- percentile is first, duration is second in millisecs) categories_admin: 50: 19 75: 21 90: 25 99: 33 home_admin: 50: 24 75: 26 90: 30 99: 35 topic_admin: 50: 19 75: 20 90: 25 99: 30 categories: 50: 40 75: 44 90: 48 99: 76 home: 50: 42 75: 48 90: 51 99: 89 topic: 50: 49 75: 55 90: 58 99: 99 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62197 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-02-04 11:22:28 +00:00
mame	552a5a993c	string.c (rb_str_format_m): Fix the example code of the doc Change `%08x` to `%016x` because of two reasons: * `%016x` demonstrates that we can use two or more digits here. * Currently, many people uses 64-bit environment. (I'm unsure if object_id is a good example here, though...) I'm unsure if git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62091 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-29 08:40:22 +00:00
nobu	9237049efe	string.c: clear substring code range * string.c (str_substr): substring of broken code range string may be valid or broken. patch by tommy (Masahiro Tomita) at [ruby-dev:50430] [Bug #14388]. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62040 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-25 13:10:14 +00:00
shyouhei	dc1e6f17ba	sizeof(uintptr_t) != sizeof(uintptr_t *) Reported by mame. Thanks. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61865 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-16 03:09:53 +00:00
shyouhei	39cfa67b4f	__builtin_assume_aligned for (foo ) casts These casts are guarded. Must be safe to assume alignments. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61829 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-15 02:35:18 +00:00
nobu	6b5e0bd98c	exclude flexible array size with old compilers git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61814 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-14 11:19:18 +00:00
mame	982e9e6235	string.c (struct mapping_buffer): Use FLEX_ARY_LEN git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61811 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-13 13:08:05 +00:00
usa	b87571100a	should cause preprocess error as other cases * string.c (NONASCII_MASK): should cause preprocess error immediately if the compiler does not satisfy our assumptions. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61756 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-10 03:54:02 +00:00
nobu	e9cb552ec9	internal.h: remove dependecy on ruby/encoding.h git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61713 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-09 06:24:11 +00:00
nobu	ee85a6e72b	internal.h: remove dependecy on ruby/io.h git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61712 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-09 06:24:10 +00:00
nobu	e043ae7348	string.c: out-of-bounds access * string.c (rb_str_enumerate_lines): fix out-of-bounds access when record separator is longer than the last element. [Bug #14257] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61636 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-06 08:44:17 +00:00
shyouhei	beaf2ace87	ULL suffix is a C99ism Don't assume long long == 8 bytes. If you can assume C99, there are macros named UINT64_C and such for appropriate integer literal suffixes. If you can't, no way but do a bitwise or. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61594 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-04 07:51:17 +00:00
nobu	a1bbb2a780	Fix doc typo in Symbol#to_proc [Fix GH-1785] [ci skip] From: Dimitris Zorbas <dimitrisplusplus@gmail.com> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61588 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2018-01-04 00:44:40 +00:00
nobu	634a48c5c1	string.c: chomp rs at the end * string.c (rb_str_enumerate_lines): should chomp record separator only, but not a newline, at the end of the receiver as well as middle, if the separator is given. [ruby-core:84552] [Bug #14257] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61513 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-29 12:19:03 +00:00
kazu	c9bdee3e0d	[DOC] Fix typos in downcase [ci skip] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61488 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-27 00:04:30 +00:00
nobu	e2479cc43f	encoding.c: rb_enc_find_index2 * string.c (str_undump): use rb_enc_find_index2 to find encoding by unterminated string. check the format before encoding name. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61396 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-22 01:03:17 +00:00
nobu	168c019998	string.c: fix memory leak git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61386 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-21 07:59:00 +00:00
naruse	05d1d29d1f	Don't allow mixed escape git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61381 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-21 05:09:17 +00:00
naruse	188d85934b	move dump format validation into parsing epilogue git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61380 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-21 05:09:16 +00:00
naruse	29c6ca423c	fix escapes in undump git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61379 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-21 05:08:57 +00:00
nobu	7c18db61a1	string.c: multiple codepoints * string.c (undump_after_backslash): fix multiple codepoints in braces. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61290 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-16 00:30:52 +00:00
nobu	ae18c8f5b6	string.c: suppress warning * string.c (str_undump): suppress maybe-uninitialized warning by gcc 7 and later. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61289 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-16 00:03:51 +00:00
tadd	bbec11d329	Implement String#undump to unescape String#dump-ed string [Feature #12275] [close GH-1765] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61228 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-14 08:47:13 +00:00
nobu	a1692f7fdf	string.c: fix rb_external_str_new_with_enc * string.c (rb_external_str_new_with_enc): do not search non-ascii by NULL pointer. [ruby-core:84055] [Bug #14150] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60979 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-12-02 07:09:16 +00:00
nobu	73e41247b9	string.c: prefer rb_syserr_fail git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60761 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-11-14 03:02:58 +00:00
rhe	a82aaea719	string.c: fix up r60748 An #ifdef was missing in r60748 and build broke on systems without crypt_r(). https://rubyci.org/logs/rubyci.s3.amazonaws.com/unstable11s/ruby-trunk/log/20171112T162503Z.fail.html.gz git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60749 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-11-12 17:10:29 +00:00
rhe	0b845a8458	string.c: fix memory leak in String#crypt Use ALLOCV to allocate struct crypt_data for slightly cleaner and less error-prone code. It is currently possible it leaks when an invalid argument is passed to String#crypt or rb_str_new_cstr() fails to allocate memory. SIZEOF_CRYPT_DATA macro in missing/crypt.h is removed since it is not used any longer. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60748 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-11-12 15:55:04 +00:00
stomar	8b1c1c55a9	string.c: improve docs for String#{concat,<<} * string.c: [DOC] remove a misleading call-seq for String#concat, which suggests that all arguments must be Integers in this case; also clarify in the example that the receiver is modified; fix grammar for String#<<; move references to the end. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60712 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-11-07 20:15:59 +00:00
stomar	3262f4910f	string.c: fix typos * string.c: [DOC] fix typos in doxygen comments. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60707 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-11-07 20:11:09 +00:00
stomar	51b0230a9b	string.c: improve docs * string.c: [DOC] fix rdoc for cross reference; fix grammar. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60574 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-10-29 21:43:36 +00:00
watson1978	b03a44c4ac	string.c: Improve String#prepend performance if only one argument is given * string.c (rb_str_prepend_multi): Prepend the string without generating temporary String object if only one argument is given. This is very similar with https://github.com/ruby/ruby/pull/1634 String#prepend -> 47.5 % up [Fix GH-1670] [ruby-core:82195] [Bug #13773] * Before String#prepend 1.517M (± 1.8%) i/s - 7.614M in 5.019819s * After String#prepend 2.236M (± 3.4%) i/s - 11.234M in 5.029716s * Test code require 'benchmark/ips' Benchmark.ips do \|x\| x.report "String#prepend" do \|loop\| loop.times { "!".prepend("hello") } end end git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60480 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-10-27 14:55:03 +00:00
nobu	2050b50d85	string.c: comment layout [ci skip] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60331 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-10-22 00:00:41 +00:00
svn	4b8c94dd84	* remove trailing spaces. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60329 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-10-21 23:49:36 +00:00
sonots	84616bf979	* string.c: [DOC] Split rdoc of String#<< and String#concat [ci skip] Split String#<< and String#concat docs to reflect single and multiple arguments patched by MSP-Greg [fix GH-1614] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60328 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-10-21 23:49:35 +00:00
sonots	d9e11970f8	* string.c: Remove errant "the" in gsub documentation patched by jlmuir (J. Lewis Muir) [fix GH-1679] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60324 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-10-21 23:35:40 +00:00
nobu	80c50308f9	Improve performance of string interpolation This patch will add pre-allocation in string interpolation. By this, unecessary capacity resizing is avoided. For small strings, optimized `rb_str_resurrect` operation is faster, so pre-allocation is done only when concatenated strings are large. `MIN_PRE_ALLOC_SIZE` was decided by experimenting with local machine (x86_64-apple-darwin 16.5.0, Apple LLVM version 8.1.0 (clang - 802.0.42)). String interpolation will be faster around 72% when large string is created. * Before ``` Calculating ------------------------------------- Large string interpolation 1.276M (± 5.9%) i/s - 6.358M in 5.002022s Small string interpolation 5.156M (± 5.5%) i/s - 25.728M in 5.005731s ``` * After ``` Calculating ------------------------------------- Large string interpolation 2.201M (± 5.8%) i/s - 11.063M in 5.043724s Small string interpolation 5.192M (± 5.7%) i/s - 25.971M in 5.020516s ``` * Test code ```ruby require 'benchmark/ips' Benchmark.ips do \|x\| x.report "Large string interpolation" do \|t\| a = "Hellooooooooooooooooooooooooooooooooooooooooooooooooooo" b = "Wooooooooooooooooooooooooooooooooooooooooooooooooooorld" t.times do "#{a}, #{b}!" end end x.report "Small string interpolation" do \|t\| a = "Hello" b = "World" t.times do "#{a}, #{b}!" end end end ``` [Fix GH-1626] From: Nao Minami <south37777@gmail.com> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60320 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-10-21 23:21:05 +00:00
hsbt	2c27e52f8e	Add documentation for `chomp` option. https://github.com/ruby/ruby/pull/1717 Patch by @ksss [fix GH-1717] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60308 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-10-21 16:11:58 +00:00
sonots	ec30bc5930	* string.c (deleted_prefix_length, deleted_suffix_length): Add doxygen comment. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60254 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-10-21 10:33:25 +00:00
naruse	6187b0001b	[Feature #13712 ] String#start_with? supports regexp git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60234 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-10-21 06:51:01 +00:00
glass	8320be1007	string.c: avoid unnecessary call of str_strlen() * string.c (rb_strseq_index): refactor and avoid call of str_strlen() when offset == 0. it will improve performance of String#index and #include? * benchmark/bm_string_index.rb: benchmark for this change git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60086 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-10-01 13:44:49 +00:00
nobu	16759238ad	string.c: fix ASCII-only on succ * string.c (str_succ): clear coderange cache when no alpha-numeric character case, carried part may become ASCII-only. [ruby-core:83062] [Bug #13952] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60066 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-30 00:01:23 +00:00
nobu	2d42119903	string.c: ASCII-incompatible is not ASCII only * string.c (tr_trans): ASCII-incompatible encoding strings cannot be ASCII-only even if valid. [ruby-core:83056] [Bug #13950] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60060 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-29 08:15:50 +00:00
nobu	8c59fdb8d8	dup String#split return value * string.c (rb_str_split): return duplicated receiver, when no splits. patched by tompng (tomoya ishida) in [ruby-core:82911], and the test case by Seiei Miyagi <hanachin@gmail.com>. [Bug#13925] [Fix GH-1705] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60002 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-23 07:09:07 +00:00
nobu	e1be1d0c38	dup String#rpartition return value * string.c (rb_str_rpartition): return duplicated receiver, when no splits. [ruby-core:82911] [Bug#13925] Author: Seiei Miyagi <hanachin@gmail.com> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60001 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-23 07:09:06 +00:00
nobu	b0326bce01	dup String#partition return value * string.c (rb_str_partition): return duplicated receiver, when no splits. [ruby-core:82911] [Bug#13925] Author: Seiei Miyagi <hanachin@gmail.com> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60000 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-23 07:09:05 +00:00
nobu	b2da3824c5	refinements in string interpolation * compile.c (iseq_compile_each0): insert to_s method call, so that refinements activated at the caller should take place. [Feature #13812] * insns.def (tostring): fix up converted object to a string, infect and fallback. * insns.def (branchiftype): new instruction for conversion. branches if TOS is an instance of the given type. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59950 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-18 02:27:13 +00:00
kazu	0f25c6d7d5	Fix a typo [ci skip] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59764 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-06 13:46:31 +00:00
nobu	bd10ce165c	string.c: fix false coderange * string.c (rb_enc_str_scrub): enc can differ from the actual encoding of the string, the cached coderange is useless then. [ruby-core:82674] [Bug #13874] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59763 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-06 13:11:44 +00:00
nobu	faa26f5570	string.c: optimize enumerate_grapheme_clusters * string.c (rb_str_enumerate_grapheme_clusters): optimize when single byte only. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59762 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-06 12:50:10 +00:00
nobu	e568455826	string.c: grapheme clusters on frozen string * string.c (rb_str_enumerate_grapheme_clusters): enumerate on shared frozen string. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59743 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-04 14:04:54 +00:00
nobu	805d6f6f3c	string.c: enumerator_element * string.c (enumerator_element): push or yield elements, and return 1 if needs checks. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59742 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-04 14:04:50 +00:00
nobu	ded7e1c2a1	string.c: make array in WANTARRAY * string.c (WANTARRAY): make array for the result in method functions and pass it to enumerator functions. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59736 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-03 13:21:07 +00:00
nobu	adac779218	string.c: enumerator_wantarray * string.c (enumerator_wantarray): show warnings at method functions for proper method names. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59732 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-03 02:08:55 +00:00
nobu	71de56621e	string.c: fix for non-Unicode encodings * string.c (rb_str_enumerate_grapheme_clusters): should enumerate chars for non-Unicode encodings. [Feature #13780] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59731 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-03 01:47:19 +00:00
nobu	fc1bf16696	string.c: suppress a warning * string.c (rb_str_enumerate_grapheme_clusters): suppress a maybe-uninitialized warning by old gcc. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59730 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-09-03 00:39:23 +00:00
nobu	bb03f02805	string.c: adjust indent [ci skip] * string.c (rb_str_enumerate_grapheme_clusters): adjust indent. [Feature #13780] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59700 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-08-31 08:07:59 +00:00
naruse	df49fc659e	String#each_grapheme_cluster and String#grapheme_clusters added to enumerate grapheme clusters [Feature #13780] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59698 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-08-31 06:35:28 +00:00
glass	b3c70d4c7e	string.c: fix potential bug in String#split * string.c (rb_str_split_m): fix potential bug when rb_memsearch() matches a octet in the middle of a multi-byte character sequence. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59673 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-08-28 10:55:37 +00:00
naruse	d067b263c7	Add optimization for creating zerofill string ``` require 'benchmark' n = 1 * 1024 * 1024 * 1024 Benchmark.bmbm do \|x\| x.report("") { 0.chr n } x.report("ljust") { String.new(capacity: n).ljust(n, "\0") } end ``` Before ```% ./ruby test.rb Rehearsal ----------------------------------------- * 0.358396 0.392753 0.751149 ( 1.134231) ljust 0.203277 0.389223 0.592500 ( 0.594816) -------------------------------- total: 1.343649sec user system total real * 0.282647 0.304600 0.587247 ( 0.589205) ljust 0.201834 0.283801 0.485635 ( 0.487617) ``` After ```% ./ruby test.rb Rehearsal ----------------------------------------- * 0.000522 0.000021 0.000543 ( 0.000534) ljust 0.208551 0.321030 0.529581 ( 0.542083) -------------------------------- total: 0.530124sec user system total real * 0.000069 0.000006 0.000075 ( 0.000069) ljust 0.206698 0.301032 0.507730 ( 0.517674) ``` git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59614 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-08-17 16:34:40 +00:00
nobu	2b770b4674	string.c: improve String#scan * string.c (rb_str_rstrip_bang): improve the performance in 50% for a string pattern, and in 10% for a regexp pattern. get rid of making MatchData in middle, which is not used. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59496 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-08-04 04:39:53 +00:00
nobu	8458e709ab	string.c: rb_str_initialize * string.c (rb_str_initialize): new function to (re)initialize a string with data and encoding. extracted from rb_external_str_new_with_enc. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59448 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-07-30 02:56:29 +00:00
sonots	510957df33	string.c: add String#delete_suffix and String#delete_suffix! to remove trailing suffix [Feature #13665] [Fix GH-1661] * string.c (rb_str_delete_suffix_bang): add a new method to remove suffix destuctively. * string.c (rb_str_delete_suffix): add a new method to remove suffix non-destuctively. * test/ruby/test_string.rb: add tests. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59377 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-07-20 16:29:19 +00:00
normal	0493b1ce3a	revert r59359, r59356, r59355, r59354 These caused numerous CI failures I haven't been able to reproduce [ruby-core:82102] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59364 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-07-19 01:35:04 +00:00
normal	86e266bb60	string: preserve taint flag with String#-@ (uminus) * string.c (tainted_fstr_update): move up (rb_fstring): support registering tainted strings (register_fstring_tainted): extract from rb_fstring_existing0 (rb_tainted_fstring_existing): use register_fstring_tainted instead git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59359 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-07-18 09:52:55 +00:00
normal	d04c085b3c	hash: keep fstrings of tainted strings for string keys The same hash keys may be loaded from tainted data sources frequently (e.g. parsing headers from socket or loading YAML data from a file). If a non-tainted fstring already exists (because the application expects the hash key), cache and deduplicate the tainted version in the new tainted_frozen_strings table. For non-embedded strings, this also allows sharing with the underlying malloc-ed data. * vm_core.h (rb_vm_struct): add tainted_frozen_strings * vm.c (ruby_vm_destruct): free tainted_frozen_strings (Init_vm_objects): initialize tainted_frozen_strings (rb_vm_tfstring_table): accessor for tainted_frozen_strings * internal.h: declare rb_fstring_existing, rb_vm_tfstring_table * hash.c (fstring_existing_str): remove (moved to string.c) (hash_aset_str): use rb_fstring_existing * string.c (rb_fstring_existing): new, based on fstring_existing_str (tainted_fstr_update): new (rb_fstring_existing0): new, based on fstring_existing_str (rb_tainted_fstring_existing): new, special case for tainted strings (rb_str_free): delete from tainted_frozen_strings table * test/ruby/test_optimization.rb (test_hash_reuse_fstring): new test [ruby-core:82012] [Bug #13737] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59354 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-07-18 02:29:59 +00:00
rhe	06a3a10acf	string.c: preserve coderange in String#setbyte Fix a wrong jump so replacing a byte in an ASCII-only string with an ASCII character won't clear the coderange. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59272 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-07-06 07:21:17 +00:00
rhe	2c8cec96cc	string.c: remove dead code in str_fill_term() The length of a string never exceeds the capacity. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59271 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-07-06 07:21:16 +00:00
sonots	10082360b9	string.c: add String#delete_prefix and String#delete_prefix! to remove leading substr [Feature #12694] [fix GH-1632] * string.c (rb_str_delete_prefix_bang): add a new method to remove prefix destuctively. * string.c (rb_str_delete_prefix): add a new method to remove prefix non-destuctively. * test/ruby/test_string.rb: add tests. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59132 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-06-21 07:43:26 +00:00
nobu	f5052d45be	string.c: check just before modification * string.c (rb_str_chomp_bang): check if modifiable after checking an argument and just before modification, as it can get frozen during the argument conversion to String. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59112 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-06-18 04:38:01 +00:00
stomar	038c2e52d8	string.c: docs for String#split * string.c: [DOC] clarify docs for String#split when called with limit and capture groups. Reported by Cichol Tsai. [ruby-core:81505] [Bug #13621] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59002 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-06-02 21:29:27 +00:00
watson1978	d0015e4ac6	Improve performance of implicit type conversion To convert the object implicitly, it has had two parts in convert_type() which are 1. lookink up the method's id 2. calling the method Seems that strncmp() and strcmp() in convert_type() are slightly heavy to look up the method's id for type conversion. This patch will add and use internal APIs (rb_convert_type_with_id, rb_check_convert_type_with_id) to call the method without looking up the method's id when convert the object. Array#flatten -> 19 % up Array#+ -> 3 % up [ruby-dev:50024] [Bug #13341] [Fix GH-1537] ### Before Array#flatten 104.119k (± 1.1%) i/s - 525.690k in 5.049517s Array#+ 1.993M (± 1.8%) i/s - 10.010M in 5.024258s ### After Array#flatten 124.005k (± 1.0%) i/s - 624.240k in 5.034477s Array#+ 2.058M (± 4.8%) i/s - 10.302M in 5.019328s ### Test Code require 'benchmark/ips' class Foo def to_ary [1,2,3] end end Benchmark.ips do \|x\| ary = [] 100.times { \|i\| ary << i } array = [ary] x.report "Array#flatten" do \|i\| i.times { array.flatten } end x.report "Array#+" do \|i\| obj = Foo.new i.times { array + obj } end end git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58978 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-31 12:30:57 +00:00
nobu	a77cb8c80f	string.c: adjust style [ci skip] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58897 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-26 06:39:06 +00:00
k0kubun	592c3f9b10	string.c: Optimize String#concat when argc is 1 Optimize performance regression introduced in r56021. * Benchmark (i7-4790K @ 4.00GH, x86_64 GNU/Linux) Benchmark.ips do \|x\| x.report("String#concat (1)") { "a".concat("b") } if RUBY_VERSION >= "2.4.0" x.report("String#concat (2)") { "a".concat("b", "c") } end end * Ruby 2.3 Calculating ------------------------------------- String#concat (1) 6.003M (± 5.2%) i/s - 30.122M in 5.031646s * Ruby 2.4 (Before this patch) Calculating ------------------------------------- String#concat (1) 4.458M (± 8.9%) i/s - 22.298M in 5.058084s String#concat (2) 3.660M (± 5.6%) i/s - 18.314M in 5.020527s * Ruby 2.4 (After this patch) Calculating ------------------------------------- String#concat (1) 6.448M (± 5.2%) i/s - 32.215M in 5.010833s String#concat (2) 3.633M (± 9.0%) i/s - 18.056M in 5.022603s [fix GH-1631] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58886 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-25 11:14:40 +00:00
nobu	7db534a20c	vm_insnhelper.c: rb_eql_opt should call eql? * vm_insnhelper.c (rb_eql_opt): should call #eql? on Float and String, not #==. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58882 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-25 05:29:35 +00:00
normal	2079b71000	string.c: fix String#crypt leak introduced in r58866 * string.c (rb_str_crypt): define LARGE_CRYPT_DATA when allocating git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58876 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-24 21:26:14 +00:00
nobu	92261511b6	string.c: for small crypt_data * string.c (rb_str_crypt): struct crypt_data defined in missing/crypt.h is small enough. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58866 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-24 06:55:09 +00:00
ko1	9e1624cfe8	Add debug counters. * debug_counter.h: add the following counters to measure object types. obj_free: freed count obj_str_ptr: freed count of Strings they have extra buff. obj_str_embed: freed count of Strings they don't have extra buff. obj_str_shared: freed count of Strings they have shared extra buff. obj_str_nofree: freed count of Strings they are marked as nofree. obj_str_fstr: freed count of Strings they are marked as fstr. obj_ary_ptr: freed count of Arrays they have extra buff. obj_ary_embed: freed count of Arrays they don't have extra buff. obj_obj_ptr: freed count of Objects (T_OBJECT) they have extra buff. obj_obj_embed: freed count of Objects they don't have extra buff. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58865 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-24 06:46:44 +00:00
normal	144e067007	string.c (rb_str_crypt): fix excessive stack use with crypt_r "struct crypt_data" is 131232 bytes on x86-64 GNU/Linux, making it unsafe to use tiny Fiber stack sizes. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58864 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-24 03:01:44 +00:00
stomar	40bc846bf8	string.c: fix String#{casecmp,casecmp?} for non-string arguments * string.c: make String#{casecmp,casecmp?} return nil for non-string arguments instead of raising a TypeError. * test/ruby/test_string.rb: add tests. Reported by Marcus Stollsteimer. Based on a patch by Shingo Morita. [ruby-core:80145] [Bug #13312] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58837 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-21 19:28:48 +00:00
nobu	f3a49ebc92	string.c: cut down intermediate string * string.c (rb_external_str_new_with_enc): cut down intermediate string for conversion source, by appending with conversion. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58709 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-14 00:21:00 +00:00
nobu	7323de517d	revert r58703 & r58705 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58708 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-13 16:04:05 +00:00
nobu	a3960d0a60	string.c: fix up r58703 * string.c (rb_external_str_new_with_enc): fix the case of conversion failure. when conversion failed for some reason, just ignores the default internal encoding and returns in the given encoding. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58705 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-13 14:20:19 +00:00
nobu	7678c0d702	string.c: cut down intermediate string * string.c (rb_external_str_new_with_enc): cut down intermediate string for conversion source, by appending with conversion. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58703 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-13 13:34:39 +00:00
nobu	7d52ad9e17	string.c: fix one-off bug * string.c (rb_str_cat_conv_enc_opts): fix one-off bug. `ofs` equals `olen` when appending at the end. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58702 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-13 12:31:01 +00:00
nobu	2c0baa97a9	string.c: remove bare Unicode. * string.c (rb_str_unicode_normalize): remove bare Unicode. do not assume that all compilers can handle UTF-8. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58688 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-12 15:29:55 +00:00
stomar	8303bc6cf6	string.c: docs for String#match * string.c: [DOC] add example for String#match with pos argument. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58669 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-11 18:59:45 +00:00
stomar	4a0eaeeb4d	string.c: docs for Symbol * string.c: [DOC] adopt call-seq's for Symbol#{match,match?} from String methods; other small improvements for Symbol docs. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58668 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-11 18:58:27 +00:00
stomar	477cb2b8d8	string.c: docs for Symbol#{match,match?} * string.c: [DOC] mention pos argument for Symbol#{match,match?}. Patch by Yuki Kurihara (ksss). [Fix GH-1606] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58666 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-11 18:56:32 +00:00
nobu	208240870a	string.c: fix r58618 * string.c (unicode_normalize_common): aggregation type cannot be initialized with dynamic values, in C89. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58621 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-09 14:11:46 +00:00
duerst	a4301ec214	replace hand-written argument check by call to rb_scan_args in unicode_normalize_common In string.c, replace hand-written argument count check by call to rb_scan_args. This allows to use rb_funcallv once, rather than using rb_funcall twice. Thanks to Hanmac (Hans Mackowiak) for the idea, see https://bugs.ruby-lang.org/issues/11078#note-7. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58618 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-09 11:13:45 +00:00
nobu	d7f2c72322	string.c: fix types * string.c (id_normalize, id_normalized_p): fix types, IDs should be ID. * string.c (unicode_normalize_common): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58575 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-06 01:01:52 +00:00
stomar	6ae3cf02f7	string.c: [DOC] improve docs for String.new git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58569 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-04 13:19:43 +00:00
ktsj	4e6daedc70	string.c: [DOC] Properly refer to keyword argument by its name git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58567 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-04 08:59:01 +00:00
duerst	f47033e237	refactor common parts of unicode normalization functions into unicode_normalize_common In string.c, refactor the common parts (requiring of unicode_normalize/normalize.rb, check of number of arguments) of the unicode normalization functions (rb_str_unicode_normalize, rb_str_unicode_normalize_bang, rb_str_unicode_normalized_p) into the new function unicode_normalize_common. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58558 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-04 02:16:27 +00:00
duerst	140560e4ee	move definition of String#unicode_normalized? to C to make sure it is documented * lib/unicode_normalize.rb: Remove definition of String#unicode_normalized? (including documentation). Leave a comment explaining that the file is now empty. * string.c: Define String#unicode_normalized? in rb_str_unicode_normalized_p in C, (including documentation) * lib/unicode_normalize/normalize.rb: Remove (re)definition of String#unicode_normalized? to avoid warnings (when $VERBOSE==true) and problems when String is frozen git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58555 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-04 02:00:19 +00:00
duerst	90ab1ee023	move definition of String#unicode_normalize! to C to make sure it is documented * lib/unicode_normalize.rb: Remove definition of String#unicode_normalize! (including documentation) * string.c: Define String#unicode_normalize! in rb_str_unicode_normalize_bang in C, (including documentation) * lib/unicode_normalize/normalize.rb: Remove (re)definition of String#unicode_normalize! to avoid warnings (when $VERBOSE==true) and problems when String is frozen git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58553 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-04 01:36:52 +00:00
svn	7abf8bae23	* remove trailing spaces. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58551 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-03 12:18:37 +00:00
duerst	5fee67c9ba	move definition of String#unicode_normalize to C to make sure it is documented * lib/unicode_normalize.rb: Remove definition of String#unicode_normalize (including documentation) * string.c: Define String#unicode_normalize in rb_str_unicode_normalize in C, (including documentation) * lib/unicode_normalize/normalize.rb: Remove (re)definition of String#unicode_normalize to avoid warnings (when $VERBOSE==true) and problems when String is frozen git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58550 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-05-03 12:18:37 +00:00
nobu	cc68af3d02	string.c: improve insertion performace * string.c (rb_str_splice_0): improve performace of single byte optimizable cases, insertion 7bit string to 7bit string. [ruby-dev:49984] [Bug #13228] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58383 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-04-17 13:38:34 +00:00
sorah	31a755e4f2	string.c: Supress logical-op-parentheses warning * string.c(rb_str_upcase_bang): Supress logical-op-parentheses warning Patch by Fukuo Kadota <fukuo-kadota@cookpad.com>, Closes [GH-1570] [Bug #13387]. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58211 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-29 11:33:59 +00:00
nobu	f0e6e47999	string.c: use the usable size * string.c (rb_str_change_terminator_length): when called after the content has been copied, old terminator length no longer makes sense. use the whole usable size instead of capacity without terminator. [ruby-core:80257] [Bug #13339] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58042 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-21 05:28:38 +00:00
duerst	a5330fa9ea	fix accidental reversal of r57997 in r58000 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58009 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-18 01:35:03 +00:00
duerst	67c1197835	clarifiy 'codepoint' in documentation of String#each_codepoint Make sure it's clear that the returned values are not Unicode codepoints for encodings other than UTF-8/UTF-16(BE\|LE)/UTF-32(BE\|LE). [ci skip] [Bug #13321] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58000 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-17 02:24:53 +00:00
normal	9eb94b4dc1	deduplicate static rb_str_format format strings Anybody who hits these code paths can hit them again in the future, so try deduplicating across multiple runs of these methods to reduce garbage. * string.c (str_upto_each): fstring on "%.d" strftime.c (rb_strftime_with_timespec): fstring on "%0*d" git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57997 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-17 00:55:55 +00:00
nobu	8c661ba264	string.c: shortcut argument check * string.c (str_casecmp, str_casecmp_p): split to skip argument check when it is a String certainly. * string.c (sym_casecmp, sym_casecmp_p): shortcut argument checks. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57978 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-15 07:57:11 +00:00
nobu	9fa56026e5	string.c: use rb_check_string_type * string.c (rb_str_cmp_m): use rb_check_string_type for check and conversion, instead of calling the conversion method directly. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57965 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-14 03:42:43 +00:00
stomar	e6dec8f92e	docs for Symbol#casecmp and Symbol#casecmp? * string.c: [DOC] improve docs of Symbol#casecmp and Symbol#casecmp? according to the similar String methods; fix RDoc markup and typos; fix call-seq's for Symbol#{upcase,downcase,capitalize,swapcase}. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57963 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-13 20:20:40 +00:00
nobu	bd17e25588	string.c (rb_str_set_len): pathological check git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57961 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-13 11:47:45 +00:00
nobu	16e804117c	string.c: $; is a GC-root * string.c (Init_String): $; must be a GC-root, not to be collected. [ruby-core:79582] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57958 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-13 09:12:05 +00:00
stomar	605b472d2d	docs for String#casecmp and String#casecmp? * string.c: [DOC] specify when String#casecmp and String#casecmp? return nil; modify examples to better show difference to <=>; fix RDoc markup and typos. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57886 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-11 20:01:55 +00:00
normal	e064879e5a	string.c (str_uminus): update doc for deduplication As of r57698, String#-@ can return pre-existing strings. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57813 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-08 21:24:24 +00:00
nobu	e7f4d90930	fix paren * string.c (str_byte_substr): fix misplaced parenthesis at r56155. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57809 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-08 08:19:56 +00:00
kazu	d0708e9e2a	string.c: [DOC] Fix a typo in String#dump [Fix GH-1531][ci skip] Author: Alex Semyonov <alex@semyonov.us> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57802 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-07 13:04:39 +00:00
nobu	d69d98f61a	string.c: negation of LONG_MIN * string.c (rb_str_update): do not use negation of LONG_MIN, which is negative too. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57800 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-07 09:13:41 +00:00
nobu	f4d13801b6	string.c: fix integer overflow * string.c (str_byte_substr): fix another integer overflow which can happen only when SHARABLE_MIDDLE_SUBSTRING is enabled. [ruby-core:79951] [Bug #13289] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57799 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-07 09:07:57 +00:00
nobu	72f8df158f	string.c: fix integer overflow * string.c (rb_str_subpos): fix integer overflow which can happen only when SHARABLE_MIDDLE_SUBSTRING is enabled. incorpolate https://github.com/mruby/mruby/commit/7db0786abdd243ba031e24683f git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57797 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-07 05:48:15 +00:00
stomar	3ca1cbecc6	string.c: [DOC] fix doc formatting for String#==, #=== git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57778 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-04 20:08:04 +00:00
stomar	a698d99703	string.c: restore documentation for String#<< * string.c: [DOC] restore documentation for String#<< which became undocumented with r56021; fix a typo. [ruby-core:79865] [Bug #13268] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57758 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-03-02 10:31:56 +00:00
normal	4e90dcc9d7	string.c (str_uminus): deduplicate strings This exposes the rb_fstring internal function to return a deduped and frozen string when a non-frozen string is given. This is useful for writing all sorts of record processing key values maybe stored, but certain keys and values are often duplicated at a high frequency, so memory savings can noticeable. Use cases are many: * email/NNTP header processing There are some standard header keys everybody uses (From/To/Cc/Date/Subject/Received/Message-ID/References/In-Reply-To), as well as common ones specific to a certain lists: (ruby-core has X-Redmine-* headers) It is also useful to dedupe values, as most inboxes have multiple messages from the same sender, or MUA. * package management systems - things like RubyGems stores identical strings for licenses, dependency names, author names/emails, etc * HTTP headers/trailers - standard headers (Host/Accept/Accept-Encoding/User-Agent/...) are common, but there are also uncommon ones. Values may be deduped, as well, as it is likely a user agent will make multiple/parallel requests to the same server. * version control systems - this can be useful for deduplicating names of frequent committers (like "nobu" :) In linux.git and git.git, there are also common trailers such as Signed-Off-By/Acked-by/Reviewed-by/Fixes/... as well as less common ones. * audio metadata - There are commonly used tags (Artist/Album/Title/Tracknumber), but Vorbis comments allows arbitrary key values to be stored. Music collections contain songs by the same artist or mutiple songs from the same album, so deduplicating values will be helpful there, too. * JSON, YAML, XML, HTML processing Certain fields, tags and attributes are commonly used across the same and multiple documents There is no security concern in this being a DoS vector by causing immortal strings. The fstring table is not a GC-root and not walked during the mark phase. GC-able dynamic symbols since Ruby 2.2 are handled in the same manner, and that implementation also relies on the non-immortality of fstrings. [Feature #13077] [ruby-core:79663] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57698 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-02-24 01:01:23 +00:00
nobu	7de42daa21	string.c: assertion * string.c (str_shared_replace): use RUBY_ASSERT for pre-condition. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57628 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-02-14 12:29:56 +00:00
nobu	957e6e4b14	initialize variables * string.c (rb_str_enumerate_lines): initialize conditionally used variable. * thread.c (rb_fd_no_init): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57625 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-02-14 07:52:30 +00:00
nobu	959aac29e7	suppress warnings * string.c (rb_str_enumerate_lines): hint to suppress a maybe-uninitialized warning by gcc. * thread.c (rb_fd_no_init): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57618 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-02-13 05:44:15 +00:00
normal	a6b9b360ce	doc: Add example for Symbol#to_s * string.c: add example for Symbol#to_s. The docs for Symbol#to_s only include an example for Symbol#id2name, but not for #to_s which is an alias; the docs should include examples for both methods. From: Marcus Stollsteimer <sto.mar@web.de> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57536 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-02-05 00:22:03 +00:00
normal	b0cfa46bce	symbol.c (rb_id2str): eliminate branch to set class Since the fstring table encompasses all strings in the symbol table, we may reuse the fstring table walk to set the class and eliminate the branch in rb_id2str. * string.c (Init_String): use rb_cString immediately after definition * symbol.c (rb_id2str): eliminate branch to set class git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57521 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-02-03 23:55:06 +00:00
normal	5c988df0dd	string.c (rb_str_tmp_frozen_release): release embedded strings Handle the embedded case first, since we may have an embedded duplicate and non-embedded original string. * string.c (rb_str_tmp_frozen_release): handled embedded strings * test/ruby/test_io.rb (test_write_no_garbage): new test [ruby-core:78898] [Bug #13085] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57471 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-01-30 21:54:32 +00:00
normal	9c4ba969a5	io.c: recycle garbage on write * string.c (STR_IS_SHARED_M): new flag to mark shared mulitple times (STR_SET_SHARED): set STR_IS_SHARED_M (rb_str_tmp_frozen_acquire, rb_str_tmp_frozen_release): new functions (str_new_frozen): set/unset STR_IS_SHARED_M as appropriate * internal.h: declare new functions * io.c (fwrite_arg, fwrite_do, fwrite_end): new (io_fwrite): use new functions Introduce rb_str_tmp_frozen_acquire and rb_str_tmp_frozen_release to manage a hidden, frozen string. Reuse one bit of the embed length for shared strings as STR_IS_SHARED_M to indicate a string has been shared multiple times. In the common case, the string is only shared once so the object slot can be reclaimed immediately. minimum results in each 3 measurements. (time and size) Execution time (sec) name trunk built io_copy_stream_write 0.682 0.254 io_copy_stream_write_socket 1.225 0.751 Speedup ratio: compare with the result of `trunk' (greater is better) name built io_copy_stream_write 2.680 io_copy_stream_write_socket 1.630 Memory usage (last size) (B) name trunk built io_copy_stream_write 95436800.000 6512640.000 io_copy_stream_write_socket 117628928.000 7127040.000 Memory consuming ratio (size) with the result of `trunk' (greater is better) name built io_copy_stream_write 14.654 io_copy_stream_write_socket 16.505 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57469 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-01-30 20:40:18 +00:00
shugo	d33726b837	string.c: rindex(//) should set $~. This seems a bug introduced by r520 (1.4.0). [ruby-core:79110] [Bug #13135] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57374 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-01-19 08:13:03 +00:00
nobu	803621f6d7	file.c: refine message * file.c (rb_get_path_check_convert): refine the error message when the path name contains null byte. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57336 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-01-16 02:43:55 +00:00
nobu	9029464175	string.c: replacement and block * string.c (rb_enc_str_scrub): only one of replacement and block is allowed. [ruby-core:79038] [Bug #13119] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57304 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-01-11 02:31:02 +00:00
nobu	a3aa4da773	string.c: yield invalid part * string.c (rb_enc_str_scrub): yield the invalid part only with ASCII-incompatible. [ruby-core:79039] [Bug #13120] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57303 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-01-11 02:18:45 +00:00
nobu	c763f0fb9b	string.c: block for scrub with ASCII-incompatible * string.c (rb_enc_str_scrub): honor the given block with ASCII-incompatible encoding. [ruby-core:79039] [Bug #13120] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57302 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2017-01-11 01:03:37 +00:00
nobu	10bd48e402	string.c: CRLF in paragraph mode * string.c (rb_str_enumerate_lines): allow CRLF to separate paragraphs. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57185 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-25 23:56:55 +00:00
nobu	091f99b4b9	string.c: consistent paragraph mode with IO * string.c (rb_str_enumerate_lines): in paragraph mode, do not include newlines which separate paragraphs, so that it will be consistent with IO#each_line. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57184 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-25 23:50:09 +00:00
nobu	d124fa3a35	string.c: suppress a warning * string.c (rb_str_casecmp_p): [DOC] use Unicode escape form to get rid of warning C4819 by Microsoft Visual C++. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57154 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-22 22:16:19 +00:00
rhe	44ba4fd362	string.c: add missing size_t cast Add size_t cast to avoid signed integer overflow. r56157 ("string.c: avoid signed integer overflow", 2016-09-13) missed this. Suppresses UBSan. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57122 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-20 06:53:45 +00:00
nobu	5d6292809f	no crypt.h on FreeBSD 12 * string.c (crypt.h): crypt_r() was added in FreeBSD 12.0 but is declared in unistd.h. [ruby-core:78664] [Bug #13038] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57091 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-16 05:05:42 +00:00
nobu	75755ef159	fix chomping newline only line * string.c (chomp_newline): fix chomping newline only line. rb_enc_prev_char return NULL if no previous character and must not call rb_enc_ascget on it. a patch by Ary Borenszweig <asterite AT gmail.com> at [ruby-core:78666]. [Bug #13037] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57088 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-16 01:12:09 +00:00
nobu	c95388a58d	string.c: fix method name in rdoc [ci skip] * string.c (rb_str_equal): [DOC] fix fallback method name. the peer's == method will be used, not ===. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57056 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-12 07:12:07 +00:00
nobu	6dd5ee752a	String#match? and Symbol#match? * string.c (rb_str_match_m_p): inverse of Regexp#match?. based on the patch by Herwin Weststrate <herwin@snt.utwente.nl>. [Fix GH-1483] [Feature #12898] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57053 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-12 02:56:12 +00:00
nobu	d95f5bc81a	string.c: chomp option * string.c (rb_str_enumerate_lines): implement chomp option. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56972 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-12-03 14:18:03 +00:00
duerst	dacf977a42	Fix/improve documentation of String/Symbol#casecmp[?] Fix documentation of String#casecmp? (examples didn't have the '?'). Add an example with non-ASCII characters. Clarify that casecmp, unlike casecmp?, only does case-insensitivity on A-Z/a-z. [ci skip] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56926 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-11-29 10:45:54 +00:00
nobu	07fb750fd0	string.c: use xmalloc * string.c (rb_str_casemap): use xmalloc simply instead of ALLOC_N. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56920 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-11-29 03:06:01 +00:00
nobu	78b0d7ac1c	string.c: fix zero-length array * string.c (mapping_buffer): get rid of zero-length array member, which is not a part of C90. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56915 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-11-28 13:16:00 +00:00
nobu	196e8b4480	string.c: enable rdoc * string.c (rb_str_casecmp_p): [DOC] move forward declaration of rb_str_downcase to enable rdoc. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56913 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-11-28 09:37:19 +00:00
duerst	ad619e02c4	implement String/Symbol#casecmp? including Unicode case folding * string.c: Implement String#casecmp? and Symbol#casecmp? by using String#downcase :fold for Unicode case folding. This does not include options such as :turkic, because these currently cannot be combined with the :fold option. This implements feature #12786. * test/ruby/test_string.rb/test_symbol.rb: Tests for above. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56912 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-11-28 08:37:32 +00:00
nobu	a2144bd72a	chomp option * io.c (extract_getline_opts): extract chomp option. [Feature #12553] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56581 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-11-05 07:28:09 +00:00
nobu	4e44f6ef86	[DOC] replace Fixnum with Integer [ci skip] * numeric.c: [DOC] update document for Integer class. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56492 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-10-26 06:11:23 +00:00
nobu	3ba353fc1a	Fixed typo [ci skip] * string.c (rb_str_sub, rb_str_gsub): [DOC] 'backlash' should read 'backslash'. [Fix GH-1461] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56460 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-10-21 02:34:19 +00:00
usa	c2dd2d268e	* internal.h (ST2FIX): new macro to convert st_index_t to Fixnum. a hash value of Object might be Bignum, but it causes many troubles expecially the Object is used as a key of a hash. so I've gave up to do so. * array.c (rb_ary_hash): use above macro. * bignum.c (rb_big_hash): ditto. * hash.c (rb_obj_hash, rb_hash_hash): ditto. * numeric.c (rb_dbl_hash): ditto. * proc.c (proc_hash): ditto. * re.c (rb_reg_hash, match_hash): ditto. * string.c (rb_str_hash_m): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56340 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-10-04 16:25:01 +00:00
nobu	63d77c2a1b	string.c: negative hash * string.c (rb_str_hash_m): hash values may be negative. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56321 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-10-01 22:51:23 +00:00
usa	7a44019031	* string.c (rb_str_hash_m): st_index_t is not guaranteed as the same size with int, and of course also not guaranteed the value can be Fixnum. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56320 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-10-01 17:06:21 +00:00
nobu	8d501ec021	string.c: fast path of lstrip_offset * string.c (lstrip_offset): add a fast path in the case of single byte optimizable strings, as well as rstrip_offset. [ruby-core:77392] [Feature #12788] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56250 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-09-26 05:10:56 +00:00
rhe	537fea9921	string.c: fix integer overflow in enc_strlen() and rb_enc_strlen_cr() * string.c (enc_strlen, rb_enc_strlen_cr): Avoid signed integer overflow. The result type of a pointer subtraction may have the same size as long. This fixes String#size returning an negative value on i686-linux environment: str = "\x00" * ((1<<31)-2)) str.slice!(-3, 3) str.force_encoding("UTF-32BE") str << 1234 p str.size git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56247 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-09-26 02:09:50 +00:00
shyouhei	2fc5210f31	* internal.h (WARN_UNUSED_RESULT): moved to configure.in, to actually check its availability rather to check GCC's version. * configure.in (WARN_UNUSED_RESULT): moved to here. * configure.in (RUBY_FUNC_ATTRIBUTE): change function declaration to return int rather than void, because it makes no sense for a warn_unused_result attributed function to return void. Funny thing however is that it also makes no sense for noreturn attributed function to return int. So there is a fundamental conflict between them. While I tested this, I confirmed both GCC 6 and Clang 3.8 prefers int over void to correctly detect necessary attributes under this setup. Maybe subject to change in future. * internal.h (UNINITIALIZED_VAR): renamed to MAYBE_UNUSED, then moved to configure.in for the same reason we move WARN_UNUSED_RESULT. * configure.in (MAYBE_UNUSED): moved to here. * internal.h (__has_attribute): deleted, because it has no use now. * string.c (rb_str_enumerate_lines): refactor macro rename. * string.c (rb_str_enumerate_bytes): ditto. * string.c (rb_str_enumerate_chars): ditto. * string.c (rb_str_enumerate_codepoints): ditto. * thread.c (do_select): ditto. * vm_backtrace.c (rb_debug_inspector_open): ditto. * vsnprintf.c (BSD_vfprintf): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56169 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-09-16 06:15:55 +00:00
rhe	00fcd967d9	string.c: avoid signed integer overflow The behavior on signed integer overflow is undefined. On platform with sizeof(long)==4, it's fairly easy that 'len + termlen' overflows, where len is the string length and termlen is the terminator length. So, prevent the integer overflow by avoiding adding to a string length, or casting to size_t before adding where the total size is passed to {RE,}ALLOC(). string.c (STR_HEAP_SIZE, RESIZE_CAPA_TERM, str_new0, rb_str_buf_new, str_shared_replace, rb_str_init, str_make_independent_expand, rb_str_resize): Avoid overflow by casting the length to size_t. size_t should be able to represent LONG_MAX+termlen. * string.c (rb_str_modify_expand): Check that the new length is in the range of long before resizing. Also refactor to use RESIZE_CAPA_TERM macro. * string.c (str_buf_cat): Fix so that it does not create a negative length String. Also fix the condition for 'string sizes too big', the total length can be up to LONG_MAX. * string.c (rb_str_plus): Check the resulting String length does not exceed LONG_MAX. * string.c (rb_str_dump): Fix integer overflow. The dump result will be longer then the original String. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56157 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-09-13 12:33:16 +00:00
rhe	eaba77154f	string.c: rename STR_EMBEDABLE_P to STR_EMBEDDABLE_P * string.c (STR_EMBEDDABLE_P): Renamed from STR_EMBEDABLE_P(). And use it in more places. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56155 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-09-13 12:28:54 +00:00
nobu	2608f7d9c5	string.c: STR_EMBEDABLE_P * string.c (STR_EMBEDABLE_P): extract the predicate macro to tell if the given length is capable in an embedded string, and fix possible integer overflow. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56151 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-09-13 12:11:57 +00:00
nobu	8a64787632	string.c: fix integer overflow * string.c (rb_str_change_terminator_length): fix integer overflow in the case growing the terminator length and the string length is around LONG_MAX. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56149 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-09-13 08:12:54 +00:00
rhe	be3baa4380	string.c: fix buffer overflow check condition in rb_str_set_len() * string.c (rb_str_set_len): The buffer overflow check is wrong. The space for termlen is allocated outside the capacity returned by rb_str_capacity(). This fixes r41920 ("string.c: multi-byte terminator", 2013-07-11). [ruby-core:77257] [Bug #12757] * test/-ext-/string/test_set_len.rb (test_capacity_equals_to_new_size): Test for this change. Applying only the test will trigger [BUG]. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56148 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-09-13 07:08:15 +00:00
akr	577de1e93d	replace fixnum by integer in documents. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56102 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-09-08 04:57:49 +00:00
nobu	9387ff7315	multiple arguments * array.c (rb_ary_concat_multi): take multiple arguments. based on the patch by Satoru Horie. [Feature #12333] * string.c (rb_str_concat_multi, rb_str_prepend_multi): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56021 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-08-27 01:26:17 +00:00
nobu	c2bf7e6f7d	string.c: rb_fs_setter * string.c (rb_fs_setter): check and convert $; value at assignment. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55990 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-08-23 01:15:04 +00:00
nobu	bd6fe32691	string.c: $; name in error message * string.c (rb_str_split_m): show $; name in error message when it is a wrong object. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55986 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-08-22 17:10:00 +00:00
duerst	31040a307e	* string.c (String#downcase), NEWS: Mentioned that case mapping for all of ISO-8859-1~16 is now supported. [ci skip] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55777 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-30 03:13:28 +00:00
nobu	c463366dfd	rb_funcallv * *.c: rename rb_funcall2 to rb_funcallv, except for extensions which are/will be/may be gems. [Fix GH-1406] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55773 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-29 11:57:14 +00:00
ko1	9f60791a04	* vm_core.h: revisit the structure of frame, block and env. [Bug #12628] This patch introduce many changes. * Introduce concept of "Block Handler (BH)" to represent passed blocks. * move rb_control_frame_t::flag to ep[0] (as a special local variable). This flags represents not only frame type, but also env flags such as escaped. * rename `rb_block_t` to `struct rb_block`. * Make Proc, Binding and RubyVM::Env objects wb-protected. Check [Bug #12628] for more details. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55766 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-28 11:02:30 +00:00
nobu	a325876ad3	Fix Issues reported by PVS-Studio static analyzer * vm.c (vm_set_main_stack): remove unnecessary check. toplevel binding must be initialized. [Bug #12611] (N1) * win32/win32.c (w32_symlink): fix return type. [Bug #12611] (N3) * string.c (rb_str_split_m): simplify the condition. [Bug #12611](N4) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55729 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-22 10:55:22 +00:00
duerst	c6692d9410	* string.c (String#dump): Change escaping of non-ASCII characters in UTF-8 to use upper-case four-digit hexadecimal escapes without braces where possible [Feature #12419]. * test/ruby/test_string.rb (test_dump): Add tests for above. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55728 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-22 08:13:38 +00:00
ngoto	20c4461d86	* string.c (str_buf_cat): Fix potential interger overflow of capa. In addition, termlen is used instead of +1. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55692 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-15 13:08:54 +00:00
ngoto	2bb292fccf	* string.c (str_buf_cat): Fix capa size for embed string. Fix bug in r55547. [Bug #12536] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55691 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-15 12:35:52 +00:00
normal	ed5401a696	string.c: reduce malloc overhead for default buffer size * string.c (STR_BUF_MIN_SIZE): reduce from 128 to 127 [ruby-core:76371] [Feature #12025] * string.c (rb_str_buf_new): adjust for above reduction From Jeremy Evans <code@jeremyevans.net>: This changes the minimum buffer size for string buffers from 128 to 127. The underlying C buffer is always 1 more than the ruby buffer, so this changes the actual amount of memory used for the minimum string buffer from 129 to 128. This makes it much easier on the malloc implementation, as evidenced by the following code (note that time -l is used here, but Linux systems may need time -v). $ cat bench_mem.rb i = ARGV.first.to_i Array.new(1000000){" " * i} $ /usr/bin/time -l ruby bench_mem.rb 128 3.10 real 2.19 user 0.46 sys 289080 maximum resident set size 72673 minor page faults 13 block output operations 29 voluntary context switches $ /usr/bin/time -l ruby bench_mem.rb 127 2.64 real 2.09 user 0.27 sys 162720 maximum resident set size 40966 minor page faults 2 block output operations 4 voluntary context switches To try to ensure a power-of-2 growth, when a ruby string capacity needs to be increased, after doubling the capacity, add one. This ensures the ruby capacity will be odd, which means actual amount of memory used will be even, which is probably better than the current case of the ruby capacity being even and the actual amount of memory used being odd. A very similar patch was proposed 4 years ago in feature #5875. It ended up being rejected, because no performance increase was shown. One reason for that is that ruby does not use STR_BUF_MIN_SIZE unless rb_str_buf_new is called, and that previously did not have a ruby API, only a C API, so unless you were using a C extension that called it, there would be no performance increase. With the recently proposed feature #12024, String.buffer is added, which is a ruby API for creating string buffers. Using String.buffer(100) wastes much less memory with this patch, as the malloc implementation can more easily deal with the power-of-2 sized memory usage. As measured above, memory usage is 44% less, and performance is 17% better. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55686 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-14 23:30:29 +00:00
ngoto	5eff15d1bd	* string.c (rb_str_change_terminator_length): New function to change termlen and resize heap for the terminator. This is split from rb_str_fill_terminator (str_fill_term) because filling terminator and changing terminator length are different things. [Bug #12536] * internal.h: declaration for rb_str_change_terminator_length. * string.c (str_fill_term): Simplify only to zero-fill the terminator. For non-shared strings, it assumes that (capa + termlen) bytes of heap is allocated. This partially reverts r55557. * encoding.c (rb_enc_associate_index): rb_str_change_terminator_length is used, and it should be called whenever the termlen is changed. * string.c (str_capacity): New static function to return capacity of a string with the given termlen, because the termlen may sometimes be different from TERM_LEN(str) especially during changing termlen or filling terminator with specific termlen. * string.c (rb_str_capacity): Use str_capacity. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55575 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-05 10:45:23 +00:00
ngoto	3418a277d8	* string.c: Partially reverts r55547 and r55555. ChangeLog about the reverted changes are also deleted in this file. [Bug #12536] [ruby-dev:49699] [ruby-dev:49702] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55559 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-01 18:11:11 +00:00
ngoto	61f2ee0d90	* string.c (str_fill_term): When termlen increases, re-allocation of memory for termlen should always be needed. In this fix, if possible, decrease capa instead of realloc. [Bug #12536] [ruby-dev:49699] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55557 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-01 17:32:21 +00:00
ngoto	a92a537bf4	* string.c: Specify termlen as far as possible. Additional fix for [Bug #12536] [ruby-dev:49699]. * string.c (rb_usascii_str_new, rb_utf8_str_new): Specify termlen which is apparently 1 for the encodings. * string.c (str_new0_cstr): New static function to create a String object from a C string with specifying termlen. * string.c (rb_usascii_str_new_cstr, rb_utf8_str_new_cstr): Specify termlen by using new str_new0_cstr(). * string.c (str_new_static): Specify termlen from the given encoding when creating a new String object is needed. * string.c (rb_tainted_str_new_with_enc): New function to create a tainted String object with the given encoding. This means that the termlen is correctly specified. Curretly static function. The function name might be renamed to rb_tainted_enc_str_new or rb_enc_tainted_str_new. * string.c (rb_external_str_new_with_enc): Use encoding by using the above rb_tainted_str_new_with_enc(). git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55555 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-01 11:24:11 +00:00
ngoto	10e28726a1	* string.c (rb_str_subseq, str_substr): When RSTRING_EMBED_LEN_MAX is used, TERM_LEN(str) should be considered with it because embedded strings are also processed by TERM_FILL. Additional fix for [Bug #12536] [ruby-dev:49699]. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55552 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-01 04:50:38 +00:00
ngoto	6734a0c3d9	string.c: Add parentheses to avoid C source code ambiguity. [Bug #12536 ] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55551 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-07-01 03:58:51 +00:00
ngoto	f2ee22371b	* string.c: Fix memory corruptions when using UTF-16/32 strings. [Bug #12536] [ruby-dev:49699] * string.c (TERM_LEN_MAX): Macro for the longest TERM_FILL length, the same as largest value of rb_enc_mbminlen(enc) among encodings. * string.c (str_new, rb_str_buf_new, str_shared_replace): Allocate +TERM_LEN_MAX bytes instead of +1. This change may increase memory usage. * string.c (rb_str_new_with_class): Use TERM_LEN of the "obj". * string.c (rb_str_plus, rb_str_justify): Use str_new0 which is aware of termlen. * string.c (str_shared_replace): Copy +termlen bytes instead of +1. * string.c (rb_str_times): termlen should not be included in capa. * string.c (RESIZE_CAPA_TERM): When using RSTRING_EMBED_LEN_MAX, termlen should be counted with it because embedded strings are also processed by TERM_FILL. * string.c (rb_str_capacity, str_shared_replace, str_buf_cat): ditto. * string.c (rb_str_drop_bytes, rb_str_setbyte, str_byte_substr): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55547 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-30 10:20:23 +00:00
nobu	bcf0a198f1	CASEMAP_DEBUG [ci skip] * string.c (rb_str_casemap, rb_str_ascii_casemap): move debug/tuning messages under a preprocessor condition, CASEMAP_DEBUG. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-21 08:19:59 +00:00
nobu	3a6bb56029	Fix garbage allocation * string.c (rb_str_casemap): do not put code with side effects inside RSTRING_PTR() macro which evaluates the argument multiple times. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55481 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-21 07:38:16 +00:00
naruse	8272729977	* string.c (rb_str_casemap): fix memory leak. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55480 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-21 07:14:05 +00:00
naruse	9d291c82e5	* string.c (rb_str_casemap): int is too small for string size. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55479 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-21 07:14:04 +00:00
nobu	1cbc622ea7	string.c: adjust buffer size * string.c (tr_trans): adjust buffer size by processed and rest lengths, instead of doubling repeatedly. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55428 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-16 03:17:54 +00:00
nobu	cc9f1e9195	string.c: fix terminator * string.c (tr_trans): consider terminator length and fix heap overflow. reported by Guido Vranken <guido AT guidovranken.nl>. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55427 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-16 02:15:27 +00:00
nobu	aaf8c09900	Fix typo in string.c [ci skip] * string.c (rb_str_oct): [DOC] fix typo, hornored -> honored. [Fix GH-1379] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55378 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-11 06:02:46 +00:00
duerst	02f7ad6237	* enc/iso_8859_1.c: Implement non-ASCII case mapping. * test/ruby/enc/test_case_comprehensive.rb: Tests for above. * string.c: Add iso-8859-1 to supported encodings. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55373 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-11 00:46:21 +00:00
duerst	10174c295b	* string.c: Special-case :ascii option in rb_str_capitalize_bang and rb_str_swapcase_bang. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55361 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-10 08:35:17 +00:00
duerst	13f576d6b9	* string.c: Special-case :ascii option in rb_str_upcase_bang (retry). git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55359 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-10 08:12:28 +00:00
nobu	2667d1b38f	hash.c: ensure NUL-terminated for ENV * hash.c (get_env_cstr): ensure NUL-terminated. [ruby-dev:49655] [Bug #12475] * string.c (rb_str_fill_terminator): return the pointer to the NUL-terminated content. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55345 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-10 05:48:38 +00:00
kazu	075cf3d2e8	string.c (rb_str_ascii_casemap): fix compile error. error: implicit conversion loses integer precision: 'long' to 'int' [-Werror,-Wshorten-64-to-32] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55332 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-08 14:11:17 +00:00
duerst	872f9a498f	* string.c: Revert previous commit (possibility of endless loop). git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55331 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-08 13:22:28 +00:00
duerst	5eb73eeda8	* string.c: Special-case :ascii option in rb_str_upcase_bang. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55330 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-08 12:57:44 +00:00
duerst	f0fc6ec872	* string.c: New static function rb_str_ascii_casemap; special-casing :ascii option in rb_str_upcase_bang and rb_str_downcase_bang. * regenc.c: Fix a bug (wrong use of unnecessary slack at end of string). * regenc.h -> include/ruby/oniguruma.h: Move declaration of onigenc_ascii_only_case_map so that it is visible in string.c. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55329 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-08 12:28:42 +00:00
duerst	8743f010c6	* string.c (rb_str_upcase_bang, rb_str_capitalize_bang, rb_str_swapcase_bang): Switch to use primitive. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55310 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-07 08:18:42 +00:00
duerst	53a3e3ddd9	* string.c (rb_str_downcase_bang): Switch to use primitive except if conversion can be done ASCII-only. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55308 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-07 07:44:19 +00:00
duerst	ab5f23f26c	* string.c: Added UTF-16BE/LE and UTF-32BE/LE to supported encodings for Unicode case mapping. * test/ruby/enc/test_case_comprehensive.rb: Tests for above functionality; fixed an encoding issue in assertion error message. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55296 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-06 09:36:36 +00:00
duerst	2f49aa8f62	* string.c Change rb_str_casemap to use encoding primitive case_map instead of directly calling onigenc_unicode_case_map. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55293 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-06 04:37:10 +00:00
duerst	c5ea268264	* string.c: Remove :lithuanian guard for Unicode case mapping. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55277 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-05 05:46:37 +00:00
nobu	40c3c3ec6c	crypt.h: remove initialized * missing/crypt.h (struct crypt_data): remove unnecessary member "initialized". * missing/crypt.c (des_setkey_r): nothing to be initialized in crypt_data. * configure.in (struct crypt_data): check for "initialized" in struct crypt_data, which may be only in glibc, and isn't on AIX at least. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55272 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-04 01:54:54 +00:00
duerst	3dd98b2446	* string.c: Raise ArgumentError when invalid string is detected in case mapping methods. * enc/unicode.c: Check for invalid string and signal with negative length value. * test/ruby/enc/test_case_mapping.rb: Add tests for above. * test/ruby/test_m17n_comb.rb: Add a message to clarify test failure. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55253 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-02 01:24:52 +00:00
nobu	a94201243e	string.c: fallback to crypt_r * string.c: prefer crypt_r to crypt iff system crypt nor crypt_r are not provided. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55250 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-01 13:17:31 +00:00
nobu	a8bfa9bdf1	use system crypt * configure.in: revert r55237. replace crypt, not crypt_r, and check if crypt is broken more. * missing/crypt.c: move crypt_r.c * string.c (rb_str_crypt): use crypt_r if provided by the system. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55245 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-01 06:58:21 +00:00
nobu	3c31685e11	use crypt_r * string.c (rb_str_crypt): use reentrant crypt_r. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55237 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-06-01 00:48:08 +00:00
naruse	e6ff652ce8	Revert r55225 Run test-all before large commit: "* string.c: Activate full Unicode case mapping for UTF-8 by removing" This reverts commit `3fb0fcd1e8`. http://rubyci.s3.amazonaws.com/centos5-64/ruby-trunk/log/20160531T013303Z.fail.html.gz git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55226 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-05-31 02:56:09 +00:00
duerst	3fb0fcd1e8	* string.c: Activate full Unicode case mapping for UTF-8 by removing the protective check for the presence of an option. Update documentation. * test/ruby/enc/test_case_comprehensive.rb: Adjust tests for above change. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55225 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-05-31 01:10:06 +00:00
duerst	ae4fba3167	* string.c: Document current behavior for other case mapping methods on String. [ci skip] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55217 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-05-30 12:15:41 +00:00
duerst	85950c5257	* string.c: Document current situation for String#downcase. [ci skip] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55215 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-05-30 11:00:26 +00:00
nobu	79a85b18cc	string.c: return reallocated pointer * string.c (str_fill_term): return new pointer reallocated by filling terminator. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55212 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-05-30 07:20:28 +00:00
nobu	9ac5f9135a	string.c: get rid of unnecessary empty string * string.c (str_substr, rb_str_aref): refactor not to create unnecessary empty string. * string.c (str_byte_substr, str_byte_aref): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55209 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-05-30 05:50:27 +00:00
nobu	e3e8cae9be	string.c: check in the order * string.c (rb_str_aref_m, rb_str_byteslice): check arguments in the left-to-right order. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55208 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-05-30 05:41:02 +00:00
nobu	4fad63da01	transcode.c: scrub in the given encoding * transcode.c (str_transcode0): scrub in the given encoding when the source encoding is given, not in the encoding of the receiver. [ruby-core:75732] [Bug #12431] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55181 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-05-27 08:09:46 +00:00
nobu	b493d156de	string.c: integer overflow * string.c (rb_str_modify_expand): check integer overflow. [ruby-core:75592] [Bug #12390] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55054 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-05-18 05:52:40 +00:00
nobu	4a9705d6e3	ruby.h: RB_INTEGER_TYPE_P * include/ruby/ruby.h (RB_INTEGER_TYPE_P): new macro and underlying inline function to check if the object is an Integer (Fixnum or Bignum). git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55044 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-05-18 01:17:43 +00:00
naruse	28f5e12c24	* configure.in: check function attirbute const and pure, and define CONSTFUNC and PUREFUNC if available. Note that I don't add those options as default because it still shows many false-positive (it seems not to consider longjmp). * vm_eval.c (stack_check): get rb_thread_t* as an argument to avoid duplicate call of GET_THREAD(). git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54952 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-05-08 17:44:51 +00:00
yui-knk	deca1d8007	* string.c (rb_str_sub): Fix a special match variable name. [ci skip] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54915 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-05-05 05:39:35 +00:00
naruse	cdef0bc833	* string.c (count_utf8_lead_bytes_with_word): Use __builtin_popcount only if it can use SSE 4.2 POPCNT whose latency is 3 cycle. * internal.h (rb_popcount64): use __builtin_popcountll because now it is in fast path. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54894 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-05-03 13:14:30 +00:00
nobu	c353ec0c9e	string.c: shortcut * string.c (rb_str_concat): shortcut concatenation to ASCII-8BIT as well as US-ASCII. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54882 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-05-02 03:58:28 +00:00
nobu	321c6df89b	string.c: fix doc * string.c (rb_str_concat): [DOC] fix the indefinite article, for replacement from Fixnum to Integer. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54881 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-05-02 03:53:34 +00:00
nobu	0e3475a6d9	string.c: fix braces * string.c (search_nonascii): fix braces unmatched by a preprocessing condition. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54879 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-05-02 00:06:04 +00:00
naruse	2fc973796a	fix mixed declaration on non UNALIGNED_WORD_ACCESS git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54877 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-05-01 18:27:41 +00:00
naruse	64837f778a	fix for where UNALIGNED_WORD_ACCESS is not allowed git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54867 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-05-01 14:19:02 +00:00
naruse	db2c32778d	Use WORDS_BIGENDIAN git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54862 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-05-01 09:07:14 +00:00
naruse	0f0121fe1a	* string.c (search_nonascii): use nlz on big endian environments. * internal.h (nlz_intpr): defined. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54859 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-04-30 22:32:05 +00:00
naruse	424a706afe	More optimization for r54854's search_nonascii git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54857 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-04-30 16:32:36 +00:00
naruse	4cf460a7bb	* string.c (search_nonascii): unroll and use ntz * configure.in (__builtin_ctz): check. * configure.in (__builtin_ctzll): check. * internal.h (rb_popcount32): defined for ntz_int32. it can use __builtin_popcount but this function is not used on GCC environment because it uses __builtin_ctz. When another function uses this, using __builtin_popcount should be re-considered. * internal.h (rb_popcount64): ditto. * internal.h (ntz_int32): defined for ntz_intptr. * internal.h (ntz_int64): defined for ntz_intptr. * internal.h (ntz_intptr): defined as ntz for uintptr_t. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54854 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-04-30 15:39:02 +00:00
nobu	a491508753	string.c: rb_str_concat_literals * string.c (rb_str_concat_literals): concatenate literal string fragments. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54490 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-04-05 08:15:22 +00:00
nobu	0f32783976	string.c: skip invalid char gap * string.c (enc_succ_alnum_char): try to skip an invalid character gap between GREEK CAPITAL RHO and SIGMA. [ruby-core:74478] [Bug #12204] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54210 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-03-21 10:09:33 +00:00
nobu	49a272d728	string.c: Symbol#match * string.c (sym_match_m): delegate to String#match but not String#=~. [ruby-core:72864] [Bug #11991] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53866 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-02-18 12:06:20 +00:00
nobu	5a6a502ef9	string.c: fix rb_str_init * string.c (rb_str_init): fix segfault and memory leak, consider wide char encoding terminator. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53855 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-02-17 11:24:09 +00:00
naruse	d092fc5398	Additional fix and tests for r53851 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53854 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-02-17 10:15:28 +00:00
nobu	b6053df008	remove unnecessary declaration so that rdoc works git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53852 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-02-17 07:37:20 +00:00
naruse	49dee548f4	fix rubyspec error from r53850 http://rubyci.s3.amazonaws.com/tk2-243-31075/ruby-trunk/log/20160217T061402Z.fail.html.gz git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53851 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-02-17 07:24:13 +00:00
naruse	d46e2aea71	* string.c (rb_str_init): introduce String.new(capacity: size) [Feature #12024] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53850 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-02-17 03:21:35 +00:00
duerst	2ca7569c6d	* string.c, enc/unicode.c: Disassociating ONIGENC_CASE_FOLD flag from ONIGENC_CASE_DOWNCASE. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53778 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-02-08 11:44:12 +00:00
nobu	1bea5a6127	string.c: remove magic number * string.c (rb_str_dump): share same string literal instead of a magic number. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53774 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-02-08 03:44:48 +00:00
nobu	6442f02176	string.c: use encoding index * string.c (rb_external_str_with_enc, rb_str_concat, rb_str_dump): use encoding index as shortcut without rb_encoding. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53773 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-02-08 03:41:16 +00:00
nobu	94c70c7d72	fstring_enc_new * string.c (rb_fstring_enc_new, rb_fstring_enc_cstr): functions to make fstring with encoding. * re.c (rb_reg_initialize): make fstring without copying. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53736 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-02-04 06:35:34 +00:00
naruse	040ce05610	* string.c (str_new_frozen): if the given string is embeddedable but not embedded, embed a new copied string. [Bug #11946] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53724 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-02-03 04:52:13 +00:00
naruse	21daa56b2a	* re.c: Introduce RREGEXP_PTR. patch by dbussink. partially merge https://github.com/ruby/ruby/pull/497 * include/ruby/ruby.h: ditto. * gc.c: ditto. * ext/strscan/strscan.c: ditto. * parse.y: ditto. * string.c: ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53715 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-02-02 04:39:44 +00:00
nobu	439224a590	RUBY_ASSERT * error.c (rb_assert_failure): assertion with stack dump. * ruby_assert.h (RUBY_ASSERT): new header for the assertion. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53615 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-01-22 08:33:55 +00:00
hsbt	4c6713f374	* string.c: fix a typo. [fix GH-1202][ci skip] Patch by @sunboshan git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53571 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-01-18 02:48:24 +00:00
duerst	e580847ce8	* string.c: Any kind of option is now taking the new code path for upcase/downcase/capitalize/swapcase. :lithuanian can be used for testing if no specific option is desired. * test/ruby/enc/test_case_mapping.rb: Adjusted to above. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53565 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-01-17 11:40:46 +00:00
duerst	959bbb6f72	* enc/unicode.c: Removed artificial expansion for Turkic, added hand-coded support for Turkic, fixed logic for swapcase. * string.c: Made use of new case mapping code possible from upcase, capitalize, and swapcase (with :lithuanian as a guard). * test/ruby/enc/test_case_mapping.rb: Adjusted for above. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53562 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-01-17 08:42:16 +00:00
duerst	c12af76763	* enc/unicode.c: Artificial mapping to test buffer expansion code. * string.c: Fixed buffer expansion logic. * test/ruby/enc/test_case_mapping.rb: Tests for above. (with Kimihito Matsui) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53554 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-01-16 08:24:58 +00:00
hsbt	219467abde	* enc/unicode.c: fix implicit conversion error with clang. fixup r53548. * string.c: ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53552 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-01-16 01:51:58 +00:00
svn	72fa5a8ee5	* remove trailing spaces. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53549 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-01-16 01:24:04 +00:00
duerst	be897c2507	* string.c, enc/unicode.c: New code path as a preparation for Unicode-wide case mapping. The code path is currently guarded by the :lithuanian option to avoid accidental problems in daily use. * test/ruby/enc/test_case_mapping.rb: Test for above. * string.c: function 'check_case_options': fixed logical errors git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53548 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-01-16 01:24:03 +00:00
duerst	4a5d3572e6	string.c: made a variable name more grammatically correct git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53510 b2dd03c8-39d4-4d8f-98ff-823fe69b080e	2016-01-12 09:42:07 +00:00

... 3 4 5 6 7 ...

1637 Коммитов