github/ruby - ruby

Граф коммитов

Автор	SHA1	Сообщение	Дата
Peter Zhu	176c4bb3c7	Fix corruption of internal encoding string [Bug #20598] Just like [Bug #20595], Encoding#name_list and Encoding#aliases can have their strings corrupted when Encoding.default_internal is set to nil. Co-authored-by: Matthew Valentine-House <matt@eightbitraptor.com>	2024-06-27 14:06:40 -04:00
Peter Zhu	c6a0d03649	Fix corruption of encoding name string [Bug #20595] enc_set_default_encoding will free the C string if the encoding is nil, but the C string can be used by the encoding name string. This will cause the encoding name string to be corrupted. Consider the following code: Encoding.default_internal = Encoding::ASCII_8BIT names = Encoding.default_internal.names p names Encoding.default_internal = nil p names It outputs: ["ASCII-8BIT", "BINARY", "internal"] ["ASCII-8BIT", "BINARY", "\x00\x00\x00\x00\x00\x00\x00\x00"] Co-authored-by: Matthew Valentine-House <matt@eightbitraptor.com>	2024-06-27 09:47:22 -04:00
Jean Boussier	3a7846b1aa	Add a hint of `ASCII-8BIT` being `BINARY` [Feature #18576] Since outright renaming `ASCII-8BIT` is deemed to backward incompatible, the next best thing would be to only change its `#inspect`, particularly in exception messages.	2024-04-18 10:17:26 +02:00
Jean Boussier	d4f3dcf4df	Refactor VM root modules This `st_table` is used to both mark and pin classes defined from the C API. But `vm->mark_object_ary` already does both much more efficiently. Currently a Ruby process starts with 252 rooted classes, which uses `7224B` in an `st_table` or `2016B` in an `RArray`. So a baseline of 5kB saved, but since `mark_object_ary` is preallocated with `1024` slots but only use `405` of them, it's a net `7kB` save. `vm->mark_object_ary` is also being refactored. Prior to this changes, `mark_object_ary` was a regular `RArray`, but since this allows for references to be moved, it was marked a second time from `rb_vm_mark()` to pin these objects. This has the detrimental effect of marking these references on every minors even though it's a mostly append only list. But using a custom TypedData we can save from having to mark all the references on minor GC runs. Addtionally, immediate values are now ignored and not appended to `vm->mark_object_ary` as it's just wasted space.	2024-03-06 15:33:43 -05:00
Peter Zhu	c7ce2f537f	Fix memory leak in setting encodings There is a memory leak in Encoding.default_external= and Encoding.default_internal= because the duplicated name is not freed when overwriting. 10.times do 1_000_000.times do Encoding.default_internal = nil end puts `ps -o rss= -p #{$$}` end Before: 25664 41504 57360 73232 89168 105056 120944 136816 152720 168576 After: 9648 9648 9648 9680 9680 9680 9680 9680 9680 9680	2024-01-03 13:31:43 -05:00
Adam Hess	6816e8efcf	Free everything at shutdown when the RUBY_FREE_ON_SHUTDOWN environment variable is set, manually free memory at shutdown. Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org> Co-authored-by: Peter Zhu <peter@peterzhu.ca>	2023-12-07 15:52:35 -05:00
Jean Boussier	60c924770d	Mark Encoding as Write Barrier protected It doesn't even have a mark function. It's only about a hundred objects, but not reason to scan them every time.	2023-02-07 11:48:57 +01:00
Benoit Daloze	6abe20e87b	Remove Encoding#replicate	2023-01-11 13:41:41 +01:00
Koichi Sasada	dbf77d420d	surpress warning now `enc_table->list` is not a pointer.	2022-12-16 11:12:37 +09:00
Koichi Sasada	ae19ac5b5b	fixed encoding table This reduces global lock acquiring for reading. https://bugs.ruby-lang.org/issues/18949	2022-12-16 10:04:37 +09:00
Benoit Daloze	6525b6f760	Remove get_actual_encoding() and the dynamic endian detection for dummy UTF-16/UTF-32 * And simplify callers of get_actual_encoding(). * See [Feature #18949]. * See https://github.com/ruby/ruby/pull/6322#issuecomment-1242758474	2022-09-12 14:02:34 +02:00
Benoit Daloze	14bcf69c9c	Deprecate Encoding#replicate * See [Feature #18949].	2022-09-10 19:02:15 +02:00
Takashi Kokubun	5b21e94beb	Expand tabs [ci skip] [Misc #18891]	2022-07-21 09:42:04 -07:00
Jean Boussier	d084585f01	Rename ENCINDEX_ASCII to ENCINDEX_ASCII_8BIT Otherwise it's way too easy to confuse it with US_ASCII.	2022-07-19 08:48:56 +02:00
Burdette Lamar	81741690a0	[DOC] Main doc for encodings moved from encoding.c to doc/encodings.rdoc (#5748 ) Main doc for encodings moved from encoding.c to doc/encodings.rdoc	2022-04-01 20:41:04 -05:00
Nobuyoshi Nakada	7459a32af3	suppress warnings for probable NULL dererefences	2021-10-24 19:24:50 +09:00
卜部昌平	5112a54846	include/ruby/encoding.h: convert macros into inline functions Less macros == huge win.	2021-10-05 14:18:23 +09:00
Jeremy Evans	3f7da458a7	Make encoding loading not issue warning Instead of relying on setting an unsetting ruby_verbose, which is not thread-safe, restructure require_internal and load_lock to accept a warn argument for whether to warn, and add rb_require_internal_silent to require without warnings. Use rb_require_internal_silent when loading encoding. Note this does not modify ruby_debug and errinfo handling, those remain thread-unsafe. Also silent requires when loading transcoders.	2021-10-02 05:51:29 -09:00
S-H-GAMELINKS	18031f4102	Add rb_encoding_check function	2021-08-22 10:39:14 +09:00
S.H	378e8cdad6	Using RBOOL macro	2021-08-02 12:06:44 +09:00
Jean Boussier	7e8a9af9db	rb_enc_interned_str: handle autoloaded encodings If called with an autoloaded encoding that was not yet initialized, `rb_enc_interned_str` would crash with a NULL pointer exception. See: https://github.com/ruby/ruby/pull/4119#issuecomment-800189841	2021-03-22 21:37:48 +09:00
Koichi Sasada	2a3324fcd2	No sync on ASCII/US_ASCCII/UTF-8 rb_enc_from_index(index) doesn't need locking if index specify ASCII/US_ASCCII/UTF-8. rb_enc_from_index() is called frequently so it has impact. user system total real r_parallel/miniruby 174 0.000209 0.000000 5.559872 ( 1.811501) r_parallel/master_mini 175 0.000238 0.000000 12.664707 ( 3.523641) (repeat x1000 `s.split(/,/)` where s = '0,,' * 1000)	2020-12-17 03:44:23 +09:00
Lars Kanis	94b6933d1c	Set default for Encoding.default_external to UTF-8 on Windows (#2877 ) * Use UTF-8 as default for Encoding.default_external on Windows * Document UTF-8 change on Windows to Encoding.default_external fix https://bugs.ruby-lang.org/issues/16604	2020-12-08 01:48:37 +09:00
Koichi Sasada	5e3259ea74	fix public interface To make some kind of Ractor related extensions, some functions should be exposed. * include/ruby/thread_native.h * rb_native_mutex_* * rb_native_cond_* * include/ruby/ractor.h * RB_OBJ_SHAREABLE_P(obj) * rb_ractor_shareable_p(obj) * rb_ractor_std() rb_cRactor and rm ractor_pub.h and rename srcdir/ractor.h to srcdir/ractor_core.h (to avoid conflict with include/ruby/ractor.h)	2020-11-18 03:52:41 +09:00
Stefan Stüben	8c2e5bbf58	Don't redefine #rb_intern over and over again	2020-10-21 12:45:18 +09:00
Koichi Sasada	a76a30724d	Revert "reduce lock for encoding" This reverts commit `de17e2dea1`. This patch can introduce race condition because of conflicting read/write access for enc_table::default_list. Maybe we need to freeze default_list at the end of Init_encdb() in enc/encdb.c.	2020-10-20 01:34:17 +09:00
Koichi Sasada	de17e2dea1	reduce lock for encoding To reduce the number of locking for encoding manipulation, enc_table::list is splited to ::default_list and ::additional_list. ::default_list is pre-allocated and no need locking to access to the ::default_list. If additional encoding space is needed, use ::additional_list and this list need to use locking. However, most of case, ::default_list is enough.	2020-10-19 14:06:40 +09:00
Nobuyoshi Nakada	7ffd14a18c	Check encoding name to replicate https://hackerone.com/reports/954433	2020-10-15 16:48:25 +09:00
Koichi Sasada	102c2ba65f	freeze Encoding objects Encoding objects can be accessed in multi-ractors and there is no state to mutate. So we can mark it as frozen and shareable. [Bug #17188]	2020-10-14 14:02:06 +09:00
Koichi Sasada	11c2f0f36c	sync enc_table and rb_encoding_list enc_table which manages Encoding information. rb_encoding_list also manages Encoding objects. Both are accessed/modified by ractors simultaneously so that they should be synchronized. For enc_table, this patch introduced GLOBAL_ENC_TABLE_ENTER/LEAVE/EVAL to access this table with VM lock. To make shortcut, three new global variables global_enc_ascii, global_enc_utf_8, global_enc_us_ascii are also introduced. For rb_encoding_list, we split it to rb_default_encoding_list (256 entries) and rb_additional_encoding_list. rb_default_encoding_list is fixed sized Array so we don't need to synchronized (and most of apps only needs it). To manage 257 or more encoding objects, they are stored into rb_additional_encoding_list. To access rb_additional_encoding_list., VM lock is needed.	2020-10-14 14:02:06 +09:00
Nobuyoshi Nakada	9e67a38fde	Fallback to built-in UTF-8 for miniruby Source code encoding is defaulted to UTF-8 now too.	2020-05-16 17:36:30 +09:00
卜部昌平	9e41a75255	sed -i 's\|ruby/impl\|ruby/internal\|' To fix build failures.	2020-05-11 09:24:08 +09:00
卜部昌平	d7f4d732c1	sed -i s\|ruby/3\|ruby/impl\|g This shall fix compile errors.	2020-05-11 09:24:08 +09:00
Nobuyoshi Nakada	5d430c1b34	Added more NORETURN declarations	2020-05-11 00:40:14 +09:00
卜部昌平	9e6e39c351	Merge pull request #2991 from shyouhei/ruby.h Split ruby.h	2020-04-08 13:28:13 +09:00
Nobuyoshi Nakada	fce667ed08	Get rid of warnings/exceptions at cleanup After the encoding index instance variable is removed when all instance variables are removed in `obj_free`, then `rb_str_free` causes uninitialized instance variable warning and nil-to-integer conversion exception. Both cases result in object allocation during GC, and crashes.	2020-02-13 12:46:48 +09:00
卜部昌平	f83781c8c1	rb_enc_str_asciionly_p expects T_STRING This `str2` variable can be non-string (regexp etc.) but the previous code passed it directly to rb_enc_str_asciionly_p(), which expects its argument be a string. Let's enforce that constraint.	2020-02-10 12:19:30 +09:00
卜部昌平	115fec062c	more on NULL versus functions. Function pointers are not void*. See also `ce4ea956d2` `8427fca49b`	2020-02-07 14:24:19 +09:00
Lars Kanis	a4fca28b80	Fix description of Encoding.default_(in\|ex)ternal Data written to files is not transcoded per default, but only when default_internal is set. The default for default_internal is nil and doesn't depend on the source file encoding.	2020-02-03 08:42:01 -08:00
卜部昌平	5e22f873ed	decouple internal.h headers Saves comitters' daily life by avoid #include-ing everything from internal.h to make each file do so instead. This would significantly speed up incremental builds. We take the following inclusion order in this changeset: 1. "ruby/config.h", where _GNU_SOURCE is defined (must be the very first thing among everything). 2. RUBY_EXTCONF_H if any. 3. Standard C headers, sorted alphabetically. 4. Other system headers, maybe guarded by #ifdef 5. Everything else, sorted alphabetically. Exceptions are those win32-related headers, which tend not be self- containing (headers have inclusion order dependencies).	2019-12-26 20:45:12 +09:00
QuestionDriven	54be15f325	[Doc] Fix sample in Encoding#names	2019-12-22 23:01:45 +09:00
QuestionDriven	9654241d5d	[Doc] Fix wrong example in Encoding.aliases	2019-12-22 23:01:45 +09:00
Jeremy Evans	ffd0820ab3	Deprecate taint/trust and related methods, and make the methods no-ops This removes the related tests, and puts the related specs behind version guards. This affects all code in lib, including some libraries that may want to support older versions of Ruby.	2019-11-18 01:00:25 +02:00
Jeremy Evans	c5c05460ac	Warn on access/modify of $SAFE, and remove effects of modifying $SAFE This removes the security features added by $SAFE = 1, and warns for access or modification of $SAFE from Ruby-level, as well as warning when calling all public C functions related to $SAFE. This modifies some internal functions that took a safe level argument to no longer take the argument. rb_require_safe now warns, rb_require_string has been added as a version that takes a VALUE and does not warn. One public C function that still takes a safe level argument and that this doesn't warn for is rb_eval_cmd. We may want to consider adding an alternative method that does not take a safe level argument, and warn for rb_eval_cmd.	2019-11-18 01:00:25 +02:00
Nobuyoshi Nakada	8869384367	Moved Init_encoding from wrong place [Bug #16292 ]	2019-11-05 10:28:01 +09:00
卜部昌平	7e0ae1698d	avoid overflow in integer multiplication This changeset basically replaces `ruby_xmalloc(x * y)` into `ruby_xmalloc2(x, y)`. Some convenient functions are also provided for instance `rb_xmalloc_mul_add(x, y, z)` which allocates x * y + z byes.	2019-10-09 12:12:28 +09:00
Lars Kanis	9311656914	Better wording for __ENCODING__ "locale encoding" is misleading since it doesn't mean Encoding.find("locale") but the encoding used to interpret the script file. It's therefore better to call it "script encoding" as in the paragraphs above. Closes: https://github.com/ruby/ruby/pull/1611	2019-08-04 09:03:46 +09:00
Lourens Naudé	009ec37a47	Let the index boundary check in rb_enc_from_index be flagged as unlikely [Misc #15806] Closes: https://github.com/ruby/ruby/pull/2128	2019-07-23 16:45:54 +09:00
Lourens Naudé	6546aed475	Explicitly initialise encodings on init to remove branches on encoding lookup [Misc #15806] Closes: https://github.com/ruby/ruby/pull/2128	2019-07-23 16:45:54 +09:00
Koichi Sasada	8ac1c6eb48	respect RUBY_DEBUG too	2019-07-15 12:06:25 +09:00

1 2 3 4 5 ...

400 Коммитов