Граф коммитов

26 Коммитов

Автор SHA1 Сообщение Дата
yui-knk 899d9f79dd Rename `vast` to `ast_value`
There is an English word "vast".
This commit changes the name to be more clear name to avoid confusion.
2024-05-03 12:40:35 +09:00
HASUMI Hitoshi 2244c58b00 [Universal parser] Decouple IMEMO from rb_ast_t
This patch removes the `VALUE flags` member from the `rb_ast_t` structure making `rb_ast_t` no longer an IMEMO object.

## Background

We are trying to make the Ruby parser generated from parse.y a universal parser that can be used by other implementations such as mruby.
To achieve this, it is necessary to exclude VALUE and IMEMO from parse.y, AST, and NODE.

## Summary (file by file)

- `rubyparser.h`
  - Remove the `VALUE flags` member from `rb_ast_t`
- `ruby_parser.c` and `internal/ruby_parser.h`
  - Use TypedData_Make_Struct VALUE which wraps `rb_ast_t` `in ast_alloc()` so that GC can manage it
    - You can retrieve `rb_ast_t` from the VALUE by `rb_ruby_ast_data_get()`
  - Change the return type of `rb_parser_compile_XXXX()` functions from `rb_ast_t *` to `VALUE`
  - rb_ruby_ast_new() which internally `calls ast_alloc()` is to create VALUE vast outside ruby_parser.c
- `iseq.c` and `vm_core.h`
  - Amend the first parameter of `rb_iseq_new_XXXX()` functions from `rb_ast_body_t *` to `VALUE`
  - This keeps the VALUE of AST on the machine stack to prevent being removed by GC
- `ast.c`
  - Almost all change is replacement `rb_ast_t *ast` with `VALUE vast` (sorry for the big diff)
  - Fix `node_memsize()`
    - Now it includes `rb_ast_local_table_link`, `tokens` and script_lines
- `compile.c`, `load.c`, `node.c`, `parse.y`, `proc.c`, `ruby.c`, `template/prelude.c.tmpl`, `vm.c` and `vm_eval.c`
  - Follow-up due to the above changes
- `imemo.{c|h}`
  - If an object with `imemo_ast` appears, considers it a bug

Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
2024-04-26 11:21:08 +09:00
HASUMI Hitoshi 9b1e97b211 [Universal parser] DeVALUE of p->debug_lines and ast->body.script_lines
This patch is part of universal parser work.

## Summary
- Decouple VALUE from members below:
  - `(struct parser_params *)->debug_lines`
  - `(rb_ast_t *)->body.script_lines`
- Instead, they are now `rb_parser_ary_t *`
  - They can also be a `(VALUE)FIXNUM` as before to hold line count
- `ISEQ_BODY(iseq)->variable.script_lines` remains VALUE
  - In order to do this,
  - Add `VALUE script_lines` param to `rb_iseq_new_with_opt()`
  - Introduce `rb_parser_build_script_lines_from()` to convert `rb_parser_ary_t *` into `VALUE`

## Other details
- Extend `rb_parser_ary_t *`. It previously could only store `rb_parser_ast_token *`, now can store script_lines, too
- Change tactics of building the top-level `SCRIPT_LINES__` in `yycompile0()`
  - Before: While parsing, each line of the script is added to `SCRIPT_LINES__[path]`
  - After: After `yyparse(p)`, `SCRIPT_LINES__[path]` will be built from `p->debug_lines`
- Remove the second parameter of `rb_parser_set_script_lines()` to make it simple
- Introduce `script_lines_free()` to be called from `rb_ast_free()` because the GC no longer takes care of the script_lines
- Introduce `rb_parser_string_deep_copy()` in parse.y to maintain script_lines when `rb_ruby_parser_free()` called
  - With regard to this, please see *Future tasks* below

## Future tasks
- Decouple IMEMO from `rb_ast_t *`
  - This lifts the five-members-restriction of Ruby object,
  - So we will be able to move the ownership of the `lex.string_buffer` from parser to AST
  - Then we remove `rb_parser_string_deep_copy()` to make the whole thing simple
2024-04-15 20:51:54 +09:00
Étienne Barrié 12be40ae6b Implement chilled strings
[Feature #20205]

As a path toward enabling frozen string literals by default in the future,
this commit introduce "chilled strings". From a user perspective chilled
strings pretend to be frozen, but on the first attempt to mutate them,
they lose their frozen status and emit a warning rather than to raise a
`FrozenError`.

Implementation wise, `rb_compile_option_struct.frozen_string_literal` is
no longer a boolean but a tri-state of `enabled/disabled/unset`.

When code is compiled with frozen string literals neither explictly enabled
or disabled, string literals are compiled with a new `putchilledstring`
instruction. This instruction is identical to `putstring` except it marks
the String with the `STR_CHILLED (FL_USER3)` and `FL_FREEZE` flags.

Chilled strings have the `FL_FREEZE` flag as to minimize the need to check
for chilled strings across the codebase, and to improve compatibility with
C extensions.

Notes:
  - `String#freeze`: clears the chilled flag.
  - `String#-@`: acts as if the string was mutable.
  - `String#+@`: acts as if the string was mutable.
  - `String#clone`: copies the chilled flag.

Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2024-03-19 09:26:49 +01:00
Jean Boussier d4f3dcf4df Refactor VM root modules
This `st_table` is used to both mark and pin classes
defined from the C API. But `vm->mark_object_ary` already
does both much more efficiently.

Currently a Ruby process starts with 252 rooted classes,
which uses `7224B` in an `st_table` or `2016B` in an `RArray`.

So a baseline of 5kB saved, but since `mark_object_ary` is
preallocated with `1024` slots but only use `405` of them,
it's a net `7kB` save.

`vm->mark_object_ary` is also being refactored.

Prior to this changes, `mark_object_ary` was a regular `RArray`, but
since this allows for references to be moved, it was marked a second
time from `rb_vm_mark()` to pin these objects.

This has the detrimental effect of marking these references on every
minors even though it's a mostly append only list.

But using a custom TypedData we can save from having to mark
all the references on minor GC runs.

Addtionally, immediate values are now ignored and not appended
to `vm->mark_object_ary` as it's just wasted space.
2024-03-06 15:33:43 -05:00
Nobuyoshi Nakada 127b19ab56 Use line numbers as builtin-index
The order of iseq may differ from the order of tokens, typically
`while`/`until` conditions are put after the body.

These orders can match by using line numbers as builtin-indexes, but
at the same time, it introduces the restriction that multiple `cexpr!`
and `cstmt!` cannot appear in the same line.

Another possible idea is to use `RubyVM::AbstractSyntaxTree` and
`node_id` instead of ripper, with making BASERUBY 3.1 or later.
2024-01-22 19:39:34 +09:00
Nobuyoshi Nakada 40fc9b070c
[DOC] No document for internal or debug methods 2023-12-18 20:17:45 +09:00
Takashi Kokubun 38be9a9b72
Clean up OPT_STACK_CACHING (#8132) 2023-07-27 17:27:05 -07:00
yui-knk 1740482d06 Fix rb_compile_option_t comments [ci skip] 2023-06-18 14:39:15 +09:00
Nobuyoshi Nakada 677c3228d0 Check loading built-in binaries 2023-03-08 13:59:21 +09:00
Samuel Williams 75cf29f60d Rework `first_lineno` to be `int`. 2022-09-26 00:41:16 +13:00
git c1a7e86f40 * expand tabs. [ci skip]
Tabs were expanded because the file did not have any tab indentation in unedited lines.
Please update your editor config, and use misc/expand_tabs.rb in the pre-commit hook.
2021-06-17 06:09:26 +09:00
John Hawthorn c10d5085a2 Enable frozen_string_literal in builtin_iseq_load
Currently this has a fairly minor effect as strings are not used heavily
inside the builtins (outside of warnings, requires, and errors).
Hopefully this allows us to use strings in the future where appropriate.
2021-06-16 14:09:09 -07:00
卜部昌平 5e22f873ed decouple internal.h headers
Saves comitters' daily life by avoid #include-ing everything from
internal.h to make each file do so instead.  This would significantly
speed up incremental builds.

We take the following inclusion order in this changeset:

1.  "ruby/config.h", where _GNU_SOURCE is defined (must be the very
    first thing among everything).
2.  RUBY_EXTCONF_H if any.
3.  Standard C headers, sorted alphabetically.
4.  Other system headers, maybe guarded by #ifdef
5.  Everything else, sorted alphabetically.

Exceptions are those win32-related headers, which tend not be self-
containing (headers have inclusion order dependencies).
2019-12-26 20:45:12 +09:00
Koichi Sasada 97a17a51b8 readable function names for inline functions.
Now, C functions written by __builtin_cexpr!(code) and others are
named as "__builtin_inline#{n}". However, it is difficult to know
what the function is. This patch rename them into
"__builtin_foo_#{lineno}" when cexpr! is in 'foo' method.
2019-12-13 17:55:45 +09:00
Koichi Sasada 40026a408d support cross-compilation.
On cross-compilation, compiled binary can no be created because
compiled binary should be created by same interpreter (on cross-
compilation, host ruby is used to build ruby (BASERUBY)).
So that cross-compilation system loads required scripts in text.
It is same as miniruby.
2019-12-11 11:24:42 +09:00
Koichi Sasada 9c2807b2df remove prelude.c
prelude.c is an automatically generated file by template/prelude.c.tmpl.
However it does not contain any required functions. So remove it from
dependency.

Also miniprelude.c is included by mini_builtin.c and does not need
to make miniprelude.o.
2019-12-11 11:24:42 +09:00
Koichi Sasada 2c5c60754c use compiled binary for gem_prelude.rb.
`gem_prelude.rb` is not compiled yet. This patch compile it to
compiled binary.
2019-12-11 11:24:42 +09:00
Koichi Sasada 71fee9bc72 vm_invoke_builtin_delegate with start index.
opt_invokebuiltin_delegate and opt_invokebuiltin_delegate_leave
invokes builtin functions with same parameters of the method.
This technique eliminate stack push operations. However, delegation
parameters should be completely same as given parameters.
(e.g. `def foo(a, b, c) __builtin_foo(a, b, c)` is okay, but
__builtin_foo(b, c) is not allowed)

This patch relaxes this restriction. ISeq has a local variables
table which includes parameters. For example, the method defined
as `def foo(a, b, c) x=y=nil`, then local variables table contains
[a, b, c, x, y]. If calling builtin-function with arguments which
are sub-array of the lvar table, use opt_invokebuiltin_delegate
instruction with start index. For example, `__builtin_foo(b, c)`,
`__builtin_bar(c, x, y)` is okay, and so on.
2019-11-18 10:16:11 +09:00
Yuichiro Kaneko ef03d48cb2 Fix a typo 2019-11-10 22:21:40 +09:00
Nobuyoshi Nakada e3c8524411
Full-path of builtin scripts no longer needed 2019-11-09 19:43:14 +09:00
Nobuyoshi Nakada dfaac2b372
Embed builtin ruby scripts in miniprelude.c
Instead of reading from the files by the full-path at runtime.  As
rbinc files need to be included in distributed tarballs, the
full-paths at the packaging are unavailable at compilation times.
2019-11-09 19:28:45 +09:00
Koichi Sasada b5d8849220 Revert "don't embed full-path."
This reverts commit dfac2e9eb3.

It does not work if cwd is different from builddir...
2019-11-09 07:09:01 +09:00
Koichi Sasada dfac2e9eb3 don't embed full-path.
miniruby load *.rb from srcdir. To specify file path,
tool/mk_builtin_loader.rb embed full path of each *.rb file.
However it prevent to pre-generation of required files for tarball.
This patch generate srcdir/*.rb from __FILE__ information.
2019-11-09 06:57:58 +09:00
Nobuyoshi Nakada 61cff5c51c
Removed BOM [ci skip] 2019-11-08 13:21:29 +09:00
Koichi Sasada 46acd0075d support builtin features with Ruby and C.
Support loading builtin features written in Ruby, which implement
with C builtin functions.
[Feature #16254]

Several features:

(1) Load .rb file at boottime with native binary.

Now, prelude.rb is loaded at boottime. However, this file is contained
into the interpreter as a text format and we need to compile it.
This patch contains a feature to load from binary format.

(2) __builtin_func() in Ruby call func() written in C.

In Ruby file, we can write `__builtin_func()` like method call.
However this is not a method call, but special syntax to call
a function `func()` written in C. C functions should be defined
in a file (same compile unit) which load this .rb file.

Functions (`func` in above example) should be defined with
  (a) 1st parameter: rb_execution_context_t *ec
  (b) rest parameters (0 to 15).
  (c) VALUE return type.
This is very similar requirements for functions used by
rb_define_method(), however `rb_execution_context_t *ec`
is new requirement.

(3) automatic C code generation from .rb files.

tool/mk_builtin_loader.rb creates a C code to load .rb files
needed by miniruby and ruby command. This script is run by
BASERUBY, so *.rb should be written in BASERUBY compatbile
syntax. This script load a .rb file and find all of __builtin_
prefix method calls, and generate a part of C code to export
functions.

tool/mk_builtin_binary.rb creates a C code which contains
binary compiled Ruby files needed by ruby command.
2019-11-08 09:09:29 +09:00