This commit is contained in:
Koichi Sasada 2020-12-24 17:41:48 +09:00
Родитель a89932799c
Коммит 8664c3ddef
1 изменённых файлов: 129 добавлений и 81 удалений

Просмотреть файл

@ -8,14 +8,14 @@ Ractor is designed to provide a parallel execution feature of Ruby without threa
You can make multiple Ractors and they run in parallel.
* Ractors run in parallel.
* `Ractor.new{ expr }` creates a new Ractor and `expr` is run in parallel on a parallel computer.
* Interpreter invokes with the first Ractor (called *main Ractor*).
* If main Ractor terminated, all Ractors receive terminate request like Threads (if main thread (first invoked Thread), Ruby interpreter sends all running threads to terminate execution).
* Each Ractor has 1 or more Threads.
* Threads in a Ractor shares a Ractor-wide global lock like GIL (GVL in MRI terminology), so they can't run in parallel (without releasing GVL explicitly in C-level).
* Threads in a Ractor shares a Ractor-wide global lock like GIL (GVL in MRI terminology), so they can't run in parallel (without releasing GVL explicitly in C-level). Threads in different ractors run in parallel.
* The overhead of creating a Ractor is similar to overhead of one Thread creation.
### Limited sharing
### Limited sharing between multiple ractors
Ractors don't share everything, unlike threads.
@ -24,7 +24,8 @@ Ractors don't share everything, unlike threads.
* Immutable objects: frozen objects which don't refer to unshareable-objects.
* `i = 123`: `i` is an immutable object.
* `s = "str".freeze`: `s` is an immutable object.
* `a = [1, [2], 3].freeze`: `a` is not an immutable object because `a` refer unshareable-object `[2]` (which is not frozen).
* `a = [1, [2], 3].freeze`: `a` is not an immutable object because `a` refers unshareable-object `[2]` (which is not frozen).
* `h = {c: Object}.freeze`: `h` is an immutable object because `h` refers Symbol `:c` and shareable `Object` class object which is not frozen.
* Class/Module objects
* Special shareable objects
* Ractor object itself.
@ -35,27 +36,28 @@ Ractors don't share everything, unlike threads.
Ractors communicate with each other and synchronize the execution by message exchanging between Ractors. There are two message exchange protocols: push type (message passing) and pull type.
* Push type message passing: `Ractor#send(obj)` and `Ractor.receive()` pair.
* Sender ractor passes the `obj` to receiver Ractor.
* Sender knows a destination Ractor (the receiver of `r.send(obj)`) and the receiver does not know the sender (accept all message from any ractors).
* Receiver has infinite queue and sender enqueues the message. Sender doesn't block to put message.
* This type is based on actor model
* Sender ractor passes the `obj` to the ractor `r` by `r.send(obj)` and receiver ractor receives the message with `Ractor.receive`.
* Sender knows the destination Ractor `r` and the receiver does not know the sender (accept all message from any ractors).
* Receiver has infinite queue and sender enqueues the message. Sender doesn't block to put message into this queue.
* This type message exchangin is employed by many other Actor-based language.
* `Ractor.receive_if{ filter_expr }` is a variant of `Ractor.receive` to select a message.
* Pull type communication: `Ractor.yield(obj)` and `Ractor#take()` pair.
* Sender ractor declare to yield the `obj` and receiver Ractor take it.
* Sender doesn't know a destination Ractor and receiver knows the sender (the receiver of `r.take`).
* Sender ractor declare to yield the `obj` by `Ractor.yield(obj)` and receiver Ractor take it with `r.take`.
* Sender doesn't know a destination Ractor and receiver knows the sender Ractor `r`.
* Sender or receiver will block if there is no other side.
### Copy & Move semantics to send messages
To send unshareable objects as messages, objects are copied or moved.
* Copy: use deep-copy (like dRuby)
* Move: move membership
* Copy: use deep-copy.
* Move: move membership.
* Sender can not access the moved object after moving the object.
* Guarantee that at least only 1 Ractor can access the object.
### Thread-safety
Ractor helps to write a thread-safe program, but we can make thread-unsafe programs with Ractors.
Ractor helps to write a thread-safe concurrent program, but we can make thread-unsafe programs with Ractors.
* GOOD: Sharing limitation
* Most objects are unshareable, so we can't make data-racy and race-conditional programs.
@ -68,18 +70,18 @@ Ractor helps to write a thread-safe program, but we can make thread-unsafe progr
* Some kind of shareable objects can introduce transactions (STM, for example). However, misusing transactions will generate inconsistent state.
Without Ractor, we need to trace all of state-mutations to debug thread-safety issues.
With Ractor, you can concentrate to suspicious
With Ractor, you can concentrate to suspicious code which are shared with Ractors.
## Creation and termination
### `Ractor.new`
* `Ractor.new do expr end` generates another Ractor.
* `Ractor.new{ expr }` generates another Ractor.
```ruby
# Ractor.new with a block creates new Ractor
r = Ractor.new do
# This block will be run in parallel
# This block will be run in parallel with other ractors
end
# You can name a Ractor with `name:` argument.
@ -93,15 +95,11 @@ r.name #=> 'test-name'
### Given block isolation
The Ractor execute given `expr` in a given block.
Given block will be isolated from outer scope by `Proc#isolate`.
Given block will be isolated from outer scope by `Proc#isolate`. To prevent sharing unshareable objects between ractors, block outer-variables, `self` and other information are isolated.
Given block will be isolated by `Proc#isolate` method (not exposed yet for Ruby users). `Proc#isolate` is called at Ractor creation timing (`Ractor.new` is called). If given Proc object is not enable to isolate because of outer variables and so on, an error will be raised.
```ruby
# To prevent sharing unshareable objects between ractors,
# block outer-variables, `self` and other information are isolated.
# Given block will be isolated by `Proc#isolate` method.
# `Proc#isolate` is called at Ractor creation timing (`Ractor.new` is called)
# and it can cause an error if block accesses outer variables.
begin
a = true
r = Ractor.new do
@ -116,6 +114,7 @@ end
```ruby
r = Ractor.new do
p self.class #=> Ractor
self.object_id
end
r.take == self.object_id #=> false
@ -177,18 +176,23 @@ end
## Communication between Ractors
Communication between Ractors is achieved by sending and receiving messages.
Communication between Ractors is achieved by sending and receiving messages. There is two way to communicate each other.
* (1) Message sending/receiving
* (1-1) push type send/receive (sender knows receiver). similar to the Actor model.
* (1-2) pull type yield/take (receiver knows sender).
* (2) Using shareable container objects (not implemented yet)
* (2) Using shareable container objects
* Ractor::TVar gem ([ko1/ractor-tvar](https://github.com/ko1/ractor-tvar))
* more?
Users can control blocking on (1), but should not control on (2) (only manage as critical section).
Users can control program execution timing with (1), but should not control with (2) (only manage as critical section).
For message sending and receiving, there are two types APIs: push type and pull type.
* (1-1) send/receive (push type)
* `Ractor#send(obj)` (`Ractor#<<(obj)` is an aliases) send a message to the Ractor's incoming port. Incoming port is connected to the infinite size incoming queue so `Ractor#send` will never block.
* `Ractor.receive` dequeue a message from its own incoming queue. If the incoming queue is empty, `Ractor.receive` calling will block.
* `Ractor.receive_if{|msg| filter_expr }` is variant of `Ractor.receive`. `receive_if` only receives a message which `filter_expr` is true (So `Ractor.receive` is same as `Ractor.receive_if{ true }`.
* (1-2) yield/take (pull type)
* `Ractor.yield(obj)` send an message to a Ractor which are calling `Ractor#take` via outgoing port . If no Ractors are waiting for it, the `Ractor.yield(obj)` will block. If multiple Ractors are waiting for `Ractor.yield(obj)`, only one Ractor can receive the message.
* `Ractor#take` receives a message which is waiting by `Ractor.yield(obj)` method from the specified Ractor. If the Ractor does not call `Ractor.yield` yet, the `Ractor#take` call will block.
@ -196,13 +200,13 @@ Users can control blocking on (1), but should not control on (2) (only manage as
* You can close the incoming port or outgoing port.
* You can close then with `Ractor#close_incoming` and `Ractor#close_outgoing`.
* If the incoming port is closed for a Ractor, you can't `send` to the Ractor. If `Ractor.receive` is blocked for the closed incoming port, then it will raise an exception.
* If the outgoing port is closed for a Ractor, you can't call `Ractor#take` and `Ractor.yield` on the Ractor. If `Ractor#take` is blocked for the Ractor, then it will raise an exception.
* If the outgoing port is closed for a Ractor, you can't call `Ractor#take` and `Ractor.yield` on the Ractor. If ractors are blocking by `Ractor#take` or `Ractor.yield`, closing outgoing port will raise an exception on these blocking ractors.
* When a Ractor is terminated, the Ractor's ports are closed.
* There are 3 methods to send an object as a message
* (1) Send a reference: Send a shareable object, send only a reference to the object (fast)
* (2) Copy an object: Send an unshareable object by copying deeply and send copied object (slow). Note that you can not send an object which is not support deep copy. Current implementation uses Marshal protocol to get deep copy.
* (3) Move an object: Send an unshareable object reference with a membership. Sender Ractor can not access moved objects anymore (raise an exception). Current implementation makes new object as a moved object for receiver Ractor and copy references of sending object to moved object.
* You can choose "Copy" and "Send" as a keyword for `Ractor#send(obj)` and `Ractor.yield(obj)` (default is "Copy").
* There are 3 way to send an object as a message
* (1) Send a reference: Sending a shareable object, send only a reference to the object (fast)
* (2) Copy an object: Sending an unshareable object by copying an object deeply (slow). Note that you can not send an object which is not support deep copy. Some `T_DATA` objects are not supported.
* (3) Move an object: Sending an unshareable object reference with a membership. Sender Ractor can not access moved objects anymore (raise an exception) after moving it. Current implementation makes new object as a moved object for receiver Ractor and copy references of sending object to moved object.
* You can choose "Copy" and "Move" by the `move:` keyword, `Ractor#send(obj, move: true/false)` and `Ractor.yield(obj, move: true/false)` (default is `false` (COPY)).
### Sending/Receiving ports
@ -222,13 +226,13 @@ Each Ractor has _incoming-port_ and _outgoing-port_. Incoming-port is connected
Connection example: r2.send obj on r1、Ractor.receive on r2
+----+ +----+
* r1 |-----* r2 *
* r1 |---->* r2 *
+----+ +----+
Connection example: Ractor.yield(obj) on r1, r1.take on r2
+----+ +----+
* r1 *------ r2 *
* r1 *---->- r2 *
+----+ +----+
Connection example: Ractor.yield(obj) on r1 and r2,
@ -237,7 +241,7 @@ Connection example: Ractor.yield(obj) on r1 and r2,
+----+
* r1 *------+
+----+ |
+----- Ractor.select(r1, r2)
+----> Ractor.select(r1, r2)
+----+ |
* r2 *------|
+----+
@ -252,6 +256,19 @@ Connection example: Ractor.yield(obj) on r1 and r2,
r.take # Receive from r's outgoing port
```
The last example shows the following ractor network.
```
+------+ +---+
* main |------> * r *---+
+-----+ +---+ |
^ |
+-------------------+
```
And this code can be rewrite more simple way by using an argument for `Ractor.new`.
```ruby
# Actual argument 'ok' for `Ractor.new()` will be send to created Ractor.
r = Ractor.new 'ok' do |msg|
@ -265,6 +282,25 @@ Connection example: Ractor.yield(obj) on r1 and r2,
r.take #=> `ok`
```
### Return value of a block for `Ractor.new`
As already explained, the return value of `Ractor.new` (an evaluated value of `expr` in `Ractor.new{ expr }`) can be taken by `Ractor#take`.
```ruby
Ractor.new{ 42 }.take #=> 42
```
When the block return value is available, the Ractor is dead so that no ractors except taken Ractor can touch the return value, so any values can be sent with this communication path without any modification.
```ruby
r = Ractor.new do
a = "hello"
binding
end
r.take.eval("p a") #=> "hello" (other communication path can not send a Binding object directly)
```
### Wait for multiple Ractors with `Ractor.select`
You can wait multiple Ractor's `yield` with `Ractor.select(*ractors)`.
@ -358,14 +394,13 @@ TODO: `select` syntax of go-language uses round-robin technique to make fair sch
* `Ractor#close_incoming/outgoing` close incoming/outgoing ports (similar to `Queue#close`).
* `Ractor#close_incoming`
* `r.send(obj) ` where `r`'s incoming port is closed, will raise an exception.
* When the incoming queue is empty and incoming port is closed, `Ractor.receive` raise an exception. If the incoming queue is not empty, it dequeues an object.
* When the incoming queue is empty and incoming port is closed, `Ractor.receive` raise an exception. If the incoming queue is not empty, it dequeues an object without exceptions.
* `Ractor#close_outgoing`
* `Ractor.yield` on a Ractor which closed the outgoing port, it will raise an exception.
* `Ractor#take` for a Ractor which closed the outgoing port, it will raise an exception. If `Ractor#take` is blocking, it will raise an exception.
* When a Ractor terminates, the ports are closed automatically.
* Return value of the Ractor's block will be yielded as `Ractor.yield(ret_val)`, even if the implementation terminates the based native thread.
Example (try to take from closed Ractor):
```ruby
@ -415,7 +450,7 @@ end
obj.object_id == r.take #=> false
```
Current implementation uses Marshal protocol (similar to dRuby). We can not send Marshal unsupported objects.
Some objects are not supported to copy the value, and raise an exception.
```ruby
obj = Thread.new{}
@ -424,7 +459,7 @@ begin
msg
end
rescue TypeError => e
e.message #=> no _dump_data is defined for class Thread
e.message #=> #<TypeError: allocator undefined for Thread>
else
'ng' # unreachable here
end
@ -474,11 +509,15 @@ end
end
```
Now only `T_FILE`, `T_STRING` and `T_ARRAY` objects are supported.
Some objects are not supported to move, and an exception will be raise.
* `T_FILE` (`IO`, `File`): support to send accepted socket etc.
* `T_STRING` (`String`): support to send a huge string without copying (fast).
* `T_ARRAY` (`Array'): support to send a huge Array without re-allocating the array's buffer. However, all of the referred objects from the array should be moved, so it is not so fast.
```ruby
r = Ractor.new do
Ractor.receive
end
r.send(Thread.new{}, move: true) #=> allocator undefined for Thread (TypeError)
```
To achieve the access prohibition for moved objects, _class replacement_ technique is used to implement it.
@ -491,43 +530,17 @@ The following objects are shareable.
* Frozen native objects
* Numeric objects: `Float`, `Complex`, `Rational`, big integers (`T_BIGNUM` in internal)
* All Symbols.
* Frozen `String` and `Regexp` objects (which does not have instance variables)
* In future, "Immutable" objects (frozen and only refer to shareable objects) will be supported (TODO: introduce an `immutable` flag for objects?)
* Frozen `String` and `Regexp` objects (their instance variables should refer only sharble objects)
* Class, Module objects (`T_CLASS`, `T_MODULE` and `T_ICLASS` in internal)
* `Ractor` and other objects which care about synchronization.
* `Ractor` and other special objects which care about synchronization.
Implementation: Now shareable objects (`RVALUE`) have `FL_SHAREABLE` flag. This flag can be added lazily.
```ruby
r = Ractor.new do
while v = Ractor.receive
Ractor.yield v
end
end
class C
end
shareable_objects = [1, :sym, 'xyzzy'.to_sym, 'frozen'.freeze, 1+2r, 3+4i, /regexp/, C]
shareable_objects.map{|o|
r << o
o2 = r.take
[o, o.object_id == o2.object_id]
}
#=> [[1, true], [:sym, true], [:xyzzy, true], [\"frozen\", true], [(3/1), true], [(3+4i), true], [/regexp/, true], [C, true]]
unshareable_objects = ['mutable str'.dup, [:array], {hash: true}].map{|o|
r << o
o2 = r.take
[o, o.object_id == o2.object_id]
}
#+> "[[\"mutable str\", false], [[:array], false], [{:hash=>true}, false]]]"
```
To make sharable objects, `Ractor.make_shareable(obj)` method is provided. In this case, try to make sharaeble by freezing `obj` and recursively travasible objects. This method accepts `copy:` keyword (default value is false).`Ractor.make_sharable(obj, copy: true)` tries to make a deep copy of `obj` and make the copied object sharable.
## Language changes to isolate unshareable objects between Ractors
To isolate unshareable objects between Ractors, we introduced additional language semantics on multi-Ractor.
To isolate unshareable objects between Ractors, we introduced additional language semantics on multi-Ractor Ruby programs.
Note that without using Ractors, these additional semantics is not needed (100% compatible with Ruby 2).
@ -548,6 +561,8 @@ Only the main Ractor (a Ractor created at starting of interpreter) can access gl
end
```
Note that some special global variables are ractor-local, like `$stdin`, `$stdout`, `$stderr`. See [[Bug #17268]](https://bugs.ruby-lang.org/issues/17268) for more details.
### Instance variables of shareable objects
Only the main Ractor can access instance variables of shareable objects.
@ -567,7 +582,7 @@ Only the main Ractor can access instance variables of shareable objects.
begin
r.take
rescue => e
e.class #=> RuntimeError
e.class #=> Ractor::IsolationError
end
```
@ -582,7 +597,7 @@ Only the main Ractor can access instance variables of shareable objects.
begin
r.take
rescue Ractor::RemoteError => e
e.cause.message #=> can not access instance variables of shareable objects from non-main Ractors
e.cause.message #=> can not access instance variables of shareable objects from non-main Ractors (Ractor::IsolationError)
end
```
@ -607,7 +622,7 @@ Only the main Ractor can access class variables.
begin
r.take
rescue => e
e.class #=> RuntimeError
e.class #=> Ractor::IsolationError
end
```
@ -625,7 +640,7 @@ Only the main Ractor can read constants which refer to the unshareable object.
begin
r.take
rescue => e
e.class #=> NameError
e.class #=> Ractor::IsolationError
end
```
@ -640,14 +655,48 @@ Only the main Ractor can define constants which refer to the unshareable object.
begin
r.take
rescue => e
e.class #=> NameError
e.class #=> Ractor::IsolationError
end
```
To make multi-ractor supported library, the constants should only refer sharable objects.
```ruby
TABLE = {a: 'ko1', b: 'ko2', c: 'ko3'}
```
In this case, `TABLE` reference an unsharable Hash object. So that other ractors can not refer `TABLE` constant. To make it shareable, we can use `Ractor.make_sharable()` like that.
```ruby
TABLE = Ractor.make_sharable( {a: 'ko1', b: 'ko2', c: 'ko3'} )
```
To make it easy, Ruby 3.0 introduced new `shareable_constant_value` Directive.
```ruby
shareable_constant_value: literal
TABLE = {a: 'ko1', b: 'ko2', c: 'ko3'}
#=> Same as: TABLE = Ractor.make_sharable( {a: 'ko1', b: 'ko2', c: 'ko3'} )
```
`shareable_constant_value` directive accepts the following modes (descriptions use the example: `CONST = expr`):
* none: Do nothing. Same as: `CONST = expr`
* literal:
* if `expr` is consites of literals, replaced to `CONST = Ractor.make_sharable(expr)`.
* otherwise: replaced to `CONST = expr.tap{|o| raise unless Ractor.shareable?}`.
* experimental_everything: replaced to `CONST = Ractor.make_sharable(expr)`.
* experimental_copy: replaced to `CONST = Ractor.make_sharable(expr, copy: true)`.
Except the `none` mode (default), it is guaranteed that the assigned constants refer to only sharable objects.
See [doc/syntax/comment.rdoc](syntax/comment.rdoc) for more details.
## Implementation note
* Each Ractor has its own thread, it means each Ractor has at least 1 native thread.
* Each Ractor has its own ID (`rb_ractor_t::id`).
* Each Ractor has its own ID (`rb_ractor_t::pub::id`).
* On debug mode, all unshareable objects are labeled with current Ractor's id, and it is checked to detect unshareable object leak (access an object from different Ractor) in VM.
## Examples
@ -655,7 +704,7 @@ Only the main Ractor can define constants which refer to the unshareable object.
### Traditional Ring example in Actor-model
```ruby
RN = 1000
RN = 1_000
CR = Ractor.current
r = Ractor.new do
@ -802,7 +851,6 @@ r.send "r0"
p Ractor.receive #=> "r0r10r9r8r7r6r5r4r3r2r1"
r.send "r0"
p Ractor.select(*rs, Ractor.current) #=> [:receive, "r0r10r9r8r7r6r5r4r3r2r1"]
[:receive, "r0r10r9r8r7r6r5r4r3r2r1"]
r.send "e0"
p Ractor.select(*rs, Ractor.current)
#=>