|
1 | | -> Quotations are used for notes. |
2 | | -
|
3 | | -> This is an outdated version of |
4 | | -> [this document](https://docs.google.com/document/d/1pVzU8w_QF44YzUCCab990Q_WZOdhpKolCIHaiXG-sPw/edit?usp=sharing) |
5 | | -> maintained on Google documents. |
6 | | -
|
7 | | -> This document is work-in-progress. |
8 | | -> Intentions of this effort and document are: to summarize the behavior |
9 | | -> of Ruby in concurrent and parallel environment, initiate discussion, |
10 | | -> identify problems in the document, find flaws in the Ruby |
11 | | -> implementations if any, suggest what has to be enhanced in Ruby itself |
12 | | -> and cooperate towards the goal in all implementations (using |
13 | | -> `concurrent-ruby` as compatibility layer). |
14 | | -> |
15 | | -> It's not intention of this effort to introduce high-level concurrency |
16 | | -> abstractions like actors to the language, but rather to improve low-level |
17 | | -> concurrency support to add many more concurrency abstractions through gems. |
18 | | -
|
19 | 1 | # Synchronization |
20 | 2 |
|
21 | | -This layer provides tools to write concurrent abstractions independent of any |
22 | | -particular Ruby implementation. It is built on top of the Ruby memory model |
23 | | -which is also described here. `concurrent-ruby` abstractions are build using |
24 | | -this layer. |
25 | | - |
26 | | -**Why?** Ruby is great expressive language, but it lacks in support for |
27 | | -well-defined low-level concurrent and parallel computation. It's hoped that this |
28 | | -document will provide ground steps for Ruby to become as good in this area as |
29 | | -in others. |
30 | | - |
31 | | -Without a memory model and this layer it's very hard to write concurrent |
32 | | -abstractions for Ruby. To write a proper concurrent abstraction it often means |
33 | | -to reimplement it more than once for different Ruby runtimes, which is very |
34 | | -time-consuming and error-prone. |
35 | | - |
36 | | -# Ruby memory model |
37 | | - |
38 | | -The Ruby memory model is a framework allowing to reason about programs in |
39 | | -concurrent and parallel environment. It defines what variable writes can be |
40 | | -observed by a particular variable read, which is essential to be able to |
41 | | -determine if a program is correct. It is achieved by defining what subset of |
42 | | -all possible program execution orders is allowed. |
43 | | - |
44 | | -A memory model sources: |
45 | | - |
46 | | -- [Java memory model](http://www.cs.umd.edu/~pugh/java/memoryModel/), |
47 | | - and its [FAQ](http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html) |
48 | | -- [Java Memory Model Pragmatics](http://shipilev.net/blog/2014/jmm-pragmatics/) |
49 | | -- [atomic<> Weapons 1](https://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-1-of-2) |
50 | | -and |
51 | | -[2](https://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-2-of-2) |
52 | | - |
53 | | -Concurrent behavior sources of Ruby implementations: |
54 | | - |
55 | | -- Source codes. |
56 | | -- [JRuby's wiki page](https://github.com/jruby/jruby/wiki/Concurrency-in-jruby) |
57 | | -- [Rubinius's wiki page](http://rubini.us/doc/en/systems/concurrency/) |
58 | | - |
59 | | -> A similar document for MRI was not found. Key fact about MRI is GVL (Global |
60 | | -> VM lock) which ensures that only one thread can interpret a Ruby code at any |
61 | | -> given time. When the GVL is handed from one thread to another a mutex is |
62 | | -> released by first and acquired by the second thread implying that everything |
63 | | -> done by first thread is visible to second thread. See |
64 | | -> [thread_pthread.c](https://github.com/ruby/ruby/blob/ruby_2_2/thread_pthread.c#L101-L107) |
65 | | -> and |
66 | | -> [thread_win32.c](https://github.com/ruby/ruby/blob/ruby_2_2/thread_win32.c#L95-L100). |
67 | | -
|
68 | | -This memory model was created by: comparing |
69 | | -[MRI](https://www.ruby-lang.org/en/), [JRuby](http://jruby.org/), |
70 | | -[JRuby+Truffle](https://github.com/jruby/jruby/wiki/Truffle), |
71 | | -[Rubinius](http://rubini.us/); taking account limitations of the implementations |
72 | | -or their platforms; inspiration drawn from other existing memory models (Java, |
73 | | -C++11). This is not a formal model. |
74 | | - |
75 | | -Key properties are: |
76 | | - |
77 | | -- **volatility (V)** - A written value is immediately visible to any |
78 | | - subsequent volatile read of the same variable on any Thread. (It has same |
79 | | - meaning as in Java.) |
80 | | -- **atomicity (A)** - Operation is either done or not as a whole. |
81 | | -- **serialized (S)** - Operations are serialized in some order (they |
82 | | - cannot disappear). This is a new property not mentioned in other memory |
83 | | - models, since Java and C++ do not have dynamically defined fields. All |
84 | | - operations on one line in a row of the tables bellow are serialized with |
85 | | - each other. |
86 | | - |
87 | | -### Core behavior: |
88 | | - |
89 | | -| Operation | V | A | S | Notes | |
90 | | -|:----------|:-:|:-:|:-:|:-----| |
91 | | -| local variable read/write/definition | - | x | x | Local variables are determined during parsing, they are not usually dynamically added (with exception of `local_variable_set`). Therefore definition is quite rare. | |
92 | | -| instance variable read/write/(un)definition | - | x | x | Newly defined instance variables have to become visible eventually. | |
93 | | -| class variable read/write/(un)definition | x | x | x || |
94 | | -| global variable read/write/definition | x | x | x | un-define us not possible currently. | |
95 | | -| constant variable read/write/(un)definition | x | x | x || |
96 | | -| `Thread` local variable read/write/definition | - | x | x | un-define is not possible currently. | |
97 | | -| `Fiber` local variable read/write/definition | - | x | x | un-define is not possible currently. | |
98 | | -| method creation/redefinition/removal | x | x | x || |
99 | | -| include/extend | x | x | x | If `AClass` is included `AModule`, `AClass` gets all `AModule`'s methods at once. | |
100 | | - |
101 | | - |
102 | | -Notes: |
103 | | - |
104 | | -- Variable read reads value from preexisting variable. |
105 | | -- Variable definition creates new variable (operation is serialized with |
106 | | - writes, implies an update cannot be lost). |
107 | | -- A Module or a Class definition is actually a constant definition. |
108 | | - The definition is atomic, it assigns the Module or the Class to the |
109 | | - constant, then its methods are defined atomically one by one. |
110 | | -- `||=`, `+=`, etc. are actually two operations read and write which implies |
111 | | - that it's not an atomic operation. See volatile variables |
112 | | - with compare-and-set. |
113 | | -- Method invocation does not have any special properties that includes |
114 | | - object initialization. |
115 | | - |
116 | | -Current Implementation differences from the model: |
117 | | - |
118 | | -- MRI: everything is volatile. |
119 | | -- JRuby: `Thread` and `Fiber` local variables are volatile. Instance |
120 | | - variables are volatile on x86 and people may un/intentionally depend |
121 | | - on the fact. |
122 | | -- Class variables require investigation. |
123 | | - |
124 | | -> TODO: updated with specific versions of the implementations. |
125 | | -
|
126 | | -### Threads |
127 | | - |
128 | | -> TODO: add description of `Thread.new`, `#join`, etc. |
129 | | -
|
130 | | -### Source loading: |
131 | | - |
132 | | -| Operation | V | A | S | Notes | |
133 | | -|:----------|:-:|:-:|:-:|:-----| |
134 | | -| requiring | x | x | x | File will not be required twice, classes and modules are still defined gradually. | |
135 | | -| autoload | x | x | - | Only one autoload at a time for a given constant, others will be blocked until first triggered autoload is done. Different constants may be loaded concurrently. | |
136 | | - |
137 | | -Notes: |
138 | | - |
139 | | -- Beware of requiring and autoloading in concurrent programs, it's possible to |
140 | | - see partially defined classes. Eager loading or blocking until class is |
141 | | - fully loaded should be used to mitigate. |
142 | | - |
143 | | -### Core classes |
144 | | - |
145 | | -`Mutex`, `Monitor`, `Queue` have to work correctly on each implementation. Ruby |
146 | | -implementation VMs should not crash when for example `Array` or `Hash` is used |
147 | | -in parallel environment but it may loose updates, or raise Exceptions. (If |
148 | | -`Array` or `Hash` were synchronized it would have too much overhead when used |
149 | | -in a single thread.) |
150 | | - |
151 | | -> `concurrent-ruby` contains synchronized versions of `Array` and `Hash` and |
152 | | -> other thread-safe data structure. |
153 | | -
|
154 | | -> TODO: This section needs more work: e.g. Thread.raise and similar is an open |
155 | | -> issue, better not to be used. |
156 | | -
|
157 | | -### Standard libraries |
158 | | - |
159 | | -Standard libraries were written for MRI so unless they are rewritten in |
160 | | -particular Ruby implementation they may contain hidden problems. Therefore it's |
161 | | -better to assume that they are not safe. |
162 | | - |
163 | | -> TODO: This section needs more work. |
164 | | -
|
165 | | -# Extensions |
166 | | - |
167 | | -The above described memory model is quite weak, e.g. A thread-safe immutable |
168 | | -object cannot be created. It requires final or volatile instance variables. |
169 | | - |
170 | | -## Final instance variable |
171 | | - |
172 | | -Objects inherited from `Synchronization::Object` provide a way how to ensure |
173 | | -that all instance variables that are set only once in constructor (therefore |
174 | | -effectively final) are safely published to all readers (assuming proper |
175 | | -construction - object instance does not escape during construction). |
176 | | - |
177 | | -``` ruby |
178 | | -class ImmutableTreeNode < Concurrent::Synchronization::Object |
179 | | - # mark this class to publish final instance variables safely |
180 | | - safe_initialization! |
181 | | - |
182 | | - def initialize(left, right) |
183 | | - # Call super to allow proper initialization. |
184 | | - super() |
185 | | - # By convention final instance variables have CamelCase names |
186 | | - # to distinguish them from ordinary instance variables. |
187 | | - @Left = left |
188 | | - @Right = right |
189 | | - end |
190 | | - |
191 | | - # Define thread-safe readers. |
192 | | - def left |
193 | | - # No need to synchronize or otherwise protect, it's already |
194 | | - # guaranteed to be visible. |
195 | | - @Left |
196 | | - end |
197 | | - |
198 | | - def right |
199 | | - @Right |
200 | | - end |
201 | | -end |
202 | | -``` |
203 | | - |
204 | | -Once `safe_initialization!` is called on a class it transitively applies to all |
205 | | -its children. |
206 | | - |
207 | | -> It's implemented by adding `new`, when `safe_initialization!` is called, as |
208 | | -> follows: |
209 | | -> |
210 | | -> ``` ruby |
211 | | -> def self.new(*) |
212 | | -> object = super |
213 | | -> ensure |
214 | | -> object.ensure_ivar_visibility! if object |
215 | | -> end |
216 | | -> ``` |
217 | | -> |
218 | | -> therefore `new` should not be overridden. |
219 | | -
|
220 | | -## Volatile instance variable |
221 | | -
|
222 | | -`Synchronization::Object` children can have volatile instance variables. A Ruby |
223 | | -library cannot alter meaning of `@a_name` expression therefore when a |
224 | | -`attr_volatile :a_name` is called, declaring the instance variable named |
225 | | -`a_name` to be volatile, it creates method accessors. |
226 | | -
|
227 | | -> However there is Ruby [issue](https://redmine.ruby-lang.org/issues/11539) |
228 | | -> filed to address this. |
229 | | -
|
230 | | -``` ruby |
231 | | -# Simple counter with cheap reads. |
232 | | -class Counter < Concurrent::Synchronization::Object |
233 | | - # Declare instance variable value to be volatile and its |
234 | | - # reader and writer to be private. `attr_volatile` returns |
235 | | - # names of created methods. |
236 | | - private *attr_volatile(:value) |
237 | | - safe_initialization! |
238 | | -
|
239 | | - def initialize(value) |
240 | | - # Call super to allow proper initialization. |
241 | | - super() |
242 | | - # Create a reentrant lock instance held in final ivar |
243 | | - # to be able to protect writer. |
244 | | - @Lock = Concurrent::Synchronization::Lock.new |
245 | | - # volatile write |
246 | | - self.value = value |
247 | | - end |
248 | | -
|
249 | | - # Very cheap reader of the Counter's current value, just a volatile read. |
250 | | - def count |
251 | | - # volatile read |
252 | | - value |
253 | | - end |
254 | | -
|
255 | | - # Safely increments the value without loosing updates |
256 | | - # (as it would happen with just += used). |
257 | | - def increment(add) |
258 | | - # Wrap the two volatile operations to make them atomic. |
259 | | - @Lock.synchronize do |
260 | | - # volatile write and read |
261 | | - self.value = self.value + add |
262 | | - end |
263 | | - end |
264 | | -end |
265 | | -``` |
266 | | -
|
267 | | -> This is currently planned to be migrated to a module to be able to add |
268 | | -> volatile fields any object not just `Synchronization::Object` children. The |
269 | | -> instance variable itself is named `"@volatile_#{name}"` to distinguish it and |
270 | | -> to prevent direct access by name. |
271 | | -
|
272 | | -## Volatile instance variable with compare-and-set |
273 | | - |
274 | | -Some concurrent abstractions may need to do compare-and-set on the volatile |
275 | | -instance variables to avoid synchronization, then `attr_volatile_with_cas` is |
276 | | -used. |
277 | | - |
278 | | -``` ruby |
279 | | -# Simplified clojure's Atom implementation |
280 | | -class Atom < Concurrent::Synchronization::Object |
281 | | - safe_initialization! |
282 | | - # Make all methods private |
283 | | - private *attr_volatile_with_cas(:value) |
284 | | - # with exception of reader |
285 | | - public :value |
286 | | - |
287 | | - def initialize(value, validator = -> (v) { true }) |
288 | | - # Call super to allow proper initialization. |
289 | | - super() |
290 | | - # volatile write |
291 | | - self.value = value |
292 | | - @Validator = validator |
293 | | - end |
294 | | - |
295 | | - # Allows to swap values computed from an old_value with function |
296 | | - # without using blocking synchronization. |
297 | | - def swap(*args, &function) |
298 | | - loop do |
299 | | - old_value = self.value # volatile read |
300 | | - begin |
301 | | - # compute new value |
302 | | - new_value = function.call(old_value, *args) |
303 | | - # return old_value if validation fails |
304 | | - break old_value unless valid?(new_value) |
305 | | - # return new_value only if compare-and-set is successful |
306 | | - # on value instance variable, otherwise repeat |
307 | | - break new_value if compare_and_set_value(old_value, new_value) |
308 | | - rescue |
309 | | - break old_value |
310 | | - end |
311 | | - end |
312 | | - end |
313 | | - |
314 | | - private |
315 | | - |
316 | | - def valid?(new_value) |
317 | | - @Validator.call(new_value) rescue false |
318 | | - end |
319 | | -end |
320 | | -``` |
321 | | - |
322 | | -`attr_volatile_with_cas` defines five methods for a given instance variable |
323 | | -name. For name `value` they are: |
324 | | - |
325 | | -``` ruby |
326 | | -self.value #=> the_value |
327 | | -self.value=(new_value) #=> new_value |
328 | | -self.swap_value(new_value) #=> old_value |
329 | | -self.compare_and_set_value(expected, new_value) #=> true || false |
330 | | -self.update_value(&function) #=> function.call(old_value) |
331 | | -``` |
332 | | - |
333 | | -Three of them were used in the example above. |
| 3 | +[This document](https://docs.google.com/document/d/1pVzU8w_QF44YzUCCab990Q_WZOdhpKolCIHaiXG-sPw/edit?usp=sharing) |
| 4 | +is moved to Google documents. It will be moved here once final and stabilized. |
334 | 5 |
|
335 | | -> Current implementation relies on final instance variables where a instance of |
336 | | -> `AtomicReference` is held to provide compare-and-set operations. That creates |
337 | | -> extra indirection which is hoped to be removed over time when better |
338 | | -> implementation will become available in Ruby implementations. The |
339 | | -> instance variable itself is named `"@VolatileCas#{camelized name}"` to |
340 | | -> distinguish it and to prevent direct access by name. |
0 commit comments