You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fix transaction on_close and Java and Python block on close() (#792)
## Usage and product changes
We notice that calling `transaction.close()` does not wait until the
server has freed up resource. This makes quick sequences, such as tests
where transactions open and are followed by database deletes,
unreliable. Further investigation that workarounds using the existing
`on_close` callbacks in Python and Java caused segfaults. We fix both:
1) `Transaction.close()` in Python and Java now blocks for 1 round trip.
In Rust, this now returns a promise/future. In Java/Python, we pick the
most relevant default and resolve the promise from Java/Python.
2) We fix segfaults that occur when the Rust driver calls into
Python/Java once the user attaches `.on_close` callbacks to
transactions.
We also fix nondeterministic errors:
1) adding `on_close` callbacks must return a promise, since the
implementation injects the callback into our lowest-level listener loop
which may register the callback later. Not awaiting the `on_close()`
registration will lead to hit or miss execution of the callback when
registering on_close callbacks, not awaiting, and then closing the
transaction immediately
2) we add `keepalive` to the channel, without which messages sometimes
get "stuck" on the client-side receiving end of responses from the
server. No further clues found as to why this happens. See comments for
more detail.
We also add one major feature enhancement: configurable logging. All
logging should now go through the `tracing` crate. We can configure
logging levels for just the driver library with the `TYPEDB_DRIVER_LOG`
or general `RUST_LOG` environment variables. By default we set it to
`info`.
## Implementation
- Fix and enhance on_close callbacks:
- on attaching a callback, we don't return until the callback is
actually registered (used to submit into an async channel, but not
necessarily be recorded)
- this is also sped up by having the lowest-level registration listener
listen in an async context instead of a polling context
- we fix calling segfault that occurred on invoking the callback from
Rust, mostly by enabling threading from the SWIG .`i` layer!
- Make `close()` a promise in Rust, which can be awaited, and a blocking
operation in Java and Python, which awaits a signal from the server that
the transaction is actually closed and the resources are freed up.
- We add on_close callback integration tests for Python, Java, and Rust
- add `keepalive`to the channel, which prevents some nondeterministic
message delays/delivery failures.
## Further notes
**Mysterious lost responses**
It appears that server responses (in particular, the transaction open
response) sometimes never gets delivered into our code. This only is
reproducible in the localstack demo
https://github.com/typedb-osi/typedb-localstack-demo, and there
non-deterministically!
We see:
- Driver: send open transaction request
- Server: receive open txn request OpenTransaction.Req
- Server: open txn, response with OpenTransaction.Res
These are confirmed with Wireshark.
The client side actually receives __something__. If we add logging into
`stub.rs`:
```
let stream = this
.grpc
.transaction(UnboundedReceiverStream::new(receiver))
.map(...)
.await
trace!("Received response to txn open request!")
```
This actually returns a usable grpc stream successfully -- however, the
initial OpenTransaction.Res message doesn't arrive until "something
else" happens, such as the stream closing, or a keepalive ping it sent.
It's very strange but the keepalive ping at being set at 3 seconds does
force the message to arrive at some point...
0 commit comments