Take charge of parsing and evaluating inputs #960

lionel- · 2025-11-06T13:54:52Z

Progress towards posit-dev/positron#1766
Progress towards posit-dev/positron#3078
Progress towards posit-dev/positron#9156
Addresses posit-dev/positron#6553
Closes #598
Closes #722
Closes #840

This PR does two things:

It takes charge of Reading/Parsing and Evaluation of R code. (First commit)
Refactors, simplifies, and fixes a bunch of issues. (All other commits)

Doing parsing and evaluation in Ark is a significant departure of how read_console(), the heart of the kernel, is currently structured. This handler is hooked to the R frontend method ReadConsole, which takes user inputs as strings. The R REPL gets these strings, parses them, evaluates them, prints the result, then loops back. With this PR, we take control of both parsing and evaluation.

There is no functional changes here except for a few bugfixes. In the future though, this will allow us to:

Parse all inputs sent from editors with source references and injected breakpoints.
When debugging, evaluate code in the frame selected in the IDE.

New Features

Selections of code are now parsed in a single pass. A minor consequence is that all functions evaluated as part of the same selection share a common virtual document when step-debugging through them.
recover() is now an alias for browser(). The R version of recover() was buggy (options(error = recover) seems broken positron#9156) and it was hard to conciliate its behaviour with the requirements of the Jupyter protocol. By aliasing it to browser(), we can leverage the IDE's call stack pane to fulfill the same functionality as the original recover() function (R: Frame-aware custom debug REPL positron#3078).

Fixes

Minor: We no longer have any size limitation on lines of inputs. They were previously limited to 4096 bytes, the maximum size of the ReadConsole input buffer.
Restart during debug sessions now works as expected: Restart the R kernel during debugging session leads to bad state positron#6553
We now interrupt ongoing evaluations before restarting. This shouldn't have any impact on Positron, which sends an interrupt before restarting, but should make restarts behave better with other frontends. This is also a nice defensive measure even with Positron.
We now flush autoprint output when an error is thrown by a print method. I was quite confused by this bug because it also swallowed debugging output!
The set of syntax errors that throw an R error instead of returning an error code no longer cause a leaked backtrace in the console (Sending a pair of empty backticks to the kernel results in an internal error showing up in the console #598, Weird error with "\s" #722)

These fixes (and more) are covered by integration tests.

Approach

We parse inputs into a vector of complete expressions. These expressions are stored in the read-console state in a new PendingInputs struct that implements a stack interface, with a pop method. This stack of pending inputs replaces the stack pending lines we implemented in Split inputs by newlines #536.
When read-console is called by R for a new input, it is evaluated and the result is stored in base::.ark_last_value. We store it in the base environment for robust access from (almost) any evaluation environment (global on top level but the debugger can evaluate in any environment). We only require the presence of :: so that we can reach into base.
Read-console returns to R with the string input "base::.ark_last_value" if evaluation returned visibly, or "base::invisible(base::.ark_last_value)" if it returned invisibly. The R REPL then parses this expression and evaluates it.
Because we are still hooked into the R REPL, it has a chance to do top-level stuff like printing warnings, running task callbacks, updating .Last.value, etc.

While this approach works, I hit two main issues that we need to workaround:

When an error happens, the R_EvalDepth variable is not reset before the next evaluation (it's normally reset here: https://github.com/r-devel/r-svn/blob/811080fb/src/main/main.c#L260). This could be bad especially if that error is an exceeded evaluation depth error and we're looping back to evaluate a new expression without having reset it.
When a nested debug REPL returns to a parent REPL, the R_ConsoleIob parse buffer is normally reset after returning from eval() (here: https://github.com/r-devel/r-svn/blob/811080fb/src/main/main.c#L279). With this approach, it doesn't have a chance to reset before entering read-console again.

To solve the first issue, we evaluate a dummy input ("base::.Last.value" to keep the last value stable, this is tested) after an error occurs. This gives R a chance to fully reset state. To solve the second issue, we send a dummy parse causing a PARSE_NULL everytime we detect a nested read-console has returned. This causes R to reset the parse buffer here: https://github.com/r-devel/r-svn/blob/811080fb/src/main/main.c#L238.

These workarounds are unfortunate but have been mostly contained in the r_read_console() method, which keeps track of all these details and other state.

Reorganisation

Read-console is both an event loop (when we're waiting for input from the frontend) and a state machine (to manage execution of inputs). The event loop part is now extracted in a new method run_event_loop(). We no longer wait for both input replies (when we're asking the frontend to reply to a readline() or menu() question) and execution requests (when we're waiting for the frontend to send us the next bit of code to execute) at the same time, it's either one or the other, which simplifies the code. The handler for input requests now invokes the event loop itself, instead of returning to the top level of read-console to fall through the event loop.
The state machine part of read-console has been simplified in other ways. We now inspect the states from the top level and pass it down through arguments, which makes it easier to reason about. See in particular the new take_exception() and take_result() methods called from top-level and passed down to the handler for replying to an active request when there are no inputs left to evaluate. This allowed simplifying things quite a bit.
We now keep track of the number of nested consoles (due to active debug sessions) on the stack. Not used for anything but it seems good information to keep track of. To support this (and other state bookkeeping), we use a new wrapper around R_ExecWithCleanup().
The handling of debugger commands has been consolidated into a single location.
A benefit of parsing expressions ourselves is that it is no longer possible for R to send us an incomplete prompt. This has been completely removed from the possible states read-console can be in, which is another nice simplification.
The DummyFrontend for integration tests has now a few more methods like execute_request() that capture common patterns of messaging. This removes quite a lot of boilerplate.

DavisVaughan

Looks good! I've been running this for a few days and have not had any major issues.

In addition to my comments below, two other things

I still think Option would be useful around srcrefs to show to the reader that they are optional. I confused myself about them again today. It doesn't seem too hard to do this #961

There is one regression I noted with debugonce() when inside a package that you have done load_all() with:

devtools::load_all()
debugonce(vec_count)
vec_count(1:5)

If you take a step and then run 1:2 in the Console, then with this PR you get jumped to a fallback view of vec_count, which is pretty jarring and isn't right compared with the current behavior of staying in the deparsed srcref file we created for vctrs.

This seems worth fixing before merging.

This PR:

Screen.Recording.2025-11-11.at.9.15.22.AM.mov

Current main:

Screen.Recording.2025-11-11.at.9.16.28.AM.mov

crates/libr/src/r.rs

crates/ark/src/sys/unix/interface.rs

crates/harp/src/parser/srcref.rs

crates/amalthea/tests/client.rs

crates/ark/tests/kernel.rs

crates/ark/src/interface.rs

I couldn't produce any issues with `R_EvalDepth` but it seems more sound to let R reset state before evaluation

Would be hard to get right in the case of nested browser sessions

wip

From R's perspective, the handler runs _after_ the error was emitted. That's why the user is able to see the error message before the recover prompt. From our perspective though, we have to run the handler _before_ emitting the error, because all executed code needs to nested in an execution request so that we can properly match output to a prompt on the frontend side. The Jupyter protocol does not really support orphan side effects (streams, input requests).

The `recover()` functionality is provided by the call stack panes of frontends

lionel- force-pushed the feature/read-srcref branch 3 times, most recently from ae27d23 to b4839c2 Compare November 7, 2025 08:39

lionel- requested a review from DavisVaughan November 7, 2025 09:05

DavisVaughan mentioned this pull request Nov 11, 2025

Make it clear that srcrefs is not always present #961

Closed

DavisVaughan approved these changes Nov 11, 2025

View reviewed changes

lionel- force-pushed the feature/read-srcref branch 3 times, most recently from 955733f to dc8c5cb Compare November 19, 2025 13:51

lionel- mentioned this pull request Nov 19, 2025

Use new Ark's REPL engine posit-dev/positron#10661

Merged

lionel- force-pushed the feature/read-srcref branch from 5765c07 to 8441b97 Compare November 20, 2025 14:06

lionel- added 19 commits November 27, 2025 12:29

Take charge of parsing and evaluation

fc5724e

More caller tracking

c5a8365

Consolidate debugger states

99dfd7b

Extract handle_input_request()

5ebe1a4

Extract handle_pending_input()

6b4d285

Rename finalize_call_text() to handle_read_console()

f5dd8df

Make read() a constructor method on PendingInputs

4e7c625

Refactor console error and result handling

bb8f71e

Cancel pending inputs when we get in the debugger

a0d2e0b

Add test for invalid syntax

e3cd0ca

Tweak documentation

73bb637

Remove into_protected() method

b8b3f89

Make reply_execute_request() a free function

96d8c7a

Create Jupyter exception in the global condition handler

fe06d3a

Fully remove incomplete prompts heuristic since they are now impossible

c50273c

Extract ReadConsole event loop into method

03aa0b5

Return to base REPL in case of error

38c2090

I couldn't produce any issues with `R_EvalDepth` but it seems more sound to let R reset state before evaluation

Rename eval_pending() to eval()

5699d2f

Don't clear pending expressions in browser sessions

115240b

Would be hard to get right in the case of nested browser sessions

lionel- added 26 commits November 27, 2025 12:29

Opt out of Shutdown tests on Windows

ca9dbbc

Improve naming a bit

ce0a7e6

Don't include backtrace in syntax errors

e21ce2c

Fix backtraces in special syntax errors

96f22ac

Disable error entracing in sensitive tests

c6699eb

Extract FrontendDummy::execute_request() and variants

99fa836

Respect getOption("keep.source") in ReadConsole parser

7269231

wip

Add harp::once!

8294fc3

Restore R_Srcref on exit to avoid changing the DAP's top frame

11f3708

Adjust for recent changes on main

4f3f235

Add closure variant of IOPub Stream assertion

086c445

Collect IOPub streams until end matches

036a924

Better handle errors in options(error = )

5b377e2

Simplify call stack

5ccb65c

Hook recover() to call browser()

8adaf78

The `recover()` functionality is provided by the call stack panes of frontends

Tweak comments

e158510

Consolidate RMain-related DAP state in RMain

cd03255

Rename to read_console_nested_return_next_input

44fa355

Use self.is_empty()

88e4955

Rework srcref getters

f95a17e

Use existing list getter

c021bd4

Fix timing of error buffer peeking

ef63f57

Tweak control flow

ae0b428

Tweak more comments

7f6fa3a

Move r_cleanup_for_tests() to Unix file

fbd76e2

lionel- force-pushed the feature/read-srcref branch from 8441b97 to fbd76e2 Compare November 27, 2025 16:48

lionel- merged commit 2c2e57f into main Nov 27, 2025
8 checks passed

lionel- deleted the feature/read-srcref branch November 27, 2025 18:12

github-actions bot locked and limited conversation to collaborators Nov 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Take charge of parsing and evaluating inputs #960

Take charge of parsing and evaluating inputs #960

Uh oh!

lionel- commented Nov 6, 2025 •

edited

Loading

Uh oh!

DavisVaughan left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Take charge of parsing and evaluating inputs #960

Take charge of parsing and evaluating inputs #960

Uh oh!

Conversation

lionel- commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

New Features

Fixes

Approach

Reorganisation

Uh oh!

DavisVaughan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lionel- commented Nov 6, 2025 •

edited

Loading