Skip to content

Conversation

@korydraughn
Copy link

@korydraughn korydraughn commented Aug 14, 2025

This PR updates support for iRODS by replacing Jargon (legacy iRODS library) with irods4j.

The foundational work was implemented by @MINGYJ, a recent iRODS intern. My commits are mainly polish and corrections around the use of the irods4j library.

Basic functionality is working - i.e. single stream uploads/downloads, renames, editing, etc.

The parallel transfer implementation doesn't appear to be working as intended. This is likely due to not having a full understanding of how the Cyberduck components fit together - i.e. Read/WriteFeature vs Upload/DownloadFeature.

Here are the steps for performing parallel transfer (i.e. multipart uploads) in iRODS.

Putting in draft for now. Feedback and guidance on how to implement proper support for parallel transfer would be greatly appreciated.

Resolves #14449.

@CLAassistant

This comment was marked as resolved.

@korydraughn

This comment was marked as resolved.

@dkocher dkocher changed the title [#14449] Update support for iRODS Update support for iRODS Sep 1, 2025
@dkocher
Copy link
Contributor

dkocher commented Sep 1, 2025

Putting in draft for now. Feedback and guidance on how to implement proper support for parallel transfer would be greatly appreciated.

Discussed in #14449 (comment)

@korydraughn
Copy link
Author

TODO: Update documentation for iRODS - https://docs.cyberduck.io/protocols/irods/

@trel
Copy link

trel commented Oct 9, 2025

updating the docs is a separate repo, would have its own PR

@korydraughn

This comment was marked as resolved.

@korydraughn
Copy link
Author

korydraughn commented Oct 9, 2025

Rebased PR on top of master.

Verified the following:

  • Parallel/Single stream uploads and downloads works
  • Config options defined in new iRODS profile are honored
  • PAM authentication works via the pam_password auth scheme
  • Other operations work too - rename, delete, etc.

With this PR, users will not be able to calculate checksums on data in an iRODS zone. Cyberduck will report checksums in their hex form if they exist in iRODS. We could add an option to the iRODS profile which allows users to instruct Cyberduck to calculate checksums following an upload, but that didn't feel like the correct approach, mainly because profiles seem to be loaded once on program start and never reread.

As for the unit/integration tests, I'm not sure how the implementation can be tested without a real iRODS server.

Finally, this PR bumps the minimum iRODS version requirement to 4.3.2. A new version of irods4j will be needed before this is merged. See irods/irods4j#126. Will get a new version released this week soon.

@korydraughn
Copy link
Author

@dkocher When I move a file in or out of a directory, I see a new iRODS connection get created. It disappears eventually, but not cleanly.

Can you explain why Cyberduck creates a new connection every time a file is moved?

The move class is very similar to other implementations so it's not clear to me what causes the new connection.

@dkocher
Copy link
Contributor

dkocher commented Oct 14, 2025

@dkocher When I move a file in or out of a directory, I see a new iRODS connection get created. It disappears eventually, but not cleanly.

Can you explain why Cyberduck creates a new connection every time a file is moved?

The move class is very similar to other implementations so it's not clear to me what causes the new connection.

You will need to return Statefulness.stateless in IRODSProtocol#getStatefulness.

@korydraughn
Copy link
Author

@dkocher When would someone choose stateless or stateful?
If I change the protocol to stateless, how will that affect other operations?
If I continue using stateful, what needs to change to make sure the additional connections are closed immediately?

To provide some context, a single iRODS connection cannot be used to execute multiple API operations in parallel. A single request is sent and the server returns a response.

@dkocher
Copy link
Contributor

dkocher commented Oct 15, 2025

If you can share credentials with us for your test environment we can run daily integration tests from our CI.

@dkocher
Copy link
Contributor

dkocher commented Oct 15, 2025

@dkocher When would someone choose stateless or stateful? If I change the protocol to stateless, how will that affect other operations? If I continue using stateful, what needs to change to make sure the additional connections are closed immediately?

To provide some context, a single iRODS connection cannot be used to execute multiple API operations in parallel. A single request is sent and the server returns a response.

We will have to keep it stateful then as otherwise it will be attempted to use a single connection for multiple actions in parallel, i.e. when the user is browsing folders or a file transfer with multiple files in parallel.

I will need to review how we can still support the native copy feature implementation for iRODS that would not require a new connection.

@korydraughn
Copy link
Author

If you can share credentials with us for your test environment we can run daily integration tests from our CI.

We don't have a CI system for people to hook into yet. We have a small set of tools which make it easy to launch one or more iRODS servers for testing. It's likely overkill for your needs though.

With that said, building a containerized environment for testing iRODS is pretty easy. I'm happy to put together a Docker compose project for the iRODS component. That will allow you to launch it on a local computer and have it available for testing.

We will have to keep it stateful then as otherwise it will be attempted to use a single connection for multiple actions in parallel, i.e. when the user is browsing folders or a file transfer with multiple files in parallel.

I will need to review how we can still support the native copy feature implementation for iRODS that would not require a new connection.

So, because the IRODSProtocol class reports iRODS as being stateful, does that mean every operation results in a new iRODS connection being created?

If that's true, why is it that the Move operation results in connections which do not disconnect? No other operation shares that behavior.

I added some log statements to my local build to try and box in what is leading to the additional connections, but it didn't help. However, it did reveal a high number of instantiations of IRODSMoveFeature. Any idea why the move operation is instantiated so much?

@dkocher
Copy link
Contributor

dkocher commented Oct 15, 2025

We have other usages of Docker Compose containers in integration tests, thus that should be feasible.

@korydraughn
Copy link
Author

Where should I place the Docker Compose project? Is the test directory for irods appropriate?

I figure you can move things around if needed.

@korydraughn
Copy link
Author

@dkocher What triggers the Copy implementation?

Using the Duplicate option within the context menu doesn't appear to trigger it.

@korydraughn
Copy link
Author

Docker compose project added under test directory.

Squashing everything down.

@dkocher
Copy link
Contributor

dkocher commented Dec 2, 2025

So far, all of my GUI testing has been "click button, wait for operation to finish, check results and log". I have not tried executing multiple operations in parallel using the GUI.

The simplest way to test manually is to expand multiple folders at once.

@dkocher
Copy link
Contributor

dkocher commented Dec 2, 2025

  1. Due to the IRODSProtocol implementation reporting as "stateful", Cyberduck will not use the same iRODS connection for parallel operations, correct?

That is correct. Connections are leased from a pool.

  1. Will Cyberduck block when the user attempts to execute another operation while the connection is in use?

No, a second connection will be opened.

@korydraughn
Copy link
Author

korydraughn commented Dec 3, 2025

So far, all of my GUI testing has been "click button, wait for operation to finish, check results and log". I have not tried executing multiple operations in parallel using the GUI.

The simplest way to test manually is to expand multiple folders at once.

I'm probably misunderstanding, but wouldn't I need to expand multiple folders fast enough such that the operations overlap?

And if that's the case, is there a trick to doing that in the GUI?

@dkocher
Copy link
Contributor

dkocher commented Dec 3, 2025

So far, all of my GUI testing has been "click button, wait for operation to finish, check results and log". I have not tried executing multiple operations in parallel using the GUI.

The simplest way to test manually is to expand multiple folders at once.

I'm probably misunderstanding, but wouldn't I need to expand multiple folders fast enough such that the operations overlap?

And if that's the case, is there a trick to doing that in the GUI?

Select multiple folders and press the right arrow key to expand.

@korydraughn
Copy link
Author

Wow, can't believe I forgot about multi-select.

Will give that a try.

@korydraughn
Copy link
Author

NTS: The rebase resulted in changes to SHAs. Here's the new mapping for squashing (old sha -> sha after rebase).

@korydraughn
Copy link
Author

korydraughn commented Dec 4, 2025

Removed the integration stanza in irods/pom.xml, as mentioned in #17341 (comment).

Waiting for GitHub Actions to complete. If everything succeeds, I'll squash the commits so that they are clean.


Side note: Opening multiple directories via the GUI worked just fine. I did not encounter any issues stemming from concurrent/parallel use of an iRODS connection.

@korydraughn
Copy link
Author

GitHub Actions reported success.

The commits have been squashed. I think this is ready.

@dkocher
Copy link
Contributor

dkocher commented Dec 5, 2025

GitHub Actions reported success.

The commits have been squashed. I think this is ready.

Sometimes the tests fail with

[ERROR]   IRODSPamAuthenticationTest.start:104 » Runtime org.testcontainers.containers.ContainerLaunchException: Timed out waiting for log output matching '.*"log_message":"Initializing delay server.".*'

Should one increase the timeout to wait for the container?

@korydraughn
Copy link
Author

Increasing the timeout might be enough to resolve it. Will give it a try.

@korydraughn
Copy link
Author

Looks like increasing the wait time to 5 minutes resolved the issue.

Squashing changes.

@korydraughn
Copy link
Author

Hmm, it timed out again. Perhaps the timeout option I set isn't the correct option.

@korydraughn
Copy link
Author

korydraughn commented Dec 5, 2025

Looks like I have to apply it to the object returned by Wait.forLogMessage().

Trying again with updated code.

@korydraughn
Copy link
Author

Squashing changes and waiting to see if GitHub Actions is still happy.

@korydraughn
Copy link
Author

That's two successful runs of the github actions in a row. I can't force it to rerun more so I think this PR is complete.

@dkocher Maybe you can rerun the ubuntu github action a few more times to see how well the latest tweak holds up?

Switching to wrapping up the profiles PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

irods IRODS Protocol Implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PAM passwords not handled correctly

4 participants