Skip to content

Commit 5c929ce

Browse files
authored
feat(pins): add support for custom storage options in board_s3 (#237)
* feat(pins): add support for custom storage options in `board_s3` - Introduce the ability to pass arbitrary storage options to the underlying fsspec S3FileSystem in `board_s3`. - This enhancement allows for greater flexibility when connecting to S3-compatible services by enabling the specification of custom credentials, endpoints, and other S3FileSystem parameters. - Added documentation and examples to illustrate how to use the new `storage_options` parameter, including an example for connecting to Backblaze B2. This change enables users to more easily integrate with a variety of S3-compatible storage solutions, improving the library's versatility and user experience. * fix(pins): correct kwargs reference in board_s3 constructor - Replace `**kwargs` with `**storage_options` to accurately reflect the intended parameter in `board_s3` function, ensuring the correct handling of storage options. * docs(pins): add missing import statement in board_s3 example - This commit adds an import statement for the `pins` module in the docstring example of the `board_s3` function. This change ensures that the example is self-contained and can be executed without prior context, improving the documentation's usability for new users. * style(pins): format board_s3 function definition for better readability - Adjusted the function definition of `board_s3` to span multiple lines, improving code readability and maintainability. - Ensured consistency with project coding standards for function definitions. * feat(pins): add warning for non-zero listings_expiry_time in board_s3 - Implemented a warning for users setting `listings_expiry_time` to a non-zero value in `board_s3` function to alert them about potential unexpected cache behaviour. - Refactored the handling of `storage_options` to ensure `listings_expiry_time` is explicitly set, either by the user or to a default of 0, to improve clarity and maintainability of the code. This change aims to guide users towards optimal performance settings and enhance the reliability of cache operations with S3 boards.
1 parent 2c53d3c commit 5c929ce

File tree

1 file changed

+36
-5
lines changed

1 file changed

+36
-5
lines changed

pins/constructors.py

Lines changed: 36 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
import fsspec
22
import os
33
import tempfile
4+
import warnings
45

56
from .boards import BaseBoard, BoardRsConnect, BoardManual, board_deparse
67
from .cache import PinsCache, PinsRscCacheMapper, PinsAccessTimeCache, prefix_cache
@@ -432,7 +433,9 @@ def board_connect(
432433
board_rsconnect = board_connect
433434

434435

435-
def board_s3(path, versioned=True, cache=DEFAULT, allow_pickle_read=None):
436+
def board_s3(
437+
path, versioned=True, cache=DEFAULT, allow_pickle_read=None, **storage_options
438+
):
436439
"""Create a board to read and write pins from an AWS S3 bucket folder.
437440
438441
Parameters
@@ -453,19 +456,47 @@ def board_s3(path, versioned=True, cache=DEFAULT, allow_pickle_read=None):
453456
You can enable reading pickles by setting this to `True`, or by setting the
454457
environment variable `PINS_ALLOW_PICKLE_READ`. If both are set, this argument
455458
takes precedence.
459+
storage_options:
460+
Additional keyword arguments to be passed to the underlying fsspec S3FileSystem.
456461
457462
Notes
458463
-----
459464
The s3 board uses the fsspec library (s3fs) to handle interacting with AWS S3.
460465
In order to authenticate, set the `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`,
461-
and (optionally) `AWS_REGION` environment variables.
466+
and (optionally) `AWS_REGION` environment variables. If you are using an
467+
s3-compatible storage service that is not from AWS, you can pass in the necessary
468+
credentials to the `storage_options` dictionary, such as `endpoint_url`, `key`, and
469+
`secret`. We recommend setting these as environment variables. An example using
470+
Backblaze B2 would look like:
471+
472+
Examples
473+
--------
474+
>>> import pins
475+
>>> board = pins.board_s3(
476+
... "pins-test",
477+
... endpoint_url=os.getenv("FSSPEC_S3_ENDPOINT_URL"),
478+
... key=os.getenv("FSSPEC_S3_KEY"),
479+
... secret=os.getenv("FSSPEC_S3_SECRET"),
480+
... )
462481
463482
See <https://github.com/fsspec/s3fs>
464483
465484
"""
466-
# TODO: user should be able to specify storage options here?
467-
468-
opts = {"listings_expiry_time": 0}
485+
# Warn user about the use of non-zero listings_expiry_time
486+
listings_expiry_time = storage_options.get("listings_expiry_time", 0)
487+
if listings_expiry_time != 0:
488+
warning_msg = """
489+
Non-zero `listings_expiry_time` may lead to unexpected behaviour with cache operations.
490+
We're not discouraging you from setting it to be a non-zero value,
491+
but we strongly recommend setting it to 0 for optimal performance.
492+
"""
493+
warnings.warn(warning_msg)
494+
495+
# Set options to pass in. Start with storage options provided by user.
496+
opts = {**storage_options}
497+
# Set listings_expiry_time based on what's provided by user
498+
# or the default value of 0.
499+
opts.update({"listings_expiry_time": listings_expiry_time})
469500
return board("s3", path, versioned, cache, allow_pickle_read, storage_options=opts)
470501

471502

0 commit comments

Comments
 (0)