Skip to content

Commit 2150182

Browse files
authored
Merge pull request #153 from cthoyt-forks-and-packages/package-cli
Move Selene CLI from an external script to within the package
2 parents be6d2e1 + 1aff5e2 commit 2150182

File tree

9 files changed

+58
-39
lines changed

9 files changed

+58
-39
lines changed

README.md

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -51,10 +51,6 @@ If you would like to locally install Selene, you can run
5151
python setup.py install
5252
```
5353

54-
### Additional dependency for the CLI
55-
56-
Please install `docopt` before running the command-line script `selene_cli.py` provided in the repository.
57-
5854
## About Selene
5955

6056
Selene is composed of a command-line interface and an API (the `selene-sdk` Python package).

docs/source/overview/cli.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,8 @@
11
# Selene CLI operations and outputs
2-
Selene provides a command-line interface (CLI) that takes in a user-specified configuration file containing the operations the user wants to run and the parameters required for these operations. (See [Operations](#operations) for more detail.)
2+
Selene provides a command-line interface (CLI) that takes in a user-specified configuration file containing the operations the user wants to run and the parameters required for these operations.
3+
It is automatically installed using a setuptools entrypoint so it can be called with the bash function
4+
`selene_sdk` from anywhere in the bash. It can also be called with `python -m selene_sdk`.
5+
See [Operations](#operations) for more detail.
36

47
The sections that follow describe in detail how the various components that make up the configuration file are specified. For operation-specific sections (e.g. training, evaluation), we also explain what the expected outputs are.
58

@@ -54,7 +57,7 @@ Note that there should not be any commas at the end of these lines.
5457
- `random_seed`: Set a random seed for `torch` and `torch.cuda` (if using CUDA-enabled GPUs) for reproducibility.
5558
- `output_dir`: The output directory to use for all operations. If no `output_dir` is specified, Selene assumes that the `output_dir` is specified in all relevant function-type values for operations in Selene. (More information on what function-type values are in [later sections](#a-note-for-the-following-sections).) We recommend using this parameter for `train` and `evaluate` operations.
5659
- `create_subdirectory`: If True, creates a directory within `output_dir` with the name formatted as `%Y-%m-%d-%H-%M-%S`---the date/time when Selene was run. (This is only applicable if `output_dir` has been specified.)
57-
- `lr`: The learning rate. If you use our [CLI script](https://github.com/FunctionLab/selene/blob/master/selene_cli.py), you can pass this in as a command-line argument rather than having it specified in the configuration file.
60+
- `lr`: The learning rate. If you use the CLI (`selene_sdk`), you can pass this in as a command-line argument rather than having it specified in the configuration file.
5861
- `load_test_set`: This is only applicable if you have specified `ops: [train, evaluate]`. You can set this parameter to True (by default it is False and the test set is only loaded when training ends) if you would like to load the test set into memory before training begins---and therefore save the test data generated by a sampler to a .bed file. You would find this useful if you want to save a test dataset (see [Samplers used for training](#samplers-used-for-training-and-evaluation-optionally)) and you do not know if your model will finish training and evaluation within the allotted time that your job is run. You should also be running Selene on a machine that can support such an increase in memory usage (on the order of GBs, depending on how many classes your model predicts, how large the test dataset is, etc.).
5962

6063
## Model architecture

selene-cpu.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ channels:
66
- conda-forge
77
dependencies:
88
- cython=0.29.3
9+
- click==7.1.2
910
- docopt=0.6.2
1011
- h5py=2.9.0
1112
- matplotlib=2.0.2

selene-gpu-snapshot.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,7 @@ dependencies:
9797
- pytorch=1.0.1=py3.6_cuda10.0.130_cudnn7.4.2_2
9898
- torchvision=0.2.2=py_3
9999
- pip:
100+
- click==7.1.2
100101
- pkginfo==1.4.2
101102
- requests-toolbelt==0.8.0
102103
- torch==1.0.1.post2

selene-gpu.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ channels:
77
- defaults
88
dependencies:
99
- cython=0.29.3
10+
- click==7.1.2
1011
- docopt=0.6.2
1112
- h5py=2.9.0
1213
- pandas=0.20.3

selene_cli.py

Lines changed: 0 additions & 32 deletions
This file was deleted.

selene_sdk/__main__.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
"""Entrypoint module, in case you use `python -m selene_sdk`.
2+
3+
Why does this file exist, and why __main__? For more info, read:
4+
- https://www.python.org/dev/peps/pep-0338/
5+
- https://docs.python.org/3/using/cmdline.html#cmdoption-m
6+
"""
7+
8+
from .cli import main
9+
10+
if __name__ == '__main__':
11+
main()

selene_sdk/cli.py

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
"""Command line interface for Selene.
2+
3+
Why does this file exist, and why not put this in ``__main__``? You might be tempted to import things from ``__main__``
4+
later, but that will cause problems--the code will get executed twice:
5+
6+
- When you run ``python3 -m selene_sdk`` python will execute``__main__.py`` as a script. That means there won't be any
7+
``selene_sdk.__main__`` in ``sys.modules``.
8+
- When you import __main__ it will get executed again (as a module) because
9+
there's no ``selene_sdk.__main__`` in ``sys.modules``.
10+
11+
.. seealso:: http://click.pocoo.org/5/setuptools/#setuptools-integration
12+
"""
13+
14+
import click
15+
16+
from selene_sdk import __version__
17+
from selene_sdk.utils import load_path, parse_configs_and_run
18+
19+
20+
@click.command()
21+
@click.version_option(__version__)
22+
@click.argument('path', type=click.Path(exists=True, file_okay=True, dir_okay=False))
23+
@click.option('--lr', type=float, help='If training, the optimizer learning rate', show_default=True)
24+
def main(path, lr):
25+
"""Build the model and trains it using user-specified input data."""
26+
configs = load_path(path, instantiate=False)
27+
parse_configs_and_run(configs, lr=lr)
28+
29+
30+
if __name__ == "__main__":
31+
main()

setup.py

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,7 @@
5050
cmdclass=cmdclass,
5151
install_requires=[
5252
"cython>=0.27.3",
53+
'click',
5354
"h5py",
5455
"matplotlib>=2.2.3",
5556
"numpy",
@@ -63,4 +64,10 @@
6364
"seaborn",
6465
"statsmodels",
6566
"torch>=0.4.1, <=1.4.0",
66-
])
67+
],
68+
entry_points={
69+
'console_scripts': [
70+
'selene_sdk = selene_sdk.cli:main',
71+
],
72+
},
73+
)

0 commit comments

Comments
 (0)