Skip to content

Commit 704aff8

Browse files
committed
Merge pull request #23: New timestamp pattern option
The pull request wasn't exactly complete (the code couldn't have run as written, although it showed the general idea clear enough) so I decided to treat #23 as more of a feature suggestion. However there was no reason no to merge the pull request and use it as a base for my changes, hence why I decided to do so despite rewriting the code. Changes from the pull request: - Renamed 'timestamp' to 'timestamp_pattern' (less ambiguous). - Added validation that custom patterns define named capture groups corresponding to all of the required date components. - Rewrote mapping from capture groups to datetime.datetime() arguments: - Previously positional datetime.datetime() arguments were used which depended on the order of capture groups in the hard coded regular expression pattern to function correctly. - Now that users can define their own patterns, this is no longer a reasonable approach. As such the code now constructs and passes a dictionary of keyword arguments to datetime.datetime(). - Updated documentation and command line interface usage message. - Added tests for the new behavior.
2 parents 06d57c4 + bc88870 commit 704aff8

File tree

5 files changed

+169
-20
lines changed

5 files changed

+169
-20
lines changed

README.rst

Lines changed: 28 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -153,6 +153,10 @@ intended you have no right to complain ;-).
153153
usage of the ``-H``, ``--hourly`` option for details about ``COUNT``."
154154
"``-y``, ``--yearly=COUNT``","Set the number of yearly backups to preserve during rotation. Refer to the
155155
usage of the ``-H``, ``--hourly`` option for details about ``COUNT``."
156+
"``-t``, ``--timestamp-pattern=PATTERN``","Customize the regular expression pattern that is used to match and extract
157+
timestamps from filenames. ``PATTERN`` is expected to be a Python compatible
158+
regular expression that must define the named capture groups 'year',
159+
'month' and 'day' and may define 'hour', 'minute' and 'second'."
156160
"``-I``, ``--include=PATTERN``","Only process backups that match the shell pattern given by ``PATTERN``. This
157161
argument can be repeated. Make sure to quote ``PATTERN`` so the shell doesn't
158162
expand the pattern before it's received by rotate-backups."
@@ -368,6 +372,28 @@ Supported configuration options
368372
``weekly``, ``monthly`` and ``yearly`` options, these options support the
369373
same values as documented for the command line interface.
370374

375+
- The ``timestamp-pattern`` option can be used to customize the regular
376+
expression that's used to extract timestamps from filenames. The value is
377+
expected to be a Python compatible regular expression that must contain the
378+
named capture groups 'year', 'month' and 'day' and may contain the groups
379+
'hour', 'minute' and 'second'. As an example here is the default regular
380+
expression::
381+
382+
# Required components.
383+
(?P<year>\d{4} ) \D?
384+
(?P<month>\d{2}) \D?
385+
(?P<day>\d{2} ) \D?
386+
(?:
387+
# Optional components.
388+
(?P<hour>\d{2} ) \D?
389+
(?P<minute>\d{2}) \D?
390+
(?P<second>\d{2})?
391+
)?
392+
393+
Note how this pattern spans multiple lines: Regular expressions are compiled
394+
using the `re.VERBOSE`_ flag which means whitespace (including newlines) is
395+
ignored.
396+
371397
- The ``include-list`` and ``exclude-list`` options define a comma separated
372398
list of filename patterns to include or exclude, respectively:
373399

@@ -384,7 +410,7 @@ Supported configuration options
384410
used to remove backups.
385411

386412
- The ``ionice`` option expects one of the I/O scheduling class names ``idle``,
387-
``best-effort`` or ``realtime``.
413+
``best-effort`` or ``realtime`` (or the corresponding numbers).
388414

389415
- The ``ssh-user`` option can be used to override the name of the remote SSH
390416
account that's used to connect to a remote system.
@@ -436,6 +462,7 @@ This software is licensed under the `MIT license`_.
436462
.. _peter@peterodding.com: peter@peterodding.com
437463
.. _PyPI: https://pypi.python.org/pypi/rotate-backups
438464
.. _Python Package Index: https://pypi.python.org/pypi/rotate-backups
465+
.. _re.VERBOSE: https://docs.python.org/3/library/re.html#re.VERBOSE
439466
.. _Read the Docs: https://rotate-backups.readthedocs.org
440467
.. _rsync: http://en.wikipedia.org/wiki/rsync
441468
.. _virtual environments: http://docs.python-guide.org/en/latest/dev/virtualenvs/

docs/conf.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,7 @@
7474
'python3': ('https://docs.python.org/3/', None),
7575
'dateutil': ('https://dateutil.readthedocs.io/en/latest/', None),
7676
'executor': ('https://executor.readthedocs.io/en/latest/', None),
77+
'humanfriendly': ('https://humanfriendly.readthedocs.io/en/latest/', None),
7778
'propertymanager': ('https://property-manager.readthedocs.io/en/latest/', None),
7879
'updatedotdee': ('https://update-dotdee.readthedocs.io/en/latest/', None),
7980
}

rotate_backups/__init__.py

Lines changed: 93 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# rotate-backups: Simple command line interface for backup rotation.
22
#
33
# Author: Peter Odding <peter@peterodding.com>
4-
# Last Change: February 13, 2020
4+
# Last Change: February 14, 2020
55
# URL: https://github.com/xolox/python-rotate-backups
66

77
"""
@@ -26,7 +26,7 @@
2626
from executor import ExternalCommandFailed
2727
from executor.concurrent import CommandPool
2828
from executor.contexts import RemoteContext, create_context
29-
from humanfriendly import Timer, coerce_boolean, format_path, parse_path, pluralize
29+
from humanfriendly import Timer, coerce_boolean, coerce_pattern, format_path, parse_path, pluralize
3030
from humanfriendly.text import concatenate, split
3131
from natsort import natsort
3232
from property_manager import (
@@ -36,6 +36,7 @@
3636
lazy_property,
3737
mutable_property,
3838
required_property,
39+
set_property,
3940
)
4041
from simpleeval import simple_eval
4142
from six import string_types
@@ -48,6 +49,9 @@
4849
# Initialize a logger for this module.
4950
logger = VerboseLogger(__name__)
5051

52+
DEFAULT_REMOVAL_COMMAND = ['rm', '-fR']
53+
"""The default removal command (a list of strings)."""
54+
5155
ORDERED_FREQUENCIES = (
5256
('minutely', relativedelta(minutes=1)),
5357
('hourly', relativedelta(hours=1)),
@@ -57,14 +61,29 @@
5761
('yearly', relativedelta(years=1)),
5862
)
5963
"""
60-
A list of tuples with two values each:
64+
An iterable of tuples with two values each:
6165
6266
- The name of a rotation frequency (a string like 'hourly', 'daily', etc.).
6367
- A :class:`~dateutil.relativedelta.relativedelta` object.
6468
6569
The tuples are sorted by increasing delta (intentionally).
6670
"""
6771

72+
SUPPORTED_DATE_COMPONENTS = (
73+
('year', True),
74+
('month', True),
75+
('day', True),
76+
('hour', False),
77+
('minute', False),
78+
('second', False),
79+
)
80+
"""
81+
An iterable of tuples with two values each:
82+
83+
- The name of a date component (a string).
84+
- :data:`True` for required components, :data:`False` for optional components.
85+
"""
86+
6887
SUPPORTED_FREQUENCIES = dict(ORDERED_FREQUENCIES)
6988
"""
7089
A dictionary with rotation frequency names (strings) as keys and
@@ -89,9 +108,6 @@
89108
filenames.
90109
"""
91110

92-
DEFAULT_REMOVAL_COMMAND = ['rm', '-fR']
93-
"""The default removal command (a list of strings)."""
94-
95111

96112
def coerce_location(value, **options):
97113
"""
@@ -197,15 +213,21 @@ def load_config_file(configuration_file=None, expand=True):
197213
rotation_scheme = dict((name, coerce_retention_period(items[name]))
198214
for name in SUPPORTED_FREQUENCIES
199215
if name in items)
200-
options = dict(include_list=split(items.get('include-list', '')),
201-
exclude_list=split(items.get('exclude-list', '')),
202-
io_scheduling_class=items.get('ionice'),
203-
strict=coerce_boolean(items.get('strict', 'yes')),
204-
prefer_recent=coerce_boolean(items.get('prefer-recent', 'no')))
216+
options = dict(
217+
exclude_list=split(items.get('exclude-list', '')),
218+
include_list=split(items.get('include-list', '')),
219+
io_scheduling_class=items.get('ionice'),
220+
prefer_recent=coerce_boolean(items.get('prefer-recent', 'no')),
221+
strict=coerce_boolean(items.get('strict', 'yes')),
222+
)
205223
# Don't override the value of the 'removal_command' property unless the
206224
# 'removal-command' configuration file option has a value set.
207225
if items.get('removal-command'):
208226
options['removal_command'] = shlex.split(items['removal-command'])
227+
# Don't override the value of the 'timestamp_pattern' property unless the
228+
# 'timestamp-pattern' configuration file option has a value set.
229+
if items.get('timestamp-pattern'):
230+
options['timestamp_pattern'] = items['timestamp-pattern']
209231
# Expand filename patterns?
210232
if expand and location.have_wildcards:
211233
logger.verbose("Expanding filename pattern %s on %s ..", location.directory, location.context)
@@ -425,6 +447,40 @@ def strict(self):
425447
"""
426448
return True
427449

450+
@mutable_property
451+
def timestamp_pattern(self):
452+
"""
453+
The pattern used to extract timestamps from filenames (defaults to :data:`TIMESTAMP_PATTERN`).
454+
455+
The value of this property is a compiled regular expression object.
456+
Callers can provide their own compiled regular expression which
457+
makes it possible to customize the compilation flags (see the
458+
:func:`re.compile()` documentation for details).
459+
460+
The regular expression pattern is expected to be a Python compatible
461+
regular expression that contains the named capture groups 'year',
462+
'month', 'day', 'hour', 'minute' and 'second'.
463+
464+
String values are automatically coerced to compiled regular expressions
465+
by calling :func:`~humanfriendly.coerce_pattern()`, in this case only
466+
the :data:`re.VERBOSE` flag is used.
467+
468+
If the caller provides a custom pattern it will be validated
469+
to confirm that the pattern contains named capture groups
470+
corresponding to each of the required date components
471+
defined by :data:`SUPPORTED_DATE_COMPONENTS`.
472+
"""
473+
return TIMESTAMP_PATTERN
474+
475+
@timestamp_pattern.setter
476+
def timestamp_pattern(self, value):
477+
"""Coerce the value of :attr:`timestamp_pattern` to a compiled regular expression."""
478+
pattern = coerce_pattern(value, re.VERBOSE)
479+
for component, required in SUPPORTED_DATE_COMPONENTS:
480+
if component not in pattern.groupindex and required:
481+
raise ValueError("Pattern is missing required capture group! (%s)" % component)
482+
set_property(self, 'timestamp_pattern', pattern)
483+
428484
def rotate_concurrent(self, *locations, **kw):
429485
"""
430486
Rotate the backups in the given locations concurrently.
@@ -587,7 +643,7 @@ def collect_backups(self, location):
587643
logger.info("Scanning %s for backups ..", location)
588644
location.ensure_readable(self.force)
589645
for entry in natsort(location.context.list_entries(location.directory)):
590-
match = TIMESTAMP_PATTERN.search(entry)
646+
match = self.timestamp_pattern.search(entry)
591647
if match:
592648
if self.exclude_list and any(fnmatch.fnmatch(entry, p) for p in self.exclude_list):
593649
logger.verbose("Excluded %s (it matched the exclude list).", entry)
@@ -597,7 +653,7 @@ def collect_backups(self, location):
597653
try:
598654
backups.append(Backup(
599655
pathname=os.path.join(location.directory, entry),
600-
timestamp=datetime.datetime(*(int(group, 10) for group in match.groups('0'))),
656+
timestamp=self.match_to_datetime(match),
601657
))
602658
except ValueError as e:
603659
logger.notice("Ignoring %s due to invalid date (%s).", entry, e)
@@ -607,6 +663,30 @@ def collect_backups(self, location):
607663
logger.info("Found %i timestamped backups in %s.", len(backups), location)
608664
return sorted(backups)
609665

666+
def match_to_datetime(self, match):
667+
"""
668+
Convert a regular expression match to a :class:`~datetime.datetime` value.
669+
670+
:param match: A regular expression match object.
671+
:returns: A :class:`~datetime.datetime` value.
672+
:raises: :exc:`exceptions.ValueError` when a required date component is
673+
not captured by the pattern, the captured value is an empty
674+
string or the captured value cannot be interpreted as a
675+
base-10 integer.
676+
677+
.. seealso:: :data:`SUPPORTED_DATE_COMPONENTS`
678+
"""
679+
kw = {}
680+
for component, required in SUPPORTED_DATE_COMPONENTS:
681+
value = match.group(component)
682+
if value:
683+
kw[component] = int(value, 10)
684+
elif required:
685+
raise ValueError("Missing required date component! (%s)!" % component)
686+
else:
687+
kw[component] = 0
688+
return datetime.datetime(**kw)
689+
610690
def group_backups(self, backups):
611691
"""
612692
Group backups collected by :func:`collect_backups()` by rotation frequencies.

rotate_backups/cli.py

Lines changed: 15 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# rotate-backups: Simple command line interface for backup rotation.
22
#
33
# Author: Peter Odding <peter@peterodding.com>
4-
# Last Change: February 13, 2020
4+
# Last Change: February 14, 2020
55
# URL: https://github.com/xolox/python-rotate-backups
66

77
"""
@@ -69,6 +69,13 @@
6969
Set the number of yearly backups to preserve during rotation. Refer to the
7070
usage of the -H, --hourly option for details about COUNT.
7171
72+
-t, --timestamp-pattern=PATTERN
73+
74+
Customize the regular expression pattern that is used to match and extract
75+
timestamps from filenames. PATTERN is expected to be a Python compatible
76+
regular expression that must define the named capture groups 'year',
77+
'month' and 'day' and may define 'hour', 'minute' and 'second'.
78+
7279
-I, --include=PATTERN
7380
7481
Only process backups that match the shell pattern given by PATTERN. This
@@ -233,11 +240,12 @@ def main():
233240
selected_locations = []
234241
# Parse the command line arguments.
235242
try:
236-
options, arguments = getopt.getopt(sys.argv[1:], 'M:H:d:w:m:y:I:x:jpri:c:C:uS:fnvqh', [
243+
options, arguments = getopt.getopt(sys.argv[1:], 'M:H:d:w:m:y:t:I:x:jpri:c:C:uS:fnvqh', [
237244
'minutely=', 'hourly=', 'daily=', 'weekly=', 'monthly=', 'yearly=',
238-
'include=', 'exclude=', 'parallel', 'prefer-recent', 'relaxed',
239-
'ionice=', 'config=', 'removal-command=', 'use-sudo', 'syslog=',
240-
'force', 'dry-run', 'verbose', 'quiet', 'help',
245+
'timestamp-pattern=', 'include=', 'exclude=', 'parallel',
246+
'prefer-recent', 'relaxed', 'ionice=', 'config=',
247+
'removal-command=', 'use-sudo', 'syslog=', 'force',
248+
'dry-run', 'verbose', 'quiet', 'help',
241249
])
242250
for option, value in options:
243251
if option in ('-M', '--minutely'):
@@ -252,6 +260,8 @@ def main():
252260
rotation_scheme['monthly'] = coerce_retention_period(value)
253261
elif option in ('-y', '--yearly'):
254262
rotation_scheme['yearly'] = coerce_retention_period(value)
263+
elif option in ('-t', '--timestamp-pattern'):
264+
kw['timestamp_pattern'] = value
255265
elif option in ('-I', '--include'):
256266
kw['include_list'].append(value)
257267
elif option in ('-x', '--exclude'):

rotate_backups/tests.py

Lines changed: 32 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Test suite for the `rotate-backups' Python package.
22
#
33
# Author: Peter Odding <peter@peterodding.com>
4-
# Last Change: February 12, 2020
4+
# Last Change: February 14, 2020
55
# URL: https://github.com/xolox/python-rotate-backups
66

77
"""Test suite for the `rotate-backups` package."""
@@ -455,6 +455,37 @@ def test_filename_patterns(self):
455455
assert any(location.directory == os.path.join(root, 'laptop') for location in available_locations)
456456
assert any(location.directory == os.path.join(root, 'vps') for location in available_locations)
457457

458+
def test_custom_timestamp_pattern(self):
459+
"""Test that custom timestamp patterns are properly supported."""
460+
with TemporaryDirectory(prefix='rotate-backups-', suffix='-test-suite') as root:
461+
custom_backup_filename = os.path.join(root, 'My-File--2009-12-31--23-59-59.txt')
462+
touch(custom_backup_filename)
463+
program = RotateBackups(
464+
rotation_scheme=dict(monthly='always'),
465+
timestamp_pattern=r'''
466+
(?P<year>\d{4}) - (?P<month>\d{2}) - (?P<day>\d{2})
467+
--
468+
(?P<hour>\d{2}) - (?P<minute>\d{2}) - (?P<second>\d{2})
469+
''',
470+
)
471+
backups = program.collect_backups(root)
472+
assert backups[0].pathname == custom_backup_filename
473+
assert backups[0].timestamp.year == 2009
474+
assert backups[0].timestamp.month == 12
475+
assert backups[0].timestamp.day == 31
476+
assert backups[0].timestamp.hour == 23
477+
assert backups[0].timestamp.minute == 59
478+
assert backups[0].timestamp.second == 59
479+
480+
def test_invalid_timestamp_pattern(self):
481+
"""Test that the capture groups in custom timestamp patterns are validated."""
482+
self.assertRaises(
483+
ValueError,
484+
RotateBackups,
485+
rotation_scheme=dict(monthly='always'),
486+
timestamp_pattern=r'(?P<year>\d{4})-(?P<month>\d{2})',
487+
)
488+
458489
def create_sample_backup_set(self, root):
459490
"""Create a sample backup set to be rotated."""
460491
for name in SAMPLE_BACKUP_SET:

0 commit comments

Comments
 (0)