Skip to content

Commit 8236a8e

Browse files
Merge pull request #109 from timvaillancourt/MCB_1.0-bugfix18
MCB_1.0: Use callbacks to check multiprocessing.Pool threads, README updates for release
2 parents a865aa5 + dbc79fc commit 8236a8e

File tree

6 files changed

+137
-68
lines changed

6 files changed

+137
-68
lines changed

README.rst

Lines changed: 70 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -4,27 +4,28 @@ MongoDB Consistent Backup Tool - mongodb-consistent-backup
44
About
55
~~~~~
66

7-
Creates cluster-consistent point-in-time backups of MongoDB via wrapping
8-
'mongodump'. Backups are remotely-pulled and outputted onto the host
9-
running the tool.
7+
Creates cluster-consistent point-in-time backups of MongoDB with optional
8+
archiving, compression/de-duplication, encryption and upload functionality
109

1110
Features
1211
~~~~~~~~
1312

1413
- Works on a single replset (2+ members) or a sharded cluster
1514
- Auto-discovers healthy members for backup by considering replication
16-
lag, replication 'priority' and by preferring 'hidden' members.
15+
lag, replication 'priority' and by preferring 'hidden' members
1716
- Creates cluster-consistent backups across many separate shards
17+
- 'mongodump' is the default *(and currently only)* backup method. Other methods coming soon!
1818
- Transparent restore process (*just add --oplogReplay flag to your
1919
mongorestore command*)
20-
- Archiving and compression of backups
20+
- Archiving and compression of backups (*optional*)
2121
- Block de-duplication and optional AES encryption at rest via `ZBackup <http://zbackup.org/>`__
22-
archiving method
23-
- AWS S3 Secure/HTTPS Multipart backup uploads (*optional*)
22+
archiving method (*optional*)
23+
- `AWS S3 <https://aws.amazon.com/s3/>`__ Secure Multipart backup uploads (*optional*)
2424
- `Nagios NSCA <https://sourceforge.net/p/nagios/nsca>`__ push
2525
notification support (*optional*)
2626
- Modular backup, archiving, upload and notification components
2727
- Multi-threaded, single executable
28+
- Auto-scales to number of available CPUs by default
2829

2930
Current Limitations
3031
~~~~~~~~~~~~~~~~~~~
@@ -38,47 +39,56 @@ Requirements:
3839
~~~~~~~~~~~~~
3940

4041
- Backup consistency depends on consistent server time across all
41-
hosts. Server time **must be synchronized on all nodes** using ntpd
42+
hosts! Server time **must be synchronized on all nodes** using ntpd
4243
and a consistent time source or virtualization guest agent that
4344
syncs time
4445
- Must have 'mongodump' installed and specified if not at default:
4546
*/usr/bin/mongodump*. Even if you do not run MongoDB 3.2+, it is
46-
strongly recommended to use MongoDB 3.2+ binaries due to inline
47-
compression, parallelism, etc
47+
strongly recommended to use MongoDB 3.2+ mongodump binaries due
48+
to inline compression and parallelism features
4849
- Must have Python 2.7 installed
4950

51+
Releases
52+
~~~~~~~~
53+
54+
Pre-built release binaries and packages are available on our `GitHub Releases Page <https://github.com/Percona-Lab/mongodb_consistent_backup/releases>`__. We recommend most users deploy mongodb_consistent_backup using these packages.
55+
5056
Build/Install
5157
~~~~~~~~~~~~~
5258

53-
To build on CentOS/RedHat, you wil need the following packages (see
54-
command):
59+
To build on CentOS/RedHat, you will need the following packages installed:
5560

5661
::
5762

58-
yum install python python-devel python-virtualenv gcc git make libffi-devel openssl-devel
63+
$ yum install python python-devel python-virtualenv gcc git make libffi-devel openssl-devel
5964

60-
To install to default '*/usr/local/bin/mongodb-consistent-backup*\ ':
65+
To build an CentOS/RedHat RPM of the tool *(recommended)*:
6166

6267
::
6368

64-
cd path/to/mongo_backup
65-
make
66-
make install
69+
$ cd /path/to/mongodb_consistent_backup
70+
$ make rpm
71+
72+
To build and install from source *(to default '/usr/local/bin/mongodb-consistent-backup')*:
6773

68-
Use the PREFIX= variable to change the installation path (*default:
69-
/usr/local*), ie: ``make PREFIX=/usr install`` to install to:
70-
'*/usr/bin/mongodb-consistent-backup*\ '.
74+
::
75+
76+
$ cd /path/to/mongodb_consistent_backup
77+
$ make
78+
$ make install
79+
80+
Use the PREFIX= variable to change the installation path (*default: /usr/local*), ie: ``make PREFIX=/usr install`` to install to: '*/usr/bin/mongodb-consistent-backup*'.
7181

7282
MongoDB Authorization
7383
~~~~~~~~~~~~~~~~~~~~~
7484

75-
If your replset/cluster uses `Authentication <https://docs.mongodb.com/manual/core/authentication>`__, you must add a user with the "backup" and "clusterMonitor" built-in auth roles.
85+
If your replset/cluster uses `Authentication <https://docs.mongodb.com/manual/core/authentication>`__, you must add a user with the `"backup" <https://docs.mongodb.com/manual/reference/built-in-roles/#backup>`__ and `"clusterMonitor" <https://docs.mongodb.com/manual/reference/built-in-roles/#clusterMonitor>`__ built-in auth roles.
7686

7787
To create a user, execute the following **replace the 'pwd' field with a secure password!**:
7888

7989
::
8090

81-
db.createUser({
91+
db.getSiblingDB("admin").createUser({
8292
user: "mongodb_consistent_backup",
8393
pwd: "PASSWORD-HERE",
8494
roles: [
@@ -94,36 +104,44 @@ Run a Backup
94104

95105
**Using Command-Line Flags**
96106

107+
*Note: username+password is visible in process lists when set using the command-line flags. Use a config file (below) to hide credentials!*
108+
97109
::
98110

99-
$ mongodb-consistent-backup -H mongos1.example.com -P 27018 -u mongodb-consistent-backup -p s3cr3t -n prodwebsite -l /opt/mongobackups
111+
$ mongodb-consistent-backup -H mongos1.example.com -P 27018 -u mongodb-consistent-backup -p s3cr3t -n prodwebsite -l /var/lib/mongodb-consistent-backup
100112
...
101113
...
102114
$ ls /opt/mongobackups
103115
prodwebsite
104116

105117
**Using a Config File**
106118

119+
The tool supports a YAML-based config file for settings. The config file is loaded first and any additional command-line arguments override the file based config settings.
120+
107121
::
108122

109123
$ mongodb-consistent-backup --config /etc/mongodb-consistent-backup.yml
110124
...
111125

126+
An example *(with comments)* of the YAML-based config file is here: `conf/mongodb-consistent-backup.example.conf <conf/mongodb-consistent-backup.example.conf>`__.
127+
128+
A description of all available config settings can also be listed by passing the '--help' flag to the tool.
129+
112130
Restore a Backup
113131
~~~~~~~~~~~~~~~~
114132

115-
The backups are mongorestore compatible. The *--oplogReplay* flag **MUST** be present to replay the oplogs to ensure consistency.
133+
The backups are mongorestore compatible and stored in a directory per backup. The *--oplogReplay* flag **MUST** be present to replay the oplogs to ensure consistency.
116134

117135
::
118136

119137
$ tar xfvz <shardname>.tar.gz
120138
...
121-
$ mongorestore --host mongod12.example.com --port 27017 -u admin -p 123456 --oplogReplay --dir /path/to/backup/dump
139+
$ mongorestore --host mongod12.example.com --port 27017 -u admin -p 123456 --oplogReplay --dir /var/lib/mongodb-consistent-backup/default/20170424_0000/rs0/dump
122140

123141
Run as Docker Container (Experimental)
124142
~~~~~~~~~~~~~~~~~~~~~~~
125143

126-
*Note: you need to use persistent volumes to store backups long-term on disk when using Docker. Data in Docker containers is destroyed when the container is deleted.*
144+
*Note: you need to use persistent volumes to store backups and/or config files long-term when using Docker. Data in Docker containers is destroyed when the container is deleted.*
127145

128146
**Via Docker Hub**
129147

@@ -143,12 +161,16 @@ Run as Docker Container (Experimental)
143161
ZBackup Archiving (Optional)
144162
~~~~~~~
145163

146-
`ZBackup <http://zbackup.org/>`__ *(with LZMA compression)* is an optional archive method for mongodb_consistent_backup. This archive method significantly reduces disk usage for backups via deduplication and compression.
164+
*Note: the ZBackup archive method is not yet compatible with the 'Upload' phase. Disable uploading by setting 'upload.method' to 'none' in the meantime.*
165+
166+
`ZBackup <http://zbackup.org/>`__ *(with LZMA compression)* is an optional archive method for mongodb_consistent_backup. This archive method significantly reduces disk usage for backups via de-duplication and compression.
147167

148-
ZBackup offers block de-duplication and compression of backups and optionally supports AES-128 encryption at rest. The ZBackup archive method causes backups to be stored via ZBackup at archive time.
168+
ZBackup offers block de-duplication and compression of backups and optionally supports AES-128 *(CBC mode with PKCS#7 padding)* encryption at rest. The ZBackup archive method causes backups to be stored via ZBackup at archive time.
149169

150170
To enable, ZBackup must be installed on your system and the 'archive.method' config file variable *(or --archive.method flag=)* must be set to 'zbackup'.
151171

172+
ZBackup's compression is most efficient when compression is disabled in the backup phase, to do this set 'backup.<method>.compression' to 'none'.
173+
152174
**Install on CentOS/RHEL**
153175

154176
::
@@ -161,21 +183,36 @@ To enable, ZBackup must be installed on your system and the 'archive.method' con
161183

162184
$ apt-get install zbackup
163185

164-
ZBackup data is stored in a repository directory named *mongodb_consistent_backup-zbackup* and must be restored using a 'zbackup restore ...' command.
165186

166-
**Get Backup from ZBackup Repo**
187+
**Get Backup from ZBackup**
188+
189+
ZBackup data is stored in a storage directory named *'mongodb_consistent_backup-zbackup'* and must be restored using a 'zbackup restore ...' command.
190+
191+
::
192+
193+
$ zbackup restore --password-file /etc/zbackup.passwd /mnt/backup/default/mongodb_consistent_backup-zbackup/backups/20170424_0000.tar | tar -xf
194+
195+
**Delete Backup from ZBackup**
196+
197+
To remove a backup, first delete the .tar file in 'backups' subdir of the ZBackup storage directory. After, run a 'zbackup gc full' garbage collection to remove unused data.
167198

168199
::
169200

170-
$ zbackup restore --password-file /etc/zbackup.passwd /mnt/backup/default/mongodb_consistent_backup-zbackup/backups/20170424_0000.tar
201+
$ rm -f /mnt/backup/default/mongodb_consistent_backup-zbackup/backups/20170424_0000.tar
202+
$ zbackup gc full --password-file /etc/zbackup.passwd /mnt/backup/default/mongodb_consistent_backup-zbackup
171203
172204
Roadmap
173205
~~~~~~~
174206

207+
- More testing: this project has many flows that probably need more in-depth testing. Please submit and bugs and/or bugfixes!
175208
- "Distributed Mode" for running backup on remote hosts *(vs. only on one host)*
176-
- Support more notification methods *(Prometheus, PagerDuty, etc)* and upload methods *(Google Cloud Storage, Rsync, etc)*
209+
- Upload compatibility for ZBackup archive phase *(upload unsupported today)*
210+
- Backup retention/rotation *(eg: delete old backups)*
211+
- Support more notification methods *(Prometheus, PagerDuty, etc)*
212+
- Support more upload methods *(Google Cloud Storage, Rsync, etc)*
177213
- Support SSL MongoDB connections
178-
- Unit tests
214+
- Documentation for running under Docker with persistent volumes
215+
- Python unit tests
179216

180217
Contact
181218
~~~~~~~

mongodb_consistent_backup/Archive/Tar/Tar.py

Lines changed: 19 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,8 @@
22
import logging
33

44
from copy_reg import pickle
5-
from multiprocessing import Pool, TimeoutError
5+
from multiprocessing import Pool
6+
from time import sleep
67
from types import MethodType
78

89
from TarThread import TarThread
@@ -30,20 +31,23 @@ def __init__(self, manager, config, timer, base_dir, backup_dir, **kwargs):
3031
self._pool = None
3132
self._pooled = []
3233

34+
def done(self, done_dir):
35+
if done_dir in self._pooled:
36+
logging.debug("Archiving completed for: %s" % done_dir)
37+
self._pooled.remove(done_dir)
38+
else:
39+
raise OperationError("Unexpected response from tar thread: %s" % done_dir)
40+
3341
def wait(self):
3442
if len(self._pooled) > 0:
3543
self._pool.close()
36-
logging.debug("Waiting for tar threads to stop")
37-
while len(self._pooled) > 0:
38-
try:
39-
item = self._pooled[0]
40-
path, result = item
41-
result.get(1)
42-
logging.debug("Archiving completed for directory: %s" % path)
43-
self._pooled.remove(item)
44-
except TimeoutError:
45-
continue
44+
while len(self._pooled):
45+
logging.debug("Waiting for %i tar thread(s) to stop" % len(self._pooled))
46+
sleep(2)
47+
self._pool.terminate()
48+
logging.debug("Stopped all tar threads")
4649
self.stopped = True
50+
self.running = False
4751

4852
def run(self):
4953
try:
@@ -64,20 +68,19 @@ def run(self):
6468
output_file = "%s.tar" % subdir_name
6569
if self.do_gzip():
6670
output_file = "%s.tgz" % subdir_name
67-
result = self._pool.apply_async(TarThread(subdir_name, output_file, self.do_gzip(), self.verbose, self.binary).run)
68-
self._pooled.append((subdir_name, result))
71+
self._pool.apply_async(TarThread(subdir_name, output_file, self.do_gzip(), self.verbose, self.binary).run, callback=self.done)
72+
self._pooled.append(subdir_name)
6973
except Exception, e:
7074
self._pool.terminate()
7175
logging.fatal("Could not create tar archiving thread! Error: %s" % e)
7276
raise Error(e)
7377
finally:
7478
self.wait()
75-
self.completed = True
79+
self.completed = True
7680

7781
def close(self, code=None, frame=None):
78-
logging.debug("Stopping tar archiving threads")
7982
if not self.stopped and self._pool is not None:
83+
logging.debug("Stopping tar archiving threads")
8084
self._pool.terminate()
81-
self._pool.join()
8285
logging.info("Stopped all tar archiving threads")
8386
self.stopped = True

mongodb_consistent_backup/Archive/Tar/TarThread.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,3 +50,4 @@ def run(self):
5050
else:
5151
logging.fatal("Output file: %s already exists!" % self.output_file)
5252
sys.exit(1)
53+
return self.backup_dir

mongodb_consistent_backup/Common/LocalCommand.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,6 @@ def run(self):
5858
# return exit code from mongodump
5959
return self._process.returncode
6060

61-
def close(self):
61+
def close(self, frame=None, code=None):
6262
if self._process:
6363
self._process.terminate()

mongodb_consistent_backup/Oplog/Resolver/Resolver.py

Lines changed: 37 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
from bson.timestamp import Timestamp
66
from copy_reg import pickle
77
from multiprocessing import Pool
8+
from time import sleep
89
from types import MethodType
910

1011
from ResolverThread import ResolverThread
@@ -34,16 +35,23 @@ def __init__(self, manager, config, timer, base_dir, backup_dir, tailed_oplogs,
3435
self.resolver_summary = {}
3536
self.resolver_state = {}
3637

38+
self.running = False
39+
self.stopped = False
40+
self.completed = False
41+
self._pool = None
42+
self._pooled = []
3743
try:
3844
self._pool = Pool(processes=self.threads(None, 2))
3945
except Exception, e:
4046
logging.fatal("Could not start oplog resolver pool! Error: %s" % e)
4147
raise Error(e)
4248

43-
def close(self):
44-
if self._pool:
49+
def close(self, code=None, frame=None):
50+
if self._pool and not self.stopped:
51+
logging.debug("Stopping all oplog resolver threads")
4552
self._pool.terminate()
46-
self._pool.join()
53+
logging.info("Stopped all oplog resolver threads")
54+
self.stopped = True
4755

4856
def get_consistent_end_ts(self):
4957
ts = None
@@ -54,9 +62,28 @@ def get_consistent_end_ts(self):
5462
ts = Timestamp(instance['last_ts'].time, 0)
5563
return ts
5664

65+
def done(self, done_uri):
66+
if done_uri in self._pooled:
67+
logging.debug("Resolving completed for: %s" % done_uri)
68+
self._pooled.remove(done_uri)
69+
else:
70+
raise OperationError("Unexpected response from resolver thread: %s" % done_uri)
71+
72+
def wait(self):
73+
if len(self._pooled) > 0:
74+
self._pool.close()
75+
while len(self._pooled):
76+
logging.debug("Waiting for %i oplog resolver thread(s) to stop" % len(self._pooled))
77+
sleep(2)
78+
self._pool.terminate()
79+
logging.debug("Stopped all oplog resolve threads")
80+
self.stopped = True
81+
self.running = False
82+
5783
def run(self):
58-
logging.info("Resolving oplogs (options: threads=%s,compression=%s)" % (self.threads(), self.compression()))
84+
logging.info("Resolving oplogs (options: threads=%s, compression=%s)" % (self.threads(), self.compression()))
5985
self.timer.start(self.timer_name)
86+
self.running = True
6087

6188
for shard in self.backup_oplogs:
6289
backup_oplog = self.backup_oplogs[shard]
@@ -80,14 +107,17 @@ def run(self):
80107
backup_oplog.copy(),
81108
self.get_consistent_end_ts(),
82109
self.do_gzip()
83-
).run)
110+
).run, callback=self.done)
111+
self._pooled.append(uri.str())
84112
except Exception, e:
85113
logging.fatal("Resolve failed for %s! Error: %s" % (uri, e))
86114
raise Error(e)
87115
else:
88116
logging.info("No tailed oplog for host %s" % uri)
89-
self._pool.close()
90-
self._pool.join()
117+
self.wait()
118+
self.running = False
119+
self.stopped = True
120+
self.completed = True
91121

92122
self.timer.stop(self.timer_name)
93123
logging.info("Oplog resolving completed in %.2f seconds" % self.timer.duration(self.timer_name))

0 commit comments

Comments
 (0)