Skip to content

Commit b6c4649

Browse files
committed
Fixes for migrator to allow for incremental updates and update the link in the email to the new review app
1 parent b01078f commit b6c4649

14 files changed

+308
-60
lines changed

migrator/README.md

Lines changed: 53 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -70,15 +70,59 @@
7070
CREATED_BY=my-migrator node src/index.js member-profiles
7171
```
7272

73+
## 🕒 Incremental Migration with Start Date
74+
75+
The `--start-date` parameter unlocks incremental migrations by filtering records on their creation and update timestamps. Run a full import in advance, then rerun with a start date to pick up only the recent changes—keeping the final cut-over fast and predictable.
76+
77+
### 🛠️ Usage Examples
78+
79+
```bash
80+
node src/index.js resources ./data/challenge-api.resources.json --start-date=2024-01-15
81+
node src/index.js resource-roles --start-date=2024-03-01
82+
node src/migrateAll.js --start-date=2024-02-20
83+
```
84+
85+
### 🔍 Filtering Behavior
86+
87+
- Records are imported when `created`/`createdAt` **or** `updatedAt` is on or after the provided start date
88+
- Records missing **both** date fields are skipped when a start date is supplied
89+
- Records with both date fields present but older than the start date are skipped
90+
- Dates must use the ISO `YYYY-MM-DD` format
91+
- Without `--start-date`, all records are imported (default behavior)
92+
93+
### 🔄 Incremental Migration Workflow
94+
95+
1. **Bulk load ahead of time (Jan 1)** — seed the database without a start date:
96+
97+
```bash
98+
node src/migrateAll.js
99+
```
100+
101+
2. **Wait until the cut-over window (Mar 15)** — allow upstream systems to keep producing data as usual.
102+
103+
3. **Refresh with recent updates** — bring the target in sync by importing only records touched since Jan 1:
104+
105+
```bash
106+
node src/migrateAll.js --start-date=2024-01-01
107+
```
108+
109+
This staggered approach turns a multi-hour bulk migration into a quick incremental catch-up that typically completes in minutes.
110+
111+
### 📌 Model Support
112+
113+
- `resources`: filters on `created` and `updatedAt`
114+
- `resource-roles`: filters on `createdAt` and `updatedAt`
115+
- `resource-role-phase-dependencies`: filters on `createdAt` and `updatedAt`
116+
73117
## 🧩 Available Migration Steps
74118

75-
| Step | Auto Strategy | Description |
76-
|-------------------------------------|---------------|---------------------------------------------------------------------------------------------------|
77-
| `member-profiles` || Auto strategy: uses `stream-json` (batch) for files larger than 3MB, and `loadJSON` (simple) otherwise |
78-
| `member-stats` || Auto strategy: uses `stream-json` (batch) for files larger than 3MB, and `loadJSON` (simple) otherwise |
79-
| `resource-roles` || Simple in-memory migration using `loadJSON`, not expected to be large |
80-
| `resource-role-phase-dependencies` || Simple in-memory migration using `loadJSON`, not expected to be large |
81-
| `resources` || Auto strategy for NDJSON files: uses `readline` + batch for files > 3 MB, otherwise simple line-by-line |
119+
| Step | Auto Strategy | Start Date Support | Description |
120+
|-------------------------------------|---------------|--------------------|---------------------------------------------------------------------------------------------------|
121+
| `member-profiles` || | Auto strategy: uses `stream-json` (batch) for files larger than 3MB, and `loadJSON` (simple) otherwise |
122+
| `member-stats` || | Auto strategy: uses `stream-json` (batch) for files larger than 3MB, and `loadJSON` (simple) otherwise |
123+
| `resource-roles` || | Simple in-memory migration using `loadJSON`, not expected to be large |
124+
| `resource-role-phase-dependencies` || | Simple in-memory migration using `loadJSON`, not expected to be large |
125+
| `resources` || | Auto strategy for NDJSON files: uses `readline` + batch for files > 3 MB, otherwise simple line-by-line |
82126

83127
> ⚙️ **Why Auto Strategy?**
84128
>
@@ -110,9 +154,10 @@ You can now run all migration steps sequentially using the following script:
110154
111155
```bash
112156
node src/migrateAll.js
157+
node src/migrateAll.js --start-date=2024-01-15
113158
```
114159
115-
This script will automatically execute each step in order (`resource-roles`, `resource-role-phase-dependencies`, `resources`), logging progress and duration for each. Ideal for full dataset migration in one command.
160+
This script will automatically execute each step in order (`resource-roles`, `resource-role-phase-dependencies`, `resources`), logging progress and duration for each. When provided, the start date is passed to every step, so only records touched on or after the target date are processed—ideal for full imports followed by a quick incremental catch-up in one command.
116161
117162
## 📒 Error Logs
118163
All failed migrations are logged under the `logs/` folder by model:
@@ -224,5 +269,3 @@ node src/validation/validateMemberProfiles.js > logs/memberprofile_validation.lo
224269
225270
---
226271
227-
228-

migrator/src/index.js

Lines changed: 31 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -14,28 +14,49 @@ const { migrateResourceAuto } = require('./migrators/migrateResourceAuto'); //
1414
// Map each step name to its corresponding function.
1515
// Each function logs execution time and calls its migrator.
1616
const steps = {
17-
'resource-roles': async (filePath) => {
17+
'resource-roles': async (filePath, startDate) => {
1818
const start = Date.now();
19-
await migrateResourceRoleAuto(filePath);
19+
await migrateResourceRoleAuto(filePath, startDate);
2020
console.log(`⏱️ Duration: ${((Date.now() - start) / 1000).toFixed(2)}s.`);
2121
},
22-
'resource-role-phase-dependencies': async (filePath) => {
22+
'resource-role-phase-dependencies': async (filePath, startDate) => {
2323
const start = Date.now();
24-
await migrateResourceRolePhaseDependencyAuto(filePath);
24+
await migrateResourceRolePhaseDependencyAuto(filePath, startDate);
2525
console.log(`⏱️ Duration: ${((Date.now() - start) / 1000).toFixed(2)}s.`);
2626
},
27-
'resources': async (filePath) => {
27+
'resources': async (filePath, startDate) => {
2828
const start = Date.now();
29-
await migrateResourceAuto(filePath);
29+
await migrateResourceAuto(filePath, startDate);
3030
console.log(`⏱️ Duration: ${((Date.now() - start) / 1000).toFixed(2)}s.`);
3131
}
3232
};
3333

3434
// === EXECUTION ENTRYPOINT ===
3535
// Determines which migration step to execute and handles file path input
3636
(async () => {
37-
const step = process.argv[2]; // First argument: migration step name
38-
const customPath = process.argv[3]; // Second argument (optional): custom file path
37+
const args = process.argv.slice(2);
38+
const step = args[0];
39+
let customPath = null;
40+
let startDate = null;
41+
const startDatePattern = /^\d{4}-\d{2}-\d{2}$/;
42+
43+
for (const arg of args.slice(1)) {
44+
if (arg.startsWith('--start-date=')) {
45+
const value = arg.split('=')[1];
46+
if (!value || !startDatePattern.test(value)) {
47+
console.error('❌ Invalid --start-date format. Expected YYYY-MM-DD.');
48+
process.exit(1);
49+
}
50+
const parsedDate = new Date(value);
51+
if (isNaN(parsedDate.getTime())) {
52+
console.error('❌ Invalid --start-date value.');
53+
process.exit(1);
54+
}
55+
startDate = value;
56+
} else if (!arg.startsWith('-') && !customPath) {
57+
customPath = arg;
58+
}
59+
}
3960

4061
// Default file paths for each step
4162
const defaultPaths = {
@@ -47,7 +68,7 @@ const steps = {
4768
// Show help if step is invalid
4869
if (!steps[step]) {
4970
console.log('❌ Invalid migration step.\nUsage:');
50-
console.log(' node src/index.js <step-name> [custom-path]');
71+
console.log(' node src/index.js <step-name> [custom-path] [--start-date=YYYY-MM-DD]');
5172
console.log('\nAvailable steps:');
5273
console.log(Object.keys(steps).map(s => ` - ${s}`).join('\n'));
5374
process.exit(1);
@@ -64,7 +85,7 @@ const steps = {
6485
// Run selected migration step
6586
try {
6687
console.log(`🚀 Starting ${step} migration from ${filePath}`);
67-
await steps[step](filePath);
88+
await steps[step](filePath, startDate);
6889
console.log(`Step '${step}' completed successfully.`);
6990
} catch (error) {
7091
console.error(`❌ Error during '${step}':`, error.message);

migrator/src/migrateAll.js

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,21 @@ const steps = [
2222
];
2323

2424
(async () => {
25+
const startDateArg = process.argv.find(arg => arg.startsWith('--start-date='));
26+
let startDate = null;
27+
if (startDateArg) {
28+
startDate = startDateArg.split('=')[1];
29+
const startDatePattern = /^\d{4}-\d{2}-\d{2}$/;
30+
if (!startDate || !startDatePattern.test(startDate)) {
31+
console.error('❌ Invalid --start-date format. Expected YYYY-MM-DD.');
32+
process.exit(1);
33+
}
34+
}
35+
2536
console.log('🚀 Starting full migration (all steps)...\n');
37+
if (startDate) {
38+
console.log(`🔁 Incremental mode enabled. Filtering records from ${startDate}.\n`);
39+
}
2640

2741
for (const step of steps) {
2842
const filePath = path.resolve(defaultPaths[step.name]);
@@ -36,7 +50,7 @@ const steps = [
3650
const start = Date.now();
3751

3852
try {
39-
await step.fn(filePath);
53+
await step.fn(filePath, startDate);
4054
const duration = ((Date.now() - start) / 1000).toFixed(2);
4155
console.log(`✅ '${step.name}' completed in ${duration}s\n`);
4256
} catch (error) {

migrator/src/migrators/migrateResource.js

Lines changed: 26 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ const fs = require('fs');
1010
const readline = require('readline');
1111
const prisma = require('../clients/prismaClient');
1212

13-
async function migrateResource(filePath) {
13+
async function migrateResource(filePath, startDate) {
1414
const fileStream = fs.createReadStream(filePath);
1515
const rl = readline.createInterface({
1616
input: fileStream,
@@ -19,13 +19,37 @@ async function migrateResource(filePath) {
1919

2020
let successCount = 0;
2121
let failCount = 0;
22+
let skippedCount = 0;
23+
const startDateObj = startDate ? new Date(startDate) : null;
2224

2325
for await (const line of rl) {
2426
if (!line.trim()) continue;
2527

2628
try {
2729
const jsonLine = JSON.parse(line);
2830
const data = jsonLine._source;
31+
32+
const createdRaw = data.created;
33+
const updatedRaw = data.updatedAt;
34+
35+
let createdDate = createdRaw ? new Date(createdRaw) : null;
36+
if (createdDate && isNaN(createdDate.getTime())) {
37+
createdDate = null;
38+
}
39+
40+
let updatedDate = updatedRaw ? new Date(updatedRaw) : null;
41+
if (updatedDate && isNaN(updatedDate.getTime())) {
42+
updatedDate = null;
43+
}
44+
45+
const createdBeforeOrMissing = !createdDate || (startDateObj ? createdDate < startDateObj : false);
46+
const updatedBeforeOrMissing = !updatedDate || (startDateObj ? updatedDate < startDateObj : false);
47+
48+
if (startDateObj && createdBeforeOrMissing && updatedBeforeOrMissing) {
49+
skippedCount++;
50+
continue;
51+
}
52+
2953
const createdBy = data.createdBy || process.env.CREATED_BY;
3054
const phaseChangeNotifications = Object.prototype.hasOwnProperty.call(data, 'phaseChangeNotifications')
3155
? data.phaseChangeNotifications
@@ -72,7 +96,7 @@ async function migrateResource(filePath) {
7296
}
7397
}
7498

75-
console.log(`✅ Resource migration finished: ${successCount} success, ${failCount} failed`);
99+
console.log(`✅ Resource migration finished: ${successCount} success, ${failCount} failed, ${skippedCount} skipped`);
76100
}
77101

78102
module.exports = { migrateResource };

migrator/src/migrators/migrateResourceAuto.js

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,16 +16,16 @@ const { migrateResource: migrateBatch } = require('./migrateResourceBatch');
1616

1717
const FILE_SIZE_THRESHOLD = 3 * 1024 * 1024; // 3 MB
1818

19-
async function migrateResourceAuto(filePath) {
19+
async function migrateResourceAuto(filePath, startDate) {
2020
const stats = fs.statSync(filePath);
2121
const fileSize = stats.size;
2222

2323
if (fileSize < FILE_SIZE_THRESHOLD) {
2424
// Using normal migration (in-memory)
25-
await migrateResource(filePath);
25+
await migrateResource(filePath, startDate);
2626
} else {
2727
// Using batch migration (streaming)
28-
await migrateBatch(filePath);
28+
await migrateBatch(filePath, startDate);
2929
}
3030
}
3131

migrator/src/migrators/migrateResourceBatch.js

Lines changed: 31 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,12 @@ const prisma = require('../clients/prismaClient');
1313
const { countFileLines } = require('../utils/countFileLines');
1414
const { createSimpleProgressBar } = require('../utils/progressLogger');
1515

16-
async function migrateResource(filePath) {
16+
const parseTimestamp = (value) => {
17+
const t = value ? new Date(value) : null;
18+
return t && !isNaN(t.getTime()) ? t : null;
19+
};
20+
21+
async function migrateResource(filePath, startDate) {
1722
// Estimar el total de líneas del archivo NDJSON
1823
const totalRecords = await countFileLines(filePath);
1924
const progress = createSimpleProgressBar(Math.ceil(totalRecords / 100));
@@ -28,6 +33,14 @@ async function migrateResource(filePath) {
2833
let batch = [];
2934
let successCount = 0;
3035
let failCount = 0;
36+
let skippedCount = 0;
37+
const startDateObj = startDate ? new Date(startDate) : null;
38+
const isInvalidStartDate = startDate && isNaN(startDateObj.getTime());
39+
const filterStartDate = !isInvalidStartDate ? startDateObj : null;
40+
41+
if (isInvalidStartDate) {
42+
console.warn('migrateResource: invalid startDate provided; disabling date filter.', startDate);
43+
}
3144

3245
async function processBatch(batch) {
3346
const results = await Promise.allSettled(
@@ -83,6 +96,20 @@ async function migrateResource(filePath) {
8396
try {
8497
const jsonLine = JSON.parse(line);
8598
const data = jsonLine._source;
99+
if (filterStartDate) {
100+
const createdDate = parseTimestamp(data.created);
101+
const updatedAtDate = parseTimestamp(data.updatedAt);
102+
103+
const shouldSkip =
104+
(!createdDate && !updatedAtDate) ||
105+
(((createdDate && createdDate < filterStartDate) || !createdDate) &&
106+
((updatedAtDate && updatedAtDate < filterStartDate) || !updatedAtDate));
107+
108+
if (shouldSkip) {
109+
skippedCount++;
110+
continue;
111+
}
112+
}
86113
batch.push(data);
87114

88115
if (batch.length >= batchSize) {
@@ -102,6 +129,9 @@ async function migrateResource(filePath) {
102129

103130
progress.done();
104131
console.log(`✅ Resource migration finished: ${successCount} success, ${failCount} failed`);
132+
if (skippedCount > 0) {
133+
console.log(`ℹ️ Resource migration skipped ${skippedCount} record(s) before the start date`);
134+
}
105135
}
106136

107137
module.exports = { migrateResource };

0 commit comments

Comments
 (0)