Skip to content

Commit bb9e3c3

Browse files
committed
Merge branch 'develop' of github.com:topcoder-platform/resource-api-v6 into develop
2 parents 6c21063 + dd020f5 commit bb9e3c3

File tree

70 files changed

+11828
-32176
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

70 files changed

+11828
-32176
lines changed

.DS_Store

0 Bytes
Binary file not shown.

ReadMe.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ Configuration for testing is at `config/test.js`, only add such new configuratio
6868
- COPILOT_CREDENTIALS_PASSWORD: The user's password with copilot role
6969
- USER_CREDENTIALS_USERNAME: The user's username with user role
7070
- USER_CREDENTIALS_PASSWORD: The user's password with user role
71-
- AUTOMATED_TESTING_REPORTERS_FORMAT: indicates reporters format. It is an array of the formats. e.g. `['html']` produces html format. `['cli', 'json', 'junit', 'html']` is the full format.
71+
- AUTOMATED_TESTING_REPORTERS_FORMAT: indicates reporters format. It is an array of the formats. e.g. `['html']` produces html format. `['cli', 'json', 'junit', 'html']` is the full format.
7272
*For the details of the supported format, please refer to https://www.npmjs.com/package/newman#reporters*.
7373

7474
## Available commands
@@ -106,15 +106,16 @@ You can also use docker to start it directly.
106106
```bash
107107
docker pull postgres:16.8
108108

109-
docker run -d --name resourcedb -p 5432:5432 \
110-
-e POSTGRES_USER=johndoe -e POSTGRES_DB=resourcedb \
109+
docker run -d --name resourceapi -p 5532:5432 \
110+
-e POSTGRES_USER=johndoe -e POSTGRES_DB=resourceapi \
111111
-e POSTGRES_PASSWORD=mypassword \
112112
postgres:16.8
113113
```
114114

115115
After that, please run
116116
```bash
117-
export DATABASE_URL="postgresql://johndoe:mypassword@localhost:5432/resourcedb?schema=public&statement_timeout=60000"
117+
export DATABASE_URL="postgresql://johndoe:mypassword@localhost:5532/resourceapi?schema=public&statement_timeout=60000"
118+
export MEMBER_DB_URL="postgresql://johndoe:mypassword@localhost:5632/memberdb"
118119
```
119120

120121
### Create Tables

app-bootstrap.js

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,3 +10,6 @@ Joi.optionalId = () => Joi.string().uuid()
1010
Joi.id = () => Joi.optionalId().required()
1111
Joi.page = () => Joi.number().integer().min(1).default(1)
1212
Joi.perPage = () => Joi.number().integer().min(1).max(10000).default(config.DEFAULT_PAGE_SIZE)
13+
14+
// eslint-disable-next-line
15+
BigInt.prototype.toJSON = function () { return this.toString() }

docs/topcoder-challenge-resource-api.postman_environment.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,7 @@
105105
{
106106
"enabled": true,
107107
"key": "MEMBER_ID",
108-
"value": "16096823",
108+
"value": "22742764",
109109
"type": "text"
110110
},
111111
{

env.sh

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,4 +26,6 @@ export USER_CREDENTIALS_PASSWORD=
2626

2727
export AUTH_SECRET=
2828

29-
export DATABASE_URL="postgresql://johndoe:mypassword@localhost:5432/resourcedb?schema=public&statement_timeout=60000"
29+
export DATABASE_URL="postgresql://johndoe:mypassword@localhost:5532/resourceapi?schema=public&statement_timeout=60000"
30+
31+
export MEMBER_DB_URL="postgresql://johndoe:mypassword@localhost:5632/memberdb"

migrator/.DS_Store

6 KB
Binary file not shown.

migrator/README.md

Lines changed: 118 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -1,79 +1,77 @@
11
# Topcoder Resources Data Migration Tool
2-
2+
33
This tool is designed to **migrate data from DynamoDB (JSON format) to PostgreSQL** using **Prisma ORM**. It covers five key models of the Topcoder Resources API:
4-
5-
- `MemberProfile`
6-
- `MemberStats`
4+
75
- `ResourceRole`
86
- `ResourceRolePhaseDependency`
97
- `Resource`
10-
8+
119
## 📦 Technologies Used
1210
- **Node.js** (backend scripting)
1311
- **Prisma ORM** (PostgreSQL schema management)
1412
- **PostgreSQL 16.3** (Dockerized database)
1513
- **Docker & Docker Compose** (for DB setup)
1614
- **stream-json / readline** (for streaming JSON migration)
1715
- **Jest** (unit testing framework)
18-
16+
1917
## ⚙️ Environment Configuration
2018
Create a `.env` file in the root directory:
21-
19+
2220
```env
2321
DATABASE_URL="postgresql://postgres:postgres@localhost:5432/resourcesdb"
2422
CREATED_BY="resources-api-db-migration"
2523
```
26-
24+
2725
> The `CREATED_BY` field can be overridden at runtime:
2826
```bash
2927
CREATED_BY=eduardo node src/index.js member-stats ./data/MemberStats_test.json
3028
```
31-
29+
3230
## 🚀 How to Run
33-
31+
3432
This tool expects a running PostgreSQL instance defined in `docker-compose.yml`.
35-
33+
3634
1. Clone the repo and install dependencies:
37-
35+
3836
```bash
3937
npm install
4038
```
41-
39+
4240
2. Start PostgreSQL with Docker Compose:
43-
41+
4442
```bash
4543
docker-compose up -d
4644
```
47-
45+
4846
To tear it down completely (including the volume):
49-
47+
5048
```bash
5149
docker-compose down -v
5250
```
53-
51+
5452
> The database runs on port `5432` with credentials `postgres:postgres`, and is mapped to `resourcesdb`.
55-
53+
5654
3. Push the Prisma schema to the database:
57-
55+
5856
```bash
5957
npx prisma db push
6058
```
61-
59+
6260
4. Run a migration step (with optional file override):
63-
61+
6462
```bash
6563
node src/index.js member-stats
6664
node src/index.js resources ./data/challenge-api.resources.json
6765
```
68-
66+
6967
You can override the default `createdBy` value:
70-
68+
7169
```bash
7270
CREATED_BY=my-migrator node src/index.js member-profiles
7371
```
74-
72+
7573
## 🧩 Available Migration Steps
76-
74+
7775
| Step | Auto Strategy | Description |
7876
|-------------------------------------|---------------|---------------------------------------------------------------------------------------------------|
7977
| `member-profiles` || Auto strategy: uses `stream-json` (batch) for files larger than 3MB, and `loadJSON` (simple) otherwise |
@@ -89,72 +87,142 @@
8987
> - For **smaller files**, it defaults to **simple in-memory processing** (`loadJSON`) for faster performance.
9088
>
9189
> This approach ensures optimal balance between **efficiency** and **stability**, especially when working with hundreds of thousands of records (e.g., over 850,000 for MemberProfile).
92-
90+
9391
### 📁 Default Input Files per Migration Step
94-
92+
9593
The following files are used by default for each step, unless a custom path is provided via the CLI:
96-
94+
9795
| Step | Default File Path |
9896
|-------------------------------------|----------------------------------------------------------------|
99-
| `member-profiles` | `./data/MemberProfile_dynamo_data.json` |
100-
| `member-stats` | `./data/MemberStats_dynamo_data.json` |
10197
| `resource-roles` | `./data/ResourceRole_dynamo_data.json` |
10298
| `resource-role-phase-dependencies` | `./data/ResourceRolePhaseDependency_dynamo_data.json` |
10399
| `resources` | `./data/Resource_data.json` ← requires NDJSON format |
104-
100+
105101
💡 **Note:** If you're using the original ElasticSearch export file (`challenge-api.resources.json`) provided in the forum ([link here](https://drive.google.com/file/d/1F8YW-fnKjn8tt5a0_Z-QenZIHPiP3RK7/view?usp=sharing)), you must explicitly provide its path when running the migration:
106-
102+
107103
```bash
108104
node src/index.js resources ./data/challenge-api.resources.json
109105
```
110-
106+
107+
### 🔁 Run All Migrations at Once
108+
109+
You can now run all migration steps sequentially using the following script:
110+
111+
```bash
112+
node src/migrateAll.js
113+
```
114+
115+
This script will automatically execute each step in order (`resource-roles`, `resource-role-phase-dependencies`, `resources`), logging progress and duration for each. Ideal for full dataset migration in one command.
116+
111117
## 📒 Error Logs
112118
All failed migrations are logged under the `logs/` folder by model:
113-
114-
- `logs/memberprofile_errors.log` ← from `MemberProfile_dynamo_data.json` *(7 migrations failed)*
115-
- `logs/memberstats_errors.log` ← from `MemberStats_dynamo_data.json` *(1 migration failed)*
119+
116120
- `logs/rrpd_errors.log` ← from `ResourceRolePhaseDependency_dynamo_data.json` *(17 migrations failed)*
117-
121+
118122
> ✅ Most migrations complete successfully. Errors are logged for further review and debugging.
119-
123+
120124
## ✅ Verification
121125
You can verify successful migration with simple SQL queries, for example:
122126
```sql
123-
SELECT COUNT(*) FROM "MemberProfile";
124127
SELECT COUNT(*) FROM "Resource";
125128
```
126129
To connect:
127130
```bash
128131
docker exec -it resources_postgres psql -U postgres -d resourcesdb
129132
```
130-
133+
131134
## 📸 Screenshots
132135
See `/docs/` for evidence of a fully mounted database.
133136
![Screenshot from 2025-04-14 16-58-20](https://github.com/user-attachments/assets/8fb66fb8-3db1-4b51-bb29-c1db7b207689)
134-
137+
135138
## 🧪 Testing
136-
139+
137140
Run all test suites with:
138-
141+
139142
```bash
140143
npm test
141144
```
142-
145+
143146
Each migrator has a corresponding unit test with mock input files under `src/test/mocks/`. Jest is used as the testing framework.
144-
147+
145148
---
146-
149+
147150
### 📂 Data Files Not Included
148151

149-
The official DynamoDB dataset files provided in the forum (e.g., `MemberProfile_dynamo_data.json`, `challenge-api.resources.json`, etc.) are **not included** in this submission due to size constraints.
152+
The official DynamoDB dataset files provided in the forum (e.g., `challenge-api.resources.json`, etc.) are **not included** in this submission due to size constraints.
150153

151154
Please download them manually from the official challenge forum and place them under the `/data/` directory.
152155

153156
🔗 [Official Data Files (Google Drive)](https://drive.google.com/file/d/1F8YW-fnKjn8tt5a0_Z-QenZIHPiP3RK7/view?usp=sharing)
154157

155158
> 🧪 This project **includes lightweight mock data files** under `src/test/mocks/` for testing purposes and sample execution. Full data is only required for production migration.
156-
159+
157160
---
158-
161+
159162
✅ All requirements of the challenge have been implemented, including logs, unit tests, schema adherence, and configurability.
160-
163+
164+
165+
## 🔧 Integrated Fixes & Enhancements
166+
167+
Several improvements and refinements have been implemented throughout the migration tool to ensure performance, reliability, and clarity:
168+
169+
### ✅ Progress Bar for Batch Processes
170+
171+
A custom CLI progress bar was added using `utils/progressLogger.js`. This applies only to **batch-based migrations**, and provides a visual representation of migration progress based on the total number of records or batches processed:
172+
- Implemented for: `resources`
173+
- Skipped for small or in-memory migrations like `resource-roles` and `resource-role-phase-dependencies`
174+
175+
### ✅ Validation for All Models
176+
177+
Additional validation scripts were also developed for:
178+
- `Resource`
179+
- `ResourceRole`
180+
- `ResourceRolePhaseDependency`
181+
182+
While binary search was not applicable for these due to non-numeric or unordered IDs, the validation was still efficiently implemented using `Map`-based lookups with the `id` as the key.
183+
184+
### ✅ Cleaner Code & Utility Reuse
185+
186+
A reusable utility module `utils/batchMigrator.js` was created to consolidate the logic for:
187+
- Streamed reading of large JSON and NDJSON files
188+
- Batch-based record processing with `Promise.allSettled`
189+
- Progress tracking and error logging
190+
- Automatic detection of input format size
191+
192+
This approach:
193+
- Avoids code duplication
194+
- Allows for consistent logging and error handling
195+
- Simplifies future extensions
196+
197+
### ✅ Default Field Logic (createdAt, updatedAt, etc.)
198+
199+
- Fields like `createdAt`, `updatedAt`, `createdBy`, and `updatedBy` are now conditionally set based on whether values exist in the original JSON.
200+
- If `updatedAt` or `updatedBy` are missing from the source, they are explicitly set to `null`, rather than omitted or auto-filled—ensuring data integrity.
201+
202+
### ✅ FullAccess Compatibility Fix
203+
204+
In `ResourceRole`, the original dataset sometimes includes only a `fullAccess` flag instead of `fullReadAccess` or `fullWriteAccess`.
205+
206+
Logic was added to:
207+
- Derive `fullReadAccess` and `fullWriteAccess` from `fullAccess` when the specific fields are missing.
208+
- Ensure fallback to `.env` defaults only if neither are provided.
209+
210+
```js
211+
const fullReadAccess = role.fullReadAccess ?? (role.fullAccess ?? DEFAULT_READ_ACCESS);
212+
const fullWriteAccess = role.fullWriteAccess ?? (role.fullAccess ?? DEFAULT_WRITE_ACCESS);
213+
```
214+
215+
> 🚩 **Important Note:** Some records in the source data had `fullWriteAccess: true` but `fullReadAccess: false`, which is logically inconsistent. This was **not auto-corrected**, but a warning was added in the README for awareness during validation.
216+
217+
### 📄 Validation Logs
218+
219+
All validation scripts write their outputs and mismatches to `console.log`. You can redirect them to a file using:
220+
221+
```bash
222+
node src/validation/validateMemberProfiles.js > logs/memberprofile_validation.log
223+
```
224+
225+
---
226+
227+
228+

migrator/docs/progressBars.png

90.7 KB
Loading

migrator/package-lock.json

Lines changed: 13 additions & 5 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

migrator/package.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111
"description": "",
1212
"dependencies": {
1313
"@prisma/client": "^6.6.0",
14+
"cli-progress": "^3.12.0",
1415
"dotenv": "^16.5.0",
1516
"prisma": "^6.6.0",
1617
"stream-json": "^1.9.1"

0 commit comments

Comments
 (0)