Skip to content

Commit a43cefa

Browse files
committed
Updated status values to match challenge API V6
1 parent a0982c1 commit a43cefa

39 files changed

+37300
-3
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,3 +62,5 @@ typings/
6262

6363
# next.js build output
6464
.next
65+
66+
migrator/data

.nvmrc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
18.19.0

app-constants.js

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,8 @@ const UserRoles = {
1010
}
1111

1212
const ChallengeStatuses = {
13-
Completed: 'Completed',
14-
Active: 'Active'
13+
Completed: 'COMPLETED',
14+
Active: 'ACTIVE'
1515
}
1616

1717
module.exports = {

migrator/.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
node_modules
2+
# Keep environment variables out of version control
3+
.env

migrator/README.md

Lines changed: 160 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,160 @@
1+
# Topcoder Resources Data Migration Tool
2+
3+
This tool is designed to **migrate data from DynamoDB (JSON format) to PostgreSQL** using **Prisma ORM**. It covers five key models of the Topcoder Resources API:
4+
5+
- `MemberProfile`
6+
- `MemberStats`
7+
- `ResourceRole`
8+
- `ResourceRolePhaseDependency`
9+
- `Resource`
10+
11+
## 📦 Technologies Used
12+
- **Node.js** (backend scripting)
13+
- **Prisma ORM** (PostgreSQL schema management)
14+
- **PostgreSQL 16.3** (Dockerized database)
15+
- **Docker & Docker Compose** (for DB setup)
16+
- **stream-json / readline** (for streaming JSON migration)
17+
- **Jest** (unit testing framework)
18+
19+
## ⚙️ Environment Configuration
20+
Create a `.env` file in the root directory:
21+
22+
```env
23+
DATABASE_URL="postgresql://postgres:postgres@localhost:5432/resourcesdb"
24+
CREATED_BY="resources-api-db-migration"
25+
```
26+
27+
> The `CREATED_BY` field can be overridden at runtime:
28+
```bash
29+
CREATED_BY=eduardo node src/index.js member-stats ./data/MemberStats_test.json
30+
```
31+
32+
## 🚀 How to Run
33+
34+
This tool expects a running PostgreSQL instance defined in `docker-compose.yml`.
35+
36+
1. Clone the repo and install dependencies:
37+
38+
```bash
39+
npm install
40+
```
41+
42+
2. Start PostgreSQL with Docker Compose:
43+
44+
```bash
45+
docker-compose up -d
46+
```
47+
48+
To tear it down completely (including the volume):
49+
50+
```bash
51+
docker-compose down -v
52+
```
53+
54+
> The database runs on port `5432` with credentials `postgres:postgres`, and is mapped to `resourcesdb`.
55+
56+
3. Push the Prisma schema to the database:
57+
58+
```bash
59+
npx prisma db push
60+
```
61+
62+
4. Run a migration step (with optional file override):
63+
64+
```bash
65+
node src/index.js member-stats
66+
node src/index.js resources ./data/challenge-api.resources.json
67+
```
68+
69+
You can override the default `createdBy` value:
70+
71+
```bash
72+
CREATED_BY=my-migrator node src/index.js member-profiles
73+
```
74+
75+
## 🧩 Available Migration Steps
76+
77+
| Step | Auto Strategy | Description |
78+
|-------------------------------------|---------------|---------------------------------------------------------------------------------------------------|
79+
| `member-profiles` || Auto strategy: uses `stream-json` (batch) for files larger than 3MB, and `loadJSON` (simple) otherwise |
80+
| `member-stats` || Auto strategy: uses `stream-json` (batch) for files larger than 3MB, and `loadJSON` (simple) otherwise |
81+
| `resource-roles` || Simple in-memory migration using `loadJSON`, not expected to be large |
82+
| `resource-role-phase-dependencies` || Simple in-memory migration using `loadJSON`, not expected to be large |
83+
| `resources` || Auto strategy for NDJSON files: uses `readline` + batch for files > 3 MB, otherwise simple line-by-line |
84+
85+
> ⚙️ **Why Auto Strategy?**
86+
>
87+
> For models that involve large datasets (`member-profiles`, `member-stats`, and `resources`), the tool implements an **automatic selection strategy** based on file size:
88+
> - If the input file is **larger than 3 MB**, the migration runs in **batch mode using streaming (e.g., `stream-json` or `readline`)** to reduce memory usage.
89+
> - For **smaller files**, it defaults to **simple in-memory processing** (`loadJSON`) for faster performance.
90+
>
91+
> This approach ensures optimal balance between **efficiency** and **stability**, especially when working with hundreds of thousands of records (e.g., over 850,000 for MemberProfile).
92+
93+
### 📁 Default Input Files per Migration Step
94+
95+
The following files are used by default for each step, unless a custom path is provided via the CLI:
96+
97+
| Step | Default File Path |
98+
|-------------------------------------|----------------------------------------------------------------|
99+
| `member-profiles` | `./data/MemberProfile_dynamo_data.json` |
100+
| `member-stats` | `./data/MemberStats_dynamo_data.json` |
101+
| `resource-roles` | `./data/ResourceRole_dynamo_data.json` |
102+
| `resource-role-phase-dependencies` | `./data/ResourceRolePhaseDependency_dynamo_data.json` |
103+
| `resources` | `./data/Resource_data.json` ← requires NDJSON format |
104+
105+
💡 **Note:** If you're using the original ElasticSearch export file (`challenge-api.resources.json`) provided in the forum ([link here](https://drive.google.com/file/d/1F8YW-fnKjn8tt5a0_Z-QenZIHPiP3RK7/view?usp=sharing)), you must explicitly provide its path when running the migration:
106+
107+
```bash
108+
node src/index.js resources ./data/challenge-api.resources.json
109+
```
110+
111+
## 📒 Error Logs
112+
All failed migrations are logged under the `logs/` folder by model:
113+
114+
- `logs/memberprofile_errors.log` ← from `MemberProfile_dynamo_data.json` *(7 migrations failed)*
115+
- `logs/memberstats_errors.log` ← from `MemberStats_dynamo_data.json` *(1 migration failed)*
116+
- `logs/rrpd_errors.log` ← from `ResourceRolePhaseDependency_dynamo_data.json` *(17 migrations failed)*
117+
118+
> ✅ Most migrations complete successfully. Errors are logged for further review and debugging.
119+
120+
## ✅ Verification
121+
You can verify successful migration with simple SQL queries, for example:
122+
```sql
123+
SELECT COUNT(*) FROM "MemberProfile";
124+
SELECT COUNT(*) FROM "Resource";
125+
```
126+
To connect:
127+
```bash
128+
docker exec -it resources_postgres psql -U postgres -d resourcesdb
129+
```
130+
131+
## 📸 Screenshots
132+
See `/docs/` for evidence of a fully mounted database.
133+
![Screenshot from 2025-04-14 16-58-20](https://github.com/user-attachments/assets/8fb66fb8-3db1-4b51-bb29-c1db7b207689)
134+
135+
## 🧪 Testing
136+
137+
Run all test suites with:
138+
139+
```bash
140+
npm test
141+
```
142+
143+
Each migrator has a corresponding unit test with mock input files under `src/test/mocks/`. Jest is used as the testing framework.
144+
145+
---
146+
147+
### 📂 Data Files Not Included
148+
149+
The official DynamoDB dataset files provided in the forum (e.g., `MemberProfile_dynamo_data.json`, `challenge-api.resources.json`, etc.) are **not included** in this submission due to size constraints.
150+
151+
Please download them manually from the official challenge forum and place them under the `/data/` directory.
152+
153+
🔗 [Official Data Files (Google Drive)](https://drive.google.com/file/d/1F8YW-fnKjn8tt5a0_Z-QenZIHPiP3RK7/view?usp=sharing)
154+
155+
> 🧪 This project **includes lightweight mock data files** under `src/test/mocks/` for testing purposes and sample execution. Full data is only required for production migration.
156+
157+
---
158+
159+
✅ All requirements of the challenge have been implemented, including logs, unit tests, schema adherence, and configurability.
160+

migrator/docker-compose.yml

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
version: '3.9'
2+
3+
services:
4+
postgres:
5+
image: postgres:16.3
6+
container_name: resources_postgres
7+
restart: unless-stopped
8+
environment:
9+
POSTGRES_USER: postgres
10+
POSTGRES_PASSWORD: postgres
11+
POSTGRES_DB: resourcesdb
12+
ports:
13+
- "5432:5432"
14+
volumes:
15+
- pgdata:/var/lib/postgresql/data
16+
17+
volumes:
18+
pgdata:

migrator/docs/ss_moutedDB.png

205 KB
Loading

0 commit comments

Comments
 (0)