-
Notifications
You must be signed in to change notification settings - Fork 4
Closed
Labels
Description
We currently have an ednpoint-level check like
# Check that no other job with the same dataset_id is SUBMITTED
stm = (
select(JobV2)
.where(JobV2.dataset_id == dataset_id)
.where(JobV2.status == JobStatusType.SUBMITTED)
)
res = await db.execute(stm)
if res.scalars().all():
raise HTTPException(
status_code=status.HTTP_422_UNPROCESSABLE_CONTENT,
detail=(
f"Dataset {dataset_id} is already in use in submitted job(s)."
),
)but we observed at least once a situation where two "submitted" jobs co-existed.
It'd be best to make this more robust, by also having a postgres unique index on "jobs with dataset_id = X and status = submitted (where X is not null)".
I think a partial index should be appropriate:
- https://docs.sqlalchemy.org/en/20/dialects/postgresql.html#partial-indexes
- "Example 11.3. Setting up a Partial Unique Index" section of https://www.postgresql.org/docs/current/indexes-partial.html
If we are able to implement this at db level, we could:
- leave the current check in-place, to provide a user-friendly error message
- rely on the db check to avoid inconsistent state in some unexpected scenarios (this would result in a 500 response and
IntegrityErrorinternal error - which is anyway much better than having two submitted jobs)