Skip to content

Add database constraint for jobs (through partial index) #3040

@tcompa

Description

@tcompa

We currently have an ednpoint-level check like

    # Check that no other job with the same dataset_id is SUBMITTED
    stm = (
        select(JobV2)
        .where(JobV2.dataset_id == dataset_id)
        .where(JobV2.status == JobStatusType.SUBMITTED)
    )
    res = await db.execute(stm)
    if res.scalars().all():
        raise HTTPException(
            status_code=status.HTTP_422_UNPROCESSABLE_CONTENT,
            detail=(
                f"Dataset {dataset_id} is already in use in submitted job(s)."
            ),
        )

but we observed at least once a situation where two "submitted" jobs co-existed.


It'd be best to make this more robust, by also having a postgres unique index on "jobs with dataset_id = X and status = submitted (where X is not null)".
I think a partial index should be appropriate:

If we are able to implement this at db level, we could:

  • leave the current check in-place, to provide a user-friendly error message
  • rely on the db check to avoid inconsistent state in some unexpected scenarios (this would result in a 500 response and IntegrityError internal error - which is anyway much better than having two submitted jobs)

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions