Skip to content

IGV Update API tends to crash GCE VM #262

@EddieLF

Description

@EddieLF

Describe the bug
The IGV diff API and update API writes new IgvSample records to the db. It links these records to read files (CRAMs) found in the GCP buckets.

  • IGV diff api checks which individuals already have an IgvSample record, and which are missing. This is used to determine which reads need to be added. Code link.
  • Update IGV api adds the records to each individual via the individuals guid, making one call per individual. When syncing from Metamist, we limit this to 5 or 10 at a time, with sleep commands in between. Even with this chunking and waiting, so many calls seems to cause the VM to crash. Code link.

Scope of the bug
Yes, if too many IGV updates occur in a short period, the VM will crash and return 502 for a few minutes until the VM restarts.

To fix?
Whatever is happening with these API calls must be really poorly optimised.

The first call should not be too difficult - it should just query the db for all families / individuals in a project, and filter for those with no IgvSample record attached. Then return this list of participants. Very simple django query.

The second call might be harder to optimise, since it seems like it has to do one call per individual. Maybe we just need to space it out more in the sync module.

Sample log entry for creating one IgvSample
Several gsutil ls requests are made for the CRAM / index objects that are part of the update, querying for the existence of the individual CRAM and index file. Perhaps these requests could be collapsed into one single request, and then the existence of all files in the update request could be checked at once, rather than with individual gsutil ls requests per file.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions