Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 26 additions & 4 deletions docs/content/en/open_source/upgrading/2.53.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: 'Upgrading to DefectDojo Version 2.53.x'
toc_hide: true
weight: -20251103
description: "Helm chart: changes for initializer annotations + Replaced Redis with Valkey + HPA & PDB support"
description: "Helm chart: changes for initializer annotations + Replaced Redis with Valkey + HPA & PDB support + Batch Deduplication"
---

## Helm Chart Changes
Expand All @@ -17,9 +17,9 @@ Added Helm chart support for Celery and Django deployments for Horizontal Pod Au

### Breaking changes

#### Valkey
#### Valkey

##### Renamed values
##### Renamed values

HELM values had been changed to the following:
- `createRedisSecret` → `createValkeySecret`
Expand All @@ -40,7 +40,7 @@ If an external Redis instance is being used, set the parameter `valkey.enabled`
0. As always, perform a backup of your instance
1. If you would like to be 100% sure that you do not miss any async event (triggered deduplication, email notification, ...) it is recommended to perform the following substeps (if your system is not in production and/or you are willing to miss some notifications or postpone deduplication to a later time, feel free to skip these substeps)
0. Perform the following steps with your previous version of HELM chart (not with the upgraded one - you might lose your data)
1. Downscale all producers of async tasks:
1. Downscale all producers of async tasks:
- Set `django.replicas` to 0 (if you used HPA, adjust it based on your needs)
- Set `celery.beat.replicas` to 0 (if you used HPA, adjust it based on your needs)
- Do not change `celery.worker.replicas` (they are responsible for processing your async tasks)
Expand Down Expand Up @@ -89,4 +89,26 @@ Both `extraAnnotations` and `initializer.podAnnotations` will now be properly ap

Reimport will update existing findings `fix_available` and `fix_version` fields based on the incoming scan report.

## Batch Deduplication

Before 2.53.0 Defect Dojo has been deduplicating new or updated findings one-by-one. This works well for small imports and has the benefit of an easy to understand codebase and test suite. For larger imports however the performance is bad and resource usage is (very) high. A 1000+ finding import can cause a celery worker to spend minutes on deduplication.

PR [13491](https://github.com/DefectDojo/django-DefectDojo/pull/13491) changes the deduplication process for import and reimport to be done in batches. This biggest benefit is that there now will be 1 database query per batch (1000 findings), instead of 1 query per finding (1000 queries).

A quick test with the `jfrog_xray_unified/very_many_vulns.json` samples scan (10k findings) shwo the obvious huge improvement in deduplication time. Please note that we're not only doing this for performance, but also to reduce the resources (cloud cost) needed to run Defect Dojo.

initial import (no duplicates):
| branch | import time | dedupe time | total time |
|--------|:-----------:|:-----------:|:-----------:|
| dev | ~200s | ~400s | ~600s |
| dedupe-batching | ~190s | _~12s_ | ~200s |

second import into the same product (all duplicates):
initial import (no duplicates):
| branch | import time | dedupe time | total time |
|--------|:-----------:|:-----------:|:-----------:|
| dev | ~200s | ~400s | ~600s |
| dedupe-batching | ~190s | _~180s_ | ~370s |


There are no other special instructions for upgrading to 2.53.x. Check the [Release Notes](https://github.com/DefectDojo/django-DefectDojo/releases/tag/2.53.0) for the contents of the release.