You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/content/en/open_source/upgrading/2.53.md
+26-4Lines changed: 26 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
title: 'Upgrading to DefectDojo Version 2.53.x'
3
3
toc_hide: true
4
4
weight: -20251103
5
-
description: "Helm chart: changes for initializer annotations + Replaced Redis with Valkey + HPA & PDB support"
5
+
description: "Helm chart: changes for initializer annotations + Replaced Redis with Valkey + HPA & PDB support + Batch Deduplication"
6
6
---
7
7
8
8
## Helm Chart Changes
@@ -17,9 +17,9 @@ Added Helm chart support for Celery and Django deployments for Horizontal Pod Au
17
17
18
18
### Breaking changes
19
19
20
-
#### Valkey
20
+
#### Valkey
21
21
22
-
##### Renamed values
22
+
##### Renamed values
23
23
24
24
HELM values had been changed to the following:
25
25
-`createRedisSecret` → `createValkeySecret`
@@ -40,7 +40,7 @@ If an external Redis instance is being used, set the parameter `valkey.enabled`
40
40
0. As always, perform a backup of your instance
41
41
1. If you would like to be 100% sure that you do not miss any async event (triggered deduplication, email notification, ...) it is recommended to perform the following substeps (if your system is not in production and/or you are willing to miss some notifications or postpone deduplication to a later time, feel free to skip these substeps)
42
42
0. Perform the following steps with your previous version of HELM chart (not with the upgraded one - you might lose your data)
43
-
1. Downscale all producers of async tasks:
43
+
1. Downscale all producers of async tasks:
44
44
- Set `django.replicas` to 0 (if you used HPA, adjust it based on your needs)
45
45
- Set `celery.beat.replicas` to 0 (if you used HPA, adjust it based on your needs)
46
46
- Do not change `celery.worker.replicas` (they are responsible for processing your async tasks)
@@ -89,4 +89,26 @@ Both `extraAnnotations` and `initializer.podAnnotations` will now be properly ap
89
89
90
90
Reimport will update existing findings `fix_available` and `fix_version` fields based on the incoming scan report.
91
91
92
+
## Batch Deduplication
93
+
94
+
Before 2.53.0 Defect Dojo has been deduplicating new or updated findings one-by-one. This works well for small imports and has the benefit of an easy to understand codebase and test suite. For larger imports however the performance is bad and resource usage is (very) high. A 1000+ finding import can cause a celery worker to spend minutes on deduplication.
95
+
96
+
PR [13491](https://github.com/DefectDojo/django-DefectDojo/pull/13491) changes the deduplication process for import and reimport to be done in batches. This biggest benefit is that there now will be 1 database query per batch (1000 findings), instead of 1 query per finding (1000 queries).
97
+
98
+
A quick test with the `jfrog_xray_unified/very_many_vulns.json` samples scan (10k findings) shwo the obvious huge improvement in deduplication time. Please note that we're not only doing this for performance, but also to reduce the resources (cloud cost) needed to run Defect Dojo.
99
+
100
+
initial import (no duplicates):
101
+
| branch | import time | dedupe time | total time |
There are no other special instructions for upgrading to 2.53.x. Check the [Release Notes](https://github.com/DefectDojo/django-DefectDojo/releases/tag/2.53.0) for the contents of the release.
0 commit comments