Skip to content

Commit 3f2dc20

Browse files
authored
feat(DD_DEDUPLICATION_ALGORITHM_PER_PARSER + DD_HASHCODE_FIELDS_PER_SCANNER): Add checker of values (#11244)
1 parent 310c881 commit 3f2dc20

3 files changed

Lines changed: 21 additions & 5 deletions

File tree

docs/content/en/usage/features.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -244,7 +244,7 @@ The environment variable will override the settings in `settings.dist.py`, repla
244244

245245
The available algorithms are:
246246

247-
DEDUPE_ALGO_UNIQUE_ID_FROM_TOOL
247+
DEDUPE_ALGO_UNIQUE_ID_FROM_TOOL (value for `DD_DEDUPLICATION_ALGORITHM_PER_PARSER`: `unique_id_from_tool`)
248248
: The deduplication occurs based on
249249
finding.unique_id_from_tool which is a unique technical
250250
id existing in the source tool. Few scanners populate this
@@ -266,12 +266,12 @@ DEDUPE_ALGO_UNIQUE_ID_FROM_TOOL
266266
able to recognise that findings found in previous
267267
scans are actually the same as the new findings.
268268

269-
DEDUPE_ALGO_HASH_CODE
269+
DEDUPE_ALGO_HASH_CODE (value for `DD_DEDUPLICATION_ALGORITHM_PER_PARSER`: `hash_code`)
270270
: The deduplication occurs based on finding.hash_code. The
271271
hash_code itself is configurable for each scanner in
272272
parameter `HASHCODE_FIELDS_PER_SCANNER`.
273273

274-
DEDUPE_ALGO_UNIQUE_ID_FROM_TOOL_OR_HASH_CODE
274+
DEDUPE_ALGO_UNIQUE_ID_FROM_TOOL_OR_HASH_CODE (value for `DD_DEDUPLICATION_ALGORITHM_PER_PARSER`: `unique_id_from_tool_or_hash_code`)
275275
: A finding is a duplicate with another if they have the same
276276
unique_id_from_tool OR the same hash_code.
277277

@@ -284,7 +284,7 @@ DEDUPE_ALGO_UNIQUE_ID_FROM_TOOL_OR_HASH_CODE
284284
cross-parser deduplication
285285

286286

287-
DEDUPE_ALGO_LEGACY
287+
DEDUPE_ALGO_LEGACY (value for `DD_DEDUPLICATION_ALGORITHM_PER_PARSER`: `legacy`)
288288
: This is algorithm that was in place before the configuration
289289
per parser was made possible, and also the default one for
290290
backward compatibility reasons.
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
09169f6d20ebf2f37347156111c3670a5b207c3530583a53ed9ac59ae4221188
1+
f09caa2d4e41f44b7cd6ecf2f1400817d4776e703bd039c8d857f1356382e1f3

dojo/settings/settings.dist.py

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1296,6 +1296,12 @@ def saml2_attrib_map_format(dict):
12961296
if len(env("DD_HASHCODE_FIELDS_PER_SCANNER")) > 0:
12971297
env_hashcode_fields_per_scanner = json.loads(env("DD_HASHCODE_FIELDS_PER_SCANNER"))
12981298
for key, value in env_hashcode_fields_per_scanner.items():
1299+
if not isinstance(value, list):
1300+
msg = f"Fields definition '{value}' for hashcode calculation of '{key}' is not valid. It needs to be list of strings but it is {type(value)}."
1301+
raise TypeError(msg)
1302+
if not all(isinstance(field, str) for field in value):
1303+
msg = f"Fields for hashcode calculation for {key} are not valid. It needs to be list of strings. Some of fields are not string."
1304+
raise AttributeError(msg)
12991305
if key in HASHCODE_FIELDS_PER_SCANNER:
13001306
logger.info(f"Replacing {key} with value {value} (previously set to {HASHCODE_FIELDS_PER_SCANNER[key]}) from env var DD_HASHCODE_FIELDS_PER_SCANNER")
13011307
HASHCODE_FIELDS_PER_SCANNER[key] = value
@@ -1377,6 +1383,13 @@ def saml2_attrib_map_format(dict):
13771383
# Makes it possible to deduplicate on a technical id (same parser) and also on some functional fields (cross-parsers deduplication)
13781384
DEDUPE_ALGO_UNIQUE_ID_FROM_TOOL_OR_HASH_CODE = "unique_id_from_tool_or_hash_code"
13791385

1386+
DEDUPE_ALGOS = [
1387+
DEDUPE_ALGO_LEGACY,
1388+
DEDUPE_ALGO_UNIQUE_ID_FROM_TOOL,
1389+
DEDUPE_ALGO_HASH_CODE,
1390+
DEDUPE_ALGO_UNIQUE_ID_FROM_TOOL_OR_HASH_CODE,
1391+
]
1392+
13801393
# Allows to deduplicate with endpoints if endpoints is not included in the hashcode.
13811394
# Possible values are: scheme, host, port, path, query, fragment, userinfo, and user. For a details description see https://hyperlink.readthedocs.io/en/latest/api.html#attributes.
13821395
# Example:
@@ -1526,6 +1539,9 @@ def saml2_attrib_map_format(dict):
15261539
if len(env("DD_DEDUPLICATION_ALGORITHM_PER_PARSER")) > 0:
15271540
env_dedup_algorithm_per_parser = json.loads(env("DD_DEDUPLICATION_ALGORITHM_PER_PARSER"))
15281541
for key, value in env_dedup_algorithm_per_parser.items():
1542+
if value not in DEDUPE_ALGOS:
1543+
msg = f"DEDUP algorithm '{value}' for '{key}' is not valid. Use one of following values: {', '.join(DEDUPE_ALGOS)}"
1544+
raise AttributeError(msg)
15291545
if key in DEDUPLICATION_ALGORITHM_PER_PARSER:
15301546
logger.info(f"Replacing {key} with value {value} (previously set to {DEDUPLICATION_ALGORITHM_PER_PARSER[key]}) from env var DD_DEDUPLICATION_ALGORITHM_PER_PARSER")
15311547
DEDUPLICATION_ALGORITHM_PER_PARSER[key] = value

0 commit comments

Comments
 (0)