Skip to content

Eliminate MD5 usage by adopting Project wide SHA-256 checksums#951

Open
in-manishkr wants to merge 1 commit into
openmainframeproject:masterfrom
in-manishkr:enhance_checksum_algo
Open

Eliminate MD5 usage by adopting Project wide SHA-256 checksums#951
in-manishkr wants to merge 1 commit into
openmainframeproject:masterfrom
in-manishkr:enhance_checksum_algo

Conversation

@in-manishkr

Copy link
Copy Markdown
Contributor

Summary

Replace MD5-based checksum generation with SHA-256 and standardize MD5-specific naming to generic checksum terminology throughout the codebase.

MD5 is no longer considered secure due to known collision vulnerabilities. This change improves file integrity verification by adopting SHA-256 and aligns the codebase with modern security practices.

In addition, a database migration script has been added to migrate existing image metadata and recalculate checksums for restored databases.

Changes

Core implementation

  • Replace hashlib.md5() with hashlib.sha256()
  • Rename _get_md5sum() to _get_checksum()
  • Update image import, capture, and file upload workflows to use SHA-256 checksums
  • Replace MD5-specific references with generic checksum naming

Database

  • Rename image table column from md5sum to checksum
  • Update database APIs, queries, and mappings to use the new column name
  • Update image record creation and retrieval logic to use checksum values

API and validation

  • Rename API parameter and response fields from md5sum to checksum
  • Update checksum validation from 32-character MD5 hashes to 64-character SHA-256 hashes
  • Update API documentation and examples accordingly

Tests

  • Update unit tests to use SHA-256 checksum values
  • Rename MD5-specific test methods, mocks, and assertions
  • Update validation and API handler test coverage
  • Update database and SMT client test cases

Documentation

  • Replace MD5-specific references with generic checksum terminology
  • Update REST API documentation, parameter definitions, and examples
  • Update image import/export documentation and sample payloads

Migration tooling

  • Add feilong/database_migration_md5_to_sha256.py
  • Provide support for migrating existing image metadata from MD5 to SHA-256
  • Recalculate checksums for existing image records during migration

Security Impact

This change eliminates the use of MD5 for checksum generation and verification. SHA-256 provides significantly stronger collision resistance and better aligns with current security recommendations and compliance requirements.

Benefits include:

  • Improved protection against hash collision attacks
  • Stronger file integrity verification
  • Alignment with modern cryptographic best practices
  • Removal of dependency on a deprecated hashing algorithm

Compatibility Notes

Breaking Change

  • API field md5sum has been renamed to checksum
  • Database column md5sum has been renamed to checksum
  • SHA-256 checksums are now expected (64 hexadecimal characters)
  • Existing integrations that submit or consume MD5 values must be updated
  • Existing databases require migration before use with this change

Database Migration Requirement

For deployments restoring or upgrading an existing sdk_image.sqlite database, the following migration script must be executed after the database is restored:

python feilong/database_migration_md5_to_sha256.py

The migration script:

  • Renames the database column from md5sum to checksum
  • Recalculates SHA-256 checksums for all existing image records
  • Updates image metadata to match the new checksum format
  • Ensures compatibility with the updated SDK schema

Failure to run the migration script after database restoration may result in schema mismatches or invalid checksum data.

Upgrade Considerations

Operators upgrading existing environments should:

  1. Back up the existing database.
  2. Restore the sdk_image.sqlite database if required.
  3. Run feilong/database_migration_md5_to_sha256.py.
  4. Verify image records contain valid SHA-256 checksums.
  5. Upgrade API clients to use the checksum field and SHA-256 values.
  6. Validate image import/export workflows after upgrade.

Testing

  • Updated unit tests for checksum generation and validation
  • Verified image import workflows with SHA-256 checksums
  • Verified file upload checksum generation
  • Verified API validation accepts SHA-256 and rejects legacy MD5 values
  • Updated database-related test coverage
  • Updated documentation examples and API references

@in-manishkr in-manishkr force-pushed the enhance_checksum_algo branch 3 times, most recently from c029d90 to 1d116be Compare June 11, 2026 18:00
@Rajat-0 Rajat-0 requested a review from Bischoff June 12, 2026 05:28

@Bischoff Bischoff left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, thanks. I did not see problems with the code and I am approving it.

WARNING I am afraid that other products depending on Feilong will need to be updated due to the API change. I am thinking in particular at the Go Connector for Feilong and the Terraform provider for Feilong that I am maintaining, but the ICIC guys are probably hit too for the openstack code that uses Feilong.

PLEASE When this is merged, close issue #888

@in-manishkr

Copy link
Copy Markdown
Contributor Author

Excellent, thanks. I did not see problems with the code and I am approving it.

WARNING I am afraid that other products depending on Feilong will need to be updated due to the API change. I am thinking in particular at the Go Connector for Feilong and the Terraform provider for Feilong that I am maintaining, but the ICIC guys are probably hit too for the openstack code that uses Feilong.

PLEASE When this is merged, close issue #888

I agree with you, due to which I have included a python script database_migration_md5_to_sha256.py which can be used to re-evaluate the checksums for all existing images and updates the checksum column with newly calculated values.

@Bischoff

Copy link
Copy Markdown
Contributor

I agree with you, due to which I have included a python script database_migration_md5_to_sha256.py which can be used to re-evaluate the checksums for all existing images and updates the checksum column with newly calculated values.

Thank you for that database upgrade script, Manish.

But this is not only about database contents. Your PR is also a breaking change for the API, which means that every program that uses the Feilong API will have to be rewritten.

This could be mitigated though by accepting both the old parameter name (image_md5sum) and the new parameter name (image_checksum) when parsing an API call.

@in-manishkr

Copy link
Copy Markdown
Contributor Author

I agree with you, due to which I have included a python script database_migration_md5_to_sha256.py which can be used to re-evaluate the checksums for all existing images and updates the checksum column with newly calculated values.

Thank you for that database upgrade script, Manish.

But this is not only about database contents. Your PR is also a breaking change for the API, which means that every program that uses the Feilong API will have to be rewritten.

This could be mitigated though by accepting both the old parameter name (image_md5sum) and the new parameter name (image_checksum) when parsing an API call.

Yes, thats a good suggestion, i will incorporate these changes asap for backward compatibility.

Replace MD5-based checksum generation with SHA-256 to address
known MD5 collision vulnerabilities and improve security.
Also update MD5-specific references to generic checksum naming
where applicable.

Signed-off-by: Manish Kumar <Manish.Kumar176@ibm.com>
@in-manishkr in-manishkr force-pushed the enhance_checksum_algo branch from 1d116be to f99b689 Compare June 16, 2026 07:57
@in-manishkr

Copy link
Copy Markdown
Contributor Author

@Bischoff

I have incorporated the requested changes as a fallback mechanism:

expect_checksum = image_meta.get('checksum', image_meta.get('md5sum'))

please re-review

@Bischoff

Bischoff commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

@Bischoff

I have incorporated the requested changes as a fallback mechanism:

expect_checksum = image_meta.get('checksum', image_meta.get('md5sum'))

please re-review

Thanks. It might be you also need to accept both values in the validation code

(on top of my head: zvmsdk/sdkwsgi/validation/parameter_types.py‎, but there might be other places as well).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants