Imagine you're developing a Python software for genome sequence analysis. Everything works perfectly today—but without proper versioning of software dependencies, the future is uncertain:
❌ Your results may become non-reproducible as dependencies change.
❌ Collaborators may struggle to run your code.
❌ Future updates could silently break your software.
The solution? Managing dependencies properly using pyproject.toml
. For example:
[project]
dependencies = [
"tensorflow>=2.1,<3.0",
"pandas>=1.3,<2.0"
]
Explicitly specifying dependencies ensures your software always runs with compatible versions, preventing unexpected issues.
But versioning isn’t just about managing your own dependencies—it also makes your software reliable for others. Proper versioning allows seamless integration into other workflows, ensuring stability and long-term compatibility.
In this blog post, we'll explore best practices for versioning scientific software to enhance reliability and reproducibility.
A human-first approach keeps people at the center—ensuring changelogs are carefully crafted, peer-reviewed, and verified before release. Combined with automation, this approach ensures consistency, making changes clear, reliable, and seamless for collaboration.
To showcase this approach, I've created a demo repository, https://github.com/ricky-lim/versioning, demonstrating how to:
With this approach, each version carries semantic meaning, includes a Git tag for easy rollbacks, and provides a clear changelog to communicate changes effectively.
I use bump-my-version
to automate version updates.
This tool allows you to update the version in pyproject.toml and can also be configured directly within the file for seamless integration.
Version bumps trigger automatically when merging to main:
Additionally, bump-my-version
automates Git tag creation, ensuring each new version is properly tagged for easy tracking and rollbacks.
To roll back to a previous version, use git checkout <tag>
. For example, run git checkout v1.0.0
to revert to version 1.0.0.
Using common-changelog, developers manually curate CHANGELOG.md
## NEXT
### Added
- New genome analysis pipeline
### Fixed
- Memory leak in sequence processing
### Changed
- Updated TensorFlow to 2.15.0
GitHub Actions automatically transforms this into a release entry:
## [1.2.0] - 2025-02-26
### Added
- New genome analysis pipeline
### Fixed
- Memory leak in sequence processing
### Changed
- Updated TensorFlow to 2.15.0
This approach combines human-curated changes with automated versioning.
Automate versioning with two essential workflows:
1. pre-commit.yml
2. version_bump.yml
pyproject.toml
"This human-centered versioning strategy ensures reliable and reproducible scientific software while effectively communicating changes to users, contributors, and our future selves"