Versioning and Lock Files¶
Version numbers mean something (sometimes). Lock files ensure reproducibility (when you use them). This chapter explains how versioning works, why lock files matter, and how to avoid the common pitfalls.
Aspirational Spec
Semver is a promise that everyone makes and nobody keeps perfectly. I've seen "minor" releases that broke production. I've seen "patches" that changed behavior in subtle, infuriating ways. The version number tells you what the maintainer intended. Reality requires verification.
Semantic Versioning: The Promise¶
Most modern packages follow semantic versioning (semver): MAJOR.MINOR.PATCH
| Component | Increments When | Example |
|---|---|---|
| MAJOR | Breaking changes | 1.0.0 → 2.0.0 |
| MINOR | New features (backwards compatible) | 1.0.0 → 1.1.0 |
| PATCH | Bug fixes (backwards compatible) | 1.0.0 → 1.0.1 |
The promise: if you're using version 1.2.3, you can safely update to 1.2.4 (patch) or 1.3.0 (minor). Only 2.0.0 (major) requires attention.
This enables automated updates. If the contract holds, you can auto-update patches without fear.
Semantic Versioning: The Reality¶
The promise is aspirational. Reality is messier.
"Minor" Releases That Break Everything¶
# Worked in 1.2.0
result = library.process(data)
# 1.3.0 "minor" release
# Oops, the function signature changed
result = library.process(data, new_required_param) # Breaks
Some maintainers increment MINOR when they should increment MAJOR. Some don't consider edge cases as "breaking." Some make genuine mistakes.
Pre-1.0 Means Anything Goes¶
Projects with version 0.x.y aren't bound by semver rules:
Major version zero (0.y.z) is for initial development. Anything MAY change at any time. The public API SHOULD NOT be considered stable.
That 0.9.0 package might break everything in 0.10.0. The version number doesn't protect you.
Fast-Moving Ecosystems¶
In some ecosystems, breaking changes are constant:
- Frontend frameworks with major releases every year
- ML libraries with rapid API evolution
- Early-stage languages with shifting best practices
Semver provides guidance, but you still need to test.
Version Specifiers¶
When you declare dependencies, you specify which versions are acceptable.
The Specifiers¶
| Specifier | Meaning | Example | Matches |
|---|---|---|---|
== | Exact version | ==1.2.3 | Only 1.2.3 |
>= | Minimum | >=1.2.0 | 1.2.0, 1.3.0, 2.0.0... |
<= | Maximum | <=2.0.0 | 2.0.0, 1.9.0, 1.0.0... |
~= | Compatible | ~=1.2.0 | 1.2.*, not 1.3.0 |
^ | Caret (npm) | ^1.2.3 | ≥1.2.3, <2.0.0 |
~ | Tilde (npm) | ~1.2.3 | ≥1.2.3, <1.3.0 |
* | Any | * | Anything (dangerous) |
The npm Footguns¶
npm defaults are particularly surprising:
The ^ means "4.17.0 or higher, but less than 5.0.0." This allows 4.18.0, 4.19.0, and so on—any of which could potentially break your code.
When you run npm install, you might get different versions on different days. That's... not great for reproducibility.
Safe Defaults¶
For production code, prefer tighter constraints:
# pip - exact versions
pip install pandas==2.0.3
# npm - exact versions
npm install --save-exact lodash@4.17.21
Trade the convenience of automatic updates for the stability of known versions.
Lock Files Are Not Optional¶
Lock files record the exact versions that were actually installed—not the ranges you specified, but the specific versions that were resolved.
What Lock Files Capture¶
A lock file includes:
- Exact versions —
lodash@4.17.21, not^4.17.0 - Transitive dependencies — Everything, not just direct deps
- Integrity hashes — Checksums to verify downloads
- Resolution context — Which registry, what platform
Why This Matters¶
Without a lock file:
# Monday
$ pip install pandas>=2.0
# Resolves to pandas==2.0.3
# Tuesday (new release happened)
$ pip install pandas>=2.0
# Resolves to pandas==2.1.0 (different behavior, possible bugs)
With a lock file:
# Monday
$ pip install -r requirements.txt
# Installs pandas==2.0.3, creates lock
# Tuesday
$ pip install -r requirements.txt # (using lock)
# Still installs pandas==2.0.3 (reproducible)
Lock Files by Ecosystem¶
| Ecosystem | Lock File | Notes |
|---|---|---|
| npm | package-lock.json | Auto-generated, commit it |
| Yarn | yarn.lock | Yarn's equivalent |
| pnpm | pnpm-lock.yaml | pnpm's equivalent |
| pip | None built-in | Use pip freeze or tools |
| pip-tools | requirements.txt | Generated by pip-compile |
| Poetry | poetry.lock | Full dependency resolver |
| uv | uv.lock | Modern, fast |
| Bundler | Gemfile.lock | Ruby's lock file |
| Cargo | Cargo.lock | Rust's lock file |
| Go | go.sum | Hash verification |
Python's Lock File Problem¶
Python's ecosystem is fragmented. pip doesn't have a native lock file—requirements.txt is typically used, but it's weak:
# requirements.txt (not a real lock file)
pandas>=2.0 # Version range, not locked
# requirements.txt (better, but manual)
pandas==2.0.3 # Pinned, but transitive deps not captured
# requirements.txt (pip freeze output)
numpy==1.24.3
pandas==2.0.3
python-dateutil==2.8.2
pytz==2023.3
six==1.16.0
tzdata==2023.3
# Better! But no hashes, manual updates are error-prone
For proper lock files in Python, use tools:
pip-tools:
# requirements.in (what you want)
pandas>=2.0
# pip-compile generates requirements.txt (locked)
pip-compile requirements.in
# requirements.txt includes exact versions with hashes
Poetry:
# pyproject.toml (what you want)
[tool.poetry.dependencies]
pandas = "^2.0"
# poetry.lock (auto-generated, includes everything)
uv:
The Cardinal Rules¶
1. Lock Files Go in Version Control¶
# YES
git add package-lock.json
git commit -m "Update dependencies"
# NO - never ignore lock files
echo "package-lock.json" >> .gitignore
The lock file is part of your project. It defines what your project actually uses.
2. CI Installs from Lock File¶
Your CI should reproduce exactly what you tested locally:
# npm - use ci, not install
npm ci # Installs exactly what's in lock file
# pip-tools
pip install -r requirements.txt # The compiled file
# Poetry
poetry install # Uses poetry.lock
npm install might update the lock file. npm ci uses it as-is.
3. Update Lock Files Intentionally¶
Don't let CI regenerate lock files. Updates should be deliberate:
# Update a specific package
npm update lodash
git diff package-lock.json # Review what changed
git add package-lock.json
git commit -m "Update lodash to 4.17.21 for security fix"
4. Review Lock File Changes in PRs¶
When lock files change, ask:
- What versions changed?
- Were new dependencies added?
- Were any removed?
- Was this intentional?
Lock file changes can hide surprises—new transitive dependencies, unexpected version jumps, or even supply chain attacks.
Vendoring: The Nuclear Option¶
Vendoring means copying dependencies directly into your project:
my-project/
├── src/
├── vendor/
│ ├── lodash/
│ ├── axios/
│ └── ... (all dependencies)
└── package.json
When to Vendor¶
- High-security environments — Air-gapped networks, compliance requirements
- Reproducibility-critical — Research that must be reproducible indefinitely
- Registry distrust — Can't rely on npm/PyPI being available
- Audit requirements — Need to review every line of dependency code
The Tradeoffs¶
| Advantage | Disadvantage |
|---|---|
| Complete control | You own all updates |
| Registry-independent | Repo bloat |
| Auditable | Manual security patches |
| Reproducible forever | Significant maintenance |
Vendoring is powerful but expensive. For most projects, lock files provide sufficient reproducibility without the maintenance burden.
Modern Alternatives¶
Instead of vendoring, consider:
- Private registry mirrors — Cache approved packages internally
- Artifact storage — Store built artifacts alongside code
- Reproducible builds — Verify you can rebuild from source
Not a Prayer
I've debugged enough "it works on my machine" problems to have strong opinions about lock files.
The pattern is always the same. Developer A writes code. It works. Developer B clones the repo. It doesn't work. Hours of debugging later, someone notices: different dependency versions.
Without lock files, you're not sharing a project—you're sharing a wish. "I hope you get the same versions I had." That's not engineering. That's prayer. AI-assisted coding makes this worse—the AI generates code with imports but no lock files, with dependencies but no version constraints. It's wishes built on wishes. Lock files aren't bureaucracy. They're communication. They say: "This exact combination of code and dependencies worked. Use this." When something breaks, you can diff the lock files and see what changed.
The five minutes you spend setting up proper lock file handling saves days of debugging "it worked yesterday" problems.
Quick Reference¶
Python Lock File Options¶
| Tool | Command | Lock File |
|---|---|---|
| pip freeze | pip freeze > requirements.txt | requirements.txt |
| pip-tools | pip-compile requirements.in | requirements.txt |
| Poetry | poetry lock | poetry.lock |
| uv | uv lock | uv.lock |
| PDM | pdm lock | pdm.lock |
npm/Node Lock File Commands¶
Lock File Checklist¶
- Lock file exists
- Lock file is in version control
- CI uses lock file (not resolving fresh)
- Lock file changes are reviewed
- Update process is documented