Dependency Management

We have seen in previous chapters that Nix makes it easy to construct complex build dependencies with several benefits:

  • Non-related dependencies can be built in parallel.
  • Reproducible builds make it easy to cache and reuse dependencies.

However Nix does not provide any mechanism for dependency resolution, e.g. choose from multiple versions of dependencies and determining the most suitable versions.

As an example, we will build hypothetical Nix packages resembling Minecraft crafting recipes with versioning schemes following semver. Let's first try to build our first version of pickaxe, which is made of wood:

pickaxe
  - 1.0.0
    - stick ^1.0.1
    - planks ~2.1.0
stick
  - 1.0.3
  - 1.1.2
  - 2.0.0
planks
  - 2.1.0
  - 2.1.1
  - 2.2.1

The first step in deciding the appropriate versions to be used to build pickaxe-1.0.0 is to rule out invalid versions. With that, stick-2.0.0 is ruled out because it is outside of the ^1.0.1 range. Similarly planks-2.2.1 is outside the bound for ~2.1.0.

After filtering out the invalid versions, there are still multiple versions of stick and planks available. As a result there can be multiple version candidates for building pickaxe-1.0.0. For example, we can use stick-1.0.3 and planks-2.1.1. But are those the best versions to be used?

Depending on the dependency resolution algorithm used, we may get different answers. Though in general, we can usually expect the algorithm to choose the latest versions that are compatible with the required range. So we should expect to get stick-1.1.2 and planks-2.1.1 as the answers.

Nested Dependencies

In reality, dependency resolution can be more complicated because of nested dependencies. Let's say both stick and planks both depend on wood:

stick
  - 1.0.3
    - wood ^1.5.0
  - 1.1.2
    - wood ~2.0.0

planks
  - 2.1.0
    - wood ^2.0.0
  - 2.1.1
    - wood ~2.3.0

wood
  - 1.5.0
  - 2.0.1
  - 2.3.2

In such case, the only solution is to use stick-1.1.2 and planks-2.1.0, because the other version combinations do not have a common wood version usable by both stick and planks.

Package Managers

Dependency resolution is a complex topic on its own. Different languages have their own package managers that deal with dependency resolution differently. e.g. cabal-install, npm, mvn, etc. There are also OS-level package managers that have to deal with dependencies resolution. e.g. apt (for Debian and Ubuntu), rpm (Fedora), pacman (Arch Linux, Manjaro), etc.

To support package management across multiple languages and multiple platforms, Nix has its own unique challenge of managing dependencies. At this point, Nix itself do not provide any mechanism for resolving dependencies. Instead Nix users have to come out with their own higher level design patterns to resolve dependencies, such as in nixpkgs.

Package Registry

For a dependency resolution algorithm to determine what versions of dependency to use, it must first refer to a registry that contains all versions available to all packages. Each package manager have their own registry, e.g. Hackage, npm registry, Debian registry, etc.

Package registries are usually mutable databases that are constantly updated. This creates an issue with reproducibility: the result given back from a dependency resolution algorithm depends on the mutable state of the registry at the time the algorithm is executed.

In other words, say if we try to resolve the dependencies of pickaxe-1.0.0 today, we may get stick-1.1.2 and planks-2.1.0. But if we resolve the same dependencies tomorrow, we might get stick-1.1.3 because new version of stick is published. To make it worse, stick-1.1.3 may contain unexpected changes that causes pickaxe-1.0.0 to break.

Version Pinning

Even without Nix, there is a strong use case to pin the versions to a particular snapshot of the registry. This is to make sure that, no matter when we try to resolve the dependencies, we will always get back the same dependencies.

Package Lock

One common approach is to create a lock file containing the result of running the dependency resolution algorithm, and include the lock file into the version control system (e.g. GIT). For instance the lock file could be package-lock.json (for npm) or cabal.project.freeze (for haskell projects). With the lock files available, we can even skip dependency resolution in the future, and just use the result in the lock file.

Registry Snapshot

An alternative approach would be to specify the snapshot of the package registry itself. For example, cabal accepts an index-state option for us to specify a timestamp of the Hackage snapshot that it should resolve the dependencies from. With that we can specify the timestamp of the time we first build our dependencies, and not worry about new versions of packages being added in the future.

However there can still be other variables that can affect the outcome. For example, the package manager itself may update the dependency resolution algorithm, so we may still get different results depending on the version of package manager used.

Upgrading Dependencies

The strategies for pinning dependencies does not eliminate the need to resolving the plans in the first place, or the need to upgrade or install new dependencies.

In the ideal world, we would like to be able to just specify the dependencies we need, and have the package manager give us the best versions that just work. But reality is messy, and dependencies can have breaking changes all the time.

Versioning Schemes

There are many attempts at coming up with versioning schemes that carry breakage information with them, such as semver and PVP. However they require package authors to manually follow the rules, and rules can be broken, intentionally or not.

Exponential Versions

In reality, each combination of dependency versions produce a unique version of the full package that needs to be tested. There is never just one version of pickaxe-1.0.0, but exponential number of versions of pickaxe-1.0.0 depending on the versions of stick, planks, and their transitive dependencies.

To make matters worse, real world software also tend to have implicit dependencies on the runtime environment, such as executables, shared libraries, and operating system APIs.

So for each of the versions of pickaxe-1.0.0 with pinned dependencies, we would also have multiple versions of that software for different platforms, e.g. Linux, MacOS, Windows, Android, iOS, etc. Even among these platforms, there are also multiple releases of the platform, e.g. Debian 10, Ubuntu 20.04, MacOS Big Sur, etc.

Mono Versioning

Despite all these complexities, we still like to pretend that there is only one or few versions of pickaxe-1.0.0 ever existed. One way to tame down this complexity is through mono versioning.

Monorepo

The simplest kind of mono versioning is by having a single repository that contains all its components and dependencies. For each commit in this repository, there is exactly one version each component and dependecy. We simply ignore the possibility of other valid combinations of component versions, and not support them.

Lockfiles in Monorepo

Package managers such as godep check the source code of dependencies into a monorepo. As an alternative, we can check just the lockfiles into the repository, and have the package managers fetch them separately.

Checking the lock file is still effectively mono-versioning the dependencies. For each commit in the repository, we support only the exact dependencies specified in the lockfile of that commit. We simply pretend that no other versions of the dependencies are available.

Mono Registry

Taking the idea to extreme, we can also freeze all dependencies in a package registry and provide only one version of each dependency at any point in time.

This is the approach for registries such as Stackage, which guarantees that all Haskell dependencies only have one version that always work.

Mono registry tend to work more cleanly together with monorepo. In a project, we can specify just the snapshot version of the mono registry that we are using, and there is no need for messy details such as generating the lockfiles in the first place.

Mono Environment

Nixpkgs is also a mono registry for the standard packages in Nix. For each version of nixpkgs, there is exactly one version of packages such as bash, gcc, etc. But since these packages used to be provided by operating systems, we can say that nixpkgs is also providing a mono environment to our software.

When we create a monorepo with pinned nixpkgs, we are not only providing exactly one version of each dependencies, but also exactly one version of the environment to run on.

Mono environment restricts the specification of our software so that it does not just run on platforms such as any version of Linux or any version of Debian 10. We just pretend that there is exact one version of OS as specified in nixpkgs.

Pros and Cons of Mono Versioning

There is a fundamental difference in philosophy between multi-versioning and mono-versioning that makes it difficult for the two camps to reconcile. At its core, there are a few factors involved.

Stability

Mono-versioning places much higher value in stability. People in this camp want to make sure each version of the software always work. They achieve that by significantly limiting the number of versions of the software, and thoroughly testing softwares before upgrading any version.

Mono-versioning tend to put emphasis in LTS (long term support) releases, where its components are guaranteed to not have any breaking changes and be given years of support.

Rapid Releases

Mono versioning tend to suffer in providing slower releases. When a new version of component is available, it has to be tested to not break any other component before the new version can be released.

In contrast, multi-versioning allows a new component to be released immediately. This allows software to independently upgrade the dependencies, at the risk of it may break on some of the software.

Blurring the Line

There is no clear cut off whether the mono-versioning or multi-versioning approaches are better. In practice, we tend to take a hybrid approach in large software projects.

For example, in a company each team may have different monorepos for their projects to manage their own dependencies. The full release of the software suite is then a multi-versioning combination of each team's projects, which can break during development. Finally in production, the exact versions of each subprojects are pinned to specific versions before deployment.

Although Nix is more suitable for mono-versioning development, some of its features also make it easier to manage multi-versioned projects, by building a mono "Nixified" version of the projects.