Yarn Zero-Installs & Plug’n’Play

[Fuente: https://yarnpkg.com/features/zero-installs]

Zero-Installs

While not a feature in itself, the term “Zero Install” encompasses a lot of Yarn features tailored around one specific goal – to make your projects as stable and fast as possible by removing the main source of entropy from the equation: Yarn itself.

Important: Zero-install is an optional philosophy. It has some drawbacks, and while we believe this workflow to be a defining feature for professional-grade projects we don’t have any plans to ignore or deprecate the typical yarn install workflow in any way, now or in the future.

How does Yarn impact a project’s stability?

Yarn does its best to guarantee that running yarn install twice will give you the same result in both cases. The main way it does this is through a lockfile, which contains all the information needed for a project to be installed in a reproducible way across systems. But is it good enough?

While Yarn does its best to guarantee that what works now will keep working, there’s always the off chance that a future Yarn release will introduce a bug that will prevent you from installing your project. Or maybe your production environments will change and yarn install won’t be able to write in the temporary directories anymore. Or maybe the network will fail and your packages won’t be available anymore. Or maybe your credentials will rotate and you will start getting authentication issues. Or … so many things can go wrong, and not all of them are things we can control.

Note that these challenges are not unique to Yarn — you may remember a time when npm used to erase production servers due to a bug that reached one of their releases. This is exactly what we mean: any code that runs is code that can fail. And thanks to Murphy’s law, we know that something that can fail will eventually fail. From there, it becomes clear that the only sure way to prevent such issues is to run as little code as possible.

How do you reach this “zero-install” state you’re advocating for?

In order to make a project zero-install, you must be able to use it as soon as you clone it. This is very easy starting from Yarn 2!

  • First, ensure that your project is using Plug’n’Play to resolve dependencies via the cache folder and not from node_modules.
    • While in theory you could check-in your node_modules folder rather than the cache, in practice the node_modules contains a gigantic amount of files that frequently change location and mess with Git’s optimizations. By contrast, the Yarn cache contains exactly one file per package, that only change when the packages themselves change.
  • The cache folder is by default stored within your project folder (in .yarn/cache). Just make sure you add it to your repository (see also, Offline Cache).
    • Again, this whole workflow is optional. If at some point you decide that in the end you prefer to keep using a global cache, just toggle on enableGlobalCache in the yarnrc settings and it’ll be back to normal.
  • When running yarn install, Yarn will generate a .pnp.cjs file. Add it to your repository as well – it contains the dependency tree that Node will use to load your packages.
  • Depending on whether your dependencies have install scripts or not (we advise you to avoid it if you can, and prefer wasm-powered alternatives) you may also want to add the .yarn/unplugged entries.

And that’s it! Push your changes to your repository, checkout a new one somewhere, and check whether running yarn start (or whatever other script you’d normally use) works.

Concerns

Is it different from just checking-in the node_modules folder?

Yes, very much. To give you an idea, a node_modules folder of 135k uncompressed files (for a total of 1.2GB) gives a Yarn cache of 2k binary archives (for a total of 139MB). Git simply cannot support the former, while the latter is perfectly fine.

Another huge difference is the number of changes. Back in Yarn 1, when updating a package, a huge amount of files had to be recreated, or even simply moved. When the same happens in a Yarn 2 install, you get a very predictable result: exactly one changed file for each added/removed package. This in turn has beneficial side effects in terms of performance and security, since you can easily spot the invalid checksums on a per-package basis.

Does it have security implications?

Note that, by design, this setup requires that you trust people modifying your repository. In particular, projects accepting PRs from external users will have to be careful that the PRs affecting the package archives are legit (since it would otherwise be possible to a malicious user to send a PR for a new dependency after having altered its archive content). The best way to do this is to add a CI step (for untrusted PRs only) that uses the --check-cache flag:

$> yarn install --check-cache

This way Yarn will re-download the package files from whatever their remote location would be and will report any mismatching checksum.

Plug’n’Play

Edit this page on GitHub

PnP API

Are you a library author trying to make your library compatible with the Plug’n’Play installation strategy? Do you want to use the PnP API for something awesome? If the answer to any of these questions is yes, make sure to visit the PnP API page after reading the introduction!

Unveiled in September 2018, Plug’n’Play is an innovative installation strategy for Node. Based on prior work in other languages (for example autoload for PHP), it presents interesting characteristics that build upon the regular CommonJS require workflow in an almost completely backward-compatible way.

The node_modules problem

The way installs used to work was simple: when running yarn install Yarn would generate a node_modules directory that Node was then able to consume thanks to its built-in Node Resolution Algorithm. In this context, Node didn’t have to know the first thing about what a “package” was: it only reasoned in terms of files. “Does this file exist here? No: Ok, let’s look in the parent node_modules then. Does it exist here? Still no: Ok …”, and it kept going until it found the right one. This process was vastly inefficient for several reasons:

  • The node_modules directories typically contained gargantuan amounts of files. Generating them could make up for more than 70% of the time needed to run yarn install. Even having preexisting installations wouldn’t save you, as package managers still had to diff the contents of node_modules with what it should contain.
  • Because the node_modules generation was an I/O-heavy operation, package managers didn’t have much leeway to optimize it beyond just doing a simple file copy – and even though it could have used hardlinks or copy-on-write when possible, it would still have needed to diff the current state of the filesystem before making a bunch of syscalls to manipulate the disk.
  • Because Node had no concept of packages, it also didn’t know whether a file was meant to be accessed. It was entirely possible that the code you wrote worked one day in development but broke later in production because you forgot to list one of your dependencies in your package.json.
  • Even at runtime, the Node resolution had to make a bunch of stat and readdir calls to figure out where to load every single required file from. It was extremely wasteful and was part of why booting Node applications took so much time.
  • Finally, the very design of the node_modules folder was impractical in that it didn’t allow package managers to properly de-duplicate packages. Even though some algorithms could be employed to optimize the tree layout (hoisting), we still ended up unable to optimize some particular patterns – causing not only the disk usage to be higher than needed, but also some packages to be instantiated multiple times in memory.

Fixing node_modules

Yarn already knows everything there is to know about your dependency tree – it even installs it on the disk for you. So, why is it up to Node to find where your packages are? Instead, it should be the package manager’s job to inform the interpreter about the location of the packages on the disk and manage any dependencies between packages and even versions of packages. This is why Plug’n’Play was created.

In this install mode (the default starting from Yarn 2.0), Yarn generates a single .pnp.cjs file instead of the usual node_modules folder containing copies of various packages. The .pnp.cjs file contains various maps: one linking package names and versions to their location on the disk and another one linking package names and versions to their list of dependencies. With these lookup tables, Yarn can instantly tell Node where to find any package it needs to access, as long as they are part of the dependency tree, and as long as this file is loaded within your environment (more on that in the next section).

This approach has various benefits:

  • Installs are now nearly instantaneous. Yarn only needs to generate a single text file (instead of potentially tens of thousands). The main bottleneck becomes the number of dependencies in a project rather than disk performance.
  • Installs are more stable and reliable due to reduced I/O operations. Especially on Windows (where writing and removing files in batches may trigger various unintended interactions with Windows Defender and similar tools), I/O heavy node_modules operations were more prone to failure.
  • Perfect optimization of the dependency tree (aka perfect hoisting) and predictable package instantiations.
  • The generated .pnp.cjs file can be committed to your repository as part of the Zero-Installs effort, removing the need to run yarn install in the first place.
  • Faster application startup! The Node resolution doesn’t have to iterate over the filesystem hierarchy nearly as much as before (and soon won’t have to do it at all!).

Initializing PnP

Yarn generates a single .pnp.cjs file that needs to be installed for Node to know where to find the relevant packages. This registration is generally transparent: any direct or indirect node command executed through one of your scripts entries will automatically register the .pnp.cjs file as a runtime dependency. For the vast majority of use cases, the following will work just as you would expect:

{
  "scripts": {
    "start": "node ./server.js",
    "test": "jest"
  }
}

For some remaining edge cases, a small setup may be required:

  • If you need to run an arbitrary Node script, use yarn node as the interpreter, instead of node. This will be enough to register the .pnp.cjs file as a runtime dependency.
yarn node ./server.js
  • If you operate on a system that automatically executes a Node script (for instance on Google Cloud Platform (–reference needed here–)), simply require the PnP file at the top of your init script and call its setup function.
require('./.pnp.cjs').setup();

As a quick tip, all yarn node typically does is set the NODE_OPTIONS environment variable to use the --require option from Node, associated with the path of the .pnp.cjs file. You can easily apply this operation yourself if you prefer:

node -r ./.pnp.cjs ./server.js
NODE_OPTIONS="--require $(pwd)/.pnp.cjs" node ./server.js

PnP loose mode

Because the hoisting heuristics aren’t standardized and predictable, PnP operating under strict mode will prevent packages from requiring dependencies that are not explicitly listed; even if other dependencies also depend on it. This may cause issues with some packages.

To address this problem, Yarn ships with a “loose” mode which will cause the PnP linker to work in tandem with the node-modules hoister – we will first generate the list of packages that would have been hoisted to the top level in a typical node_modules install, then remember this list as what we call the “fallback pool”.

Note that because the loose mode directly calls the node-modules hoister, it follows the exact same implementation as the true algorithm used by the node-modules linker!

At runtime, packages that require unlisted dependencies will still be allowed to access them if any version of the dependency ended up in the fallback pool (which packages exactly are allowed to rely on the fallback pool can be tweaked with pnpFallbackMode).

Note that the content of the fallback pool is undetermined. If a dependency tree contains multiple versions of the same package, there is no means to determine which one will be hoisted to the top-level. Therefore, a package accessing the fallback pool will still generate a warning (via the process.emitWarning API).

This mode provides a compromise between the strict PnP linker and the node_modules linker.

In order to enable loose mode, make sure that the nodeLinker option is set to pnp (the default) and add the following into your local .yarnrc.yml file:

pnpMode: loose

More information about the pnpMode option.

Caveat

Because we emit warnings (instead of throwing errors) on resolution errors, applications can’t catch them. This means that the common pattern of trying to require an optional peer dependency inside a try/catch block will print a warning at runtime if the dependency is missing, even though it shouldn’t. The only runtime implication is that such a warning can cause confusion, but it can safely be ignored.

For this reason, PnP loose mode won’t be the default starting with version 2.1 (as we originally planned). It will continue to be supported as an alternative, hopefully easing the transition to the default and recommended workflow: PnP strict mode.

Alternatives

In the years leading up to Plug’n’Play being ratified as the main install strategy, other projects came up with alternative implementations of the Node Resolution Algorithm – usually to circumvent shortcomings of the require.resolve API. Examples include Webpack (enhanced-resolve), Babel (resolve), Jest (jest-resolve), and Metro (metro-resolver). These alternatives should be considered as superseded by proper integration with Plug’n’Play.

Compatibility Table

The following compatibility table gives you an idea of the integration status with various tools from the community. Note that only CLI tools are listed there, as frontend libraries (such as reactvuelodash, …) don’t reimplement the Node resolution and as such don’t need any special logic to take advantage of Plug’n’Play:

Suggest an addition to this table

Native support

Many common frontend tools now support Plug’n’Play natively!

Project nameNote
AngularStarting from 13+
BabelStarting from resolve 1.9
Create-React-AppStarting from 2.0+
DocusaurusStarting from 2.0.0-beta.14
ESLintSome compatibility issues w/ shared configs (fixable using @rushstack/eslint-patch)
GatsbySupported with version ≥2.15.0, ≥3.7.0
GulpSupported with version 4.0+
HuskyStarting from 4.0.0-1+
JestStarting from 24.1+
Next.jsStarting from 9.1.2+
ParcelStarting from 2.0.0-nightly.212+
Preact CLIStarting from 3.1.0+
PrettierStarting from 1.17+
RollupStarting from resolve 1.9+
StorybookStarting from 6.0+
TypeScriptVia plugin-compat (enabled by default)
TypeScript-ESLintStarting from 2.12+
VSCode-StylelintStarting from 1.1+
WebStormStarting from 2019.3+; See Editor SDKs
WebpackStarting from 5+ (plugin available for 4.x)

Support via plugins

Project nameNote
ESBuildVia @yarnpkg/esbuild-plugin-pnp
VSCode-ESLintFollow Editor SDKs
VSCodeFollow Editor SDKs
Webpack 4.xVia pnp-webpack-plugin (native starting from 5+)

Incompatible

The following tools cannot be used with pure Plug’n’Play install (even under loose mode).

Important: Even if a tool is incompatible with Plug’n’Play, you can still enable the node-modules plugin. Just follow the instructions and you’ll be ready to go in a minute 🙂

Project nameNote
FlowFollow yarnpkg/berry#634
React NativeFollow react-native-community/cli#27
PulumiFollow pulumi/pulumi#3586
VSCode Extension Manager (vsce)Use the vsce-yarn-patch fork with the node-modules plugin enabled. The fork is required until microsoft/vscode-vsce#493 is merged, as vsce currently uses the removed yarn list command
HugoHugo pipes expect a node-modules dir. Enable the node-modules plugin
ReScriptFollow rescript-lang/rescript-compiler#3276

This list is kept up-to-date based on the latest release we’ve published starting from v2. In case you notice something off in your own project please try to upgrade Yarn and the problematic package first, then feel free to file an issue. And maybe a PR? 😊

Frequently Asked Questions

Why not use import maps?

Yarn Plug’n’Play provides semantic errors (explaining you the exact reason why a package isn’t reachable from another) and a sensible JS API to solve various shortcomings with require.resolve. These are features that import maps wouldn’t solve by themselves. This is answered in more detail in this thread.

A main reason we’re in this mess today is that the original node_modules design tried to abstract packages away in order to provide a generic system that would work without any notion of packages. This became a challenge that prompted many implementers to come up with their own interpretations. Import maps suffer from the same flaw.

Packages are stored inside Zip archives: How can I access their files?

When using PnP, packages are stored and accessed directly inside the Zip archives from the cache. The PnP runtime (.pnp.cjs) automatically patches Node’s fs module to add support for accessing files inside Zip archives. This way, you don’t have to do anything special:

const {readFileSync} = require(`fs`);

// Looks similar to `/path/to/.yarn/cache/lodash-npm-4.17.11-1c592398b2-8b49646c65.zip/node_modules/lodash/ceil.js`
const lodashCeilPath = require.resolve(`lodash/ceil`);

console.log(readFileSync(lodashCeilPath));

Fallback Mode

Back when PnP was first implemented, the compatibility wasn’t as good as it is now. To help with the transition, we designed a fallback mechanism: if a package tries to access an unlisted dependency, it’s still allowed to resolve it if the top-level package lists it as a dependency. We allow this because there’s no resolution ambiguity, as there’s a single top-level package in any project. Unfortunately, this may cause confusing behaviors depending on how your project is set up. When that happens, PnP is always right, and the only reason it works when not in a workspace is due to some extra lax.

This behavior was just a patch, and will eventually be removed to clear up any confusion. You can prepare for that now by setting pnpFallbackMode to none, which will disable the fallback mechanism altogether.