feat: implement caching

currently every operation performed through this module has to maintain its own cache. pacote half implements some form of caching, but now that we don't keep integrity values for locally built tarballs (i.e. git repositories) pacote will never actually use the cache.

in order to resolve this, i propose that we implement git specific caching semantics in this module.

these are just some thoughts i had about how this could work, that may or may not be useful when we implement.

ideally, we use `cacache` so that we maintain consistency in our caching locations. this gives us a tiny bit of a challenge, however, in that we would need to cache specific entities in different ways

1. `package.json` (including corgis, so this is really two entities in one key similar to how `make-fetch-happen` handles different accept headers and content-types
2. `npm-shrinkwrap.json` if the package has one present, we need it accessible
3. a built tarball, meaning we've cloned the repo, checked out the appropriate reference, installed dependencies and run prepare scripts (if necessary), and run `npm pack`

where we start to venture into uncharted waters is with regards to how we identify and correctly handle stale repositories. to this end, i think one approach we could take is to also store a tarball comprising of the raw git repository in the cache. this tarball would be used as a means of determining if the requested reference has changed or not. the flow would look something like this:

- request the raw git repository from `cacache` or clone it
- extract the raw git repository (if we just cloned it, skip this)
- update the raw git repository (`git fetch`, again skip if we just cloned)
- resolve the requested git ref to a commit hash
- attempt to read the requested file from the cache, use the commit hash as an etag of sorts
- if the file does not exist, do what is necessary to retrieve it and store it

this would mean that every installation of a git repository will require us to extract a tarball of the repository, and do a `git fetch` before retrieving anything but that's a very considerably better state than we have today where we clone the entire repository from scratch every time we need something from it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: implement caching #74

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat: implement caching #74

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions