-
Notifications
You must be signed in to change notification settings - Fork 16
Description
currently every operation performed through this module has to maintain its own cache. pacote half implements some form of caching, but now that we don't keep integrity values for locally built tarballs (i.e. git repositories) pacote will never actually use the cache.
in order to resolve this, i propose that we implement git specific caching semantics in this module.
these are just some thoughts i had about how this could work, that may or may not be useful when we implement.
ideally, we use cacache so that we maintain consistency in our caching locations. this gives us a tiny bit of a challenge, however, in that we would need to cache specific entities in different ways
package.json(including corgis, so this is really two entities in one key similar to howmake-fetch-happenhandles different accept headers and content-typesnpm-shrinkwrap.jsonif the package has one present, we need it accessible- a built tarball, meaning we've cloned the repo, checked out the appropriate reference, installed dependencies and run prepare scripts (if necessary), and run
npm pack
where we start to venture into uncharted waters is with regards to how we identify and correctly handle stale repositories. to this end, i think one approach we could take is to also store a tarball comprising of the raw git repository in the cache. this tarball would be used as a means of determining if the requested reference has changed or not. the flow would look something like this:
- request the raw git repository from
cacacheor clone it - extract the raw git repository (if we just cloned it, skip this)
- update the raw git repository (
git fetch, again skip if we just cloned) - resolve the requested git ref to a commit hash
- attempt to read the requested file from the cache, use the commit hash as an etag of sorts
- if the file does not exist, do what is necessary to retrieve it and store it
this would mean that every installation of a git repository will require us to extract a tarball of the repository, and do a git fetch before retrieving anything but that's a very considerably better state than we have today where we clone the entire repository from scratch every time we need something from it.