diff --git a/README.md b/README.md
index 8a2e1dbc..ee3ecd01 100644
--- a/README.md
+++ b/README.md
@@ -453,17 +453,21 @@ This tells Docker to use the
 
 We built Plz following these principles:
 
-- Code and data must be stored for future reference.
+- Data that isn't reproducible is worthless.
+- You don't know the value of your data at the time of creation.
 - Whatever part of the running environment can be captured by Plz, we capture it
   as to make jobs repeatable.
+- Hardware is expensive.
+- Code is a means to an end. What matters is the outcome you obtain from running
+  your code.
 - Functionality is based on standard mechanisms like files and environment
   variables. You don't need to add extra dependencies to your code or learn how
   to read/write your data in specific ways.
 - The tool must be flexible enough so that no unnecessary restrictions are
-  imposed by the architecture. You should be able to do with Plz whatever you
-  can do by running a program manually. It was surprising to find out how many
-  issues, mostly around running jobs in the cloud, could be solved only by
-  tweaking the configuration, without requiring any changes to the code.
+  imposed by its architecture. You should be able to do with Plz whatever you
+  can do by running a program manually. It was surprising to find out how much
+  of the friction around running jobs in the cloud could be solved, only by
+  tweaking the configuration and without requiring any changes to Plz code.
 
 Plz is routinely used at `prodo.ai` to train ML models on AWS, some of them
 taking days to run in the most powerful instances available. We trust it to
@@ -471,6 +475,143 @@ start and terminate these instances as needed, and to manage our spot instances,
 allowing us to get a much better price than if we were using on-demand instances
 all the time.
 
+## How does Plz help
+
+If you didn't have Plz, the steps you'd need as to run your code in an AWS
+instance would be:
+
+- go to the AWS console and start an instance (or have created a launch template
+  and then use the cli)
+- wait until the instance is up
+- get the IP address of the instance from the console
+- copy your code and data by ssh-ing to the instance
+- ssh to the instance and run your job. Preferably inside docker so that a
+  dropped connection doesn't kill your job (but if you want to have docker you
+  have to take care of a `Dockerfile` and build the image)
+- each time the connection drops or you turn off your computer, you need to ssh
+  again. If you didn't use docker, you lost your terminal and it's very likely
+  your job died
+- watch your job until it finishes (and lose money at the point it has already
+  finished and your instance keeps running, if you don't look often enough)
+- copy your results back to your machine by ssh-ing to the instance, being
+  disciplined about where you store them and making sure you can link them to
+  the (version of the) code that produced them if you'll have several runs that
+  you want to compare. Or, if you started from a program that was running
+  locally, change it to write to a non-ephemeral location
+- if you care about your standard output/logs, gather and retrieve them somehow
+- make a note of your results (like stats or accuracy), or copy files with
+  results
+
+All of that gets simplified to `plz run`. If you stopped the output of `plz run`
+(by hitting Ctrl-C, or turning off your computer) can do `plz output` to get the
+output at any time.
+
+If you want to rerun your job later (for instance, to try different parameters),
+you would need to have saved a copy of the code (or have been very disciplined
+with your git history, plus have tags or have commits for every single one-line
+tweak you try --more about that [below](#why-is-Plz-the-way-it-is)), and
+possibly also have the same data you have used. You'd need to retrieve the code
+from wherever you have it (for instance, you may need to find the git branch,
+and switch to it --possibly after creating a different copy of the repo if you
+don't want to stop working on what you are doing it).
+
+Another important factor is that it gives you a standard way to run your code.
+Same as when you see a Makefile and you know that you can type `make`, when you
+see a `plz.config.json` you know that you can do `plz run`. Then your code can
+be launched in whatever machine your teammate happens to be sitting (specially
+if the job runs in the cloud). Teammates need to install `plz`, sure, but your
+team will know how to do it after a couple installs, and that's setting up one
+program per team member, instead of one setup per project.
+
+## Why is Plz the way it is
+
+This section is an attempt to describe the rationale behind the high-level
+architecture of Plz.
+
+- why Docker: simplifies input and output, which results in concrete
+  simplifications like log handling: we obtain a stream of logs from the running
+  jobs just by calling the Docker API, with facilities to filter for time.
+  Running commands with ssh requires to either keep the connection as to gather
+  the output or redirecting the output and reading it later from a file. In
+  general, Docker doesn't provide only isolation, but also an environment where
+  the job runs autonomously with controlled inputs and outputs
+- why not using git to store code snapshots (and use git to transfer code to the
+  instance): because it's very common that users want to make changes that they
+  don't necessarily want in their commit history. For instance, when users try
+  to make their job run in the cloud, or to run it at a different scale than
+  what they use to (for instance, to run the job with far more data than they do
+  locally), they might try several one-line tweaks. These commits (possibly
+  paired with messages that would be meaningless in a month, like ''Change
+  foobar from 0 to 1'') are hardly useful and pollute the repo history. Plz
+  could also create a different branch for each job but (in order to allow for
+  `plz rerun`) then these branches should be kept, would be listed in
+  `git branch`, etc. _A good summary answer to the question would be: because
+  users want to commit stuff that ''works'' (commits you can revert to, use for
+  reference, etc.) and you don't know whether something works until you've run
+  it._ The solution for code storage we implemented, using Docker images, is
+  quite simple to implement and understand as the docker API allows you to just
+  send the files as a tarball in order to create an image (if we were using git,
+  for the case of private repos, we would need to implement usage of git
+  credentials in the instance, which would actually be more complicated than
+  using Docker). Docker images are given a name so that they can be referenced
+  later, making `plz rerun` easy to implement as well. The code can be retrieved
+  by looking inside the image, which is a reliable source of truth, as it stores
+  the code that was actually running
+
+### Could Plz be smaller?
+
+- why do we need a controller/server: one reason is to manage locks (for
+  instance, to avoid two jobs requests using the same instance). It's true that
+  locking could be done just by using a redis server (so instead of a
+  controller/Plz server, the CLI could maybe point to a redis server taking care
+  of locks). That would force the tool to assume that everyone uses it
+  collaboratively (one could engineer an altered CLI that locks every instance,
+  etc.). We are having this assumption now, but we are not forced to keep it in
+  the future. Another reason for a controller is a feature that we have
+  considered for a while: to rerun jobs for spot instances that were terminated
+  because of being overbid. To that end, we need something to be running
+  permanently the cloud, as there might not be a CLI running at the point in
+  time where the spot instance is terminated. In general, it sounds like if you
+  want to do something serious about a bunch of instances that are running
+  permanently, eventually you'll need a coordinator/controller. Even if the
+  current features might not strictly require a controller, it's good that any
+  features that do require it won't need a major refactor. Needless to say, a
+  controller-less Plz cannot be obtained by just erasing the controller: a major
+  effort would be needed so that the tasks being done by the controller (setting
+  inputs, collecting outputs, etc.) are done by, for instance, a wrapper of the
+  program being run by the user
+- why collecting information from the running program: while it would be
+  possible to leave to user programs the task to write to whatever non-ephemeral
+  storage they choose, that would put a burden on the Plz user to change their
+  program significantly, with respect to a program that they already run locally
+  (for instance, instead of writing local files, to use the AWS API to write to
+  S3). With the current Plz mechanism, as long as there is a single point in
+  your program where you can set the output directory (and, if you program
+  doesn't have such point, it's a good idea to implement it anyway) you can
+  write files and Plz will make them non-ephemeral for you. Also, with the
+  current mechanism team members know how to access the outcomes of your job
+  even if they don't know the details (`plz output` for ''blobs'' and
+  `plz measures` for structured outputs), and can read them using standard
+  tools, as every computer setup can process json and files (as opposed to, say,
+  running SQL queries in the cloud)
+- why managing the instances ourselves/why not using Kubernetes: because
+  autoscaling mechanisms (either using Kubernetes or autoscaling groups) do not
+  cover the case of ''interactive users'' which want to run instances, see them
+  spawn when they launch a job and see them terminate when they stop it.
+  Autoscaling mechanisms have ''cooling times'' specified so that scaling
+  changes are not happening all the time, degrading performance, but they cause
+  operations not to be immediate/deterministic and that can be really annoying
+  when working interactively. We discovered all of these because our first
+  attempt was to use AWS autoscaling groups, and that Plz was a pain to use and
+  also to test manually (''did AWS get that we want to terminate this instance?
+  Let's wait, sometimes it takes 5 minutes to take it down''). With respect to
+  Kubernetes specifically, when we started Plz the Kubernetes implementation of
+  AWS (EKS) wasn't there. There is a feature for Kubernetes in the works. We
+  plan that users would be able to specify a Kubernetes cluster to which the
+  execution will be sent (to support the case of a non-interactive user), or, as
+  we currently do, specify an instance type, so that an instance will be started
+  and managed by Plz (to support the case of an interactive user)
+
 ## Future work
 
 In the future, Plz is intended to: