Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
149 changes: 149 additions & 0 deletions concepts/build-files.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
---
title: 'BUILD files'
---



The previous sections described packages, targets and labels, and the
build dependency graph abstractly. This section describes the concrete syntax
used to define a package.

By definition, every package contains a `BUILD` file, which is a short
program.

Note: The `BUILD` file can be named either `BUILD` or `BUILD.bazel`. If both
files exist, `BUILD.bazel` takes precedence over `BUILD`.
For simplicity's sake, the documentation refers to these files simply as `BUILD`
files.

`BUILD` files are evaluated using an imperative language,
[Starlark](https://github.com/bazelbuild/starlark/).

They are interpreted as a sequential list of statements.

In general, order does matter: variables must be defined before they are
used, for example. However, most `BUILD` files consist only of declarations of
build rules, and the relative order of these statements is immaterial; all
that matters is _which_ rules were declared, and with what values, by the
time package evaluation completes.

When a build rule function, such as `cc_library`, is executed, it creates a
new target in the graph. This target can later be referred using a label.

In simple `BUILD` files, rule declarations can be re-ordered freely without
changing the behavior.

To encourage a clean separation between code and data, `BUILD` files cannot
contain function definitions, `for` statements or `if` statements (but list
comprehensions and `if` expressions are allowed). Functions can be declared in
`.bzl` files instead. Additionally, `*args` and `**kwargs` arguments are not
allowed in `BUILD` files; instead list all the arguments explicitly.

Crucially, programs in Starlark can't perform arbitrary I/O. This invariant
makes the interpretation of `BUILD` files hermetic — dependent only on a known
set of inputs, which is essential for ensuring that builds are reproducible.
For more details, see [Hermeticity](/basics/hermeticity).

Because `BUILD` files need to be updated whenever the dependencies of the
underlying code change, they are typically maintained by multiple people on a
team. `BUILD` file authors should comment liberally to document the role
of each build target, whether or not it is intended for public use, and to
document the role of the package itself.

## Loading an extension

Bazel extensions are files ending in `.bzl`. Use the `load` statement to import
a symbol from an extension.

```
load("//foo/bar:file.bzl", "some_library")
```

This code loads the file `foo/bar/file.bzl` and adds the `some_library` symbol
to the environment. This can be used to load new rules, functions, or constants
(for example, a string or a list). Multiple symbols can be imported by using
additional arguments to the call to `load`. Arguments must be string literals
(no variable) and `load` statements must appear at top-level — they cannot be
in a function body.

The first argument of `load` is a [label](/concepts/labels) identifying a
`.bzl` file. If it's a relative label, it is resolved with respect to the
package (not directory) containing the current `bzl` file. Relative labels in
`load` statements should use a leading `:`.

`load` also supports aliases, therefore, you can assign different names to the
imported symbols.

```
load("//foo/bar:file.bzl", library_alias = "some_library")
```

You can define multiple aliases within one `load` statement. Moreover, the
argument list can contain both aliases and regular symbol names. The following
example is perfectly legal (please note when to use quotation marks).

```
load(":my_rules.bzl", "some_rule", nice_alias = "some_other_rule")
```

In a `.bzl` file, symbols starting with `_` are not exported and cannot be
loaded from another file.

You can use [load visibility](/concepts/visibility#load-visibility) to restrict
who may load a `.bzl` file.

## Types of build rules

The majority of build rules come in families, grouped together by
language. For example, `cc_binary`, `cc_library`
and `cc_test` are the build rules for C++ binaries,
libraries, and tests, respectively. Other languages use the same
naming scheme, with a different prefix, such as `java_*` for
Java. Some of these functions are documented in the
[Build Encyclopedia](/reference/be/overview), but it is possible
for anyone to create new rules.

* `*_binary` rules build executable programs in a given language. After a
build, the executable will reside in the build tool's binary
output tree at the corresponding name for the rule's label,
so `//my:program` would appear at (for example) `$(BINDIR)/my/program`.

In some languages, such rules also create a runfiles directory
containing all the files mentioned in a `data`
attribute belonging to the rule, or any rule in its transitive
closure of dependencies; this set of files is gathered together in
one place for ease of deployment to production.

* `*_test` rules are a specialization of a `*_binary` rule, used for automated
testing. Tests are simply programs that return zero on success.

Like binaries, tests also have runfiles trees, and the files
beneath it are the only files that a test may legitimately open
at runtime. For example, a program `cc_test(name='x',
data=['//foo:bar'])` may open and read `$TEST_SRCDIR/workspace/foo/bar` during execution.
(Each programming language has its own utility function for
accessing the value of `$TEST_SRCDIR`, but they are all
equivalent to using the environment variable directly.)
Failure to observe the rule will cause the test to fail when it is
executed on a remote testing host.

* `*_library` rules specify separately-compiled modules in the given
programming language. Libraries can depend on other libraries,
and binaries and tests can depend on libraries, with the expected
separate-compilation behavior.

<Columns cols={2}>
<Card title="← Labels" href="/concepts/labels" icon="arrow-left">
Learn about labels and target references
</Card>
<Card title="Dependencies →" href="/concepts/dependencies" icon="arrow-right">
Understand build dependencies
</Card>
</Columns>

## File encoding

`BUILD` and `.bzl` files should be encoded in UTF-8, of which ASCII is a valid
subset. Arbitrary byte sequences are currently allowed, but may stop being
supported in the future.

231 changes: 231 additions & 0 deletions concepts/dependencies.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,231 @@
---
title: 'Dependencies'
---



A target `A` _depends upon_ a target `B` if `B` is needed by `A` at build or
execution time. The _depends upon_ relation induces a
[Directed Acyclic Graph](https://en.wikipedia.org/wiki/Directed_acyclic_graph)
(DAG) over targets, and it is called a _dependency graph_.

A target's _direct_ dependencies are those other targets reachable by a path
of length 1 in the dependency graph. A target's _transitive_ dependencies are
those targets upon which it depends via a path of any length through the graph.

In fact, in the context of builds, there are two dependency graphs, the graph
of _actual dependencies_ and the graph of _declared dependencies_. Most of the
time, the two graphs are so similar that this distinction need not be made, but
it is useful for the discussion below.

## Actual and declared dependencies

A target `X` is _actually dependent_ on target `Y` if `Y` must be present,
built, and up-to-date in order for `X` to be built correctly. _Built_ could
mean generated, processed, compiled, linked, archived, compressed, executed, or
any of the other kinds of tasks that routinely occur during a build.

A target `X` has a _declared dependency_ on target `Y` if there is a dependency
edge from `X` to `Y` in the package of `X`.

For correct builds, the graph of actual dependencies _A_ must be a subgraph of
the graph of declared dependencies _D_. That is, every pair of
directly-connected nodes `x --> y` in _A_ must also be directly connected in
_D_. It can be said that _D_ is an _overapproximation_ of _A_.

Important: _D_ should not be too much of an overapproximation of _A_ because
redundant declared dependencies can make builds slower and binaries larger.

`BUILD` file writers must explicitly declare all of the actual direct
dependencies for every rule to the build system, and no more.

Failure to observe this principle causes undefined behavior: the build may fail,
but worse, the build may depend on some prior operations, or upon transitive
declared dependencies the target happens to have. Bazel checks for missing
dependencies and report errors, but it's not possible for this checking to be
complete in all cases.

You need not (and should not) attempt to list everything indirectly imported,
even if it is _needed_ by `A` at execution time.

During a build of target `X`, the build tool inspects the entire transitive
closure of dependencies of `X` to ensure that any changes in those targets are
reflected in the final result, rebuilding intermediates as needed.

The transitive nature of dependencies leads to a common mistake. Sometimes,
code in one file may use code provided by an _indirect_ dependency — a
transitive but not direct edge in the declared dependency graph. Indirect
dependencies don't appear in the `BUILD` file. Because the rule doesn't
directly depend on the provider, there is no way to track changes, as shown in
the following example timeline:

### 1. Declared dependencies match actual dependencies

At first, everything works. The code in package `a` uses code in package `b`.
The code in package `b` uses code in package `c`, and thus `a` transitively
depends on `c`.


### Comparing a/BUILD and **b**/BUILD

![Declared dependency graph with arrows connecting a, b, and c](/docs/images/a_b_c.svg)

![Actual dependency graph that matches the declared dependency
graph with arrows connecting a, b, and c](/docs/images/a_b_c.svg)


The declared dependencies overapproximate the actual dependencies. All is well.

### 2. Adding an undeclared dependency

A latent hazard is introduced when someone adds code to `a` that creates a
direct _actual_ dependency on `c`, but forgets to declare it in the build file
`a/BUILD`.

![Declared dependency graph with arrows connecting a, b, and c](/docs/images/a_b_c.svg)

![Actual dependency graph with arrows connecting a, b, and c. An
arrow now connects A to C as well. This does not match the
declared dependency graph](/docs/images/a_b_c_ac.svg)


The declared dependencies no longer overapproximate the actual dependencies.
This may build ok, because the transitive closures of the two graphs are equal,
but masks a problem: `a` has an actual but undeclared dependency on `c`.

### 3. Divergence between declared and actual dependency graphs

The hazard is revealed when someone refactors `b` so that it no longer depends on
`c`, inadvertently breaking `a` through no
fault of their own.

![Declared dependency graph with arrows connecting a and b.
b no longer connects to c, which breaks a's connection to c](/docs/images/ab_c.svg)

![Actual dependency graph that shows a connecting to b and c,
but b no longer connects to c](/docs/images/a_b_a_c.svg)


The declared dependency graph is now an underapproximation of the actual
dependencies, even when transitively closed; the build is likely to fail.

The problem could have been averted by ensuring that the actual dependency from
`a` to `c` introduced in Step 2 was properly declared in the `BUILD` file.

## Types of dependencies

Most build rules have three attributes for specifying different kinds of
generic dependencies: `srcs`, `deps` and `data`. These are explained below. For
more details, see
[Attributes common to all rules](/reference/be/common-definitions).

Many rules also have additional attributes for rule-specific kinds of
dependencies, for example, `compiler` or `resources`. These are detailed in the
[Build Encyclopedia](/reference/be/).

### `srcs` dependencies

Files consumed directly by the rule or rules that output source files.

### `deps` dependencies

Rule pointing to separately-compiled modules providing header files,
symbols, libraries, data, etc.

### `data` dependencies

A build target might need some data files to run correctly. These data files
aren't source code: they don't affect how the target is built. For example, a
unit test might compare a function's output to the contents of a file. When you
build the unit test you don't need the file, but you do need it when you run
the test. The same applies to tools that are launched during execution.

The build system runs tests in an isolated directory where only files listed as
`data` are available. Thus, if a binary/library/test needs some files to run,
specify them (or a build rule containing them) in `data`. For example:

```
# I need a config file from a directory named env:
java_binary(
name = "setenv",
...
data = [":env/default_env.txt"],
)

# I need test data from another directory
sh_test(
name = "regtest",
srcs = ["regtest.sh"],
data = [
"//data:file1.txt",
"//data:file2.txt",
...
],
)
```

These files are available using the relative path `path/to/data/file`. In tests,
you can refer to these files by joining the paths of the test's source
directory and the workspace-relative path, for example,
`${TEST_SRCDIR}/workspace/path/to/data/file`.

## Using labels to reference directories

As you look over our `BUILD` files, you might notice that some `data` labels
refer to directories. These labels end with `/.` or `/` like these examples,
which you should not use:

**Not recommended** — `data = ["//data/regression:unittest/."]`

**Not recommended** — `data = ["testdata/."]`

**Not recommended** — `data = ["testdata/"]`

This seems convenient, particularly for tests because it allows a test to
use all the data files in the directory.

But try not to do this. In order to ensure correct incremental rebuilds (and
re-execution of tests) after a change, the build system must be aware of the
complete set of files that are inputs to the build (or test). When you specify
a directory, the build system performs a rebuild only when the directory itself
changes (due to addition or deletion of files), but won't be able to detect
edits to individual files as those changes don't affect the enclosing directory.
Rather than specifying directories as inputs to the build system, you should
enumerate the set of files contained within them, either explicitly or using the
[`glob()`](/reference/be/functions#glob) function. (Use `**` to force the
`glob()` to be recursive.)

**Recommended** — `data = glob(["testdata/**"])`

Unfortunately, there are some scenarios where directory labels must be used.
For example, if the `testdata` directory contains files whose names don't
conform to the [label syntax](/concepts/labels#labels-lexical-specification),
then explicit enumeration of files, or use of the
[`glob()`](/reference/be/functions#glob) function produces an invalid labels
error. You must use directory labels in this case, but beware of the
associated risk of incorrect rebuilds described above.

If you must use directory labels, keep in mind that you can't refer to the
parent package with a relative `../` path; instead, use an absolute path like
`//data/regression:unittest/.`.

Note: Directory labels are only valid for data dependencies. If you try to use
a directory as a label in an argument other than `data`, it will fail and you
will get a (probably cryptic) error message.

Any external rule, such as a test, that needs to use multiple files must
explicitly declare its dependence on all of them. You can use `filegroup()` to
group files together in the `BUILD` file:

```
filegroup(
name = 'my_data',
srcs = glob(['my_unittest_data/*'])
)
```

You can then reference the label `my_data` as the data dependency in your test.

**Previous:** [BUILD files](/concepts/build-files) | **Next:** [Visibility](/concepts/visibility)


Loading
Loading