Skip to content

Commit 5761577

Browse files
committed
Infrastructure related docs
1 parent e0438c3 commit 5761577

File tree

7 files changed

+327
-18
lines changed

7 files changed

+327
-18
lines changed

deployment/byoc.mdx

Lines changed: 0 additions & 5 deletions
This file was deleted.

deployment/self-hosting.mdx

Lines changed: 0 additions & 5 deletions
This file was deleted.

docs.json

Lines changed: 4 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -49,18 +49,14 @@
4949
]
5050
},
5151
{
52-
"group": "Deploy E2B",
52+
"group": "Infrastructure",
5353
"pages": [
54-
"deployment/byoc",
55-
"deployment/self-hosting"
54+
"infrastructure/architecture",
55+
"infrastructure/self-hosting",
56+
"infrastructure/byoc"
5657
]
5758
}
5859
]
59-
},
60-
{
61-
"anchor": "SDK Reference",
62-
"icon": "square-terminal",
63-
"href": "https://external-link.com/blog"
6460
}
6561
]
6662
},
77.7 KB
Loading

infrastructure/architecture.mdx

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
---
2+
title: "Architecture"
3+
description: "E2B infrastructure architecture overview"
4+
icon: "sitemap"
5+
---
6+
7+
## Sandbox architecture
8+
E2B is built around the orchestration of microVMs using Firecracker and KVM virtualization.
9+
Its multi-tenant architecture allows you to run multiple sandboxes on a single machine while ensuring strong isolation between them.
10+
11+
Core is an orchestrator that receives requests from the E2B control plane and manages the sandbox lifecycle.
12+
Its responsible for low-level operations such as memory mapping, snapshotting, and system configuration, and is using Firecracker to run microVMs.
13+
14+
E2B can run hundreds of nodes, with each node running an orchestrator that manages hundreds of sandboxes.
15+
The API serves as the main point of entry for customers, handling all permissions and logic to build sandbox requests.
16+
It is also responsible for fast and reliable scheduling of sandbox requests to orchestrators.
17+
18+
When someone wants to access a port running in the sandbox, Edge (client-proxy) is used to route traffic from load balancer to the correct node.
19+
On the node level, the orchestrator proxy completes routing directly to the sandbox network interface.
20+
21+
## Template architecture
22+
23+
We are using Ubuntu-based images for sandbox templates.
24+
Currently, you can use a Docker image as a source for building, or a template build V2 that supports faster and code-declarative build configuration.
25+
26+
We will extract the file system from the source we received, install and configure the required packages, and then create a snapshot of the file system.
27+
This snapshot is later used to create a microVM that runs in the sandbox. We can create both file-system and memory snapshots for even faster sandbox creation.
28+
29+
## Components
30+
31+
### Services
32+
- **API** - Handled consistency and logic for whole E2B platform. Used for sandbox lifecycle and template management.
33+
- **Orchestrator** - Manages sandbox microVM lifecycle, proper system configuration, snapshotting, and else.
34+
- **Template Manager** - Currently part of orchestrator, but can be deployed separately. Responsible for building sandbox templates.
35+
- **Envd** - Small service running in each sandbox as a service handling communication with the E2B control plane and command execution.
36+
- **Edge (client-proxy)** - Routes traffic to sandboxes, exposes API for cluster management, and gRPC proxy used by E2B control plane to communicate with orchestrators.
37+
- **Docker Reverse Proxy** - Docker reverse proxy allows us to receive template source images with our own authentication and authorization.
38+
- **Open Telemetry** - Collects logs, metrics, and traces from deployed services. Used for observability and monitoring.
39+
- **ClickHouse** - Used for storing sandbox lifecycle metrics.
40+
- **Loki** - Used for storing sandbox logs. Stored only in the cluster and not sent to Grafana or any other 3rd party service.
41+
42+
### Cloud Services
43+
- **Redis** - Used for metadata and synchronization between components.
44+
- **Container Registry** - Storage for customers' source files of sandbox templates.
45+
- **Object Storage** - Storage for sandbox snapshots/templates. Needs to support byte-range read requests.
46+
- **PostgreSQL Database (currently only Supabase is supported)** - Used as a Postgres database and an OAuth/users management tool.
47+
- **Machines with KVM virtualization support** - Google Cloud Platform VM with native/nested virtualization support.
48+
- **Grafana (optional for monitoring)** - Used for monitoring logs/traces/metrics coming from Open Telemetry and ClickHouse.
49+
50+
## Security
51+
52+
### Virtualization isolation
53+
54+
We are using Firecracker and Linux KVM to provide strong isolation between sandboxes.
55+
This allows us to run multiple sandboxes on a single machine while ensuring that they are isolated from each other.
56+
Firecracker is a lightweight virtualization technology that provides a minimalistic virtual machine monitor (VMM) for running microVMs.
57+
It is designed to be secure and efficient, making it a great choice for running sandboxes.
58+
59+
### Why visualization over containerization?
60+
Docker is a popular containerization technology, but it does not provide the same level of isolation as Firecracker.
61+
Docker containers share the same kernel and resources, which can lead to security vulnerabilities and performance issues.
62+
63+
Firecracker, on the other hand, provides a lightweight virtual machine that runs its own kernel and resources, ensuring strong isolation between sandboxes.
64+
This makes Firecracker a better choice for running sandboxes, especially in a multi-tenant environment
65+
where security and performance are critical.

infrastructure/byoc.mdx

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
---
2+
title: "BYOC (Bring Your Own Cloud)"
3+
sidebarTitle: "Bring Your Own Cloud"
4+
description: "Allows you to deploy E2B sandboxes to your own cloud VPC."
5+
icon: "cloud"
6+
---
7+
8+
BYOC is currently only available for AWS.
9+
We are working on adding support for Google Cloud and Azure.
10+
11+
<Note>
12+
BYOC is offered to enterprise customers only.
13+
If you’re interested in BYOC offering, please book a call with our team [here](https://e2b.dev/contact) or contact us at [enterprise@e2b.dev](mailto:enterprise@e2b.dev).
14+
</Note>
15+
16+
## Architecture
17+
18+
Sandbox templates, snapshots, and runtime logs are stored within the customer's BYOC VPC.
19+
Anonymized system metrics such as cluster memory and cpu are sent to the E2B Cloud for observability and cluster management purposes.
20+
21+
All potentially sensitive traffic, such as sandbox template build source files,
22+
sandbox traffic, and logs, is transmitted directly from the client to the customer's BYOC VPC without ever touching the E2B Cloud infrastructure.
23+
24+
### Glossary
25+
- **BYOC VPC**: The customer's Virtual Private Network where the E2B sandboxes are deployed. For example your AWS account.
26+
- **E2B Cloud**: The managed service that provides the E2B platform, observability and cluster management.
27+
- **OAuth Provider**: Customer-managed service that provides user and E2B Cloud with access to the cluster.
28+
29+
<Frame>
30+
<img src="/images/byoc-architecture-diagram.png" alt="Graphics explaining key BYOC architecture parts" />
31+
</Frame>
32+
33+
### BYOC Cluster Components
34+
- **Orchestrator**: Represents a node that is responsible for managing sandboxes and their lifecycle. Optionally, it can also run the template builder component.
35+
- **Edge Controller**: Routes traffic to sandboxes, exposes API for cluster management, and gRPC proxy used by E2B control plane to communicate with orchestrators.
36+
- **Monitoring**: Collector that receives sandbox and build logs and system metrics from orchestrators and edge controllers. Only anonymized metrics are sent to the E2B Cloud for observability purposes.
37+
- **Storage**: Persistent storage for sandbox templates, snapshots, and runtime logs. Image container repository for template images.
38+
39+
## Onboarding
40+
41+
Customers can initiate the onboarding process by reaching out to us.
42+
Customers need to have a dedicated AWS account and know the region they will use.
43+
After that, we will receive the IAM role needed for managing account resources.
44+
For AWS account quota limits may need to be increased.
45+
46+
Terraform configuration and machine images will be used for provisioning BYOC cluster.
47+
When provisioning is done and running, we will create a new team under your E2B account that can be used by SDK/CLI the same way as it is hosted on E2B Cloud.
48+
49+
## FAQ
50+
51+
### How Is Cluster Monitored?
52+
53+
Cluster is forwarding anonymized metrics such as machine cpu/memory usage to E2B Control plane for advanced observability and alerting.
54+
The whole observability stack is anonymized and does not contain any sensitive information.
55+
56+
### Can cluster automatically scale?
57+
58+
A cluster can be scaled horizontally by adding more orchestrators and edge controllers.
59+
The autoscaler is currently in V1 not capable of automatically scale orchestrator nodes that are needed for sandbox spawning.
60+
This feature is coming in the next versions.
61+
62+
### Are sandboxes accessible only from a customer’s private network?
63+
64+
Yes. Load balancer that is handling all requests coming to sandbox can be configured as internal and VPC peering
65+
with additional customer’s VPC can be configured so sandbox traffic can stay in the private network.
66+
67+
### How control plane secure communication is ensured?
68+
69+
Data sent between the E2B Cloud and your BYOC VPC is encrypted using TLS.
70+
71+
VPC peering can be established to allow direct communication between the E2B Cloud and your BYOC VPC.
72+
When using VPC peering, the load balancer can be configured as private without a public IP address.

infrastructure/self-hosting.mdx

Lines changed: 186 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,186 @@
1+
---
2+
title: "Self-Hosting"
3+
description: "Deploy E2B to your own cloud infrastructure"
4+
icon: "server"
5+
---
6+
7+
Self-hosting E2B allows you to deploy and manage the whole E2B open-source stack on your own infrastructure.
8+
This gives you full control over your sandboxes, data, and security policies.
9+
10+
We are currently officially supporting self-hosting on Google Cloud Platform (GCP) with Amazon Web Services (AWS), and on-premise support is coming soon.
11+
12+
<Note>
13+
If you are looking for a managed solution, consider our [Bring Your Own Cloud](/infrastructure/byoc) offering that will
14+
bring you the same security and control with the E2B team managing infrastructure for you.
15+
</Note>
16+
17+
## Google Cloud Platform
18+
19+
### Prerequisites
20+
21+
**Tools**
22+
- [Packer](https://developer.hashicorp.com/packer/tutorials/docker-get-started/get-started-install-cli#installing-packer)
23+
- [Golang](https://go.dev/doc/install)
24+
- [Docker](https://docs.docker.com/engine/install/)
25+
- [Terraform](https://developer.hashicorp.com/terraform/tutorials/aws-get-started/install-cli) (v1.5.x)
26+
- [This version that is still using Mozilla Public License](https://github.com/hashicorp/terraform/commit/b145fbcaadf0fa7d0e7040eac641d9aef2a26433)
27+
- The last version of Terraform that supports Mozilla Public License is **v1.5.7**
28+
- You can install it with [tfenv](https://github.com/tfutils/tfenv) for easier version management
29+
- [Google Cloud CLI](https://cloud.google.com/sdk/docs/install)
30+
- Used for managing GCP resources deployed by Terraform
31+
- Authenticate with `gcloud auth login && gcloud auth application-default login`
32+
33+
**Accounts**
34+
- Cloudflare account with a domain
35+
- Google Cloud Platform account and project
36+
- Supabase account with PostgreSQL database
37+
- **(Optional)** Grafana account for monitoring and logging
38+
- **(Optional)** Posthog account for analytics
39+
40+
### Steps
41+
42+
1. Go to `console.cloud.google.com` and create a new GCP project
43+
> Make sure your Quota allows you to have at least 2500 GB for `Persistent Disk SSD (GB)` and at least 24 for `CPUs`.
44+
2. Create `.env.prod`, `.env.staging`, or `.env.dev` from [`.env.template`](https://github.com/e2b-dev/infra/blob/main/.env.template). You can pick any of them. Make sure to fill in the values. All are required if not specified otherwise.
45+
> Get Postgres database connection string from your database, e.g. [from Supabase](https://supabase.com/docs/guides/database/connecting-to-postgres#direct-connection): Create a new project in Supabase and go to your project in Supabase -> Settings -> Database -> Connection Strings -> Postgres -> Direct.
46+
47+
> Your Postgres database needs to have IPv4 access enabled. You can do that in the Connect screen.
48+
3. Run `make switch-env ENV={prod,staging,dev}` to start using your env
49+
4. Run `make login-gcloud` to login to `gcloud` CLI so Terraform and Packer can communicate with GCP API.
50+
5. Run `make init`
51+
> If this error, run it a second time. It's due to a race condition on Terraform enabling API access for the various GCP services; this can take several seconds.
52+
53+
> A full list of services that will be enabled for API access: [Secret Manager API](https://console.cloud.google.com/apis/library/secretmanager.googleapis.com), [Certificate Manager API](https://console.cloud.google.com/apis/library/certificatemanager.googleapis.com), [Compute Engine API](https://console.cloud.google.com/apis/library/compute.googleapis.com), [Artifact Registry API](https://console.cloud.google.com/apis/library/artifactregistry.googleapis.com), [OS Config API](https://console.cloud.google.com/apis/library/osconfig.googleapis.com), [Stackdriver Monitoring API](https://console.cloud.google.com/apis/library/monitoring.googleapis.com), [Stackdriver Logging API](https://console.cloud.google.com/apis/library/logging.googleapis.com)
54+
55+
6. Run `make build-and-upload`
56+
7. Run `make copy-public-builds`
57+
8. Run `make migrate`
58+
9. Secrets are created and stored in GCP Secrets Manager. Once created, that is the source of truth--you will need to update values there to make changes. Create a secret value for the following secrets:
59+
10. Update `e2b-cloudflare-api-token` in GCP Secrets Manager with a value taken from Cloudflare.
60+
> Get Cloudflare API Token: go to the [Cloudflare dashboard](https://dash.cloudflare.com/) -> Manage Account -> Account API Tokens -> Create Token -> Edit Zone DNS -> in "Zone Resources" select your domain and generate the token
61+
11. Run `make plan-without-jobs` and then `make apply`
62+
12. Fill out the following secret in the GCP Secrets Manager:
63+
- e2b-supabase-jwt-secrets (optional / required to self-host the [E2B dashboard](https://github.com/e2b-dev/dashboard))
64+
> Get Supabase JWT Secret: go to the [Supabase dashboard](https://supabase.com/dashboard) -> Select your Project -> Project Settings -> Data API -> JWT Settings
65+
- e2b-postgres-connection-string
66+
> This is the same value as for the `POSTGRES_CONNECTION_STRING` env variable.
67+
13. Run `make plan` and then `make apply`
68+
> Note: This will work after the TLS certificates are issued. It can take some time; you can check the status in the Google Cloud Console.
69+
14. Setup data in the cluster by following one of the two
70+
- `make prep-cluster` in `packages/shared` to create an initial user, etc. (You need to be logged in via [`e2b` CLI](https://www.npmjs.com/package/@e2b/cli)). It will create a user with same information (access token, api key, etc.) as you have in E2B.
71+
- You can also create a user in the database, it will automatically also create a team, an API key, and an access token. You will need to build template(s) for your cluster. Use [`e2b` CLI](https://www.npmjs.com/package/@e2b/cli?activetab=versions)) and run `E2B_DOMAIN=<your-domain> e2b template build`.
72+
73+
74+
### Interacting with the cluster
75+
76+
#### SDK
77+
When using SDK pass the domain when creating a new `Sandbox` in the JS/TS SDK
78+
```javascript
79+
import { Sandbox } from "@e2b/sdk";
80+
81+
const sandbox = new Sandbox({domain: "<your-domain>"});
82+
```
83+
84+
or in Python SDK
85+
86+
```python
87+
from e2b import Sandbox
88+
89+
sandbox = Sandbox(domain="<your-domain>")
90+
```
91+
92+
#### CLI
93+
When using CLI, you can pass the domain as well
94+
```sh
95+
E2B_DOMAIN=<your-domain> e2b <command>
96+
```
97+
98+
### Monitoring and logging jobs
99+
100+
To access the Nomad web UI, go to `https://nomad.<your-domain.com>`. Go to sign in, and when prompted for an API token, you can find this in GCP Secrets Manager.
101+
From here, you can see nomad jobs and tasks for both client and server, including logging.
102+
103+
To update jobs running in the cluster, look inside packages/nomad for config files. This can be useful for setting your logging and monitoring agents.
104+
105+
### Deployment Troubleshooting
106+
107+
If any problems arise, open a [GitHub issue on the repo](https://github.com/e2b-dev/infra/issues) and we'll look into it.
108+
109+
110+
### Google Cloud Troubleshooting
111+
**Quotas not available**
112+
113+
If you can't find the quota in `All Quotas` in GCP's Console, then create and delete a dummy VM before proceeding to step 2 in self-deploy guide. This will create additional quotas and policies in GCP
114+
```
115+
gcloud compute instances create dummy-init --project=YOUR-PROJECT-ID --zone=YOUR-ZONE --machine-type=e2-medium --boot-disk-type=pd-ssd --no-address
116+
```
117+
Wait a minute and destroy the VM:
118+
```
119+
gcloud compute instances delete dummy-init --zone=YOUR-ZONE --quiet
120+
```
121+
Now, you should see the right quota options in `All Quotas` and be able to request the correct size.
122+
123+
124+
## Linux Machine
125+
All E2B services are AMD64 compatible and ready to be deployed on Ubuntu 22.04 machines.
126+
Tooling for on-premise clustering and load-balancing is **not yet officially supported**.
127+
128+
### Service images
129+
130+
For running E2B core, you need to build and deploy **API**, **Edge (client-proxy)**, and **Orchestrator** services.
131+
This will work on any Linux machine with Docker installed. Orchestrator is built with Docker but deployed as a static binary, because it needs precise control over the Firecracker MicroVMs in the host system.
132+
133+
Building and provisioning services can be similar to what we do with Google Cloud Platform builds and Nomad jobs setup.
134+
Details about architecture can be found in our [architecture](/infrastructure/architecture) sections.
135+
136+
### Client machine setup
137+
138+
#### Configuration
139+
140+
The Orchestrator (client) machine requires a precise setup to spawn and control Firecracker-based sandboxes.
141+
This includes a correct OS version installed (Ubuntu 22.04) with KVM. It's possible to run KVM with nested virtualization, but there are some performance drawbacks.
142+
143+
Most of the configuration can be taken from our client [machine setup script](https://github.com/e2b-dev/infra/blob/main/packages/cluster/scripts/start-client.sh).
144+
There are adjustments for the maximum number of inodes, socket connections, NBD, and huge pages allocations needed for the MicroVM process to work properly.
145+
146+
#### Static binaries
147+
148+
There is a need for a few files and folders to be present on the machine.
149+
For correctly working sandbox spawning, you need to have Firecracker, Linux kernel, and Envd binaries.
150+
We are distributing a pre-built one in the public Google Cloud bucket.
151+
152+
```bash
153+
# Access publicly available pre-built binaries
154+
gsutil cp -r gs://e2b-prod-public-builds .
155+
```
156+
157+
Static files and folder setup example. Please replace Linux and Firecracker with the versions you want to use.
158+
Ensure you use the same Linux and Firecracker versions for both sandbox build and spawning.
159+
160+
```bash
161+
sudo mkdir -p /orchestrator/sandbox
162+
sudo mkdir -p /orchestrator/template
163+
sudo mkdir -p /orchestrator/build
164+
165+
sudo mkdir /fc-envd
166+
sudo mkdir /fc-envs
167+
sudo mkdir /fc-vm
168+
169+
# Replace with the source where you envd binary is hosted
170+
# Currently, envd needs to be taken from your source as we are not providing it.
171+
sudo curl -fsSL -o /fc-envd/envd ${source_url}
172+
sudo chmod +x /fc-envd/envd
173+
174+
SOURCE_URL="https://storage.googleapis.com/e2b-prod-public-builds"
175+
KERNEL_VERSION="vmlinux-6.1.102"
176+
FIRECRACKER_VERSION="v1.12.1_d990331"
177+
178+
# Download Kernel
179+
sudo mkdir -p /fc-kernels/vmlinux-${KERNEL_VERSION}
180+
sudo curl -fsSL -o /fc-kernels/${KERNEL_VERSION}/vmlinux.bin ${SOURCE_URL}/kernels/${KERNEL_VERSION}/vmlinux.bin
181+
182+
# Download Firecracker
183+
sudo mkdir -p /fc-versions/${FIRECRACKER_VERSION}
184+
sudo curl -fsSL -o /fc-versions/${FIRECRACKER_VERSION}/firecracker ${SOURCE_URL}/firecrackers/${FIRECRACKER_VERSION}/firecracker
185+
sudo chmod +x /fc-versions/${FIRECRACKER_VERSION}/firecracker
186+
```

0 commit comments

Comments
 (0)