|
1 | 1 | --- |
2 | | -title: 'BYOC (Bring Your Own Cloud) for AWS' |
3 | | -slug: /cloud/reference/byoc |
4 | | -sidebar_label: 'BYOC (Bring Your Own Cloud)' |
5 | | -keywords: ['BYOC', 'cloud', 'bring your own cloud'] |
| 2 | +title: 'BYOC Onboarding for AWS' |
| 3 | +slug: /cloud/reference/byoc/onboarding/aws |
| 4 | +sidebar_label: 'AWS' |
| 5 | +keywords: ['BYOC', 'cloud', 'bring your own cloud', 'AWS'] |
6 | 6 | description: 'Deploy ClickHouse on your own cloud infrastructure' |
7 | 7 | doc_type: 'reference' |
8 | 8 | --- |
9 | 9 |
|
10 | 10 | import Image from '@theme/IdealImage'; |
11 | | -import byoc1 from '@site/static/images/cloud/reference/byoc-1.png'; |
12 | | -import byoc4 from '@site/static/images/cloud/reference/byoc-4.png'; |
13 | | -import byoc3 from '@site/static/images/cloud/reference/byoc-3.png'; |
14 | 11 | import byoc_vpcpeering from '@site/static/images/cloud/reference/byoc-vpcpeering-1.png'; |
15 | 12 | import byoc_vpcpeering2 from '@site/static/images/cloud/reference/byoc-vpcpeering-2.png'; |
16 | 13 | import byoc_vpcpeering3 from '@site/static/images/cloud/reference/byoc-vpcpeering-3.png'; |
17 | 14 | import byoc_vpcpeering4 from '@site/static/images/cloud/reference/byoc-vpcpeering-4.png'; |
18 | | -import byoc_plb from '@site/static/images/cloud/reference/byoc-plb.png'; |
19 | | -import byoc_security from '@site/static/images/cloud/reference/byoc-securitygroup.png'; |
20 | | -import byoc_inbound from '@site/static/images/cloud/reference/byoc-inbound-rule.png'; |
21 | 15 | import byoc_subnet_1 from '@site/static/images/cloud/reference/byoc-subnet-1.png'; |
22 | 16 | import byoc_subnet_2 from '@site/static/images/cloud/reference/byoc-subnet-2.png'; |
23 | 17 | import byoc_s3_endpoint from '@site/static/images/cloud/reference/byoc-s3-endpoint.png' |
24 | 18 |
|
25 | | -## Overview {#overview} |
26 | | - |
27 | | -BYOC (Bring Your Own Cloud) allows you to deploy ClickHouse Cloud on your own cloud infrastructure. This is useful if you have specific requirements or constraints that prevent you from using the ClickHouse Cloud managed service. |
28 | | - |
29 | | -**If you would like access, please [contact us](https://clickhouse.com/cloud/bring-your-own-cloud).** Refer to our [Terms of Service](https://clickhouse.com/legal/agreements/terms-of-service) for additional information. |
30 | | - |
31 | | -BYOC is currently only supported for AWS. You can join the wait list for GCP and Azure [here](https://clickhouse.com/cloud/bring-your-own-cloud). |
32 | | - |
33 | | -:::note |
34 | | -BYOC is designed specifically for large-scale deployments, and requires customers to sign a committed contract. |
35 | | -::: |
36 | | - |
37 | | -## Glossary {#glossary} |
38 | | - |
39 | | -- **ClickHouse VPC:** The VPC owned by ClickHouse Cloud. |
40 | | -- **Customer BYOC VPC:** The VPC, owned by the customer's cloud account, is provisioned and managed by ClickHouse Cloud and dedicated to a ClickHouse Cloud BYOC deployment. |
41 | | -- **Customer VPC** Other VPCs owned by the customer cloud account used for applications that need to connect to the Customer BYOC VPC. |
42 | | - |
43 | | -## Architecture {#architecture} |
44 | | - |
45 | | -Metrics and logs are stored within the customer's BYOC VPC. Logs are currently stored in locally in EBS. In a future update, logs will be stored in LogHouse, which is a ClickHouse service in the customer's BYOC VPC. Metrics are implemented via a Prometheus and Thanos stack stored locally in the customer's BYOC VPC. |
46 | | - |
47 | | -<br /> |
48 | | - |
49 | | -<Image img={byoc1} size="lg" alt="BYOC Architecture" background='black'/> |
50 | | - |
51 | | -<br /> |
52 | | - |
53 | 19 | ## Onboarding process {#onboarding-process} |
54 | 20 |
|
55 | 21 | Customers can initiate the onboarding process by reaching out to [us](https://clickhouse.com/cloud/bring-your-own-cloud). Customers need to have a dedicated AWS account and know the region they will use. At this time, we are allowing users to launch BYOC services only in the regions that we support for ClickHouse Cloud. |
@@ -287,196 +253,3 @@ Metrics and logs are stored within the customer's BYOC VPC. Logs are currently s |
287 | 253 | *Outbound* |
288 | 254 |
|
289 | 255 | State Exporter sends ClickHouse service state information to an SQS owned by ClickHouse Cloud. |
290 | | - |
291 | | -## Features {#features} |
292 | | - |
293 | | -### Supported features {#supported-features} |
294 | | - |
295 | | -- **SharedMergeTree**: ClickHouse Cloud and BYOC use the same binary and configuration. Therefore all features from ClickHouse core are supported in BYOC such as SharedMergeTree. |
296 | | -- **Console access for managing service state**: |
297 | | - - Supports operations such as start, stop, and terminate. |
298 | | - - View services and status. |
299 | | -- **Backup and restore.** |
300 | | -- **Manual vertical and horizontal scaling.** |
301 | | -- **Idling.** |
302 | | -- **Warehouses**: Compute-Compute Separation |
303 | | -- **Zero Trust Network via Tailscale.** |
304 | | -- **Monitoring**: |
305 | | - - The Cloud console includes built-in health dashboards for monitoring service health. |
306 | | - - Prometheus scraping for centralized monitoring with Prometheus, Grafana, and Datadog. See the [Prometheus documentation](/integrations/prometheus) for setup instructions. |
307 | | -- **VPC Peering.** |
308 | | -- **Integrations**: See the full list on [this page](/integrations). |
309 | | -- **Secure S3.** |
310 | | -- **[AWS PrivateLink](https://aws.amazon.com/privatelink/).** |
311 | | - |
312 | | -### Planned features (currently unsupported) {#planned-features-currently-unsupported} |
313 | | - |
314 | | -- [AWS KMS](https://aws.amazon.com/kms/) aka CMEK (customer-managed encryption keys) |
315 | | -- ClickPipes for ingest |
316 | | -- Autoscaling |
317 | | -- MySQL interface |
318 | | - |
319 | | -## FAQ {#faq} |
320 | | - |
321 | | -### Compute {#compute} |
322 | | - |
323 | | -#### Can I create multiple services in this single EKS cluster? {#can-i-create-multiple-services-in-this-single-eks-cluster} |
324 | | - |
325 | | -Yes. The infrastructure only needs to be provisioned once for every AWS account and region combination. |
326 | | - |
327 | | -### Which regions do you support for BYOC? {#which-regions-do-you-support-for-byoc} |
328 | | - |
329 | | -BYOC supports the same set of [regions](/cloud/reference/supported-regions#aws-regions ) as ClickHouse Cloud. |
330 | | - |
331 | | -#### Will there be some resource overhead? What are the resources needed to run services other than ClickHouse instances? {#will-there-be-some-resource-overhead-what-are-the-resources-needed-to-run-services-other-than-clickhouse-instances} |
332 | | - |
333 | | -Besides Clickhouse instances (ClickHouse servers and ClickHouse Keeper), we run services such as `clickhouse-operator`, `aws-cluster-autoscaler`, Istio etc. and our monitoring stack. |
334 | | - |
335 | | -Currently we have 3 m5.xlarge nodes (one for each AZ) in a dedicated node group to run those workloads. |
336 | | - |
337 | | -### Network and security {#network-and-security} |
338 | | - |
339 | | -#### Can we revoke permissions set up during installation after setup is complete? {#can-we-revoke-permissions-set-up-during-installation-after-setup-is-complete} |
340 | | - |
341 | | -This is currently not possible. |
342 | | - |
343 | | -#### Have you considered some future security controls for ClickHouse engineers to access customer infra for troubleshooting? {#have-you-considered-some-future-security-controls-for-clickhouse-engineers-to-access-customer-infra-for-troubleshooting} |
344 | | - |
345 | | -Yes. Implementing a customer controlled mechanism where customers can approve engineers' access to the cluster is on our roadmap. At the moment, engineers must go through our internal escalation process to gain just-in-time access to the cluster. This is logged and audited by our security team. |
346 | | - |
347 | | -#### What is the size of the VPC IP range created? {#what-is-the-size-of-the-vpc-ip-range-created} |
348 | | - |
349 | | -By default we use `10.0.0.0/16` for BYOC VPC. We recommend reserving at least /22 for potential future scaling, |
350 | | -but if you prefer to limit the size, it is possible to use /23 if it is likely that you will be limited |
351 | | -to 30 server pods. |
352 | | - |
353 | | -#### Can I decide maintenance frequency {#can-i-decide-maintenance-frequency} |
354 | | - |
355 | | -Contact support to schedule maintenance windows. Please expect a minimum of a weekly update schedule. |
356 | | - |
357 | | -## Observability {#observability} |
358 | | - |
359 | | -### Built-in monitoring tools {#built-in-monitoring-tools} |
360 | | -ClickHouse BYOC provides several approaches for various use cases. |
361 | | - |
362 | | -#### Observability dashboard {#observability-dashboard} |
363 | | - |
364 | | -ClickHouse Cloud includes an advanced observability dashboard that displays metrics such as memory usage, query rates, and I/O. This can be accessed in the **Monitoring** section of ClickHouse Cloud web console interface. |
365 | | - |
366 | | -<br /> |
367 | | - |
368 | | -<Image img={byoc3} size="lg" alt="Observability dashboard" border /> |
369 | | - |
370 | | -<br /> |
371 | | - |
372 | | -#### Advanced dashboard {#advanced-dashboard} |
373 | | - |
374 | | -You can customize a dashboard using metrics from system tables like `system.metrics`, `system.events`, and `system.asynchronous_metrics` and more to monitor server performance and resource utilization in detail. |
375 | | - |
376 | | -<br /> |
377 | | - |
378 | | -<Image img={byoc4} size="lg" alt="Advanced dashboard" border /> |
379 | | - |
380 | | -<br /> |
381 | | - |
382 | | -#### Access the BYOC Prometheus stack {#prometheus-access} |
383 | | -ClickHouse BYOC deploys a Prometheus stack on your Kubernetes cluster. You may access and scrape the metrics from there and integrate them with your own monitoring stack. |
384 | | - |
385 | | -Contact ClickHouse support to enable the Private Load balancer and ask for the URL. Please note that this URL is only accessible via private network and does not support authentication |
386 | | - |
387 | | -**Sample URL** |
388 | | -```bash |
389 | | -https://prometheus-internal.<subdomain>.<region>.aws.clickhouse-byoc.com/query |
390 | | -``` |
391 | | - |
392 | | -#### Prometheus Integration {#prometheus-integration} |
393 | | - |
394 | | -**DEPRECATED: ** Please use the Prometheus stack integration in the above section instead. Besides the ClickHouse Server metrics, it provides more metrics including the K8S metrics and metrics from other services. |
395 | | - |
396 | | -ClickHouse Cloud provides a Prometheus endpoint that you can use to scrape metrics for monitoring. This allows for integration with tools like Grafana and Datadog for visualization. |
397 | | - |
398 | | -**Sample request via https endpoint /metrics_all** |
399 | | - |
400 | | -```bash |
401 | | -curl --user <username>:<password> https://i6ro4qarho.mhp0y4dmph.us-west-2.aws.byoc.clickhouse.cloud:8443/metrics_all |
402 | | -``` |
403 | | - |
404 | | -**Sample Response** |
405 | | - |
406 | | -```bash |
407 | | -# HELP ClickHouse_CustomMetric_StorageSystemTablesS3DiskBytes The amount of bytes stored on disk `s3disk` in system database |
408 | | -# TYPE ClickHouse_CustomMetric_StorageSystemTablesS3DiskBytes gauge |
409 | | -ClickHouse_CustomMetric_StorageSystemTablesS3DiskBytes{hostname="c-jet-ax-16-server-43d5baj-0"} 62660929 |
410 | | -# HELP ClickHouse_CustomMetric_NumberOfBrokenDetachedParts The number of broken detached parts |
411 | | -# TYPE ClickHouse_CustomMetric_NumberOfBrokenDetachedParts gauge |
412 | | -ClickHouse_CustomMetric_NumberOfBrokenDetachedParts{hostname="c-jet-ax-16-server-43d5baj-0"} 0 |
413 | | -# HELP ClickHouse_CustomMetric_LostPartCount The age of the oldest mutation (in seconds) |
414 | | -# TYPE ClickHouse_CustomMetric_LostPartCount gauge |
415 | | -ClickHouse_CustomMetric_LostPartCount{hostname="c-jet-ax-16-server-43d5baj-0"} 0 |
416 | | -# HELP ClickHouse_CustomMetric_NumberOfWarnings The number of warnings issued by the server. It usually indicates about possible misconfiguration |
417 | | -# TYPE ClickHouse_CustomMetric_NumberOfWarnings gauge |
418 | | -ClickHouse_CustomMetric_NumberOfWarnings{hostname="c-jet-ax-16-server-43d5baj-0"} 2 |
419 | | -# HELP ClickHouseErrorMetric_FILE_DOESNT_EXIST FILE_DOESNT_EXIST |
420 | | -# TYPE ClickHouseErrorMetric_FILE_DOESNT_EXIST counter |
421 | | -ClickHouseErrorMetric_FILE_DOESNT_EXIST{hostname="c-jet-ax-16-server-43d5baj-0",table="system.errors"} 1 |
422 | | -# HELP ClickHouseErrorMetric_UNKNOWN_ACCESS_TYPE UNKNOWN_ACCESS_TYPE |
423 | | -# TYPE ClickHouseErrorMetric_UNKNOWN_ACCESS_TYPE counter |
424 | | -ClickHouseErrorMetric_UNKNOWN_ACCESS_TYPE{hostname="c-jet-ax-16-server-43d5baj-0",table="system.errors"} 8 |
425 | | -# HELP ClickHouse_CustomMetric_TotalNumberOfErrors The total number of errors on server since the last restart |
426 | | -# TYPE ClickHouse_CustomMetric_TotalNumberOfErrors gauge |
427 | | -ClickHouse_CustomMetric_TotalNumberOfErrors{hostname="c-jet-ax-16-server-43d5baj-0"} 9 |
428 | | -``` |
429 | | - |
430 | | -**Authentication** |
431 | | - |
432 | | -A ClickHouse username and password pair can be used for authentication. We recommend creating a dedicated user with minimal permissions for scraping metrics. At minimum, a `READ` permission is required on the `system.custom_metrics` table across replicas. For example: |
433 | | - |
434 | | -```sql |
435 | | -GRANT REMOTE ON *.* TO scrapping_user; |
436 | | -GRANT SELECT ON system._custom_metrics_dictionary_custom_metrics_tables TO scrapping_user; |
437 | | -GRANT SELECT ON system._custom_metrics_dictionary_database_replicated_recovery_time TO scrapping_user; |
438 | | -GRANT SELECT ON system._custom_metrics_dictionary_failed_mutations TO scrapping_user; |
439 | | -GRANT SELECT ON system._custom_metrics_dictionary_group TO scrapping_user; |
440 | | -GRANT SELECT ON system._custom_metrics_dictionary_shared_catalog_recovery_time TO scrapping_user; |
441 | | -GRANT SELECT ON system._custom_metrics_dictionary_table_read_only_duration_seconds TO scrapping_user; |
442 | | -GRANT SELECT ON system._custom_metrics_view_error_metrics TO scrapping_user; |
443 | | -GRANT SELECT ON system._custom_metrics_view_histograms TO scrapping_user; |
444 | | -GRANT SELECT ON system._custom_metrics_view_metrics_and_events TO scrapping_user; |
445 | | -GRANT SELECT(description, metric, value) ON system.asynchronous_metrics TO scrapping_user; |
446 | | -GRANT SELECT ON system.custom_metrics TO scrapping_user; |
447 | | -GRANT SELECT(name, value) ON system.errors TO scrapping_user; |
448 | | -GRANT SELECT(description, event, value) ON system.events TO scrapping_user; |
449 | | -GRANT SELECT(description, labels, metric, value) ON system.histogram_metrics TO scrapping_user; |
450 | | -GRANT SELECT(description, metric, value) ON system.metrics TO scrapping_user; |
451 | | -``` |
452 | | - |
453 | | -**Configuring Prometheus** |
454 | | - |
455 | | -An example configuration is shown below. The `targets` endpoint is the same one used for accessing the ClickHouse service. |
456 | | - |
457 | | -```bash |
458 | | -global: |
459 | | - scrape_interval: 15s |
460 | | - |
461 | | -scrape_configs: |
462 | | - - job_name: "prometheus" |
463 | | - static_configs: |
464 | | - - targets: ["localhost:9090"] |
465 | | - - job_name: "clickhouse" |
466 | | - static_configs: |
467 | | - - targets: ["<subdomain1>.<subdomain2>.aws.byoc.clickhouse.cloud:8443"] |
468 | | - scheme: https |
469 | | - metrics_path: "/metrics_all" |
470 | | - basic_auth: |
471 | | - username: <KEY_ID> |
472 | | - password: <KEY_SECRET> |
473 | | - honor_labels: true |
474 | | -``` |
475 | | - |
476 | | -Please also see [this blog post](https://clickhouse.com/blog/clickhouse-cloud-now-supports-prometheus-monitoring) and the [Prometheus setup docs for ClickHouse](/integrations/prometheus). |
477 | | - |
478 | | -### Uptime SLAs {#uptime-sla} |
479 | | - |
480 | | -#### Does ClickHouse offer an uptime SLA for BYOC? {#uptime-sla-for-byoc} |
481 | | - |
482 | | -No, since the data plane is hosted in the customer's cloud environment, service availability depends on resources not in ClickHouse's control. Therefore, ClickHouse does not offer a formal uptime SLA for BYOC deployments. If you have additional questions, please contact support@clickhouse.com. |
0 commit comments