Skip to main content

AWS CloudFormation (EKS)

The deployment consists of a few docker containers, a database for storing metadata, and persistent file storage for storing files.

This document contains instructions to deploy Datagrok using CloudFormation on AWS EKS with AWS RDS (PostgreSQL) and AWS S3. The template provisions the EKS cluster, managed node group, RDS instance, S3 bucket, IAM roles with IRSA, and installs the Datagrok Helm chart on the cluster automatically.

We considered a lot of typical security nuances during the CloudFormation template development. As a result, you will create a Datagrok infrastructure in AWS that applies to all standard security policies.

More information about Datagrok design and components:

Migrating from ECS?

The EKS template is drop-in compatible with the legacy ECS template: it reuses the same logical IDs for RDS, S3, and their subnet/security groups. Replacing an ECS stack's template with the EKS template preserves your database and bucket. See Migrate from ECS.

Prerequisites

  1. Check that you have the required permissions on the AWS account to perform CloudFormation deployment to EKS.

  2. One-time per region: activate the AWS QuickStart Kubernetes Helm CloudFormation extension in your account. The template uses it to install the Helm chart declaratively.

    1. Open AWS CloudFormation → Registry → Public extensions in the AWS region where you plan to deploy Datagrok.
    2. Find AWSQS::Kubernetes::Helm and click Activate.
    3. Repeat for AWSQS::Kubernetes::Resource.

Deployment profiles

ProfileDescriptionUse when
Full stack (default)Creates EKS cluster, node group, RDS, S3, IAM, and installs the Helm chartStarting from scratch
Existing clusterCreates RDS, S3, IAM/IRSA, and secrets only — skips EKS cluster and node groupYou already manage an EKS cluster

To use an existing cluster, set UseExistingCluster=true and provide your cluster details (ExistingClusterName, ExistingClusterOIDCIssuerUrl, ExistingClusterSecurityGroupId) when launching the stack. Gather them with:

CLUSTER=my-cluster
aws eks describe-cluster --name $CLUSTER --query '{
OIDCIssuerUrl: cluster.identity.oidc.issuer,
SecurityGroupId: cluster.resourcesVpcConfig.clusterSecurityGroupId
}'

The stack creates all supporting infrastructure (database, storage, IAM roles) and outputs the values needed for the Helm chart install. Follow the post-deploy steps in the Helm chart guide.

The stack does not create an EKS cluster, node group, or Fargate profile. Ensure your existing cluster has sufficient node capacity and that the AWS Load Balancer Controller is already installed.

Deploy Datagrok components

We prepared a specific template for every need of our customers. Answer the simple questions below to use the right one for you.

Would you like to use an existing VPC in your AWS account?

The template will create a new VPC and all required network resources for you.

Do you use Route53 as your DNS provider? (new VPC)

Requirements
  1. Create a Route53 public hosted zone.
How to deploy
  1. Use the link to open the CloudFormation template and fill in all required parameters.

    1. Specify stack name.
  2. Wait until AWS completes the deployment. The stack status will be CREATE_COMPLETE.

  3. Enter the platform at datagrok.<subdomain> using the admin user. Retrieve the password from DatagrokAdminPassword in stack Outputs.

  4. Complete the initial setup in the platform and you are ready to use Datagrok.

Service versions

Each Datagrok service (the main app, grok_pipe, grok_spawner, grok_connect, the Jupyter Kernel Gateway) has a dedicated template parameter for picking the container image tag. The Helm chart version is picked separately — by convention it matches the Datagrok app version exactly.

ParameterDefaultWhat it controls
DatagrokVersionbleeding-edgedatagrok/datagrok image tag (main app)
HelmChartVersionsame as DatagrokVersionChart tag base pulled from oci://registry-1.docker.io/datagrok/datagrok (template appends -helm, e.g. 1.26.5 → chart tag 1.26.5-helm)
GrokPipeVersionsame as DatagrokVersiondatagrok/grok_pipe image tag
GrokSpawnerVersionsame as DatagrokVersiondatagrok/grok_spawner image tag
GrokConnectVersionsame as DatagrokVersiondatagrok/grok_connect image tag
JupyterKernelGatewayVersionsame as DatagrokVersiondatagrok/jupyter_kernel_gateway image tag
RabbitMQVersion4.0.5-managementrabbitmq image tag (independent cadence)

For production, pin DatagrokVersion to a specific release (e.g. 1.27.0) — this also fixes the chart and all other services to the same release. Use bleeding-edge for dev/staging stands that track the latest master. See the release history for available versions.

The template pulls all images from Docker Hub by default. To use a private registry mirror, override the image repository fields in the chart values file.

Update Datagrok components

You can update your Datagrok deployment without re-creating infrastructure. Before updating, we recommend backing up the database and persistent storage. Refer to your internal backup procedures.

How to update

Use the same deployment script and an updated version of the deployment profile:

  1. Click Update > Replace current template, and provide the new template URL corresponding to your deployment configuration.
  2. Specify new versions for image tags (see the release history). HelmChartVersion defaults to the same value as DatagrokVersion.
  3. Click Next, skip optional settings, and proceed to Review.
  4. If the stack enters a failed state (e.g., UPDATE_ROLLBACK_IN_PROGRESS), check events for details.
note
  • CloudFormation will not replace the database or file storage during update.
  • Your platform URL, admin credentials, and uploaded files will remain unchanged.
  • Database schema migrations run automatically on the first start of the new app version.
  • If you previously customized your deployment with environment variables, they will persist unless explicitly modified in parameters.

Migrate from ECS

If you already run Datagrok on the legacy ECS CloudFormation template, you can migrate to EKS without re-creating RDS or S3. The EKS template uses the same logical IDs (DatagrokDB, DatagrokS3, DatagrokDBSubnetGroup, DatagrokDBSecurityGroup) and retains the same DeletionPolicy settings (Snapshot for RDS, Retain for S3), so CloudFormation treats them as continuing resources on stack update.

  1. Back up the RDS database and take a snapshot of the S3 bucket.
  2. In the existing stack, click Update > Replace current template and point at the matching EKS template variant (same VPC / DNS choice as the original ECS stack).
  3. Review the change set. Fargate / ECS resources will be flagged for deletion; RDS and S3 should show as "no change" or "modify" — not "replace".
  4. Apply the update.
  5. Once UPDATE_COMPLETE, the stack now runs on EKS with the same database and bucket.

If the change set shows RDS or S3 flagged for replacement, stop and contact support@datagrok.ai — do not apply the update.

See also