AWS CloudFormation (EKS)
The deployment consists of a few docker containers, a database for storing metadata, and persistent file storage for storing files.
This document contains instructions to deploy Datagrok using CloudFormation on AWS EKS with AWS RDS (PostgreSQL) and AWS S3. The template provisions the EKS cluster, managed node group, RDS instance, S3 bucket, IAM roles with IRSA, and installs the Datagrok Helm chart on the cluster automatically.
We considered a lot of typical security nuances during the CloudFormation template development. As a result, you will create a Datagrok infrastructure in AWS that applies to all standard security policies.
More information about Datagrok design and components:
The EKS template is drop-in compatible with the legacy ECS template: it reuses the same logical IDs for RDS, S3, and their subnet/security groups. Replacing an ECS stack's template with the EKS template preserves your database and bucket. See Migrate from ECS.
Prerequisites
-
Check that you have the required permissions on the AWS account to perform CloudFormation deployment to EKS.
-
One-time per region: activate the AWS QuickStart Kubernetes Helm CloudFormation extension in your account. The template uses it to install the Helm chart declaratively.
- Open AWS CloudFormation → Registry → Public extensions in the AWS region where you plan to deploy Datagrok.
- Find
AWSQS::Kubernetes::Helmand click Activate. - Repeat for
AWSQS::Kubernetes::Resource.
Deployment profiles
| Profile | Description | Use when |
|---|---|---|
| Full stack (default) | Creates EKS cluster, node group, RDS, S3, IAM, and installs the Helm chart | Starting from scratch |
| Existing cluster | Creates RDS, S3, IAM/IRSA, and secrets only — skips EKS cluster and node group | You already manage an EKS cluster |
To use an existing cluster, set UseExistingCluster=true and provide your cluster
details (ExistingClusterName, ExistingClusterOIDCIssuerUrl,
ExistingClusterSecurityGroupId) when launching the stack. Gather them with:
CLUSTER=my-cluster
aws eks describe-cluster --name $CLUSTER --query '{
OIDCIssuerUrl: cluster.identity.oidc.issuer,
SecurityGroupId: cluster.resourcesVpcConfig.clusterSecurityGroupId
}'
The stack creates all supporting infrastructure (database, storage, IAM roles) and outputs the values needed for the Helm chart install. Follow the post-deploy steps in the Helm chart guide.
The stack does not create an EKS cluster, node group, or Fargate profile. Ensure your existing cluster has sufficient node capacity and that the AWS Load Balancer Controller is already installed.
Deploy Datagrok components
We prepared a specific template for every need of our customers. Answer the simple questions below to use the right one for you.
Would you like to use an existing VPC in your AWS account?
- Yes
- No
Datagrok stand will be put in an existing VPC you choose upon creation.
Do you use Route53 as your DNS provider? (existing VPC)
- Yes
- No
Requirements
- Create a Route53 public hosted zone.
How to deploy
-
Use the link to open the CloudFormation template and fill in all required parameters.
- Specify stack name. To meet AWS naming requirements, the name must be shorter than 10 symbols and correspond to S3 Bucket naming rules. We use 'datagrok' by default, but you may prefer to also specify the env in the stack name.
-
Wait until AWS completes the deployment. The stack status will be
CREATE_COMPLETE. The script created a datagrok stand inside your existing VPC using the existing Route53 hosted zone. Your Datagrok instance is now ready to use.If you see one of the following statuses then something went wrong:
CREATE_FAILED,ROLLBACK_IN_PROGRESS,ROLLBACK_COMPLETE,ROLLBACK_FAILED. Check the stack events for more information about the error. -
Enter the platform at
datagrok.<subdomain>using theadminuser. To get the password:- Go to stack Outputs. Find DatagrokAdminPassword and click on the link to open AWS Secrets Manager.
- Click Retrieve secret value and copy password. It is a generated password for the first admin login.
- To increase security, change the password for the admin user on first login. Datagrok will ignore the admin password from secrets on subsequent restarts.
-
Complete the initial setup in the platform and you are ready to use Datagrok.
Our CloudFormation scripts support external DNS providers, however, it will require a few manual steps to configure the endpoint.
Requirements
-
Datagrok requires an endpoint:
DATAGROK_DNS. Users will use it to access Datagrok's Web UI. -
Create an RSA SSL certificate for
DATAGROK_DNS.- If you use AWS ACM for SSL certificates:
- Generate an ACM certificate
valid for
DATAGROK_DNS. - Copy the AWS ARN for the certificate. It should look like
arn:aws:acm:<region>:<account_id>:certificate/<certificate_id>.
- Generate an ACM certificate
valid for
- If you don't use AWS ACM: create a certificate for
DATAGROK_DNSany way you are already using. A wildcard certificate also suffices.- Upload the certificate to AWS ACM.
- Copy the AWS ARN for the imported certificate.
- If you use AWS ACM for SSL certificates:
How to deploy
-
Use the link to open the CloudFormation template and fill in all required parameters.
- Specify stack name.
DatagrokArnSSLCertificate: specify AWS ACM ARN forDATAGROK_DNSfrom the prerequisites.
-
Wait until AWS completes the deployment. The stack status will be
CREATE_COMPLETE. -
As you chose the fulfillment option with external DNS, you need to create a CNAME record for Datagrok's Load Balancer. To get the Load Balancer endpoint for the DNS record:
- Go to stack Outputs. Copy the value for DatagrokLoadBalancerDNSName.
- Use the copied DNS name to create a CNAME DNS record. For example:
- Host:
DATAGROK_DNS, Target: DatagrokLoadBalancerDNSName.
- Host:
-
Enter the platform at
DATAGROK_DNSusing theadminuser. Retrieve the password from DatagrokAdminPassword in stack Outputs as described in the Route53 tab above. -
Complete the initial setup in the platform and you are ready to use Datagrok.
The template will create a new VPC and all required network resources for you.
Do you use Route53 as your DNS provider? (new VPC)
- Yes
- No
Requirements
- Create a Route53 public hosted zone.
How to deploy
-
Use the link to open the CloudFormation template and fill in all required parameters.
-
Wait until AWS completes the deployment. The stack status will be
CREATE_COMPLETE. -
Enter the platform at
datagrok.<subdomain>using theadminuser. Retrieve the password from DatagrokAdminPassword in stack Outputs. -
Complete the initial setup in the platform and you are ready to use Datagrok.
Our CloudFormation scripts support external DNS providers, however, it will require a few manual steps to configure the endpoint.
Requirements
-
Come up with an endpoint:
DATAGROK_DNS. Users will useDATAGROK_DNSto access Datagrok's Web UI. -
Create an RSA SSL certificate for
DATAGROK_DNS(see theUse existing VPC → External DNStab above for details).
How to deploy
-
Use the link to open the CloudFormation template and fill in all required parameters.
- Specify stack name.
DatagrokArnSSLCertificate: specify AWS ACM ARN forDATAGROK_DNS.
-
Wait until AWS completes the deployment. The stack status will be
CREATE_COMPLETE. -
As you chose the fulfillment option with external DNS, you need to create a CNAME record for Datagrok's Load Balancer. To get the Load Balancer endpoint for the DNS record:
- Go to stack Outputs. Copy the value for DatagrokLoadBalancerDNSName.
- Use the copied DNS name to create a CNAME DNS record, for example:
- Host:
DATAGROK_DNS, Target: DatagrokLoadBalancerDNSName.
- Host:
-
Enter the platform at
DATAGROK_DNSusing theadminuser. Retrieve the password from DatagrokAdminPassword in stack Outputs. -
Complete the initial setup in the platform and you are ready to use Datagrok.
Service versions
Each Datagrok service (the main app, grok_pipe, grok_spawner, grok_connect, the Jupyter Kernel Gateway)
has a dedicated template parameter for picking the container image tag. The Helm chart version is picked
separately — by convention it matches the Datagrok app version exactly.
| Parameter | Default | What it controls |
|---|---|---|
DatagrokVersion | bleeding-edge | datagrok/datagrok image tag (main app) |
HelmChartVersion | same as DatagrokVersion | Chart tag base pulled from oci://registry-1.docker.io/datagrok/datagrok (template appends -helm, e.g. 1.26.5 → chart tag 1.26.5-helm) |
GrokPipeVersion | same as DatagrokVersion | datagrok/grok_pipe image tag |
GrokSpawnerVersion | same as DatagrokVersion | datagrok/grok_spawner image tag |
GrokConnectVersion | same as DatagrokVersion | datagrok/grok_connect image tag |
JupyterKernelGatewayVersion | same as DatagrokVersion | datagrok/jupyter_kernel_gateway image tag |
RabbitMQVersion | 4.0.5-management | rabbitmq image tag (independent cadence) |
For production, pin DatagrokVersion to a specific release (e.g. 1.27.0) — this also fixes the chart and all other
services to the same release. Use bleeding-edge for dev/staging stands that track the latest master. See the
release history for available versions.
The template pulls all images from Docker Hub by default. To use a private registry mirror, override the image repository fields in the chart values file.
Update Datagrok components
You can update your Datagrok deployment without re-creating infrastructure. Before updating, we recommend backing up the database and persistent storage. Refer to your internal backup procedures.
How to update
Use the same deployment script and an updated version of the deployment profile:
- Click Update > Replace current template, and provide the new template URL corresponding to your deployment configuration.
- Specify new versions for image tags (see the release history).
HelmChartVersiondefaults to the same value asDatagrokVersion. - Click Next, skip optional settings, and proceed to Review.
- If the stack enters a failed state (e.g.,
UPDATE_ROLLBACK_IN_PROGRESS), check events for details.
- CloudFormation will not replace the database or file storage during update.
- Your platform URL, admin credentials, and uploaded files will remain unchanged.
- Database schema migrations run automatically on the first start of the new app version.
- If you previously customized your deployment with environment variables, they will persist unless explicitly modified in parameters.
Migrate from ECS
If you already run Datagrok on the legacy ECS CloudFormation template, you can migrate to
EKS without re-creating RDS or S3. The EKS template uses the same logical IDs (DatagrokDB, DatagrokS3,
DatagrokDBSubnetGroup, DatagrokDBSecurityGroup) and retains the same DeletionPolicy settings
(Snapshot for RDS, Retain for S3), so CloudFormation treats them as continuing resources on stack update.
- Back up the RDS database and take a snapshot of the S3 bucket.
- In the existing stack, click Update > Replace current template and point at the matching EKS template variant (same VPC / DNS choice as the original ECS stack).
- Review the change set. Fargate / ECS resources will be flagged for deletion; RDS and S3 should show as "no change" or "modify" — not "replace".
- Apply the update.
- Once
UPDATE_COMPLETE, the stack now runs on EKS with the same database and bucket.
If the change set shows RDS or S3 flagged for replacement, stop and contact support@datagrok.ai — do not apply the update.
See also
- Install Datagrok with Helm — manual Helm-only install for any Kubernetes cluster
- AWS CloudFormation (ECS) — legacy deployment, deprecated
- Terraform