Install Datagrok with Helm
The Datagrok Helm chart is published as an OCI artifact on Docker Hub. It supports in-cluster PostgreSQL with local PVCs (laptop / single-node clusters), as well as managed cloud databases (RDS / Cloud SQL) with object storage (S3 / GCS) on EKS or GKE.
For turnkey EKS deployments, use the AWS CloudFormation (EKS) template — it provisions the cluster, RDS, S3, and IAM, and installs this chart automatically. This page covers manual Helm installs for any Kubernetes cluster (on-prem, GKE, AKS, kind, k3s, MicroK8s).
Prerequisites
- Kubernetes 1.27+ with a default StorageClass (or an existing PVC for stateful data)
- Helm 3.8+ (for OCI registry support)
- For cloud installs: a managed Postgres instance, an object storage bucket, and an IAM role / service account that the cluster can use to reach them
Chart version
The chart is published as an OCI artifact in the shared datagrok/datagrok Docker Hub
repo under tags suffixed with -helm to keep the chart and image tag namespaces
disjoint. --version 1.27.3-helm pulls a chart that deploys Datagrok 1.27.3, and
every sub-service (grok-pipe, grok-spawner, grok-connect, Jupyter Kernel Gateway)
defaults to the same app tag. Override individual service image tags via
--set <service>.image.tag=... — see Service versions below.
Chart tags follow the same scheme as the Datagrok image tags with a -helm suffix; see
Images and versions for the full convention.
Quick install (in-cluster Postgres + local storage)
helm install datagrok oci://registry-1.docker.io/datagrok/datagrok \
--version 1.27.3-helm \
--namespace datagrok --create-namespace \
--set postgres.password=$(openssl rand -base64 24) \
--set postgres.adminPassword=$(openssl rand -base64 24) \
--set ingress.host=datagrok.example.com
This installs PostgreSQL, RabbitMQ, the datagrok app, grok-pipe, grok-spawner, grok-connect, and JupyterKernelGateway with default resources. Suitable for evaluation, dev, or single-tenant production.
To track the latest unstable build (rebuilt after every merge to master), use
--version bleeding-edge-helm instead of a release version.
Service versions
Every service image tag defaults to the chart version. The default set is documented on
Images and versions and matches the
AWS CloudFormation templates. Override individual tags when
you need to run a newer grok_connect against an older datagrok core, or to pin a
specific service during a rollout:
helm install datagrok oci://registry-1.docker.io/datagrok/datagrok \
--version 1.27.3-helm \
--set datagrok.image.tag=1.27.3 \
--set grokPipe.image.tag=1.19.0 \
--set grokConnect.image.tag=2.6.2 \
--set spawner.image.tag=2.16.0 \
--set jkg.image.tag=1.31.0 \
--set grokRegistryProxy.image.tag=1.27.1 \
-n datagrok
RabbitMQ follows its own upstream release cadence and is not pinned to the Datagrok version.
Production install on AWS EKS
For new AWS stands use the AWS CloudFormation (EKS) template — it provisions EKS, RDS, S3, IAM with IRSA, and installs this chart for you. The steps below are for installing the chart directly into an EKS cluster you already manage.
-
Configure kubectl:
aws eks update-kubeconfig --name <cluster-name> -
Install the AWS Load Balancer Controller in the cluster — the chart's
ingress.className: albannotations require it. -
Provision RDS, S3, and the IRSA role for the Datagrok ServiceAccount yourself (
rds-db:connect,s3:Get/Put/Liston the bucket, optionally Secrets Manager read). -
Save the EKS overlay below as
values-prod.yaml, filling in the SET fields:postgres:internal: falseexternal:host: your-db.xxxxx.us-east-1.rds.amazonaws.comport: 5432ssl: truestorage:type: s3s3:bucket: your-datagrok-bucketregion: us-east-1ingress:enabled: trueclassName: albhost: datagrok.example.comannotations:alb.ingress.kubernetes.io/scheme: internet-facingalb.ingress.kubernetes.io/target-type: ipalb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}]'alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:...:certificate/...tls:enabled: false # ALB terminates TLSserviceAccount:create: trueannotations:eks.amazonaws.com/role-arn: arn:aws:iam::ACCOUNT:role/datagrok-roleregistry:type: proxyproxy:backendUrl: https://ACCOUNT.dkr.ecr.REGION.amazonaws.comcredentials:source: externalSecretsexternalSecrets:enabled: trueremoteKey: datagrok/prod # AWS Secrets Manager key -
Install:
helm install datagrok oci://registry-1.docker.io/datagrok/datagrok \--version 1.27.3-helm \-f values-prod.yaml \-n datagrok --create-namespace
The chart repo also ships a ready-made values-eks.yaml skeleton you can copy.
Production install on GCP GKE
Same flow as EKS, but use the values-gke.yaml overlay (Cloud SQL + GCS + GKE
Workload Identity).
Upgrades
helm upgrade datagrok oci://registry-1.docker.io/datagrok/datagrok \
--version 1.27.4-helm \
-f values-prod.yaml \
-n datagrok
Always upgrade through consecutive minor versions for production stands; database schema migrations run automatically on the first start of each new app version.
The chart's PostgreSQL StatefulSet, datagrok-data, and datagrok-cfg PVCs are preserved across upgrades. Database schema migrations run automatically on the first start of a new app version.
Reusing existing PVCs
If you're migrating an existing Datagrok install to the chart and want to keep your data, point the chart at the existing claims:
postgres:
internal: true
existingClaim: my-existing-postgres-pvc
storage:
type: local
local:
existingDataClaim: my-existing-data-pvc
existingCfgClaim: my-existing-cfg-pvc
When postgres.existingClaim is set, the chart skips its own
volumeClaimTemplates and mounts the data volume at the PV root (no pgdata
subPath), matching the on-disk layout used by Datagrok installs prior to the
chart.
Uninstall
helm uninstall datagrok -n datagrok
PVCs are NOT deleted by helm uninstall. To remove all data permanently:
kubectl delete pvc -n datagrok -l app.kubernetes.io/instance=datagrok
kubectl delete namespace datagrok