Regular machine
Datagrok runs as a set of Docker containers on top of a PostgreSQL metadata database and persistent file storage. This page covers manual on-host installs — bare-metal servers, on-prem VMs, single EC2 / GCE instances, or any other host you manage directly with Docker Compose. The same Datagrok services run on every deployment — see Components for the canonical list.
For new AWS stands prefer the CloudFormation (EKS) or CloudFormation (ECS) templates — they automate everything on this page and are the AWS path under active development. For Kubernetes (on-prem, GKE, AKS, existing clusters) use the Helm chart. This page is for hosts without an orchestrator.
Prerequisites
- A Linux host (or a Linux VM on Windows / macOS) with at least 4 CPUs, 8 GB RAM, and 60 GB free disk for the full stack including server-side scripting.
- Docker Engine and
Docker Compose v2 installed; the user that
will run the stack added to the
dockergroup. - A PostgreSQL 17 database. The bundled compose stack runs Postgres in-cluster; for production prefer a managed instance (AWS RDS, GCP Cloud SQL, Azure Database for PostgreSQL, or an on-prem cluster).
- Object storage. The bundled compose stack uses a local volume; production stands typically use S3, GCS, Azure Blob, or an S3-compatible service.
- DNS or load-balancer pointing at the host on port
8080(Datagrok) — direct port exposure works for evaluation, but production stands should sit behind a TLS-terminating reverse proxy.
Install
-
Clone the public repository on the host (it ships the canonical compose file):
git clone https://github.com/datagrok-ai/public.gitcd public/docker -
Open
localhost.docker-compose.yamland edit theGROK_PARAMETERSJSON on thedatagrokservice. Replace the values inline with your database and storage details (drop theamazonStorage*block if you're using local file storage):{"dbServer": "<DATABASE_HOST>","dbPort": "5432","db": "datagrok","dbLogin": "datagrok","dbPassword": "<DB_PASSWORD>","dbAdminLogin": "<POSTGRES_ADMIN_USER>","dbAdminPassword": "<POSTGRES_ADMIN_PASSWORD>","amazonStorageRegion": "us-east-2","amazonStorageBucket": "<S3_BUCKET>"}See Server configuration for every supported key — including GCS, Azure, RDS IAM auth, and TLS options.
-
Pull the images and start the full stack:
docker compose -f localhost.docker-compose.yaml --project-name datagrok \--profile all up -dUse the
--profileflags from Local machine: advanced to skip optional services (e.g., drop server-side scripting). -
After about a minute the server is ready at
http://<HOST>:8080. Sign in asadmin/adminand change the admin password on first login.
Multi-host topologies
For multi-host installs (Datagrok services on one host, scripting / Jupyter Kernel Gateway on another, or larger), use the Helm chart on Kubernetes. A single-node K8s distribution like k3s or kind is enough if you don't already run a cluster.
On AWS EC2
For a single EC2 instance with RDS and
S3 attached, follow this page and supply the RDS endpoint
and S3 bucket details in GROK_PARAMETERS — see
AWS EC2 specifics. For multi-AZ, autoscaling, or load-balanced
production stands on AWS, use the CFN ECS or
CFN EKS template instead — they provision the host
fleet, RDS, S3, and ALBs end-to-end.
See also
- Components — service list and roles
- Server configuration — full
GROK_PARAMETERSreference - Helm chart — single- or multi-node Kubernetes installs