Skip to main content

Local machine: advanced

This article provides additional options for running Datagrok on your local machine using Docker Compose. By following these instructions, you can customize your local Datagrok stand or even start the second one.

Hardware requirements

Minimal hardware requirements: 60 GB of free disk space, 4 CPUs, 8 GB RAM.

Prerequisites

  1. Install and launch the latest Docker Desktop application for your operating system:

  2. Clone public repository with docker-compose file

  3. Open the command-line interface and navigate to the directory where you cloned the repository.

Installing Datagrok

  1. Get latest Datagrok docker images

    docker compose -f docker\localhost.docker-compose.yaml --profile all pull
  2. Run the basic Datagrok stand

    docker compose -f docker\localhost.docker-compose.yaml --project-name datagrok ^
    --profile all up -d
    note

    If you encounter an error related to a WriteFile function when running docker-compose up on Windows, try running the command prompt (cmd) in Administrator mode. This is a known issue with Docker on certain computers.

  3. After the docker compose process is completed, wait for approximately 1 minute for the Datagrok server to spin up. Once the server is up and running, open http://localhost:8080 in your browser and log in.

Log in to Datagrok

  1. Once the server is up and running, open your browser and go to http://localhost:8080.
  2. On the login page, use the following credentials to login:
    • Login or Email: admin
    • Password admin
note

If you see the message Datagrok server is unavaliable, wait for approximately 1 minute for the server to start, and then reload the page.

CVM features

If you do not need CVM features, you can run only Datagrok application containers to save space and resources:

docker compose -f docker\localhost.docker-compose.yaml --project-name datagrok ^
--profile datagrok --profile db up -d

If you need CVM features only, you can run only CVM application containers:

docker compose -f docker\localhost.docker-compose.yaml --project-name datagrok ^
--profile cvm up -d

To run Datagrok with exact CVM features, specify them in the command line using the --profile flag

  • Cheminformatics

    docker compose -f docker\localhost.docker-compose.yaml --project-name datagrok ^
    --profile datagrok --profile db --profile chem up -d
  • Jupyter notebook

    docker compose -f docker\localhost.docker-compose.yaml --project-name datagrok ^
    --profile datagrok --profile db --profile jupyter_notebook up -d
  • Scripting

    docker compose -f docker\localhost.docker-compose.yaml --project-name datagrok ^
    --profile datagrok --profile db --profile scripting up -d
  • Modeling

    docker compose -f docker\localhost.docker-compose.yaml --project-name datagrok ^
    --profile datagrok --profile db --profile modeling up -d
  • Features can be enabled in any combination

    docker compose -f docker\localhost.docker-compose.yaml --project-name datagrok ^
    --profile datagrok ^
    --profile db ^
    --profile chem ^
    --profile scripting ^
    --profile jupyter_notebook ^
    --profile modeling ^
    up -d
  • Datagrok container is not required to be started for any feature, so you can omit it in run parameters

    docker compose -f docker\localhost.docker-compose.yaml --project-name datagrok ^
    --profile chem ^
    --profile scripting ^
    --profile jupyter_notebook ^
    --profile modeling ^
    up -d

Multiple stands

It is possible to run multiple stands of Datagrok on one host machine. To do so:

  1. Run the first stand as described in instruction.

  2. Set the Docker images versions with the environment variables. It can be any tag from Docker Hub.

    Environment variableDefault value
    DATAGROK_VERSIONlatest
    GROK_SPAWNER_VERSIONlatest
    GROK_CONNECT_VERSIONlatest
    GROK_COMPUTE_VERSIONlatest
     set DATAGROK_VERSION=latest
    set GROK_SPAWNER_VERSION=latest
    set GROK_CONNECT_VERSION=latest
    set GROK_COMPUTE_VERSION=latest
  3. Set environment variables for mapped ports:

    Environment variableDefault value
    DATAGROK_PORT8080
    DATAGROK_DB_PORT5432
    DATAGROK_CVM_PORT8090
    DATAGROK_H2O_PORT54321
    DATAGROK_H2O_HELPER_PORT5005
    GROK_SPAWNER_PORT8000
    GROK_CONNECT_PORT1234
    DATAGROK_DEMO_POSTGRES_NORTHWIND_PORT5433
    DATAGROK_DEMO_POSTGRES_CHEMBL_PORT5434
    DATAGROK_DEMO_POSTGRES_UNICHEM_PORT5435
    DATAGROK_DEMO_POSTGRES_STARBUCKS_PORT5436
    DATAGROK_DEMO_POSTGRES_WORLD_PORT5437

    To start the second stand properly the values should differ from the ports of the existing stands. For example, you can increment every port value by 11.

    set DATAGROK_PORT=8091
    set DATAGROK_DB_PORT=5444
    set DATAGROK_CVM_PORT=8101
    set DATAGROK_H2O_PORT=54332
    set DATAGROK_H2O_HELPER_PORT=5016
    set GROK_SPAWNER_PORT=8011
    set GROK_CONNECT_PORT=1245
    set DATAGROK_DEMO_POSTGRES_NORTHWIND_PORT=5444
    set DATAGROK_DEMO_POSTGRES_CHEMBL_PORT=5445
    set DATAGROK_DEMO_POSTGRES_UNICHEM_PORT=5446
    set DATAGROK_DEMO_POSTGRES_STARBUCKS_PORT=5447
    set DATAGROK_DEMO_POSTGRES_WORLD_PORT=5448
  4. The last step is to run the second stand. It is important to change the project name to start the second Datagrok stand. The project name in the standard instructions, which were used for the first stand, is datagrok. For example, you can add an increment to the project name: datagrok_2

    docker compose -f docker\localhost.docker-compose.yaml --project-name datagrok_2 ^
    --profile all up -d

Demo Databases

Datagrok offers demo databases that include sample data. To install them locally, run the following command:

docker compose -f docker\localhost.docker-compose.yaml --project-name datagrok_2 ^
--profile all --profile demo up -d

Shutting down Datagrok

To shut down Datagrok use this command:

docker compose -f docker\localhost.docker-compose.yaml --project-name datagrok ^
--profile all --profile demo stop

All the data is saved in the Docker volumes.

To reset Datagrok to factory settings and remove all stored data, including all created users, projects, connections, etc., run the following command:

docker compose -f docker\localhost.docker-compose.yaml --project-name datagrok ^
--profile all --profile demo down --volumes

Troubleshooting

  1. In case of any issues, check the settings in the Datagrok (Tools -> Settings...).

    • Connectors
      • External Host: grok_connect
    • Scripting:
      • CVM Url: http://cvm:8090
      • CVM URL Client: http://localhost:8090
      • H2o Url: http://h2o:54321
      • API Url: http://datagrok:8080/api
      • Cvm Split: true
    • Dev:
      • CVM Url: http://localhost:8090
      • Cvm Split: true
      • API Url: http://datagrok:8080/api
  2. Check containers logs for any possible errors and report the problem if there are any

    docker compose -f docker\localhost.docker-compose.yaml --project-name datagrok ^
    --profile all --profile demo logs

    You can also watch the logs of the desired service in real-time.

    • Replace <service> with the necessary service name
    • Replace <number> with desired log lines to watch or remove --tail <number> at all, if you want to see the full log
    docker compose -f docker\localhost.docker-compose.yaml --project-name datagrok ^
    --profile all --profile demo logs -f --tail <number> <service>
  3. Restart Docker compose stand

    docker compose -f docker\localhost.docker-compose.yaml --project-name datagrok ^
    --profile all --profile demo down
    docker compose -f docker\localhost.docker-compose.yaml --project-name datagrok ^
    --profile all --profile demo up -d
  4. For advanced service troubleshooting, you can access the containers shell.

    • Replace <service> with one of the services: db, datagrok, grok_connect, grok_compute, grok_spawner, jupyter_notebook, jupyter_kernel_gateway, h2o, etc.
    docker compose -f docker\localhost.docker-compose.yaml --project-name datagrok ^
    --profile all --profile demo exec <service> /bin/sh
  5. Docker logs might take up all your free disk space. If such a situation has already taken place:

  6. As your last resort, recreate stand completely

    docker compose -f docker\localhost.docker-compose.yaml --project-name datagrok ^
    --profile all --profile demo down --volumes
    docker compose -f docker\localhost.docker-compose.yaml --project-name datagrok ^
    --profile all --profile demo pull
    docker compose -f docker\localhost.docker-compose.yaml --project-name datagrok ^
    --profile all --profile demo up -d