Creating Python Docker Apps

Overview

This guide outlines how to build and deploy your own Python-based Docker applications that integrate seamlessly with the Datagrok platform. Your Python functions will be executed as Celery tasks within isolated Docker containers. The platform takes care of container orchestration, task queuing, and result handling.

Key Benefits

Write only the logic: You focus on writing plain Python functions.
Zero infrastructure setup: The platform handles Celery, RabbitMQ, Docker, and scaling.
Full control: Optional configuration for resource limits and dependencies.

Folder Structure

Create a python/ directory inside your plugin root. Inside it:

One or more folders, each representing a separate application
Each folder can contain:
- One or more .py files with functions.
- An optional requirements.in or environment.yaml file for Python dependencies
- An optional container.json file for resource configuration

Example layout:

plugin-root/
└── python/
    └── my_app/
        ├── logic.py
        ├── requirements.in
        └── container.json

Writing Your Python Code

You can define any number of functions inside your Python files. Functions must include metadata as comments:

# file: logic.py

#name: add
#tags: task
#input: int x
#input: int y
#output: int z
def add(x, y):
    return x + y

This metadata allows the platform to expose the function in the UI and JS API.

Optional Files

`requirements.in`

List of pip dependencies for your app:

numpy
scikit-learn

`environment.yaml`

Valid Conda if you want to use miniconda as your package manager.

name: my-simple-app
channels:
  - defaults
dependencies:
  - python=3.11
  - pip

`container.json`

Optional resource configuration:

{
  "cpu": 1,
  "memory": 1024
}

By default, Celery prefork pool is used and the number of process workers will be cpu * 2.

Note: The default and minimum value of cpu parameter for Celery based containers is 1

Platform Responsibilities

When you deploy a plugin:

Function Parsing: The platform parses your annotated Python functions.
Task Wrapping: A Celery-main file is generated automatically.
Docker Build: A Dockerfile is created using boilerplate.
Container Deployment: Containers are started using grok_spawner.
Celery Worker Setup: Each container runs a Celery worker bound to your task queue.

Executing Tasks

Once your app is deployed, functions can be called via the JS API:

await grok.functions.call('Plugin:add', { x: 1, y: 2 });

The platform will:

Publish the task to RabbitMQ
Route it to the correct worker container
Execute your function
Collect the results and return it back to you

Best Practices

Keep your apps stateless to ensure parallel execution safety
Use metadata comments to describe inputs/outputs clearly
Declare dependencies explicitly in requirements.in
Test locally before deploying

Troubleshooting

Task not showing up? Check function annotations for correct formatting
Dependency error? Validate your requirements.in by running pip install -r requirements.in locally
Performance issues? Tune cpu and memory in container.json

Conclusion

This approach allows you to define lightweight, scalable, and easily deployable Docker-based Python apps with minimal effort. By leveraging Celery and RabbitMQ, the platform ensures high performance, reliability, and scalability out of the box.

Creating Python Docker Apps

Overview​

Key Benefits​

Folder Structure​

Writing Your Python Code​

Optional Files​

requirements.in​

environment.yaml​

container.json​

Platform Responsibilities​

Executing Tasks​

Best Practices​

Troubleshooting​

Conclusion​