Skip to main content

Free-world data exploration

Traditional "walled garden" systems lock users into fixed views and predefined workflows, creating silos and fragmented decisions.

Datagrok takes a fundamentally different approach. It provides a fluid, responsive environment that adapts to how people naturally work with data - what we call "free-world data exploration". As users interact, Datagrok surfaces relevant tools, computations, models, and linked records. There's no need to switch systems, ask for data, or even know the data exists. It's appears in context exactly when needed.

While most of this behavior works out of the box, power users can customize and extend the platform to match the specific needs of their users.

Semantic types

Semantic types define what data represents (e.g., molecules, coordinates) and form the basis for automatic behavior across the platform. When Datagrok detects a semantic type, it connects the data with functions that accept that type as input.

For example, when a column contains SMILES strings, Datagrok automatically assigns it the semantic type molecule, renders structures in 2D, enables the Chem menu, and activates tools like sketchers and chemical info panes as users interact with data.

Datagrok supports built-in detection for many common types. You can write custom functions for the existing ones or define new semantic types as needed.

Text as semantic value

Datagrok can treat plain text (e.g., compound IDs) as semantic values and dynamically surface related content. For example, registering a pattern like CHEMBL\d+ allows Datagrok to:

  • Detect and highlight matching values across the platform
  • Make identifiers clickable and searchable
  • Show related data in tooltips, search cards, or info panes

This is especially useful when working with multiple identifier types, such as those found in GDB exports or compound registries. When a user interacts with a matching value (by hovering, clicking, or searching for it), your handler can query the data source and show the result in context.

This capability is delivered through Datagrok packages. Once the package is published, patterns work globally for all users who have the package installed. To implement, you'll need basic familiarity with regular expressions, JavaScript, and the Datagrok plugin framework. See documentation.

Custom identifier patterns

Functions

Scripts and queries are entities governed by permission framework. This means you can expose data securely, with full control over access, provenance, and auditability.

Function annotations

Function annotations let you control how functions (queries, scripts) interact with the Datagrok's UI and data context.
You can define the input (e.g., molecule), output presentation (e.g., tooltip, info pane), and link functions to user-friendly search patterns (e.g., "activity for Shigella").

This allows Datagrok to:

  • Match functions to relevant data
  • Provide data-specific tools and information without cluttering the UI
  • "Push" data automatically in response to user actions
  • Democratize data access through global search

Examples:

By specifying the function's #input, #output, and #tags, you can configure it to run automatically without user input. For example, a script can execute and display results in an info pane when a user clicks a molecule.

Gasteiger partial charges

Custom metadata

Using configurable schemas, you can attach structured, persistent metadata to entities or custom class of objects (e.g., molecules). Datagrok then automatically shows metadata wherever that entity or object appears.

For example, SAR comments added to a compound in one project are visible when the same structure appears in a screening dataset or the Hit Design app. Users can edit metadata directly in the Context Panel, add it as table columns, use in filters, and so on.

To implement, no coding is needed. Define the target objects using a matching expression (e.g., semtype=molecule) and create a schema with parameters. Learn how.