Skip to main content

Datagrok: Swiss Army Knife for Data

Why Datagrok?

Datagrok helps you understand data and take action.

It's fast and powerful: you can load the entire ChEMBL database (2.7 million molecules) in your browser, run substructure searches, apply filters, visualize, and interactively explore the chemical space.

Datagrok goes beyond standard data analytics. You can access data from any source, catalog it, analyze and visualize it, run scientific computations, train and apply models, and do more. Need a specific tool or functionality? Easily integrate or add your own code. Datagrok's plugin architecture makes it easy to deliver cohesive, fit-for-purpose solutions.

Access

Get your data from anywhere - databases, web services, file shares, pipelines. If it's machine-readable, we can work with it!

Learn more about data access.

Govern

Use catalogs, data lineage tools, audit, and usage analysis to take control. Your data is FAIR and secure.

Transform

Automatically generate macros from data transformations and use them on new datasets.

Learn more about functions.

Explore

Slice, dice, and visualize your data. Render millions of data points interactively and find patterns. Build dynamic dashboards in seconds. Leverage metadata for automated data enrichment and contextual suggestions.

Compute

Write in any language, annotate, publish, and apply scientific models, methods, and apps. Solve differential equations and run simulations for complex processes.

Learn more about Compute.

Learn

No-code modeling. State-of-the-art cheminformatics engines and ML toolkit included.

Collaborate

Share anything with anyone. Collaborate on decision-making. Use an open source ecosystem to save costs and innovate.

Extend

Customize anything, from context actions to UI elements. Fast development and deployment time with seamless integration.

Who is it for?

Data: Datagrok is optimized for structured, tabular data. It automatically detects the semantics, like zip codes or molecules, and has built-in support for areas like cheminformatics, bioinformatics, data science, and others. Need more? Create your own plugin.

Skillset: Datagrok is for anyone who works with data:

  • Chemists analyzing SAR tables? Perfect fit.
  • Data analysts? Drag and drop your local files to start analyzing.
  • Data scientists mapping new store locations? Excellent for strategic planning.
  • Research scientists running complex simulations? Absolutely.
  • Data engineers? Automatically convert queries to dynamic dashboards, no coding needed.
  • Developers? Quickly develop and test data-driven applications.

Team size: Datagrok is for individuals and teams of all sizes - from startups to large enterprises. The platform is enterprise-ready, scalable, and ideal for sharing and collaboration.

What makes it so flexible?

Our mission is to help anyone understand their data, even in complex scenarios:

  • Data that's scattered across various data sources
  • Data that needs specialized, domain-specific tools
  • Teams that have different data needs and expertise.

Here's how we do it.

JS API: With JS API, you aren't confined to pre-built features or interfaces. Add new data formats, connectors, transformations, augmentations, dynamic calculations, UI elements, full-scale applications, workflows, and more. The API also provides seamless integration with data sources and other tools, crucial for large enterprises combatting data silos and complex data ecosystems.

Functions: In Datagrok, every task is a function that can be annotated. Annotations make functions versatile, allowing them to work on their own or within larger scripts, no matter the function's language or role. This means you can use functions as blocks to build on your team's collective expertise while fully leveraging Datagrok's capabilities. (See the cheminformatics example below).

Semantic types: Semantic data types provide domain-specific customization:

  • Automatic detection of domain-specific data types
  • Domain-specific menus and context actions
  • Custom data rendering, including spreadsheets and visualizations
  • Specialized data editing and filtering interfaces
  • Domain-specific calculation and data processing functions
  • Fit-for-purpose apps built on top of Datagrok.

See this example for cheminformatics.

What makes it so fast?

Our goal is to let you explore at the speed of thought. To achieve this, we designed Datagrok from scratch:

  • Data engine: In-memory columnar database that runs on both server and web browser. Fast random access, efficient data storage, aggregation, compression, filtering, transformation, and caching.

  • Native viewers: Access the data engine directly for maximum performance. They share statistics, cached calculations, and cooperate on tasks like filtering or selection.

  • App server: Uses the data engine to exchange binary-optimized datasets with the client. Custom ORM to efficiently work with metadata in Postgres.

  • Compute engine: Supports multiple languages working with binary-optimized datasets. Scales well. GPU acceleration of ML routines. Supports custom Docker containers.

Learn more about Datagrok's architecture and performance optimization.

Solutions