Core Datagrok concepts
These concepts are foundational in Datagrok:
Platform
- Dataframe
- Functions
- Entities
A Dataframe (or table) is a fundamental data structure in Datagrok. Dataframes are optimized for exploratory data analysis and support common data operations like filtering, sorting, or aggregations.
Dataframes can be manipulated directly through the UI, Console, or scripts that automate data transformations. Every change made to a dataframe is automatically recorded, which means it can be audited and reproduced.
Dataframes are visualized using the grid viewer.
In Datagrok, everything is a function, from simple data manipulation, like deleting a column, to complex operations, like running queries or scripts.
Functions can be annotated, audited, and linked. They can be written in any language and executed both on the server and the browser. You can incorporate functions into larger scripts regardless of the function's language, which lets you automate complex operations, auto generate UI, and augment data.
Datagrok treats many different objects as entities. For example, data connections, users, functions, or layouts are all entities. Information about these entities, including their metadata, is stored in a centralized database. All entities share a set of common operations, including sharing, assigning privileges, and retrieving their URL.
User interface
- Views
- Layouts
A view is a window designed for specific tasks. For example:
- Double-clicking a dataframe opens a Table View resembling Excel
- Double-clicking a query opens the Query Editor
- Clicking a Browse icon on the Sidebar opens Browse
You can open the same object in multiple views and work on each independently.
In Datagrok, the visual representation of tabular data (a layout) is separated from the data (a dataframe). This separation lets you do the following:
- Apply multiple layouts to the same table. This means you can customize views according to your needs without duplicating the underlying data.
- Save layouts and apply them to different projects and tables.
- Share layouts independently of the underlying data.
- Work on visual presentation separately from data manipulation.
Data management
- Projects
Projects act like folders containing various entities such as dataframes, queries, or scripts. For example, dashboards are projects that include the underlying data (a dataframe) and the visualizations applied to it (a layout).
Projects are essential for organizing, managing, and sharing data assets. The Browse view organizes projects in a tree that governs entity privileges.