Core Datagrok concepts
These concepts are foundational in Datagrok:
Platform
- Dataframe
- Functions
- Entities
A Dataframe (or table) is a fundamental data structure in Datagrok. Dataframes are optimized for exploratory data analysis and support common data operations like filtering, sorting, or aggregations.
Dataframes can be manipulated directly through the UI, Console, or scripts that automate data transformations. Every change made to a dataframe is automatically recorded, which means it can be audited and reproduced.
Dataframes are visualized using the grid viewer.
In Datagrok, everything is a function, from simple data manipulation, like deleting a column, to complex operations, like running queries or scripts.
Functions can be annotated, audited, and linked. They can be written in any language and executed both on the server and the browser. You can incorporate functions into larger scripts regardless of the function's language, which lets you automate complex operations, autogenerate UI, and augment data.
Datagrok treats many different objects as entities. For example, data connections, users, functions, or layouts are all entities. Information about these entities, including their metadata, is stored in a centralized database. All entities share a set of common operations, including sharing, assigning privileges, and retrieving their URL.
User interface
- Views
- Layouts
A view is designed for specific tasks. For example, when you open a dataframe, it opens a Table View resembling Excel, while the Browse view, used for navigation and data management, resembles Windows File Explorer. Additionally, Datagrok plugins and apps can introduce custom views.
Each view opens in its own window or tab. This means you can open the same table or query in multiple views and work on them independently.
In Datagrok, the visual representation of tabular data (a layout) is separated from the data (the dataframe). This separation lets you do the following:
- Apply multiple layouts to the same table. This means you can customize views according to your needs without duplicating the underlying data.
- Save layouts and apply them to different projects and tables.
- Share layouts independently of the underlying data.
- Work on visual presentation separately from data manipulation.
Data management
- Projects
Projects act like folders containing various entities such as dataframes, queries, or scripts. For example, dashboards are projects that include two entities: the underlying data (a dataframe) and the visualizations applied to it (a layout).
Projects are essential for organizing, managing, and sharing data assets. The Browse view organizes projects in a tree that governs entity privileges.