Skip to main content

Detect semantic types

The Datagrok semantic types is a very powerful concept, allowing you to define the meaning of your data. For example, you can specify that a particular string contains a chemical molecule in SMILES format, E-mail, or URL address.

For example, let's explore the Gasteiger partial charges script, which takes a molecule in SMILES format as input and calculates the Gasteiger charges distribution. The script is provided with the Chem package.

The script takes as an input the string variable mol with the semantic annotation Molecule. When you run the script, you will see the following:

Datagrok recognized the Molecule semantic types and created the custom UI displaying molecule formula. Click on it to open chemical sketcher and draw your own molecule. Datagrok has outstanding chemoinformatics support, so almost all UI elements provide you special viewing and editing options for chemical structures.

You can assign a semantic type to output variables in the same way. The semantic types annotation has no benefits for simple scalar output, but it is extremely helpful when you integrate your script with the Datagrok platform.

Semantic types for columns

Similarly, you can specify the semantic type for dataframe columns. For example, let column or column_list selectors accept only columns containing chemical molecules.

#input: dataframe df {caption: Dataframe}
#input: column mol {semType:Molecule; caption: Molecules} [Molecules to analyze]