FAQ: Peptides
To view FAQ on other topics, see FAQ.
Data format and interoperability
How does Datagrok access sequence data?
Datagrok can load sequences from CSV, XLSX, PDB, FASTA, or retrieve them from connected databases and other data sources.
What sequence formats does Datagrok support?
- BILN
- FASTA
- HELM
- Separator-based
- Connection-rules notations, and other formats
Datagrok also natively supports SMILES and can convert sequences into molecular form for further analysis.
Does Datagrok support natural and unnatural amino acids, custom monomers, and peptide linkers?
Yes. Datagrok supports sequences composed of any building blocks defined in the monomer library, including:
- Natural and non-natural amino acids
- Custom residues and linkers (represented via HELM components such as CHEM/BLOB)
Monomer definitions can be extended or modified through the built-in monomer library management system. Bulk import with Pistoia HELM JSON. Once defined, monomers are recognized consistently across the platform for rendering, SAR, enumeration, and other operations.
Does Datagrok support different peptide topologies, including cyclic or modified structures?
Yes. Datagrok supports a wide range of peptide topologies through HELM, BILN, and connection-rules notations, including:
- Linear, cyclic, and bicyclic peptides
- Branched and dendritic structures
- Lipidated peptides
- Custom cyclization patterns and noncanonical linkages
Does Datagrok support peptide conjugates?
Yes. They are either represented in HELM completely, or separated in different columns for analysis.
Can Datagrok convert between molecular and sequence representations?
Yes. Datagrok supports sequence-to-sequence and sequence-to-molecule conversion between all supported formats. When viewing a sequence, hovering over a monomer highlights the corresponding fragment in the generated molecule.
Molecule-to-sequence conversion is supported for SMILES-to-HELM.
For more details, see Format conversion or watch the RDKit UGM presentation.
Can Datagrok integrate with registration systems and ingest associated assay results?
Yes. Datagrok integrates with:
Developers: Read about integration options and JS development.
Can Datagrok integrate with a proprietary monomer library and support periodic updates?
Yes. Datagrok can connect to proprietary monomer libraries through custom library providers that work with file-based, database, or API-based sources. Libraries can be accessed directly for immediate read/write access or synchronized via ETL workflows.
Admin-controlled editing is supported through Datagrok's permissions management system.
Visualization and UI
Can Datagrok show sequences, molecules, properties, and assay data in a single table?
Yes. Tabular views are powered by the grid viewer, which displays sequences, molecular structures, calculated properties, and assay data side-by-side in a single interactive table. You can add, rearrange, or compute columns as needed.
Peptide properties and descriptors can be calculated from the Top Menu and/or dedicated info panes. Custom calculations can be defined through user-defined functions (JavaScript, Python, Julia, R, or MATLAB) or by integrating external services.
To learn about visualization and analytics capabilities for peptides, see Bioinformatics. For chemical utilities, see Cheminformatics. For assay data integration options, see Curves.
Sequence analysis
Can I filter or search peptides by architecture or sequence similarity?
- Text-based filtering: available for FASTA using text filter
- Sequence similarity and diversity searches: use Bio > Search > Similarity and Diversity tools
- Substructure search: convert sequences to molecular form and apply the substructure filter
Does Datagrok support sequence alignment?
Yes. Datagrok supports MSA using K-Align for canonical sequences and PepSEA for non-canonical sequences (learn more). For visual summaries of aligned sequences, use the WebLogo viewer.
Does Datagrok support peptide SAR analysis?
Yes. Datagrok provides an interactive environment for exploring sequence-activity relationships. It combines visualization, clustering, and statistical tools to help you identify key mutation sites, visualize sequence variability, and compare peptide properties with assay results. Learn more.
Does Datagrok support mutation-cliff analysis for peptides?
Yes. The Mutation Cliffs viewer identifies sequence positions where single-monomer substitutions cause significant activity changes. You can configure the minimum activity difference threshold to focus on the most impactful mutations and filter by specific positions.
Does Datagrok support matched pairs analysis for peptides?
Yes. The Mutation Cliffs mode of the Sequence Variability Map systematically compares peptides that differ by a single monomer substitution. For each position-monomer pair, it displays the number of matched pairs (circle size), mean activity change (color), and detailed statistics in the Context Panel.
Does Datagrok support invariance mapping and positional filtering for peptide sequences?
Yes. Datagrok includes tools for visualizing and filtering sequence variability across positions:
- Invariance mapping: use WebLogo to show monomer frequencies at each position, or Sequence Variability Map in Invariant Map mode to show position-specific frequencies and related statistics
- Positional filtering: click any cell in the Sequence Variability Map to select all sequences containing that monomer-position pair. Selected sequences are displayed in the Selection table and Context Panel. You can also use the monomer search filter to find specific residues across positions
Does Datagrok support statistical analysis for peptides?
Yes. The sequence Position Statistics viewer displays box or violin plots of selected properties across different motifs or positions, helping correlate sequence patterns with assay results or molecular properties. You can also use hypothesis testing tools, including MVA and ANOVA.