BIGNASim database structure and analysis portal for nucleic acids simulation data


A strong requirement to organize any filed of knowledge is the agreement in the terminology used. This has been a concern in Bioinformatics and has led to the development of a number of ontologies describing several aspects of the discipline (6, 7, 8, 9). Here we have developed a partial ontology to describe Nucleic Acids simulations. Its contents are used to qualify simulations, and power the search facility. Simulation browser tables, and simulation details (Figure S1B) include the set of keywords derived from the ontology. In its present state ontology represents merely a set of normalized terms, but it has been developed in a separate project and will be documented elsewhere.

Table of ontology terms (view in a ontology viewer)

Hierarchic Id Label Description
1 System Composition of simulated system
101   NA_Type Type of Nucleic Acid
10101     DNA DNA
10102     RNA RNA
10103     DNA-RNA_Hybrid DNA-RNA Hybrid
10104     PNA PNA
10199     OtherNAType Other NA Types
102   Structure_Type Structure of the nucleic acid
10201     SingleStrand Single Strand
1020101       Unpaired Single Strand with no WC pairing
1020102       Hairpin Hairpin, WC pairing
10202     Duplex Duplex
1020201       Canonical Canonical WC pairing
102020101         Linear Linear duplex
10202010101           Unpaired Linear duplex with unpaired ends
10202010102           BulgedLin Linear duplex with unpaired fragments
102020102         Circular Circular duplex
1020202       Hogsteen Hogsteen pairing
10203     Triplex Triplex
1020301       ParallelTrip Parallel Triplex
1020302       AntiParallelTrip Antiparallel Triplex
10204     Quadruplex Quadruplex
1020401       Gloop Gloop
1020402       ParallelQuad Parallel Quadruplex
1020403       AntiparallelQuad Antiparallel Quadruplex
1020404       IDNA I-DNA
10205       HollidayJnt Holliday junction
10206       3WayJnt 3-Way junction
10207       Loop Loop
10299       OtherStructureType Other structure types
103   System_Type Type of complex involving NA
10301     Naked Naked, uncomplexed
10302     Complex Complexed Nuc. Acid
1030201       Protein-nuc Complex Protein Nucl. Acid
103020101         Enzymes Complexed with Enzyme
103020102         Binding Binding Proteins
10302010201           Regulatory Regulatory Proteins (Trans Factors, etc)
10302010202           SstrandBind Single Strand Binders
10302010203           Nucleosome Nucleosome proteins
10302010299           OtherBindProt Other binding proteins
1030202       Ligand-nuc LigandĀ  Nucleic Acid complexes
103020201         Intercalator Intercalator
103020202         MinGBinder Minor groove binder
103020203         MajGBinder Major groove binder
103020204         HybridBinder Hybrid binders
104   OriginalHelicalConformation Original helical conformation of the Nucleic Acids
10401     A A
10402     B B
10403     Z Z
10404     Hogsteen Hogsteen
10405     MixedHConf Mixed conformations
10499     OtherHConf Other Conformations
105   SequenceModifications Modifications of Nucleic Acids Sequence
10501     ModifiedNucleotides Modified Nucleotides
10502     CrossLinked CrossLinked
10503     EpigeneticVariants EpigeneticVariants
10504     SequenceMismatches Sequence Mismatches
10599     OtherSeqMod Other modifications
106   SequenceFeatures Relevant features related to sequence
10601     PolyA Poly A Track
1060101       BrokenPolyA Broken Poly A Track
10602     PolyG Poly G Track
10603     DrewDickersonD Drew Dickerson Dodecamer
10604     SeqMismatch Sequence Mismatches
2 Simulation Simulation Data
201   SimConditions Simulation settings
20101     ForceField ForceField
2010101       Amber Cornell ForceField family
201010101         Parm99 Parm99
201010102         ParmBSC0 ParmBSC0
201010103         ParmBSC1 ParmBSC1
201010104         ParmBSC0-OL1 ParmBSC0-OL1
201010105         ParmBSC0-OL4 ParmBSC0-OL4
201010106         ParmBSC0-OL1-OL4 ParmBSC0-OL1-OL4
201010107         ParmBSC0-CG ParmBSC0-Cheng/Garcia
2010102       Charmm Charmm ForceField family
201010201         Charmm36 Charm66
2010199       OtherFF Other forcefields
20102     Length Length of simulations
2010201       NanoSecondRange Between 1 ns and 1 us
2010202       MicroSecondRange Over 1 us
20103     Temperature Simulation temperature
2010301       Physiological Physiological (around 298, 300K)
2010302       NonPhysiological NonPhysiological
20104     Solvent Solvent used in the simulation
2010401       Water Water only
2010402       Mixed Mixture water and other solvent
201040201         Wat-Ethanol Water Ethanol mixture
20105     Charge Charge model
2010501       Electroneutral Counter ions to compensate NA charge
2010502       AddedSalt Added counterions over charge compensation
201050201         Physiological Physiological (0.15M)
201050202         NonPhysiological NonPhysiological
20106       IonParam Parameter used for ion description
2010601         Dang Dang
2010602         Cheatham Cheatham
202   TrajectoryType Type of trajectory related to conformation changes
20201     Equilibrium Equilibrium (thermal fluctuations without major conf. Changes)
20202     Folding Folding or Unfolding
20203     Transition Transition between known conformations
20299     OtherTrajType Other type of trajectory
3 Analysis Analysis: any data derived from trajectories (simulated or experimental ensembles)
301   TimeScope Time scope of the analysis
30101     Snapshot Analysis made on a single snapshot
30102     TimeAvg Time averaged analysis
302   FragmentScope Fragment scope of the analysis
30201     SingleBase Analysis done on a single residue (base)
30202       GroupAvg Analysis done on a group of residues
3020201         BP Analysis done on a base pair (considering the main NA pairing)
3020202         BPStep Analysis done on a base pair step (2 consequent base pairs, considering the main pairing)
3020203         SeqFragment Analysis done on other sequence fragments
30203         FullSystem Analysis done on the complete system
30204         Metatrajectory Analysis done on a group of trajectories
30205         ExpStructure Analysis done on a experimental structure
303   AnalysisType Type of analysis
30301     BackboneTorsions Backbone Torsions
3030101       BI/BIIPopulation Proportion of BI/BII population
3030102       SugarPuckering Sugar puckering populations (N,E,S,W)
3030103       AGCanonical Proportion of canonical alpha/gamma torsions
30302     HelicalParam Helical parameters
3030201       AxisBP Base Pair Axis parameters
303020101         AxisBPTras Traslational Base Pair Axis parameters
30302010101           Xdisp X-Displacement
30302010102           Ydisp Y-Displacement
303020102         AxisBPRot Rotational Base Pair Axis parameters
30302010201           Inclination Inclination
30302010202           Tip Tip
3030202       HelicalBP Base Pair Helical parameters
303020201         HelicalBPTrans Translational Base Pair Helical parameters
30302020101           Shear Shear
30302020102           Stretch Stretch
30302020103           Stagger Stagger
303020202         HelicaBRRot Rotational Base Pair Helical parameters
30302020201           Buckle Buckle
30302020202           Opening Opening
30302020203           Propeller Propeller
3030203       HelicalBPStep Base Pair Step Helical parameters
303020301         HelicaBPStepTrans Translational Base Pair Step Helical parameters
30302030101           Rise Rise
30302030102           Slide Slide
30302030103           Shift Shift
303020302         HelicaBPStepRot Rotational Base Pair Step Helical parameters
30302030201           Roll Roll
30302030202           Tilt Tilt
30302030203           Twist Twist
30303     GrooveAnalysis Groove analysis
3030301       MajorGroove Major Groove
303030101         MajGDepth Depth of the Major Groove
303030102         MinGWidth Width of the Major Groove
3030302       MinorGroove Minor Groove
303030201         MinGDepth Depth of the Minor Groove
303030202         MinGWidth Width of the Mino Groove
30304     Interactions Analysis of interactions
3030401       Hbonds Hydrogen bonds (distances)
303040101         WC Watson-Crick Hydrogen Bonds
303040199         Other Other Hydrogend Bonds
3030402       Stacking Stacking interactions
303040201         Wstrand Stacking on the Watson Strand
303040202         Cstrand Stacking on the Crick Strand
303040203         Crossed Stacking between strands
30305     NMR NMR observables
3030501       NOE NOE
3030502       JC J-Couplings
30306     Stiffness Stiffness analysis
3030601       ForceConstant Force Constants
303060101         Fctwist Fctwist
303060102         Fcroll Fcroll
303060103         Fctilt Fctilt
303060104         Fcrise Fcrise
303060105         Fcshift Fcshift
303060106         Fcslide Fcslide
3030602       ForceMatrix Matrix of stiffness constants (twist, roll, tilt, rise, shift, slide)
3030603       ForceProduct DiagonalProduct
30307     TrajectoryVideo Video of Trajectory in standard formats
30308     TrajectoryData Trajectory data
3030801       PCAZip Trajectory in PCZ format
30309     Cartesian Cartesian analysis
3030901       RMSd Root Mean Square Deviation (RMSd)
3030902       RMSf Root Mean Square Fluctuation (RMSf)
3030903       RadGyration Radius of Gyration
3030904       Bfactor B - Factor
3030905       AvgStruct Average structure
30310     PCAnalysis Principal Component analysis
3031001       EigenValues PCA EigenValues
3031002       NumberEV Number of PCA EigenValues for a given variance
3031003       EigenVal Vector of eigenValues
3031002       EigenVector PCA EigenVectors
3031003       TrajectoryProj Projections of trajectory
3031004       Animated trajectory Trajectory animated following given eigenvectors
3031005       Entropy Entropy prediction
3031005         Schlitter Schlitter
303100502         Androcioaei Entropy prediction using AndrocioaeiĀ  protocol
3031006       Variance Variance measured in the trajectory
30311     ContactMaps Contact Maps