BIGNASim database structure and analysis portal for nucleic acids simulation data

Metadata for deposited trajectories

All trajectory data submitted to BIGNASim needs to be accompanied by its metadata. It can be easily generated through the web site forms, following the standard deposition procedure. However, a more authomatic way to produce such information is also made available though the use of metadata files. They are CSV files composed by:

  • BIGNASim ontology terms: a single identifier per line is enough, though the label can also be included on the second column for the sake of clarity
  • Specific tags

The list of accepted tags are the following:

datasetName Acronim for a dataset which groups one or several trajectory files. Format: alphanumeric chain of 4>length>7
datasetDescription Short title or description for the dataset
publDateSys Method to define the dataset publication date. Options are "release": submitted data will be made public immediately after validation, "hold": data will be on-hold until [publDate] is reached
publDate Date on which submitted data will be release given [publDateSys]="hold". Format: yyy/mm/dd
pubSys Method to define how the dataset reference publication is given. Options are "document": the dataset is referenced by uploaded files. "reference": the dataset is referenced by a public study. [pubTitle], [pubAuth], [pubJourn], [pubYear], [pubVol] and [pubDOI] require to be specified
pubTitle Title of the dataset reference publication. Mantadory if [pubSys]="reference"
pubAuth List of authors (comma separated) if the dataset reference publication. Mantadory if [pubSys]="reference"
pubJourn Journal where the dataset reference publication is published. Mantadory if [pubSys]="reference"
pubYear Year of the dataset reference publication year. Mantadory if [pubSys]="reference"
pubVol Volume of the Journal where the dataset reference publication is published. Mantadory if [pubSys]="reference"
pubDOI DOI identifier for the the dataset reference publication. Mantadory if [pubSys]="reference"
trajFormat Format of the trajectory file uploaded. Options are "traj_crd": AMBER CRD, "traj_dcd": CHARMM/NAMD DCD, "traj_cdf": AMBER NetCDF, "traj_gro": AMBER BINPOS, "traj_xtc": Gromacs XTC.
topFormat Format of the topology file uploaded. Options are "top_pdb": PDB, "top_prmtop": AMBER PRMTOP, "top_psf": NAMD PSF, "top_top": GROMACS TOPi, "top_itp": GROMACS ITP, "top_rtp": GROMACS RTP
PDB Protein Data Bank identifier for the reference experimental structure of the trajectory
NDB Nucleic Data Bank identifier for the reference experimental structure of the trajectory
ligandNames Comma separated list of ligands included in the trajectory
additionalSolvent Comma separated list of additional solvent molecules of the trajectory
counterions Comma separated list of counterions molecules and their concentrations
trajLength Total trajectory length in nanoseconds. If the trajectory is splitted into multiple files, the sum of them.
frames Number of frames
frameStep Time step taken between frames in nanoseconds
trajTemperature Trajectory temperature in kelvin degrees
comments Any comment valuable for the submission that may help to the correct classification of the data
rmsd all-heavy atoms mass weighted RMSD (%)
rmsd_bp all-heavy atoms mass weighted RMSD per base pair (%/bp)
Rgyr Radius of gyration (%/bp)
lostWC Percentage of lost Watson and Crick bonds
lostContacts Percentage of lost 3D contacts
fraying Presence of fraying. Options are "yes", "no"
avgTwist Global average Twist (degrees)
avgRoll Global average Roll (degrees)
minorGrooveSize Minor groove dimensions . Format: width x depth
majorGrooveSize Major groove dimensions . Format: width x depth
grooveSizeMethod Method use to compute the groove dimensions