Data Information - What is this?

Most studies either generate data or use data and these data should be acknowledged. Many journals have an explicit section to be filled out by authors describing where the data is that was generated by the manuscript. In some cases data used by authors to conduct a study can also be described in this section or it is described in the broader methods section. We recommend that when submitting to SciScore that authors include the data / code availability section, if it is not already a part of the methods.

SciScore will look for statements about data deposition, or data use and will bring back the sentence that discusses this. SciScore will also look for various identifiers associated with about 30 well known data repositories such as Gene Expression Omnibus (GEO) or generalist repositories like Dryad that use DOIs. SciScore will also attempt to resolve the identifier and will mark the ID as red (link does not resolve) or blue (link resolves).

  • Data Information: Data Availability

    • Checklist that requires this: MDAR

    • If we find a statement about data, then SciScore expects to find the following items (randomization, blinding, power);

  • Data Identifiers

    • Checklist that requires this: MDAR

    • If we find data identifiers, then SciScore expects to find the following items (randomization, blinding, power);

    • Recognized identifiers come from: dbSNP, dbVar, Sequence Read Archive, BioProject, Protein Circular Dichroism Data Bank, ArrayExpress, GEO, European Genome-phenome Archive, Japanese Genotype-phenotype Archive, MassIVE, MetaboLights, PeptideAtlas, ProteomeXchange, FlowRepository, Image Data Resource, European Nucleotide Archive, UniProt, dbGaP, Biostudies, and ClinVar

Additional notes and resources for data

When you have a choice, the so called specialist repositories (e.g., GEO, dbVar, ProteinDataBank, SPARC.science) are usually better than generalist repositories (e.g., Dataverse, Figshare, or your institutional repository) because data inclusion is more likely to be reused by the community, e.g., depositing your variant into ClinVar means that your data will be available to all pipelines using ClinVar, but putting the same data into Dryad will be less useful because it isn’t aligned to a data standard. However, depositing data into Dryad is far better than your “desk drawer” because the data is available in the long term and therefore recoverable, whether or not you get a new desk.

Some helpful lists of data repositories and databases

If you know of a good list of repositories or a repository that is not covered by the RRID portal that is/are country specific or funder specific (e.g., DOD or NSF) please send us a note, we will be happy to incorporate good lists of repositories here or individual repositories can be submitted to the RRID portal.