How to Name Your Studies, Animals, and Samples
Recommended Sample Naming Best Practices
To make upload easy, we recommend including this key information in your sample names:
- Animal Name
- Tissue
- Time Collected
You might also consider including your initials or an abbreviation of your study name, to ensure uniqueness.
Here's why:
When naming your samples for the mass spectrometer run, making them unique for TraceBase can smooth out your submission experience.
The sample names in the RAW files that come off the Mass Spec instrument should map to a unique biological sample in TraceBase. Mass Spec runs are often configured for samples from a single animal, so it makes sense that the sample names entered often embed multiple bits of data to reference a specific tissue plus sometimes the animal name, or even a researcher's initials.
When multiple
mzXML
files are generated from one sample's raw file, to obtain the positive or negative scan data, or a particular m/z (mass/charge) range, the resultant files adopt the same name, and when those files are collated together, sometimes the polarity or scan range (e.g._neg
,_pos
, or_scan2
), or some combination, is added so that the files can exist in the same directory.All of this information gets incorporated into the sample headers of the peak annotation files. TraceBase is designed to extract the biological sample names from the LCMS sample name found in a peak annotation file (e.g. AccuCor), thus it attempts to identify and remove the scan labels so that the samples listed in each row of the
Samples
sheet of the Study Doc represent unique biological samples.Thus, most importantly, since samples from multiple animals are collected into a single study, the sample names (and Animal names for that matter) should be unique across that entire study. If the Animal name or Sample name is not unique, upload is still possible, but the names will need to be modified to be unique.
And since TraceBase is designed for cross-study analyses, it enforces that all animal and sample names be globally unique, thus a simple sample name like
M1_spleen
is likely going to need to be modified to avoid colliding with existing TraceBase samples.
Suggested Scheme for Unique Study, Animal, and Sample Names
For a new study/experiment, create a new identifier counting up from "study001", and sometimes include a short identifier or name for the experiment. (e.g. "study001_hyperthyroid_glucose_infusion"). Now you can reference this experiment ID ("study001") elsewhere.
- A different experiment may have a new identifier (e.g. "study002_hypothyroid glucose infusion").
- A related follow up experiment may have an extension to the identifier, e.g. "study001b_hyperthyroid lactate infusion".
- Any text could be used in place of "study".
Within each experiment, animal name counts up from 001 "study001_m01, study001_m02...".
- A unique sample file can then be created by adding to this animal name:
- provide full animal identifier, tissue, and sometimes time collected:
- E.g. "study001_m01_quad, study001_m01_tailserum_000, study001_m01_tailserum_120"
- Some researchers keep a single list of Animal IDs for all experiments (e.g. 0001, 0002, etc) This works well for TraceBase, because each Animal name is unique.
Apply this scheme when performing LCMS
When loading samples to run LCMS, apply this labeling scheme:
- Create a folder for each sequence that includes the date of LCMS and experiment identifier.
- Label samples according to biological entity (e.g. "study001_plasma_0")
- If running the same samples multiple times (e.g. positive mode / negative mode), save each result in a different destination folder.
- When analyzed in Maven/El-Maven and processed for isotope correction, the resulting sample names will be easy to label for upload to TraceBase.
LCMS data for a single biological entity can be generated from multiple LCMS experiments (e.g. "positive" vs "negative" mass spectrometry, alternative extraction methods, etc). There are two general methods for handling this:
- Option 1: include method information in the sample name (e.g. append "hilicPos")
- Option 2: leave sample name the same for every biological entity, and generate separate Accucor / Isocorr files for
each type of method.
- This requires saving results into separate folders on the LCMS.
- This option is easiest to implement for upload to TraceBase because it keeps your Sample Information Sheet simple.
Apply this scheme everywhere
These labeling schemes can be applied to your general organization of data outside of TraceBase.
Data and any other information related to your experiments can be organized into one folder labeled for each study "study001_my first infusion", "study001b_fixing my first infusion". Put everything related to the experiment in this folder, for example:
- mouse information sheets
- Study Doc
- LCMS data (mzxml)
- accucor / isocorr files
- Maven project files
- R / python scripts for analyzing this data
In your lab notebook, create page(s) for each experiment name.
When working with samples in the lab, it is not feasible to label every tube with the full identifier, but shorthand can be used for intermediate tubes if everything is from the same study:
- E.g. when extracting tissue from only study001, label working tubes with minimal info "Q1, Q2".
- Label final tubes with as much information as possible (e.g. "001_M1_Q").
- Label boxes stored in freezers with study identifier, a relevant date, and your initials
- (e.g. "study001_glucose infusion", "study001b_lactate infusion"...)
If you are unsure of how to label something, write whatever you think fits and submit anyway. Indicate what you are unsure about in the submission form or on the information sheet. The developer team will check the data and help you label it properly before adding to the live data on TraceBase.
The key steps to remember for this sheet are to match sample names and animal IDs: