Entering and Organizing Data

Registered, unrestricted SMD users may enter and edit their own experimental data. This section very briefly covers the high points; extensive help documentation is available within SMD. Note that the design (layout) of a microarray must be entered, by a curator, before experimental data from the corresponding arrays may be entered.

3.4.1. Entering Experimental Data

Navigation (unrestricted account and login required): Data menu: "Enter my data" option.

SMD currently accepts two-color data from the Agilent Feature Extraction, GenePix, ScanAlyze, and SpotReader feature extraction software packages. Single-channel data are accepted from the Affymetrix MAS 5 and GCOS software, and from DNA Chip Analyzer (dChip). Original TIFF images, data files, and grid files are required for two-color data; image (.DAT), primary (.CEL), and summary (.txt) data files are required for Affymetrix/dChip data.

The primary data may be entered via a Web form, or in batch mode using a text file prepared by the user. Procedural information and annotations are entered as a separate step, again either interactively (by editing the hybridization record), or in batch mode.

The primary data from GenePix, ScanAlyze, and SpotReader are automatically normalized on data entry, using a simple total-intensity normalization calculation. At the experimenter's option, the data may be renormalized at any time, using the same simple calculation, or using more sophisticated loess normalization options provided by the marray package (13) for BioConductor (14). There are currently no facilities for renormalizing Agilent or Affymetrix/dChip data, although Agilent's FeatureExtraction software provides a number of normalization options that may be employed before entering data into SMD. SMD does provide some simple options for normalizing and transforming data during retrieval, prior to clustering (see Subheading 3.2.4.).

3.4.2. Array Lists

Registered users may create "array lists," which are lists of arrays/hybridizations, optionally with specifications for specific data filters for each one (see Subheading 3.2.4.). These lists may be created by hand, or using online tools within SMD, and are stored within SMD. They are used for a variety of purposes, most commonly to specify a group of hybridizations that the user will frequently analyze. Array lists are typically used by a single researcher in the course of active analysis. When the analysis has matured or must be shared with collaborators, or published, an "experiment set" (see Subheadings 3.1.3. and 3.4.3.) is generally a better tool.

3.4.3. Experiment Sets

Navigation (unrestricted account and login required):

Advanced Search tool: "Data Retrieval and Analysis" button: "Create Experiment Set" button

Registered users may create "experiment sets," which are the primary organizational tool for analysis, publication, and collaboration in SMD. An experiment set is an ordered list of hybridizations, together with experimental factor values for each and an overall description of the experiment. Experiment sets are created using interactive tools within SMD. The owner/creator of the set may assign access to other users, making it a simple matter to share annotations and descriptions along with the data.

3.4.4. Publication Records

Navigation (curator account and login required):

Lists menu: "All Programs" option: "Create Publication" link.

Database curators may create publication records (see Subheading 3.1.2.). This requires the creation of an experiment set or sets to associate with the publication (see Subheading 3.4.3.) and a grant of public access on all data to be included in the publication. Given a PubMed ID number, the publication creation tool will download all relevant information (authors, citation, abstract, and so on), or the information may be entered by hand. The curator may also enter URLs for web sites containing the full text of the article, supplemental information, and so forth.

