Purple Sea Urchin
Strongylocentrotus purpuratus
(Photo courtesy of Steven Murray, CSU Fullerton)

The April 2005 release of the Purple Sea Urchin genome (Strongylocentrotus purpuratus) was produced by Baylor College of Medicine's Human Genome Sequencing Center (BCM HGSC) and corresponds to their Spur_0.5 whole genome shotgun assembly.

Sample position queries

A genome position can be specified by a linkage group coordinate range, the accession number of an mRNA or EST, a gene name, or keywords from the GenBank description of an mRNA. The following list shows examples of valid position queries for the S. purpuratus genome. Note that some position queries (e.g. "brachyury") may return matches to the mRNA records of other species. In these cases, the mRNAs are mapped to their homologs in S. purpuratus. See the User's Guide for more information.

Request:
  Genome Browser Response:
 
Scaffold5728   Displays all 24,891 bases of Scaffold5728
Scaffold49105:1-10,000   Displays first 10,000 bases of Scaffold49105
AY044637   Searches for regions of genome aligning to mRNA with GenBank accession AY044637
cdk4   Searches for regions of genome aligning to cdk4 gene
cyclin dependent   Lists cyclin-dependent protein mRNAs
davidson   Lists accessions deposited by authors named Davidson
Davidson,E.H.   Lists mRNAs deposited by co-author E.H. Davidson

Use this last format for author queries. Although GenBank requires the search format Davidson EH, internally it uses the format Davidson,E.H.


Assembly Details

This release was produced by assembling whole genome shotgun reads using the Atlas genome assembly system at the BCM HGSC. Several whole genome shotgun libraries, with inserts of 2-6 kb, were used to produce the data. About 7 million reads were assembled, representing about 800 Mb of sequence and about 6x coverage of the (clonable) sea urchin genome. Highly repeated sequences were assembled separately into reptigs and merged into the genome assembly. Sequences from BAC clones were omitted from this assembly and will be placed in a subsequent version of the draft sequence.

The total length of all contigs greater than 1 Kb is 768 Mb (668 Mb of unique contigs and 100 Mb of reptigs). After 109,344 BAC end sequences were mapped and used for scaffolding, and the gaps between contigs in scaffolds are included, the total span of the assembly is 1.13 Gb. This length is 240 Mb larger than the estimated genome size due to the preliminary nature of the scaffolding.

This is a draft sequence and may contain errors; therefore, users should exercise caution. Typical errors in draft genome sequences include misassemblies of repeated sequences, collapses of repeated regions, and unmerged overlaps (e.g. due to polymorphisms) creating artificial duplications. However, base accuracy in contigs (contiguous blocks of sequence) is usually very high with most errors near the ends of contigs.

More assembly details are in the Spur_0.5 README file and on the BCM HGSC Sea Urchin Genome Project web page.

Bulk downloads of the sequence and annotation data are available via the Genome Browser FTP server or the Downloads page. These data have specific conditions for use. The strPur1 annotation tracks were generated by UCSC and collaborators worldwide. See the Credits page for a detailed list of the organizations and individuals who contributed to this release.


GenBank Pipeline Details

For the purposes of the GenBank alignment pipeline, this assembly is considered to be: low-coverage.