cytoBandIdeo Chromosome Band (Ideogram) Chromosome Bands Based On ISCN Lengths (for Ideogram) Mapping and Sequencing cpgIslandExt CpG Islands CpG Islands (Islands < 300 Bases are Light Green) Expression and Regulation Description CpG islands are associated with genes, particularly housekeeping genes, in vertebrates. CpG islands are typically common near transcription start sites and may be associated with promoter regions. Normally a C (cytosine) base followed immediately by a G (guanine) base (a CpG) is rare in vertebrate DNA because the Cs in such an arrangement tend to be methylated. This methylation helps distinguish the newly synthesized DNA strand from the parent strand, which aids in the final stages of DNA proofreading after duplication. However, over evolutionary time, methylated Cs tend to turn into Ts because of spontaneous deamination. The result is that CpGs are relatively rare unless there is selective pressure to keep them or a region is not methylated for some other reason, perhaps having to do with the regulation of gene expression. CpG islands are regions where CpGs are present at significantly higher levels than is typical for the genome as a whole. The unmasked version of the track displays potential CpG islands that exist in repeat regions and would otherwise not be visible in the repeat masked version. By default, only the masked version of the track is displayed. To view the unmasked version, change the visibility settings in the track controls at the top of this page. Methods CpG islands were predicted by searching the sequence one base at a time, scoring each dinucleotide (+17 for CG and -1 for others) and identifying maximally scoring segments. Each segment was then evaluated for the following criteria: GC content of 50% or greater length greater than 200 bp ratio greater than 0.6 of observed number of CG dinucleotides to the expected number on the basis of the number of Gs and Cs in the segment The entire genome sequence, masking areas included, was used for the construction of the track Unmasked CpG. The track CpG Islands is constructed on the sequence after all masked sequence is removed. The CpG count is the number of CG dinucleotides in the island. The Percentage CpG is the ratio of CpG nucleotide bases (twice the CpG count) to the length. The ratio of observed to expected CpG is calculated according to the formula (cited in Gardiner-Garden et al. (1987)): Obs/Exp CpG = Number of CpG * N / (Number of C * Number of G) where N = length of sequence. The calculation of the track data is performed by the following command sequence: twoBitToFa assembly.2bit stdout | maskOutFa stdin hard stdout \ | cpg_lh /dev/stdin 2> cpg_lh.err \ | awk '{$2 = $2 - 1; width = $3 - $2; printf("%s\t%d\t%s\t%s %s\t%s\t%s\t%0.0f\t%0.1f\t%s\t%s\n", $1, $2, $3, $5, $6, width, $6, width*$7*0.01, 100.0*2*$6/width, $7, $9);}' \ | sort -k1,1 -k2,2n > cpgIsland.bed The unmasked track data is constructed from twoBitToFa -noMask output for the twoBitToFa command. Data access CpG islands and its associated tables can be explored interactively using the REST API, the Table Browser or the Data Integrator. All the tables can also be queried directly from our public MySQL servers, with more information available on our help page as well as on our blog. The source for the cpg_lh program can be obtained from src/utils/cpgIslandExt/. The cpg_lh program binary can be obtained from: http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/cpg_lh (choose "save file") Credits This track was generated using a modification of a program developed by G. Miklem and L. Hillier (unpublished). References Gardiner-Garden M, Frommer M. CpG islands in vertebrate genomes. J Mol Biol. 1987 Jul 20;196(2):261-82. PMID: 3656447 cpgIslandSuper CpG Islands CpG Islands (Islands < 300 Bases are Light Green) Expression and Regulation Description CpG islands are associated with genes, particularly housekeeping genes, in vertebrates. CpG islands are typically common near transcription start sites and may be associated with promoter regions. Normally a C (cytosine) base followed immediately by a G (guanine) base (a CpG) is rare in vertebrate DNA because the Cs in such an arrangement tend to be methylated. This methylation helps distinguish the newly synthesized DNA strand from the parent strand, which aids in the final stages of DNA proofreading after duplication. However, over evolutionary time, methylated Cs tend to turn into Ts because of spontaneous deamination. The result is that CpGs are relatively rare unless there is selective pressure to keep them or a region is not methylated for some other reason, perhaps having to do with the regulation of gene expression. CpG islands are regions where CpGs are present at significantly higher levels than is typical for the genome as a whole. The unmasked version of the track displays potential CpG islands that exist in repeat regions and would otherwise not be visible in the repeat masked version. By default, only the masked version of the track is displayed. To view the unmasked version, change the visibility settings in the track controls at the top of this page. Methods CpG islands were predicted by searching the sequence one base at a time, scoring each dinucleotide (+17 for CG and -1 for others) and identifying maximally scoring segments. Each segment was then evaluated for the following criteria: GC content of 50% or greater length greater than 200 bp ratio greater than 0.6 of observed number of CG dinucleotides to the expected number on the basis of the number of Gs and Cs in the segment The entire genome sequence, masking areas included, was used for the construction of the track Unmasked CpG. The track CpG Islands is constructed on the sequence after all masked sequence is removed. The CpG count is the number of CG dinucleotides in the island. The Percentage CpG is the ratio of CpG nucleotide bases (twice the CpG count) to the length. The ratio of observed to expected CpG is calculated according to the formula (cited in Gardiner-Garden et al. (1987)): Obs/Exp CpG = Number of CpG * N / (Number of C * Number of G) where N = length of sequence. The calculation of the track data is performed by the following command sequence: twoBitToFa assembly.2bit stdout | maskOutFa stdin hard stdout \ | cpg_lh /dev/stdin 2> cpg_lh.err \ | awk '{$2 = $2 - 1; width = $3 - $2; printf("%s\t%d\t%s\t%s %s\t%s\t%s\t%0.0f\t%0.1f\t%s\t%s\n", $1, $2, $3, $5, $6, width, $6, width*$7*0.01, 100.0*2*$6/width, $7, $9);}' \ | sort -k1,1 -k2,2n > cpgIsland.bed The unmasked track data is constructed from twoBitToFa -noMask output for the twoBitToFa command. Data access CpG islands and its associated tables can be explored interactively using the REST API, the Table Browser or the Data Integrator. All the tables can also be queried directly from our public MySQL servers, with more information available on our help page as well as on our blog. The source for the cpg_lh program can be obtained from src/utils/cpgIslandExt/. The cpg_lh program binary can be obtained from: http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/cpg_lh (choose "save file") Credits This track was generated using a modification of a program developed by G. Miklem and L. Hillier (unpublished). References Gardiner-Garden M, Frommer M. CpG islands in vertebrate genomes. J Mol Biol. 1987 Jul 20;196(2):261-82. PMID: 3656447 gap Gap Gap Locations Mapping and Sequencing Description This track depicts gaps in the assembly. These gaps - with the exception of intractable heterochromatic gaps - will be closed during the finishing process. Gaps are represented as black boxes in this track. If the relative order and orientation of the contigs on either side of the gap is known, it is a bridged gap and a white line is drawn through the black box representing the gap. This assembly contains the following principal types of gaps: Fragment - gaps between the contigs of a draft clone. (In this context, a contig is a set of overlapping sequence reads.) These are represented by varying numbers of Ns in the assembly. Clone - gaps between bactigs. These are represented by 50,000 Ns in the assembly. nscanGene N-SCAN N-SCAN Gene Predictions Genes and Gene Predictions Description This track shows gene predictions using the N-SCAN gene structure prediction software provided by the Computational Genomics Lab at Washington University in St. Louis, MO, USA. Methods N-SCAN combines biological-signal modeling in the target genome sequence along with information from a multiple-genome alignment to generate de novo gene predictions. It extends the TWINSCAN target-informant genome pair to allow for an arbitrary number of informant sequences as well as richer models of sequence evolution. N-SCAN models the phylogenetic relationships between the aligned genome sequences, context-dependent substitution rates, insertions, and deletions. Rat N-SCAN uses human (hg18) as the informant and iterative pseudogene masking. Credits Thanks to Michael Brent's Computational Genomics Group at Washington University St. Louis for providing this data. Special thanks for this implementation of N-SCAN to Aaron Tenney in the Brent lab, and Robert Zimmermann, currently at Max F. Perutz Laboratories in Vienna, Austria. References Gross SS, Brent MR. Using multiple alignments to improve gene prediction. J Comput Biol. 2006 Mar;13(2):379-93. PMID: 16597247 Haas BJ, Delcher AL, Mount SM, Wortman JR, Smith RK Jr, Hannick LI, Maiti R, Ronning CM, Rusch DB, Town CD et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 2003 Oct 1;31(19):5654-66. PMID: 14500829; PMC: PMC206470 Korf I, Flicek P, Duan D, Brent MR. Integrating genomic homology into gene structure prediction. Bioinformatics. 2001;17 Suppl 1:S140-8. PMID: 11473003 van Baren MJ, Brent MR. Iterative gene prediction and pseudogene removal improves genome annotation. Genome Res. 2006 May;16(5):678-85. PMID: 16651666; PMC: PMC1457044 pubs Publications Publications: Sequences in Scientific Articles Literature Description This track is based on text-mining of full-text biomedical articles and includes two types of subtracks: Sequences found in publications, grouped by article and searched in genomes with BLAT Identifiers in publications that directly relate to chromosome locations (e.g., gene symbols, SNP identifiers, etc) Both sources of information are linked to the respective articles. Background information on how permission to full-text data was obtained can be found on the project website. Display Convention and Configuration The sequence subtrack indicates the location of sequences in publications mapped back to the genome, annotated with the first author and the year of the publication. All matches of one article are grouped ("chained") together. Article titles are shown when you move the mouse cursor over the features. Thicker parts of the features (exons) represent matching sequences, connected by thin lines to matches from the same article within 30 kbp. The subtrack "individual sequence matches" activates automatically when the user clicks a sequence match and follows the link "Show sequence matches individually" from the details page. Mouse-overs show flanking text around the sequence, and clicking features links to BLAT alignments. All other subtracks (i.e. bands, genes, SNPs) show the number of matching articles as the feature description. Clicking on them shows the sentences and sections in articles where the identifiers were found. The track configuration includes a keyword and year filter. Keywords are space-separated and are searched in the article's title, author list, and abstract. Data The track is based on text from biomedical research articles, obtained as part of the UCSC Genocoding Project. The current dataset consists of about 600,000 files (main text and supplementary files) from PubMed Central (Open-Access set) and around 6 million text files (main text) from Elsevier (as part of the Sciverse Apps program). Methods All file types (including XML, raw ASCII, PDFs and various Microsoft Office formats (Excel, Word, PowerPoint)) were converted to text. The results were processed to find groups of words that look like DNA/RNA sequences or words that look like protein sequences. These were then mapped with BLAT to the human genome and these model organisms: mouse (mm9), rat (rn4), zebrafish (danRer6), Drosophila melanogaster (dm3), X. tropicalis (xenTro2), Medaka (oryLat2), C. intestinalis (ci2), C. elegans (ce6) and yeast (sacCer2). The pipeline roughly proceeds through these steps: For sequences, the best match across all genomes is used, if it is longer than 17 bp and matches at 90% identity. Two sets of BLAT parameters are tried, the default ones for sequences longer than 25 bp, very sensitive ones (stepSize=5) for shorter sequences. Sequences are mapped to genomic DNA. Those that do not match are mapped to RefSeq cDNAs. Hits from the same article that are closer than 30 kbp are joined into one feature (shown as exon-blocks on the browser). All parts of a joined feature have to match at least 25 bp. Non-unique hits are kept in the joined feature with the most members. Joined features with identical members in two different genomes are kept in both genomes. Note that due to the 90% identity filter, some sequences do not match anywhere in the genome. Examples include primers with added restriction sites, mutation primers, or any other sequence that joins or mixes two pieces of genomic DNA not part of RefSeq. Also note that some gene symbols correspond to English words which can sometimes lead to many false positives. Credits Software and processing by Maximilian Haeussler. UCSC Track visualisation by Larry Meyer and Hiram Clawson. Elsevier support by Max Berenstein, Raphael Sidi, Judd Dunham, Scott Robbins and colleagues. Original version written at the Bergman Lab, University of Manchester, UK. Testing by Mary Mangan, OpenHelix Inc, and Greg Roe, UCSC. Feedback Please send ideas, comments or feedback on this track to max@soe.ucsc.edu. We are very interested in getting access to more articles from publishers for this dataset; see the project website. References Aerts S, Haeussler M, van Vooren S, Griffith OL, Hulpiau P, Jones SJ, Montgomery SB, Bergman CM, Open Regulatory Annotation Consortium. Text-mining assisted regulatory annotation. Genome Biol. 2008;9(2):R31. PMID: 18271954; PMC: PMC2374703 Haeussler M, Gerner M, Bergman CM. Annotating genes and genomes with DNA sequences extracted from biomedical articles. Bioinformatics. 2011 Apr 1;27(7):980-6. PMID: 21325301; PMC: PMC3065681 Van Noorden R. Trouble at the text mine. Nature. 2012 Mar 7;483(7388):134-5. pubsBlat Sequences Sequences in Articles: PubmedCentral and Elsevier Literature pubsBlatPsl Indiv. Seq. Matches Individual Sequence Matches of One Selected Article from Sequences Track Literature mrna Rat mRNAs Rat mRNAs from GenBank mRNA and EST Description The mRNA track shows alignments between rat mRNAs in GenBank and the genome. Display Conventions and Configuration This track follows the display conventions for PSL alignment tracks. In dense display mode, the items that are more darkly shaded indicate matches of better quality. The description page for this track has a filter that can be used to change the display mode, alter the color, and include/exclude a subset of items within the track. This may be helpful when many items are shown in the track display, especially when only some are relevant to the current task. To use the filter: Type a term in one or more of the text boxes to filter the mRNA display. For example, to apply the filter to all mRNAs expressed in a specific organ, type the name of the organ in the tissue box. To view the list of valid terms for each text box, consult the table in the Table Browser that corresponds to the factor on which you wish to filter. For example, the "tissue" table contains all the types of tissues that can be entered into the tissue text box. Multiple terms may be entered at once, separated by a space. Wildcards may also be used in the filter. If filtering on more than one value, choose the desired combination logic. If "and" is selected, only mRNAs that match all filter criteria will be highlighted. If "or" is selected, mRNAs that match any one of the filter criteria will be highlighted. Choose the color or display characteristic that should be used to highlight or include/exclude the filtered items. If "exclude" is chosen, the browser will not display mRNAs that match the filter criteria. If "include" is selected, the browser will display only those mRNAs that match the filter criteria. This track may also be configured to display codon coloring, a feature that allows the user to quickly compare mRNAs against the genomic sequence. For more information about this option, go to the Codon and Base Coloring for Alignment Tracks page. Several types of alignment gap may also be colored; for more information, go to the Alignment Insertion/Deletion Display Options page. Methods GenBank rat mRNAs were aligned against the genome using the blat program. When a single mRNA aligned in multiple places, the alignment having the highest base identity was found. Only alignments having a base identity level within 0.5% of the best and at least 96% base identity with the genomic sequence were kept. Credits The mRNA track was produced at UCSC from mRNA sequence data submitted to the international public sequence databases by scientists worldwide. References Benson DA, Cavanaugh M, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW. GenBank. Nucleic Acids Res. 2013 Jan;41(Database issue):D36-42. PMID: 23193287; PMC: PMC3531190 Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. GenBank: update. Nucleic Acids Res. 2004 Jan 1;32(Database issue):D23-6. PMID: 14681350; PMC: PMC308779 Kent WJ. BLAT - the BLAST-like alignment tool. Genome Res. 2002 Apr;12(4):656-64. PMID: 11932250; PMC: PMC187518 rmsk RepeatMasker Repeating Elements by RepeatMasker Variation and Repeats Description This track was created by using Arian Smit's RepeatMasker program, which screens DNA sequences for interspersed repeats and low complexity DNA sequences. The program outputs a detailed annotation of the repeats that are present in the query sequence (represented by this track), as well as a modified version of the query sequence in which all the annotated repeats have been masked (generally available on the Downloads page). RepeatMasker uses the Repbase Update library of repeats from the Genetic Information Research Institute (GIRI). Repbase Update is described in Jurka (2000) in the References section below. Some newer assemblies have been made with Dfam, not Repbase. You can find the details for how we make our database data here in our "makeDb/doc/" directory. Display Conventions and Configuration In full display mode, this track displays up to ten different classes of repeats: Short interspersed nuclear elements (SINE), which include ALUs Long interspersed nuclear elements (LINE) Long terminal repeat elements (LTR), which include retroposons DNA repeat elements (DNA) Simple repeats (micro-satellites) Low complexity repeats Satellite repeats RNA repeats (including RNA, tRNA, rRNA, snRNA, scRNA, srpRNA) Other repeats, which includes class RC (Rolling Circle) Unknown The level of color shading in the graphical display reflects the amount of base mismatch, base deletion, and base insertion associated with a repeat element. The higher the combined number of these, the lighter the shading. A "?" at the end of the "Family" or "Class" (for example, DNA?) signifies that the curator was unsure of the classification. At some point in the future, either the "?" will be removed or the classification will be changed. Methods Data are generated using the RepeatMasker -s flag. Additional flags may be used for certain organisms. Repeats are soft-masked. Alignments may extend through repeats, but are not permitted to initiate in them. See the FAQ for more information. Credits Thanks to Arian Smit, Robert Hubley and GIRI for providing the tools and repeat libraries used to generate this track. References Smit AFA, Hubley R, Green P. RepeatMasker Open-3.0. http://www.repeatmasker.org. 1996-2010. Repbase Update is described in: Jurka J. Repbase Update: a database and an electronic journal of repetitive elements. Trends Genet. 2000 Sep;16(9):418-420. PMID: 10973072 For a discussion of repeats in mammalian genomes, see: Smit AF. Interspersed repeats and other mementos of transposable elements in mammalian genomes. Curr Opin Genet Dev. 1999 Dec;9(6):657-63. PMID: 10607616 Smit AF. The origin of interspersed repeats in the human genome. Curr Opin Genet Dev. 1996 Dec;6(6):743-8. PMID: 8994846 xenoRefGene Other RefSeq Non-Rat RefSeq Genes Genes and Gene Predictions Description This track shows known protein-coding and non-protein-coding genes for organisms other than rat, taken from the NCBI RNA reference sequences collection (RefSeq). The data underlying this track are updated weekly. Display Conventions and Configuration This track follows the display conventions for gene prediction tracks. The color shading indicates the level of review the RefSeq record has undergone: predicted (light), provisional (medium), reviewed (dark). The item labels and display colors of features within this track can be configured through the controls at the top of the track description page. Label: By default, items are labeled by gene name. Click the appropriate Label option to display the accession name instead of the gene name, show both the gene and accession names, or turn off the label completely. Codon coloring: This track contains an optional codon coloring feature that allows users to quickly validate and compare gene predictions. To display codon colors, select the genomic codons option from the Color track by codons pull-down menu. For more information about this feature, go to the Coloring Gene Predictions and Annotations by Codon page. Hide non-coding genes: By default, both the protein-coding and non-protein-coding genes are displayed. If you wish to see only the coding genes, click this box. Methods The RNAs were aligned against the rat genome using blat; those with an alignment of less than 15% were discarded. When a single RNA aligned in multiple places, the alignment having the highest base identity was identified. Only alignments having a base identity level within 0.5% of the best and at least 25% base identity with the genomic sequence were kept. Credits This track was produced at UCSC from RNA sequence data generated by scientists worldwide and curated by the NCBI RefSeq project. References Kent WJ. BLAT--the BLAST-like alignment tool. Genome Res. 2002 Apr;12(4):656-64. PMID: 11932250; PMC: PMC187518 Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell CM, Hart J, Landrum MJ, McGarvey KM et al. RefSeq: an update on mammalian reference sequences. Nucleic Acids Res. 2014 Jan;42(Database issue):D756-63. PMID: 24259432; PMC: PMC3965018 Pruitt KD, Tatusova T, Maglott DR. NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 2005 Jan 1;33(Database issue):D501-4. PMID: 15608248; PMC: PMC539979 snp125 SNPs Simple Nucleotide Polymorphisms (dbSNP build 125) Variation and Repeats Description This track contains dbSNP build 125, available from ftp.ncbi.nih.gov/snp. Interpreting and Configuring the Graphical Display Variants are shown as single tick marks at most zoom levels. When viewing the track at or near base-level resolution, the displayed width of the SNP corresponds to the width of the variant in the reference sequence. Insertions are indicated by a single tick mark displayed between two nucleotides, single nucleotide polymorphisms are displayed as the width of a single base, and multiple nucleotide variants are represented by a block that spans two or more bases. The configuration categories reflect the following definitions (not all categories apply to this assembly): Location Type: Describes the alignment of the flanking sequence Range - the flank alignments leave a gap of 2 or more bases in the reference assembly Exact - the flank alignments leave exactly one base between them Between - the flank alignments are contiguous; the variation is an insertion RangeInsertion - the flank alignments surround a distinct polymorphism between the submitted sequence and reference assembly; the submitted sequence is shorter RangeSubstitution - the flank alignments surround a distinct polymorphism between the submitted sequence and reference assembly; the submitted sequence and the reference assembly sequence are of equal length RangeDeletion - the flank alignments surround a distinct polymorphism between the submitted sequence and reference assembly; the submitted sequence is longer Class: Describes the observed alleles Single - single nucleotide variation: all observed alleles are single nucleotides (can have 2, 3 or 4 alleles) In-del - insertion/deletion (applies to RangeInsertion, RangeSubstitution, RangeDeletion) Heterozygous - heterozygous (undetermined) variation: allele contains string '(heterozygous)' Microsatellite - the observed allele from dbSNP is variation in counts of short tandem repeats Named - the observed allele from dbSNP is given as a text name No Variation - no variation asserted for sequence Mixed - the cluster contains submissions from multiple classes Multiple Nucleotide Polymorphism - alleles of the same length, length > 1, and from set of {A,T,C,G} Insertion - the polymorphism is an insertion relative to the reference assembly Deletion - the polymorphism is a deletion relative to the reference assembly Unknown - no classification provided by data contributor Validation: Method used to validate the variant (each variant may be validated by more than one method) By Frequency - at least one submitted SNP in cluster has frequency data submitted By Cluster - cluster has at least 2 submissions, with at least one submission assayed with a non-computational method By Submitter - at least one submitter SNP in cluster was validated by independent assay By 2 Hit/2 Allele - all alleles have been observed in at least 2 chromosomes By HapMap - validated by HapMap project Unknown - no validation has been reported for this variant Function: dbSNP's predicted functional effect of variant on RefSeq transcripts, both curated (NM_* and NR_*) as in the RefSeq Genes track and predicted (XM_* and XR_*), not shown in UCSC Genome Browser. A variant may have more than one functional role if it overlaps multiple transcripts. Locus Region - variation is 3' to and within 500 bases of a transcript, or is 5' to and within 2000 bases of a transcript (dbSNP term: locus; Sequence Ontology term: feature_variant) Coding - Synonymous - no change in peptide for allele with respect to the reference assembly (dbSNP term: coding-synon; Sequence Ontology term: synonymous_variant) Coding - Non-Synonymous - change in peptide for allele with respect to the reference assembly (dbSNP term: coding-nonsynon; Sequence Ontology term: protein_altering_variant) Untranslated - variation is in a transcript, but not in a coding region interval (dbSNP term: untranslated; Sequence Ontology term: UTR_variant) Intron - variation is in an intron, but not in the first two or last two bases of the intron (dbSNP term: intron; Sequence Ontology term: intron_variant) Splice Site - variation is in the first two or last two bases of an intron (dbSNP term: splice-site; Sequence Ontology term: splice_site_variant) Reference (coding) - one of the observed alleles of a SNP in a coding region matches the reference assembly (cds-reference) Sequence Ontology term: coding_sequence_variant) Unknown - no known functional classification Molecule Type: Sample used to find this variant Genomic - variant discovered using a genomic template cDNA - variant discovered using a cDNA template Unknown - sample type not known Average heterozygosity: Calculated by dbSNP as described here Average heterozygosity should not exceed 0.5 for bi-allelic single-base substitutions. Weight: Alignment quality assigned by dbSNP Weight can be 1, 2, 3 or 10. Weight = 1 are the highest quality alignments. Weight = 10 is excluded from the data set. A filter on maximum weight value is supported, which defaults to 3. You can configure this track such that the details page displays the function and coding differences relative to particular gene sets. Choose the gene sets from the list on the SNP configuration page displayed beneath this heading: On details page, show function and coding differences relative to. When one or more gene tracks are selected, the SNP details page lists all genes that the SNP hits (or is close to), with the same keywords used in the function category. The function usually agrees with NCBI's function, but can sometimes give a bit more detail (e.g. more detail about how close a near-gene SNP is to a nearby gene). Insertions/Deletions dbSNP uses a class called 'in-del'. This has been split into 'insertion' and 'deletion', based on location type. Location types 'range' and 'exact' are deletions relative to the reference assembly. Location type 'between' are insertions relative to the reference assembly. For the new location types, the class 'in-del' is preserved. UCSC Annotations In addition to presenting the dbSNP data, the following annotations are provided: The size of the dbSNP reference allele is checked to see if it matches the coordinate span; exceptions are noted. The dbSNP reference allele is compared to the UCSC reference allele, and a note is made if the dbSNP reference allele is the reverse complement of the UCSC reference allele. Single-base substitutions where the alignments of the flanking sequences are adjacent or have a gap of more than one base are noted. A note is made if the observed alleles are not available from the rs_fasta files. Observed alleles with an unexpected format are noted. The length of the observed alleles is checked for consistency with location types; exceptions are noted. Single-base substitutions are checked to see that one of the observed alleles matches the reference allele; excpetions are noted. Simple deletions are checked to see that the observed allele matches the reference allele; exceptions are noted. Tri-allelic and quad-allelic single-base substitutions are noted. Variants that have multiple mappings are noted. Data Sources b125_SNPContigLoc.bcp.gz was used for coordinates, orientation, location type and the dbSNP reference allele. b125_SNPMapInfo.bcp.gz was used for alignment weights, filtering out alignments with weight = 0 or weight = 10. Class and observed polymorphism were obtained from the shared UniVariation.bcp.gz, using the univar_id from SNP.bcp.gz as an index. Functional classification was obtained from b125_SNPContigLocusId.bcp.gz. The internal database representation uses dbSNP's function terms, but for display in SNP details pages, these are translated into Sequence Ontology terms. Validation status and heterozygosity were obtained from SNP.bcp.gz. The header lines in the rs_fasta files were used for molecule type. intronEst Spliced ESTs Rat ESTs That Have Been Spliced mRNA and EST Description This track shows alignments between rat expressed sequence tags (ESTs) in GenBank and the genome that show signs of splicing when aligned against the genome. ESTs are single-read sequences, typically about 500 bases in length, that usually represent fragments of transcribed genes. To be considered spliced, an EST must show evidence of at least one canonical intron (i.e., the genomic sequence between EST alignment blocks must be at least 32 bases in length and have GT/AG ends). By requiring splicing, the level of contamination in the EST databases is drastically reduced at the expense of eliminating many genuine 3' ESTs. For a display of all ESTs (including unspliced), see the rat EST track. Display Conventions and Configuration This track follows the display conventions for PSL alignment tracks. In dense display mode, darker shading indicates a larger number of aligned ESTs. The strand information (+/-) indicates the direction of the match between the EST and the matching genomic sequence. It bears no relationship to the direction of transcription of the RNA with which it might be associated. The description page for this track has a filter that can be used to change the display mode, alter the color, and include/exclude a subset of items within the track. This may be helpful when many items are shown in the track display, especially when only some are relevant to the current task. To use the filter: Type a term in one or more of the text boxes to filter the EST display. For example, to apply the filter to all ESTs expressed in a specific organ, type the name of the organ in the tissue box. To view the list of valid terms for each text box, consult the table in the Table Browser that corresponds to the factor on which you wish to filter. For example, the "tissue" table contains all the types of tissues that can be entered into the tissue text box. Multiple terms may be entered at once, separated by a space. Wildcards may also be used in the filter. If filtering on more than one value, choose the desired combination logic. If "and" is selected, only ESTs that match all filter criteria will be highlighted. If "or" is selected, ESTs that match any one of the filter criteria will be highlighted. Choose the color or display characteristic that should be used to highlight or include/exclude the filtered items. If "exclude" is chosen, the browser will not display ESTs that match the filter criteria. If "include" is selected, the browser will display only those ESTs that match the filter criteria. This track may also be configured to display base labeling, a feature that allows the user to display all bases in the aligning sequence or only those that differ from the genomic sequence. For more information about this option, go to the Base Coloring for Alignment Tracks page. Several types of alignment gap may also be colored; for more information, go to the Alignment Insertion/Deletion Display Options page. Methods To make an EST, RNA is isolated from cells and reverse transcribed into cDNA. Typically, the cDNA is cloned into a plasmid vector and a read is taken from the 5' and/or 3' primer. For most — but not all — ESTs, the reverse transcription is primed by an oligo-dT, which hybridizes with the poly-A tail of mature mRNA. The reverse transcriptase may or may not make it to the 5' end of the mRNA, which may or may not be degraded. In general, the 3' ESTs mark the end of transcription reasonably well, but the 5' ESTs may end at any point within the transcript. Some of the newer cap-selected libraries cover transcription start reasonably well. Before the cap-selection techniques emerged, some projects used random rather than poly-A priming in an attempt to retrieve sequence distant from the 3' end. These projects were successful at this, but as a side effect also deposited sequences from unprocessed mRNA and perhaps even genomic sequences into the EST databases. Even outside of the random-primed projects, there is a degree of non-mRNA contamination. Because of this, a single unspliced EST should be viewed with considerable skepticism. To generate this track, rat ESTs from GenBank were aligned against the genome using blat. Note that the maximum intron length allowed by blat is 750,000 bases, which may eliminate some ESTs with very long introns that might otherwise align. When a single EST aligned in multiple places, the alignment having the highest base identity was identified. Only alignments having a base identity level within 0.5% of the best and at least 96% base identity with the genomic sequence are displayed in this track. Credits This track was produced at UCSC from EST sequence data submitted to the international public sequence databases by scientists worldwide. References Benson DA, Cavanaugh M, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW. GenBank. Nucleic Acids Res. 2013 Jan;41(Database issue):D36-42. PMID: 23193287; PMC: PMC3531190 Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. GenBank: update. Nucleic Acids Res. 2004 Jan 1;32(Database issue):D23-6. PMID: 14681350; PMC: PMC308779 Kent WJ. BLAT - the BLAST-like alignment tool. Genome Res. 2002 Apr;12(4):656-64. PMID: 11932250; PMC: PMC187518 cpgIslandExtUnmasked Unmasked CpG CpG Islands on All Sequence (Islands < 300 Bases are Light Green) Expression and Regulation Description CpG islands are associated with genes, particularly housekeeping genes, in vertebrates. CpG islands are typically common near transcription start sites and may be associated with promoter regions. Normally a C (cytosine) base followed immediately by a G (guanine) base (a CpG) is rare in vertebrate DNA because the Cs in such an arrangement tend to be methylated. This methylation helps distinguish the newly synthesized DNA strand from the parent strand, which aids in the final stages of DNA proofreading after duplication. However, over evolutionary time, methylated Cs tend to turn into Ts because of spontaneous deamination. The result is that CpGs are relatively rare unless there is selective pressure to keep them or a region is not methylated for some other reason, perhaps having to do with the regulation of gene expression. CpG islands are regions where CpGs are present at significantly higher levels than is typical for the genome as a whole. The unmasked version of the track displays potential CpG islands that exist in repeat regions and would otherwise not be visible in the repeat masked version. By default, only the masked version of the track is displayed. To view the unmasked version, change the visibility settings in the track controls at the top of this page. Methods CpG islands were predicted by searching the sequence one base at a time, scoring each dinucleotide (+17 for CG and -1 for others) and identifying maximally scoring segments. Each segment was then evaluated for the following criteria: GC content of 50% or greater length greater than 200 bp ratio greater than 0.6 of observed number of CG dinucleotides to the expected number on the basis of the number of Gs and Cs in the segment The entire genome sequence, masking areas included, was used for the construction of the track Unmasked CpG. The track CpG Islands is constructed on the sequence after all masked sequence is removed. The CpG count is the number of CG dinucleotides in the island. The Percentage CpG is the ratio of CpG nucleotide bases (twice the CpG count) to the length. The ratio of observed to expected CpG is calculated according to the formula (cited in Gardiner-Garden et al. (1987)): Obs/Exp CpG = Number of CpG * N / (Number of C * Number of G) where N = length of sequence. The calculation of the track data is performed by the following command sequence: twoBitToFa assembly.2bit stdout | maskOutFa stdin hard stdout \ | cpg_lh /dev/stdin 2> cpg_lh.err \ | awk '{$2 = $2 - 1; width = $3 - $2; printf("%s\t%d\t%s\t%s %s\t%s\t%s\t%0.0f\t%0.1f\t%s\t%s\n", $1, $2, $3, $5, $6, width, $6, width*$7*0.01, 100.0*2*$6/width, $7, $9);}' \ | sort -k1,1 -k2,2n > cpgIsland.bed The unmasked track data is constructed from twoBitToFa -noMask output for the twoBitToFa command. Data access CpG islands and its associated tables can be explored interactively using the REST API, the Table Browser or the Data Integrator. All the tables can also be queried directly from our public MySQL servers, with more information available on our help page as well as on our blog. The source for the cpg_lh program can be obtained from src/utils/cpgIslandExt/. The cpg_lh program binary can be obtained from: http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/cpg_lh (choose "save file") Credits This track was generated using a modification of a program developed by G. Miklem and L. Hillier (unpublished). References Gardiner-Garden M, Frommer M. CpG islands in vertebrate genomes. J Mol Biol. 1987 Jul 20;196(2):261-82. PMID: 3656447 refGene RefSeq Genes RefSeq Genes Genes and Gene Predictions Description The RefSeq Genes track shows known rat protein-coding and non-protein-coding genes taken from the NCBI RNA reference sequences collection (RefSeq). The data underlying this track are updated weekly. Please visit the Feedback for Gene and Reference Sequences (RefSeq) page to make suggestions, submit additions and corrections, or ask for help concerning RefSeq records. For more information on the different gene tracks, see our Genes FAQ. Display Conventions and Configuration This track follows the display conventions for gene prediction tracks. The color shading indicates the level of review the RefSeq record has undergone: predicted (light), provisional (medium), reviewed (dark). The item labels and display colors of features within this track can be configured through the controls at the top of the track description page. Label: By default, items are labeled by gene name. Click the appropriate Label option to display the accession name instead of the gene name, show both the gene and accession names, or turn off the label completely. Codon coloring: This track contains an optional codon coloring feature that allows users to quickly validate and compare gene predictions. To display codon colors, select the genomic codons option from the Color track by codons pull-down menu. For more information about this feature, go to the Coloring Gene Predictions and Annotations by Codon page. Hide non-coding genes: By default, both the protein-coding and non-protein-coding genes are displayed. If you wish to see only the coding genes, click this box. Methods RefSeq RNAs were aligned against the rat genome using BLAT. Those with an alignment of less than 15% were discarded. When a single RNA aligned in multiple places, the alignment having the highest base identity was identified. Only alignments having a base identity level within 0.1% of the best and at least 96% base identity with the genomic sequence were kept. Credits This track was produced at UCSC from RNA sequence data generated by scientists worldwide and curated by the NCBI RefSeq project. References Kent WJ. BLAT - the BLAST-like alignment tool. Genome Res. 2002 Apr;12(4):656-64. PMID: 11932250; PMC: PMC187518 Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell CM, Hart J, Landrum MJ, McGarvey KM et al. RefSeq: an update on mammalian reference sequences. Nucleic Acids Res. 2014 Jan;42(Database issue):D756-63. PMID: 24259432; PMC: PMC3965018 Pruitt KD, Tatusova T, Maglott DR. NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 2005 Jan 1;33(Database issue):D501-4. PMID: 15608248; PMC: PMC539979 rgdGene2 RGD Genes RGD Genes Genes and Gene Predictions Description This track shows protein-coding genes based on annotation provided by RGD. Method Annotations on RGD Genes were downloaded from: ftp://rgd.mcw.edu/pub/data_release/GFF/Corrected_GFF3/ ftp://rgd.mcw.edu/pub/data_release/GENES_RAT The GFF files were combined into a temporary MySQL database table and those records of "gene", "exon", and "CDS" were selected and loaded into separate MySQL tables. Records of inconsistent strand information were cleaned out. The resulting collection was loaded into a gene prediction format table using the ldHgGene utility program. The data were further processed by two programs, getRgdGeneCds and doRgdGene2 to create the genePred format table, rgdGene2, as the base table for RGD Genes. A program, doRgdGene2Xref, was used to create the rgdGene2Xref table using the Dbxref field. rgdGene2ToDescription table was built using the gene_desc field from GENES_RAT file. rgdGene2ToUniProt and rgdGene2Pep tables were built using data from GENES_RAT and UniProt database. A total of 753 genes were found to have inconsistent annotations, which caused display problems in the Genome Browser. These 753 entries were removed. All the programs mentioned above in this build pipeline could be found in the source code package which may be downloaded here. Credits Thanks to RGD for providing the base annotation of RGD Genes. The RGD Genes track was produced at UCSC by Fan Hsu, Mary Goldman and Hiram Clawson. It is based on data from RGD, NCBI RefSeq, UniProt, and GenBank. Our thanks to the people running these databases and to the scientists worldwide who have made contributions to them. References Twigger SN, Shimoyama M, Bromberg S, Kwitek AE, Jacob HJ, RGD Team. The Rat Genome Database, update 2007--easing the path from disease to data and back again. Nucleic Acids Res. 2007 Jan;35(Database issue):D658-62. PMID: 17151068; PMC: PMC1761441 sgpGene SGP Genes SGP Gene Predictions Using Rat/Human Homology Genes and Gene Predictions Description This track shows gene predictions from the SGP program, developed at the Genome Bioinformatics Laboratory (GBL), which is part of the Grup de Recerca en Informàtica Biomèdica (GRIB) at Institut Municipal d'Investigació Mèdica (IMIM) / Centre de Regulació Genòmica (CRG) in Barcelona. To predict genes in a genomic query, SGP combines geneid predictions with tblastx comparisons of the genomic query against other genomic sequences. Credits Thanks to GRIB for providing these gene predictions. stsMapRat STS Markers STS Markers on Genetic and Radiation Hybrid Maps Mapping and Sequencing Description This track shows locations of Sequence Tagged Sites (STS) along the rat draft assembly. These markers have been mapped using either genetic (Rat FHH x ACI F2 Intercross Genetic Map, Rat SHRSP x BN F2 Intercross Genetic Map) or radiation hybridization (RH Map 2.2) mapping techniques. Additional data on the individual maps can be found at the following links: NCBI UniSTS (now part of NCBI Probe) Rat Genome Database By default all genetic map markers are shown as blue; only radiation hybrid markers and markers that are neither genetic nor radiation hybrid are shown as black; markers that map to more than one position are shown in lighter colors. Users can choose a color to highlight a subset of markers of interest from the Filter options in STS Markers Track Setting page. Using the Filter The track filter can be used to change the color or include/exclude a set of map data within the track. This is helpful when many items are shown in the track display, especially when only some are relevant to the current task. To use the filter: In the pulldown menu, select the map whose data you would like to highlight or exclude in the display. By default, the "All Genetic" option is selected. Choose the color or display characteristic that will be used to highlight or include/exclude the filtered items. If "exclude" is chosen, the browser will not display data from the map selected in the pulldown list. If "include" is selected, the browser will display only data from the selected map. When you have finished configuring the filter, click the Submit button. affyAllExonProbes Affy Exon Probes Affymetrix Exon Array 1.0: Probesets Expression and Regulation Description The Exon GeneChip contains over one million probe sets designed to interrogate individual exons rather than the 3' ends of transcripts as in traditional GeneChips. Exons were derived from a variety of annotations that have been divided into the classes Core, Extended and Full. Core: RefSeq transcripts, full-length GenBank mRNAs Extended: dbEst alignments, Ensembl annotations, syntenic mRNA from human, rat and mouse, microRNA annotations, MITOMAP annotations, Vega genes, Vega pseudogenes Full: Geneid genes, Genscan genes, Genscan Subopt, Exoniphy, RNA genes, SGP genes, Twinscan genes Probe sets are colored by class with the Core probe sets being the darkest and the Full being the lightest color. Additionally, probe sets that do not overlap the exons of a transcript cluster, but fall inside of its introns, are considered bounded by that transcript cluster and are colored slightly lighter. Probe sets that overlap the coding portion of the Core class are colored slightly darker. The microarray track using this probe set can be displayed by turning on the Affy Exon Tissue track. Credits and References The exons interrogated by the probe sets displayed in this track are from the Affymetrix Exon 1.0 GeneChip and were derived from a number of sources. In addition to the millions of cDNA sequences contributed to the GenBank, dbEst and RefSeq databases by individual labs and scientists, the following annotations were used: Ensembl: Hubbard T, Barker D, Birney E, Cameron G, Chen Y, Clark L, Cox T, Cuff J, Curwen V, Down T et al. The Ensembl genome database project. Nucleic Acids Res. 2002 Jan 1;30(1):38-41. PMID: 11752248; PMC: PMC99161 Exoniphy: Siepel, A., Haussler, D. Computational identification of evolutionarily conserved exons. Proc. 8th Int'l Conf. on Research in Computational Molecular Biology, 177-186 (2004). Geneid Genes: Parra G, Blanco E, Guigó R. GeneID in Drosophila. Genome Res. 2000 Apr;10(4):511-5. PMID: 10779490; PMC: PMC310871 Genscan Genes: Burge C, Karlin S. Prediction of complete gene structures in human genomic DNA. J Mol Biol. 1997 Apr 25;268(1):78-94. PMID: 9149143 microRNA: Griffiths-Jones S. The microRNA Registry. Nucleic Acids Res. 2004 Jan 1;32(Database issue):D109-11. PMID: 14681370; PMC: PMC308757 MITOMAP: Kogelnik AM, Lott MT, Brown MD, Navathe SB, Wallace DC. MITOMAP: a human mitochondrial genome database. Nucleic Acids Res. 1996 Jan 1;24(1):177-9. PMID: 8594574; PMC: PMC145607 RNA Genes: Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997 Mar 1;25(5):955-64. PMID: 9023104; PMC: PMC146525 SGP Genes: Wiehe T, Gebauer-Jung S, Mitchell-Olds T, Guigó R. SGP-1: prediction and validation of homologous genes based on sequence alignments. Genome Res. 2001 Sep;11(9):1574-83. PMID: 11544202; PMC: PMC311140 Twinscan Genes: Korf I, Flicek P, Duan D, Brent MR. Integrating genomic homology into gene structure prediction. Bioinformatics. 2001;17 Suppl 1:S140-8. PMID: 11473003 Vega Genes and Pseudogenes: The HAVANA group, Wellcome Trust Sanger Institute. affyAllExonSuper Affy Exon Affymetrix All Exon Microarrays Expression and Regulation Overview This super-track combines related tracks of the Affymetrix All Exon chip data. There are two member tracks: Affymetrix Exon Array 1.0: Normal Tissues: This track displays data from 11 different sample tissues across three replicate arrays each. Affymetrix Exon Array 1.0: Probesets: This track displays probeset loci for the array. affyExonTissues Affy Exon Tissues Affymetrix Exon Array 1.0: Normal Tissues Expression and Regulation Methods RNA (from a commercial source) from 11 tissues were hybridized to Affymetrix Rat Exon 1.0 ST arrays. For each tissue, 3 replicate experiments were performed for a total of 33 arrays. The raw intensity signal from the arrays was normalized with a quantile normalization method, then run through the PLIER algorithm. The normalized data were then converted to median-centered log-ratios, which are displayed as green for negative log-ratios (below-median expression), and red for positive (above-median expression). The probe sets for this microarray track are shown in the the Affy Exon Probes track. Credits This track was produced by Andy Pohl, Kayla Smith, and Pauline Fujita of the genome browser group at UCSC, Melissa Cline of the Ares lab at UCSC, and Chuck Sugnet at Affymetrix, based on sample exon array data available from Affymetrix, produced by Tyson Clark. References Pohl AA, Sugnet CW, Clark TA, Smith K, Fujita PA, Cline MS. Affy exon tissues: exon levels in normal tissues in human, mouse and rat. Bioinformatics. 2009 Sep 15;25(18):2442-3. PMID: 19570805; PMC: PMC2735668 Links Affymetrix Rat Exon 1.0 ST array web page Affymetrix Rat Exon data PLIER algorithm documentation (PDF). affyRAE230 Affy RAE230 Alignments of Affymetrix Consensus Sequences from RAE230 Expression and Regulation Description This track shows the location of the consensus sequences used for the selection of probes on the Affymetrix RAE230A and RAE230B chips. Methods Consensus sequences were downloaded from the Affymetrix Product Support and mapped to the genome with blat followed by pslReps using the parameters -minCover=0.3, -minAli=0.95 and -nearTop=0.005. Credits Thanks to Affymetrix for the data underlying this track. affyU34A Affy RG-U34A Alignments of Affymetrix Exemplar Sequences from RG-U34A Expression and Regulation Description This track shows the location of the exemplar sequences used for the selection of probes on the Affymetrix RG-U34A chip. Methods Exemplar sequences were downloaded from the Affymetrix Product Support and mapped to the genome with blat followed by pslReps using the parameters -minCover=0.3, -minAli=0.95 and -nearTop=0.005. Credits Thanks to Affymetrix for the data underlying this track. genotypeArrays Agilent CGH Agilent CGH Microarray probesets Variation and Repeats Description This track displays the probes from the Agilent Catalog Oligonucleotide Microarrays. Agilent's oligonucleotide CGH (Comparative Genomic Hybridization) platform enables the study of genome-wide DNA copy number changes at a high resolution. The probes on Agilent aCGH microarrays are 60-mer oligonucleotides synthesized in situ using Agilent's inkjet SurePrint technology. The probes represented on the Agilent CGH microarrays have been selected using algorithms developed specifically for the CGH application, assuring optimal performance of these probes in detecting DNA copy number changes. The Agilent catalog CGH microarrays are printed on 1 in. x 3 in. glass slides and are available in several formats. The rat catalog SurePrint G3 microarrays formats are the 1x1M and 4x180K. The legacy rat catalog SurePrint HD microarrays are the 1x244K and 2x105K. Subtracks were colored in alternating shades of green and orange to highlight track boundaries. The track consists of the following subtracks: Description AMADID Samples per Slide Biological Features Median Probe Spacing Gene-biased SurePrint G3 Rat CGH Microarray 1x1M 027065 1 966,967 1.7 kb Yes SurePrint G3 Rat CGH Microarray 4x180K 027064 4 174,012 11 kb Yes SurePrint HD Rat CGH Microarray 1x244K 015223 1 235,974 4.9 kb Yes SurePrint HD Rat CGH Microarray 2x105K 015235 2 97,973 19 kb Yes References More information on the Agilent Oligonucleotide Microarrays is available here. Barrett MT, Scheffer A, Ben-Dor A, Sampas N, Lipson D, Kincaid R, Tsang P, Curry B, Baird K, Meltzer PS et al. Comparative genomic hybridization using oligonucleotide microarrays and total genomic DNA. Proc Natl Acad Sci U S A. 2004 Dec 21;101(51):17765-70. PMID: 15591353; PMC: PMC535426 Credits Thanks to Shane Giles, Peter Webb, and Anniek De Witte from Agilent Technologies for providing these data. agilentCgh105a Ag CGH 2x105K Agilent SurePrint HD Rat CGH Microarray 2x105K AMADID 015235 Variation and Repeats agilentCgh244a Ag CGH 1x244K Agilent SurePrint HD Rat CGH Microarray 1x244K AMADID 015223 Variation and Repeats agilentCgh4x180k Ag CGH 4x180K Agilent SurePrint G3 Rat CGH Microarray 4x180K AMADID 027064 Variation and Repeats agilentCgh1x1m Ag CGH 1x1M Agilent SurePrint G3 Rat CGH Microarray 1x1M AMADID 027065 Variation and Repeats gold Assembly Assembly from Fragments Mapping and Sequencing Description This track shows the draft assembly of the rat genome. Instead of a clone-by-clone assembly, the Rat genome assembly is pieced together from Bactig assemblies, which are reassemblies of the reads from sets of overlapping skimmed BAC clones, including mapped whole genome shotgun reads. The assembly process reworks the contigs (splitting some) and selects them under more stringent criteria than are used for the BAC submissions. In dense display mode, this track depicts the path through the draft and finished contigs used to create the assembled sequence. Where gaps exist in the path, spaces are shown between the blocks. Clone Type Key: W - Whole Genome Shotgun contig F - Finished D - Draft augustusGene AUGUSTUS AUGUSTUS ab initio gene predictions v3.1 Genes and Gene Predictions Description This track shows ab initio predictions from the program AUGUSTUS (version 3.1). The predictions are based on the genome sequence alone. For more information on the different gene tracks, see our Genes FAQ. Methods Statistical signal models were built for splice sites, branch-point patterns, translation start sites, and the poly-A signal. Furthermore, models were built for the sequence content of protein-coding and non-coding regions as well as for the length distributions of different exon and intron types. Detailed descriptions of most of these different models can be found in Mario Stanke's dissertation. This track shows the most likely gene structure according to a Semi-Markov Conditional Random Field model. Alternative splicing transcripts were obtained with a sampling algorithm (--alternatives-from-sampling=true --sample=100 --minexonintronprob=0.2 --minmeanexonintronprob=0.5 --maxtracks=3 --temperature=2). The different models used by Augustus were trained on a number of different species-specific gene sets, which included 1000-2000 training gene structures. The --species option allows one to choose the species used for training the models. Different training species were used for the --species option when generating these predictions for different groups of assemblies. Assembly Group Training Species Fish zebrafish Birds chicken Human and all other vertebrates human Nematodes caenorhabditis Drosophila fly A. mellifera honeybee1 A. gambiae culex S. cerevisiae saccharomyces This table describes which training species was used for a particular group of assemblies. When available, the closest related training species was used. Credits Thanks to the Stanke lab for providing the AUGUSTUS program. The training for the chicken version was done by Stefanie König and the training for the human and zebrafish versions was done by Mario Stanke. References Stanke M, Diekhans M, Baertsch R, Haussler D. Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics. 2008 Mar 1;24(5):637-44. PMID: 18218656 Stanke M, Waack S. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics. 2003 Oct;19 Suppl 2:ii215-25. PMID: 14534192 bacEndPairs BAC End Pairs BAC End Pairs Mapping and Sequencing Description Bacterial artificial chromosomes (BACs) are a key part of many large-scale sequencing projects. A BAC typically consists of 50 - 300 kb of DNA. During the early phase of a sequencing project, it is common to sequence a single read (approximately 500 bases) off each end of a large number of BACs. Later on in the project, these BAC end reads can be mapped to the genome sequence. This track shows these mappings in cases where both ends could be mapped. These BAC end pairs can be useful for validating the assembly over relatively long ranges. In some cases, the BACs are useful biological reagents. This track can also be used for determining which BAC contains a given gene, useful information for certain wet lab experiments. A valid pair of BAC end sequences must be at least 50 kb but no more than 600 kb away from each other. The orientation of the first BAC end sequence must be "+" and the orientation of the second BAC end sequence must be "-". The scoring scheme used for this annotation assigns 1000 to an alignment when the BAC end pair aligns to only one location in the genome (after filtering). When a BAC end pair or clone aligns to multiple locations, the score is calculated as 1500/(number of alignments). Methods BAC end sequences are placed on the assembled sequence using Jim Kent's blat program. Credits Additional information about the clone, including how it can be obtained, may be found at the NCBI Clone Registry. To view the registry entry for a specific clone, open the details page for the clone and click on its name at the top of the page. cytoBand Chromosome Band Chromosome Bands Based On ISCN Lengths Mapping and Sequencing Description The chromosome band track represents the approximate location of bands seen on Giemsa-stained chromosomes under conditions where 400 bands are visible across the entire genome. Methods Data are derived from the ideogram file downloaded from the NCBI ftp site ftp://ftp.ncbi.nih.gov/genomes/R_norvegicus/maps/mapview/. Band lengths are estimated based on relative sizes as defined by the International System for Cytogenetic Nomenclature (ISCN). Credits We would like to thank NCBI for providing this information. contrastGene CONTRAST CONTRAST Gene Predictions Genes and Gene Predictions Description This track shows protein-coding gene predictions generated by CONTRAST. Each predicted exon is colored according to confidence level: green (high confidence), orange (medium confidence), or red (low confidence). Methods CONTRAST predicts protein-coding genes from a multiple genomic alignment using a combination of discriminative machine learning techniques. A two-stage approach is used, in which output from local classifiers is combined with a global model of gene structure. CONTRAST is trained using a novel procedure designed to maximize expected coding region boundary detection accuracy. Please see the CONTRAST web site for details on how these predictions were generated and an estimate of accuracy. Credits Thanks to Samuel Gross of the Batzoglou lab at Stanford University for providing these predictions. References Gross SS, Do CB, Sirota M, Batzoglou S. CONTRAST: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction. Genome Biol. 2007;8(12):R269. PMID: 18096039; PMC: PMC2246271 cpgIslandGgfAndyMasked CpG Islands (AL) CpG Islands - Andy Law, Masked Sequence (Islands < 300 Bases are Light Green) Expression and Regulation Description CpG islands are associated with genes, particularly housekeeping genes, in vertebrates. CpG islands are typically common near transcription start sites, and may be associated with promoter regions. Normally a C (cytosine) base followed immediately by a G (guanine) base (a CpG) is rare in vertebrate DNA because the Cs in such an arrangement tend to be methylated. This methylation helps distinguish the newly synthesized DNA strand from the parent strand, which aids in the final stages of DNA proofreading after duplication. However, over evolutionary time methylated Cs tend to turn into Ts because of spontaneous deamination. The result is that CpGs are relatively rare unless there is selective pressure to keep them or a region is not methylated for some reason, perhaps having to do with the regulation of gene expression. CpG islands are regions where CpGs are present at significantly higher levels than is typical for the genome as a whole. The CpG count is the number of CG dinucleotides in the island. The Percentage CpG is the ratio of CpG nucleotide bases (twice the CpG count) to the length. Methods The genome sequence was masked using the output of RepeatMasker and the Tandem Repeats Finder (period ≤ 12). A sliding-window search was performed on the set of CpG locations in the masked genome sequence to find the longest spans that met the criteria given in Gardiner-Garden, M. and Frommer, M. (1987) in the References section below: length of 200 bp or more GC content of 50% or greater ratio of observed to expected CpGs of 0.6 or greater The ratio of observed to expect CpGs is calculated as follows: Obs/Exp CpG = Number of CpG * N / (Number of C * Number of G) Credits This track was generated using a program written by Andy Law (Roslin Institute) with minor modifications by Angie Hinrichs (UCSC). References Gardiner-Garden M, Frommer M. CpG islands in vertebrate genomes. J Mol Biol. 1987 Jul 20;196(2):261-82. PMID: 3656447 ensGene Ensembl Genes Ensembl Genes Genes and Gene Predictions Description These gene predictions were generated by Ensembl. For more information on the different gene tracks, see our Genes FAQ. Methods For a description of the methods used in Ensembl gene predictions, please refer to Hubbard et al. (2002), also listed in the References section below. Data access Ensembl Gene data can be explored interactively using the Table Browser or the Data Integrator. For local downloads, the genePred format files for rn4 are available in our downloads directory as ensGene.txt.gz or in our genes download directory in GTF format. For programmatic access, the data can be queried from the REST API or directly from our public MySQL servers. Instructions on this method are available on our MySQL help page and on our blog. Previous versions of this track can be found on our archive download server. Credits We would like to thank Ensembl for providing these gene annotations. For more information, please see Ensembl's genome annotation page. References Hubbard T, Barker D, Birney E, Cameron G, Chen Y, Clark L, Cox T, Cuff J, Curwen V, Down T et al. The Ensembl genome database project. Nucleic Acids Res. 2002 Jan 1;30(1):38-41. PMID: 11752248; PMC: PMC99161 gc5Base GC Percent GC Percent in 5-Base Windows Mapping and Sequencing Description The GC percent track shows the percentage of G (guanine) and C (cytosine) bases in 5-base windows. High GC content is typically associated with gene-rich areas. This track may be configured in a variety of ways to highlight different apsects of the displayed information. Click the "Graph configuration help" link for an explanation of the configuration options. Credits The data and presentation of this graph were prepared by Hiram Clawson. geneid Geneid Genes Geneid Gene Predictions Genes and Gene Predictions Description This track shows gene predictions from the geneid program developed by Roderic Guigó's Computational Biology of RNA Processing group which is part of the Centre de Regulació Genòmica (CRG) in Barcelona, Catalunya, Spain. Methods Geneid is a program to predict genes in anonymous genomic sequences designed with a hierarchical structure. In the first step, splice sites, start and stop codons are predicted and scored along the sequence using Position Weight Arrays (PWAs). Next, exons are built from the sites. Exons are scored as the sum of the scores of the defining sites, plus the the log-likelihood ratio of a Markov Model for coding DNA. Finally, from the set of predicted exons, the gene structure is assembled, maximizing the sum of the scores of the assembled exons. Credits Thanks to Computational Biology of RNA Processing for providing these data. References Blanco E, Parra G, Guigó R. Using geneid to identify genes. Curr Protoc Bioinformatics. 2007 Jun;Chapter 4:Unit 4.3. PMID: 18428791 Parra G, Blanco E, Guigó R. GeneID in Drosophila. Genome Res. 2000 Apr;10(4):511-5. PMID: 10779490; PMC: PMC310871 genscan Genscan Genes Genscan Gene Predictions Genes and Gene Predictions Description This track shows predictions from the Genscan program written by Chris Burge. The predictions are based on transcriptional, translational and donor/acceptor splicing signals as well as the length and compositional distributions of exons, introns and intergenic regions. For more information on the different gene tracks, see our Genes FAQ. Display Conventions and Configuration This track follows the display conventions for gene prediction tracks. The track description page offers the following filter and configuration options: Color track by codons: Select the genomic codons option to color and label each codon in a zoomed-in display to facilitate validation and comparison of gene predictions. Go to the Coloring Gene Predictions and Annotations by Codon page for more information about this feature. Methods For a description of the Genscan program and the model that underlies it, refer to Burge and Karlin (1997) in the References section below. The splice site models used are described in more detail in Burge (1998) below. Credits Thanks to Chris Burge for providing the Genscan program. References Burge C. Modeling Dependencies in Pre-mRNA Splicing Signals. In: Salzberg S, Searls D, Kasif S, editors. Computational Methods in Molecular Biology. Amsterdam: Elsevier Science; 1998. p. 127-163. Burge C, Karlin S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 1997 Apr 25;268(1):78-94. PMID: 9149143 gnfAtlas2 GNF Atlas 2 GNF Rat U34A Expression Atlas 2 Expression and Regulation Description This track shows expression data from the GNF Gene Expression Atlas 2. This contains two replicates each of 44 rat tissues run over Affymetrix microarrays. By default, averages of related tissues are shown. Display all tissues by selecting "All Arrays" from the "Combine arrays" menu on the track settings page. As is standard with microarray data red indicates overexpression in the tissue, and green indicates underexpression. You may want to view gene expression with the Gene Sorter as well as the Genome Browser. Credits Thanks to the Genomics Institute of the Novartis Research Foundation (GNF) for the data underlying this track. References Su AI, Wiltshire T, Batalov S, Lapp H, Ching KA, Block D, Zhang J, Soden R, Hayakawa M, Kreiman G et al. A gene atlas of the mouse and human protein-encoding transcriptomes. Proc Natl Acad Sci U S A. 2004 Apr 20;101(16):6062-7. PMID: 15075390; PMC: PMC395923 mgcFullMrna MGC Genes Mammalian Gene Collection Full ORF mRNAs Genes and Gene Predictions Description This track shows alignments of rat mRNAs from the Mammalian Gene Collection (MGC) having full-length open reading frames (ORFs) to the genome. The goal of the Mammalian Gene Collection is to provide researchers with unrestricted access to sequence-validated full-length protein-coding cDNA clones for human, mouse, and rat genes. Display Conventions and Configuration The track follows the display conventions for gene prediction tracks. An optional codon coloring feature is available for quick validation and comparison of gene predictions. To display codon colors, select the genomic codons option from the Color track by codons pull-down menu. For more information about this feature, go to the Coloring Gene Predictions and Annotations by Codon page. Methods GenBank rat MGC mRNAs identified as having full-length ORFs were aligned against the genome using blat. When a single mRNA aligned in multiple places, the alignment having the highest base identity was found. Only alignments having a base identity level within 1% of the best and at least 95% base identity with the genomic sequence were kept. Credits The rat MGC full-length mRNA track was produced at UCSC from mRNA sequence data submitted to GenBank by the Mammalian Gene Collection project. References Mammalian Gene Collection project references. Kent WJ. BLAT--the BLAST-like alignment tool. Genome Res. 2002 Apr;12(4):656-64. PMID: 11932250; PMC: PMC187518 microsat Microsatellite Microsatellites - Di-nucleotide and Tri-nucleotide Repeats Variation and Repeats Description This track displays regions that are likely to be useful as microsatellite markers. These are sequences of at least 15 perfect di-nucleotide and tri-nucleotide repeats and tend to be highly polymorphic in the population. Methods The data shown in this track are a subset of the Simple Repeats track, selecting only those repeats of period 2 and 3, with 100% identity and no indels and with at least 15 copies of the repeat. The Simple Repeats track is created using the Tandem Repeats Finder. For more information about this program, see Benson (1999). Credits Tandem Repeats Finder was written by Gary Benson. References Benson G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999 Jan 15;27(2):573-80. PMID: 9862982; PMC: PMC148217 miRNA miRNA MicroRNAs from miRBase Genes and Gene Predictions Description The miRNA track shows microRNAs from miRBase. Display Conventions and Configuration Mature miRNAs (miRs) are represented by thick blocks. The predicted stem-loop portions of the primary transcripts are indicated by thinner blocks. miRNAs in the sense orientation are shown in black; those in the reverse orientation are colored grey. When a single precursor produces two mature miRs from its 5' and 3' parts, it is displayed twice with the two different positions of the mature miR. To display only those items that exceed a specific unnormalized score, enter a minimum score between 0 and 1000 in the text box at the top of the track description page. Methods Mature and precursor miRNAs from the miRNA Registry were aligned against the genome using blat. The extents of the precursor sequences were not generally known, and were predicted based on base-paired hairpin structure. miRBase is described in Griffiths-Jones, S. et al. (2006). The miRNA Registry is described in Griffiths-Jones, S. (2004) and Weber, M.J. (2005) in the References section below. Credits This track was created by Michel Weber of Laboratoire de Biologie Moléculaire Eucaryote, CNRS Université Paul Sabatier (Toulouse, France), Yves Quentin of Laboratoire de Microbiologie et Génétique Moléculaires (Toulouse, France) and Sam Griffiths-Jones of The Wellcome Trust Sanger Institute (Cambridge, UK). References When making use of these data, please cite: Griffiths-Jones S, Grocock RJ, van Dongen S, Bateman A, Enright AJ. miRBase: microRNA sequences, targets and gene nomenclature. Nucleic Acids Res. 2006 Jan 1;34(Database issue):D140-4. PMID: 16381832; PMC: PMC1347474 Griffiths-Jones S. The microRNA Registry. Nucleic Acids Res. 2004 Jan 1;32(Database issue):D109-11. PMID: 14681370; PMC: PMC308757 Weber MJ. New human and mouse microRNA genes found by homology search. FEBS J. 2005 Jan;272(1):59-73. PMID: 15634332 The following publication provides guidelines on miRNA annotation: Ambros V, Bartel B, Bartel DP, Burge CB, Carrington JC, Chen X, Dreyfuss G, Eddy SR, Griffiths-Jones S, Marshall M et al. A uniform system for microRNA annotation. RNA. 2003 Mar;9(3):277-9. PMID: 12592000; PMC: PMC1370393 For more information on blat, see Kent WJ. BLAT - the BLAST-like alignment tool. Genome Res. 2002 Apr;12(4):656-64. PMID: 11932250; PMC: PMC187518 wgEncodeNhgriBip NHGRI BiP ENCODE NHGRI Elnitski Bidirectional Promoters Expression and Regulation Description Bidirectional promoters are the regulatory regions that fall between pairs of genes, where the 5' ends of the genes within a pair are positioned in close proximity to one another. This spacing facilitates the initiation of transcription of both genes, creating two transcription forks that advance in opposite directions. The formal definition of a bidirectional promoter requires that the transcription initiation sites are separated by no more than 1,000 bp from one another. Using these criteria we have comprehensively annotated the human and mouse genomes for the presence of bidirectional promoters, using in silico approaches. The identification of these promoters is contingent upon the presence of adjacent, oppositely oriented pairs of genes, because few distinguishing features are available to uniquely identify bidirectional promoters de novo. Genomic annotations used for our identification phase include: A) UCSC known genes annotations (items with score=800). B) GenBank mRNA annotations (score=600). C) spliced ESTs (score=400). The annotations for protein coding genes (A) are strongly supported and therefore provide a high quality dataset for mapping bidirectional promoters. In contrast, bidirectional promoters supported by spliced ESTs (C) alone have varying levels of evidence, ranging from one characterized transcript to hundreds of them. For this reason, the mRNA annotation (B) from GenBank provides a stringent level of validation for the start sites of the EST transcripts. As a large class of regulatory sequences, bidirectional promoters exemplify a rich source of unexplored biological information in the human genome. When compared to the mouse genome, these promoters are identifiable as truly orthologous locations, being maintained in regions of conserved synteny (including both genes and the intervening promoter region) that have undergone no rearrangements since the last common ancestor of mammals, and in some cases fish. We use this approach to annotate orthologous bidirectional promoters in nonhuman species until genomic annotations become available. Methods Assigning Orthologous Regions A multi-stage approach to mapping orthology at bidirectional promoters was developed. Orthology assignments are strongest in coding regions. Therefore we began by mapping single human genes regulated by bidirectional promoters from the Known Genes annotations onto the mouse genome. Orthology assignments were determined using the "chains and nets" data from the UCSC Human Genome Browser mysql tables. Chains in the Genome Browser represent sequences of gapless aligned blocks. Nets provide a hierarchical ordering of those chains. Level 1 chains contain the longest, best-scoring sequence chains that span any selected region. Subsequent levels in the net represent the results of rearrangements, duplications, insertions and deletions that may have disrupted the presence of conserved synteny derived from an ancestral sequence. Confirming Orthologous Genes After determining the orthology assignments using the UCSC chains and nets data, we used the Known Gene annotations or spliced ESTs to search the identity of genes within the corresponding region. Known Genes represent protein-coding genes and therefore can be verified by chains and nets alignments, followed by confirmation of protein identity in both species. Spliced ESTs carry less descriptive information than protein coding genes and therefore were validated in the second species by their presence in an orthologous region, showing conserved synteny of the two genes within a pair, and meeting the criteria of less than 1,000 bp of intergenic distance between those transcripts. Our method for mapping bidirectional promoters in spliced EST datasets is described in more detail in a previous publication. If the program verified evidence for orthology and conserved-syntenic gene arrangement, then the orthologous bidirectional promoter was confirmed. After orthologous assignments were confirmed for pairs of human genes, the reciprocal assignments were analyzed from mouse to human. Currently orthologous bidirectional promoter regions (that have been identified using UCSC known genes) have been mapped in human, chimp, macaque, mouse, rat, dog and cow genomes). Credits These data were produced by Mary Q. Yang in the Elnitski lab at NHGRI, NIH. (contact: elnitski@mail.nih.gov) References Piontkivska H, Yang MQ, Larkin DM, Lewin HA, Reecy J, Elnitski L. Cross-species mapping of bidirectional promoters enables prediction of unannotated 5' UTRs and identification of species-specific transcripts. BMC Genomics. 2009 Apr 24;10:189. PMID: 19393065; PMC: PMC2688522 Yang MQ, Elnitski LL. A computational study of bidirectional promoters in the human genome . Springer Lecture Series: Notes in Bioinformatics 2007. Yang MQ, Elnitski L. Orthology of Bidirectional Promoters Enables Use of a Multiple Class Predictor for Discriminating Functional Elements in the Human Genome. Proceedings of the 2007 International Conference on Bioinformatics & Computational Biology. Proceedings of the 2007 International Conference on Bioinformatics & Computational Biology . --> pp. 218-228. 2007. ISBN: 1-60132-042-6. Yang MQ, Koehly LM, Elnitski LL. Comprehensive annotation of bidirectional promoters identifies co-regulation among breast and ovarian cancer genes. PLoS Comput Biol. 2007 Apr 20;3(4):e72. PMID: 17447839; PMC: PMC1853124 Yang MQ, Taylor J, Elnitski L. Comparative analyses of bidirectional promoters in vertebrates. BMC Bioinformatics. 2008 May 28;9 Suppl 6:S9. PMID: 18541062; PMC: PMC2423431 Data Release Policy Data users may freely use ENCODE data, but may not, without prior consent, submit publications that use an unpublished ENCODE dataset until nine months following the release of the dataset. This date is listed in the tablemetadata as dateUnrestricted and on the download page. The full data release policy for ENCODE is available here. knownGenePrevious Old Known Genes Previous Version of Known Genes Genes and Gene Predictions Description The Old Known Genes track shows genes from the 2006 Known Genes build. RGD Genes has now replaced Known Genes as the main gene track for rat. The Old Known Genes track shows known protein-coding genes based on protein data from UniProt (SWISS-PROT and TrEMBL) and mRNA data from the NCBI reference sequences collection (RefSeq) and GenBank. Each Known Gene is represented by an mRNA and a protein. Display Conventions and Configuration This track follows the display conventions for gene prediction tracks. This track contains an optional codon coloring feature that allows users to quickly validate and compare gene predictions. To display codon colors, select the genomic codons option from the Color track by codons pull-down menu. Click here for more information about this feature. Methods This release of UCSC Known Genes was built by a new process, KG II, as described below. UniProt protein sequences (including alternative splicing isoforms) and mRNA sequences from RefSeq and GenBank were aligned against the base genome using BLAT. RefSeq alignments having a base identity level within 0.1% of the best and at least 96% base identity with the genomic sequence were kept. GenBank mRNA alignments having a base identity level within 0.2% of the best and at least 97% base identity with the genomic sequence were kept. Protein alignments having a base identity level within 0.2% of the best and at least 80% base identity with the genomic sequence were kept. Then the genomic mRNA and protein alignments were compared, and protein-mRNA pairings were determined from their overlaps. mRNA CDS data were obtained from RefSeq and GenBank data and supplemented by CDS structures derived from UCSC protein-mRNA BLAT alignments. The initial set of UCSC Known Genes candidates consists of all protein-mRNA pairs with valid mRNA CDS structures. A gene-check program (similar to the one used for the Consensus CDS (CCDS) project) is used to remove questionable candidates, such as those with in-frame stop codons, missing start or stop codons, etc. From each group of gene candidates that share the same CDS structure, the protein-mRNA pair having the best ranking and protein-mRNA alignment score is selected as a UCSC Known Gene. The ranking of a gene candidate depends on its gene-check quality measures. When all else is equal, a preference is given to RefSeq mRNAs and next to MGC mRNAs. Similarly a preference is given to gene candidates represented by Swiss-Prot proteins. The protein-mRNA alignment score is calculated based on protein to mRNA alignment using TBLASTN, plus weighted sub-scores according to the date and length of the mRNA. Credits The UCSC Known Genes track was produced using protein data from UniProt and mRNA data from NCBI RefSeq and GenBank. Data Use Restrictions The UniProt data have the following terms of use, UniProt copyright(c) 2002 - 2004 UniProt consortium: For non-commercial use, all databases and documents in the UniProt FTP directory may be copied and redistributed freely, without advance permission, provided that this copyright statement is reproduced with each copy. For commercial use, all databases and documents in the UniProt FTP directory except the files ftp://ftp.uniprot.org/pub/databases/uniprot/knowledgebase/uniprot_sprot.dat.gz ftp://ftp.uniprot.org/pub/databases/uniprot/knowledgebase/uniprot_sprot.xml.gz may be copied and redistributed freely, without advance permission, provided that this copyright statement is reproduced with each copy. More information for commercial users can be found here. From January 1, 2005, all databases and documents in the UniProt FTP directory may be copied and redistributed freely by all entities, without advance permission, provided that this copyright statement is reproduced with each copy. References Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. GenBank: update. Nucleic Acids Res. 2004 Jan 1;32(Database issue):D23-6. PMID: 14681350; PMC: PMC308779 Hsu F, Kent WJ, Clawson H, Kuhn RM, Diekhans M, Haussler D. The UCSC Known Genes. Bioinformatics. 2006 May 1;22(9):1036-46. PMID: 16500937 Kent WJ. BLAT - the BLAST-like alignment tool. Genome Res. 2002 Apr;12(4):656-64. PMID: 11932250; PMC: PMC187518 xenoMrna Other mRNAs Non-Rat mRNAs from GenBank mRNA and EST Description This track displays translated blat alignments of vertebrate and invertebrate mRNA in GenBank from organisms other than rat. Display Conventions and Configuration This track follows the display conventions for PSL alignment tracks. In dense display mode, the items that are more darkly shaded indicate matches of better quality. The strand information (+/-) for this track is in two parts. The first + indicates the orientation of the query sequence whose translated protein produced the match (here always 5' to 3', hence +). The second + or - indicates the orientation of the matching translated genomic sequence. Because the two orientations of a DNA sequence give different predicted protein sequences, there are four combinations. ++ is not the same as --, nor is +- the same as -+. The description page for this track has a filter that can be used to change the display mode, alter the color, and include/exclude a subset of items within the track. This may be helpful when many items are shown in the track display, especially when only some are relevant to the current task. To use the filter: Type a term in one or more of the text boxes to filter the mRNA display. For example, to apply the filter to all mRNAs expressed in a specific organ, type the name of the organ in the tissue box. To view the list of valid terms for each text box, consult the table in the Table Browser that corresponds to the factor on which you wish to filter. For example, the "tissue" table contains all the types of tissues that can be entered into the tissue text box. Multiple terms may be entered at once, separated by a space. Wildcards may also be used in the filter. If filtering on more than one value, choose the desired combination logic. If "and" is selected, only mRNAs that match all filter criteria will be highlighted. If "or" is selected, mRNAs that match any one of the filter criteria will be highlighted. Choose the color or display characteristic that should be used to highlight or include/exclude the filtered items. If "exclude" is chosen, the browser will not display mRNAs that match the filter criteria. If "include" is selected, the browser will display only those mRNAs that match the filter criteria. This track may also be configured to display codon coloring, a feature that allows the user to quickly compare mRNAs against the genomic sequence. For more information about this option, go to the Codon and Base Coloring for Alignment Tracks page. Several types of alignment gap may also be colored; for more information, go to the Alignment Insertion/Deletion Display Options page. Methods The mRNAs were aligned against the rat genome using translated blat. When a single mRNA aligned in multiple places, the alignment having the highest base identity was found. Only those alignments having a base identity level within 1% of the best and at least 25% base identity with the genomic sequence were kept. Credits The mRNA track was produced at UCSC from mRNA sequence data submitted to the international public sequence databases by scientists worldwide. References Benson DA, Cavanaugh M, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW. GenBank. Nucleic Acids Res. 2013 Jan;41(Database issue):D36-42. PMID: 23193287; PMC: PMC3531190 Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. GenBank: update. Nucleic Acids Res. 2004 Jan 1;32(Database issue):D23-6. PMID: 14681350; PMC: PMC308779 Kent WJ. BLAT - the BLAST-like alignment tool. Genome Res. 2002 Apr;12(4):656-64. PMID: 11932250; PMC: PMC187518 polyASeqSites PolyA-Seq Poly(A)-sequencing from Merck Research Laboratories mRNA and EST Description This track displays the location of RNA polyadenylation (polyA) sites based on high-throughput RNA sequencing using the PolyA-seq protocol. PolyA-Seq data is strand-specific, therefore two tracks are provided for each tissue. PolyA site positions correspond to a single base, namely the ends of read alignments immediately upstream of the polyadenylation site. The data provided in this track consists of filtered polyA sites (see Methods). When multiple sites occurred within a 30-bp window on the same strand, the sum of the reads was attributed to the site with the most reads. Units are in reads per million (RPM) aligned. To obtain read counts, multiply RPM values by the total number of filtered reads for the corresponding experiment: Sample Filtered reads Brain 5549424 Testis 7466688 Display Conventions and Configuration These tracks may be configured in a variety of ways to highlight different aspects of the displayed data. The graphical configuration options are shown at the top of the track description page. For more information, see Configuring graph-based tracks. In the full and pack display modes, forward-strand tracks are shown in red and reverse-strand tracks are shown in black. In the squish and dense display modes, intensity is represented in grayscale (the darker the shading, the higher the intensity). To show only selected subtracks, uncheck the boxes next to the tracks that you wish to hide. Methods A detailed explanation of the experimental methods is provided at NCBI's Gene Expression Omnibus under accession GSE30198. Briefly, PolyA+ RNA was reverse-transcribed using a T(10)VN primer and strand-specific universal adapters, amplified, and sequenced on an Illumina GAIIx sequencer. Reads were reverse-complemented, aligned to the corresponding reference genome and splice junctions, and retained only if they aligned uniquely. 3' ends of alignments were considered polyA sites. Sites were then filtered using downstream base frequency matrices for true- and false-positive sites determined from a modified experiment based on a T(10) primer (i.e., excluding the 3' VN). When multiple filtered sites occurred within a 30-nt window on the same strand, read counts were summed and attributed to the most abundant peak. For each tissue, read counts were then divided by the total number of reads, in millions, from all filtered sites. Credits These data were generated at Merck Research Laboratories and submitted by Adnan Derti and Tomas Babak. Data Release Policy No restrictions. polyASeqSitesSignalView Signal Poly(A)-sequencing from Merck Research Laboratories mRNA and EST polyASeqSitesTestisRev PolyA-Seq Testis Poly(A)-tail sequencing of Testis from Merck (Rev strand) mRNA and EST polyASeqSitesTestisFwd PolyA-Seq Testis Poly(A)-tail sequencing of Testis from Merck (Fwd strand) mRNA and EST polyASeqSitesBrainRev PolyA-Seq Brain Poly(A)-tail sequencing of Brain from Merck (Rev strand) mRNA and EST polyASeqSitesBrainFwd PolyA-Seq Brain Poly(A)-tail sequencing of Brain from Merck (Fwd strand) mRNA and EST est Rat ESTs Rat ESTs Including Unspliced mRNA and EST Description This track shows alignments between rat expressed sequence tags (ESTs) in GenBank and the genome. ESTs are single-read sequences, typically about 500 bases in length, that usually represent fragments of transcribed genes. Display Conventions and Configuration This track follows the display conventions for PSL alignment tracks. In dense display mode, the items that are more darkly shaded indicate matches of better quality. The strand information (+/-) indicates the direction of the match between the EST and the matching genomic sequence. It bears no relationship to the direction of transcription of the RNA with which it might be associated. The description page for this track has a filter that can be used to change the display mode, alter the color, and include/exclude a subset of items within the track. This may be helpful when many items are shown in the track display, especially when only some are relevant to the current task. To use the filter: Type a term in one or more of the text boxes to filter the EST display. For example, to apply the filter to all ESTs expressed in a specific organ, type the name of the organ in the tissue box. To view the list of valid terms for each text box, consult the table in the Table Browser that corresponds to the factor on which you wish to filter. For example, the "tissue" table contains all the types of tissues that can be entered into the tissue text box. Multiple terms may be entered at once, separated by a space. Wildcards may also be used in the filter. If filtering on more than one value, choose the desired combination logic. If "and" is selected, only ESTs that match all filter criteria will be highlighted. If "or" is selected, ESTs that match any one of the filter criteria will be highlighted. Choose the color or display characteristic that should be used to highlight or include/exclude the filtered items. If "exclude" is chosen, the browser will not display ESTs that match the filter criteria. If "include" is selected, the browser will display only those ESTs that match the filter criteria. This track may also be configured to display base labeling, a feature that allows the user to display all bases in the aligning sequence or only those that differ from the genomic sequence. For more information about this option, go to the Base Coloring for Alignment Tracks page. Several types of alignment gap may also be colored; for more information, go to the Alignment Insertion/Deletion Display Options page. Methods To make an EST, RNA is isolated from cells and reverse transcribed into cDNA. Typically, the cDNA is cloned into a plasmid vector and a read is taken from the 5' and/or 3' primer. For most — but not all — ESTs, the reverse transcription is primed by an oligo-dT, which hybridizes with the poly-A tail of mature mRNA. The reverse transcriptase may or may not make it to the 5' end of the mRNA, which may or may not be degraded. In general, the 3' ESTs mark the end of transcription reasonably well, but the 5' ESTs may end at any point within the transcript. Some of the newer cap-selected libraries cover transcription start reasonably well. Before the cap-selection techniques emerged, some projects used random rather than poly-A priming in an attempt to retrieve sequence distant from the 3' end. These projects were successful at this, but as a side effect also deposited sequences from unprocessed mRNA and perhaps even genomic sequences into the EST databases. Even outside of the random-primed projects, there is a degree of non-mRNA contamination. Because of this, a single unspliced EST should be viewed with considerable skepticism. To generate this track, rat ESTs from GenBank were aligned against the genome using blat. Note that the maximum intron length allowed by blat is 750,000 bases, which may eliminate some ESTs with very long introns that might otherwise align. When a single EST aligned in multiple places, the alignment having the highest base identity was identified. Only alignments having a base identity level within 0.5% of the best and at least 96% base identity with the genomic sequence were kept. Credits This track was produced at UCSC from EST sequence data submitted to the international public sequence databases by scientists worldwide. References Benson DA, Cavanaugh M, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW. GenBank. Nucleic Acids Res. 2013 Jan;41(Database issue):D36-42. PMID: 23193287; PMC: PMC3531190 Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. GenBank: update. Nucleic Acids Res. 2004 Jan 1;32(Database issue):D23-6. PMID: 14681350; PMC: PMC308779 Kent WJ. BLAT - the BLAST-like alignment tool. Genome Res. 2002 Apr;12(4):656-64. PMID: 11932250; PMC: PMC187518 rgdQtl RGD QTL Rat Quantitative Trait Locus from RGD Phenotype and Disease Associations Description A quantitative trait locus (QTL) is a polymorphic locus that contains alleles which differentially affect the expression of a continuously distributed phenotypic trait. Usually a QTL is a marker described by statistical association to quantitative variation in the particular phenotypic trait that is thought to be controlled by the cumulative action of alleles at multiple loci. For a comprehensive review of QTL mapping techniques in the rat, see Rapp, J.P. (2000) in the References section below. Credits Thanks to the RGD for providing this annotation. RGD is funded by grant HL64541 entitled "Rat Genome Database", awarded to Dr. Howard J Jacob, Medical College of Wisconsin, from the National Heart Lung and Blood Institute (NHLBI) of the National Institutes of Health (NIH). References Rapp, J.P. Genetic Analysis of Inherited Hypertension in the Rat. Physiol Rev. 2000 Jan;80(1):135-72. PMID: 10617767 simpleRepeat Simple Repeats Simple Tandem Repeats by TRF Variation and Repeats Description This track displays simple tandem repeats (possibly imperfect repeats) located by Tandem Repeats Finder (TRF) which is specialized for this purpose. These repeats can occur within coding regions of genes and may be quite polymorphic. Repeat expansions are sometimes associated with specific diseases. Methods For more information about the TRF program, see Benson (1999). Credits TRF was written by Gary Benson. References Benson G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999 Jan 15;27(2):573-80. PMID: 9862982; PMC: PMC148217 tRNAs tRNA Genes Transfer RNA Genes Identified with tRNAscan-SE Genes and Gene Predictions Description This track displays tRNA genes predicted by using tRNAscan-SE v.1.23. tRNAscan-SE is an integrated program that uses tRNAscan (Fichant) and an A/B box motif detection algorithm (Pavesi) as pre-filters to obtain an initial list of tRNA candidates. The program then filters these candidates with a covariance model-based search program COVE (Eddy) to obtain a highly specific set of primary sequence and secondary structure predictions that represent 99-100% of true tRNAs with a false positive rate of fewer than 1 per 15 gigabases. Detailed tRNA annotations for eukaryotes, bacteria, and archaea are available at Genomic tRNA Database (GtRNAdb). What does the tRNAscan-SE score mean? Anything with a score above 20 bits is likely to be derived from a tRNA, although this does not indicate whether the tRNA gene still encodes a functional tRNA molecule (i.e. tRNA-derived SINES probably do not function in the ribosome in translation). Vertebrate tRNAs with scores of >60.0 (bits) are likely to encode functional tRNA genes, and those with scores below ~45 have sequence or structural features that indicate they probably are no longer involved in translation. tRNAs with scores between 45-60 bits are in the "grey" zone, and may or may not have all the required features to be functional. In these cases, tRNAs should be inspected carefully for loss of specific primary or secondary structure features (usually in alignments with other genes of the same isotype), in order to make a better educated guess. These rough score range guides are not exact, nor are they based on specific biochemical studies of atypical tRNA features, so please treat them accordingly. Please note that tRNA genes marked as "Pseudo" are low scoring predictions that are mostly pseudogenes or tRNA-derived elements. These genes do not usually fold into a typical cloverleaf tRNA secondary structure and the provided images of the predicted secondary structures may appear rotated. Credits Both tRNAscan-SE and GtRNAdb are maintained by the Lowe Lab at UCSC. Cove-predicted tRNA secondary structures were rendered by NAVIEW (c) 1988 Robert E. Bruccoleri. References When making use of these data, please cite the following articles: Chan PP, Lowe TM. GtRNAdb: a database of transfer RNA genes detected in genomic sequence. Nucleic Acids Res. 2009 Jan;37(Database issue):D93-7. PMID: 18984615; PMC: PMC2686519 Eddy SR, Durbin R. RNA sequence analysis using covariance models. Nucleic Acids Res. 1994 Jun 11;22(11):2079-88. PMID: 8029015; PMC: PMC308124 Fichant GA, Burks C. Identifying potential tRNA genes in genomic DNA sequences. J Mol Biol. 1991 Aug 5;220(3):659-71. PMID: 1870126 Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997 Mar 1;25(5):955-64. PMID: 9023104; PMC: PMC146525 Pavesi A, Conterio F, Bolchi A, Dieci G, Ottonello S. Identification of new eukaryotic tRNA genes in genomic DNA databases by a multistep weight matrix analysis of transcriptional control regions. Nucleic Acids Res. 1994 Apr 11;22(7):1247-56. PMID: 8165140; PMC: PMC523650 pubsBingBlat Web Sequences DNA Sequences in Web Pages Indexed by Bing.com / Microsoft Research Literature Description This track is powered by Bing! and Microsoft Research. UCSC collaborators at Microsoft Research (Bob Davidson, David Heckerman) implemented a DNA sequence detector and processed thirty days of web crawler updates, which covers roughly 40 billion webpages. The results were mapped with BLAT to the genome. Display Convention and Configuration The track indicates the location of sequences on web pages mapped to the genome, labelled with the web page URL. If the web page includes invisible meta data, then the first author and a year of publication is shown instead of the URL. All matches of one web page are grouped ("chained") together. Web page titles are shown when you move the mouse cursor over the features. Thicker parts of the features (exons) represent matching sequences, connected by thin lines to matches from the same web page within 30 kbp. The subtrack "individual sequence matches" activates automatically when the user clicks a sequence match and follows the link "Show sequence matches individually" from the details page. Mouse-overs show flanking text around the sequence, and clicking features links to BLAT alignments. - --> Methods All file types (PDFs and various Microsoft Office formats) were converted to text. The results were processed to find groups of words that look like DNA/RNA sequences. These were then mapped with BLAT to the human genome using the same software as used in the Publication track. Credits DNA sequence detection by Bob Davidson at Microsoft Research. HTML parsing and sequence mapping by Maximilian Haeussler at UCSC. References Aerts S, Haeussler M, van Vooren S, Griffith OL, Hulpiau P, Jones SJ, Montgomery SB, Bergman CM, Open Regulatory Annotation Consortium. Text-mining assisted regulatory annotation. Genome Biol. 2008;9(2):R31. PMID: 18271954; PMC: PMC2374703 Haeussler M, Gerner M, Bergman CM. Annotating genes and genomes with DNA sequences extracted from biomedical articles. Bioinformatics. 2011 Apr 1;27(7):980-6. PMID: 21325301; PMC: PMC3065681 Van Noorden R. Trouble at the text mine. Nature. 2012 Mar 7;483(7388):134-5. cons9way Conservation 9-Way Multiz Alignment & Conservation Comparative Genomics Description This track shows a measure of evolutionary conservation in 9 vertebrates, including mammalian, amphibian, bird, and fish species, based on a phylogenetic hidden Markov model, phastCons (Siepel et al., 2005). Multiz alignments of the following assemblies were used to generate this track: rat (Nov. 2004 (Baylor 3.4/rn4), rn4) mouse (Feb 2006, mm8) human (Mar 2006, hg18) dog (May 2005, canFam2) cow (Mar 2005, bosTau2) opossum (Jan 2006, monDom4) chicken (Feb 2004, galGal2) frog (Oct 2004, xenTro1) zebrafish (May 2005, danRer3) Display Conventions and Configuration In full and pack display modes, conservation scores are displayed as a "wiggle" (histogram), where the height reflects the size of the score. Pairwise alignments of each species to the rat genome are displayed below as a grayscale density plot (in pack mode) or as a "wiggle" (in full mode) that indicates alignment quality. In dense display mode, conservation is shown in grayscale using darker values to indicate higher levels of overall conservation as scored by phastCons. The conservation wiggle can be configured in a variety of ways to highlight different aspects of the displayed information. Click the Graph configuration help link for an explanation of the configuration options. Checkboxes in the track configuration section allow excluding species from the pairwise display; however, this does not remove them from the conservation score display. To view detailed information about the alignments at a specific position, zoom in the display to 30,000 or fewer bases, then click on the alignment. Gap Annotation The "Display chains between alignments" configuration option enables display of gaps between alignment blocks in the pairwise alignments in a manner similar to the Chain track display. The following conventions are used: Single line: No bases in the aligned species. Possibly due to a lineage-specific insertion between the aligned blocks in the rat genome or a lineage-specific deletion between the aligned blocks in the aligning species. Double line: Aligning species has one or more unalignable bases in the gap region. Possibly due to excessive evolutionary distance between species or independent indels in the region between the aligned blocks in both species. Pale yellow coloring: Aligning species has Ns in the gap region. Reflects uncertainty in the relationship between the DNA of both species, due to lack of sequence in relevant portions of the aligning species. Genomic Breaks Discontinuities in the genomic context (chromosome, scaffold or region) of the aligned DNA in the aligning species are shown as follows: Vertical blue bar: Represents a discontinuity that persists indefinitely on either side, e.g. a large region of DNA on either side of the bar comes from a different chromosome in the aligned species due to a large scale rearrangement. Green square brackets: Enclose shorter alignments consisting of DNA from one genomic context in the aligned species nested inside a larger chain of alignments from a different genomic context. The alignment within the brackets may represent a short misalignment, a lineage-specific insertion of a transposon in the rat genome that aligns to a paralogous copy somewhere else in the aligned species, or other similar occurrence. Base Level When zoomed-in to the base-level display, the track shows the base composition of each alignment. The numbers and symbols on the Gaps line indicate the lengths of gaps in the rat sequence at those alignment positions relative to the longest non-rat sequence. If there is sufficient space in the display, the size of the gap is shown; if not, and if the gap size is a multiple of 3, a "*" is displayed, otherwise "+" is shown. Codon translation is available in base-level display mode if the displayed region is identified as a coding segment. To display this annotation, select the species for translation from the pull-down menu in the Codon Translation configuration section at the top of the page. Then, select one of the following modes: No codon translation: The gene annotation is not used; the bases are displayed without translation. Use default species reading frames for translation: The annotations from the genome displayed in the Default species to establish reading frame pull-down menu are used to translate all the aligned species present in the alignment. Use reading frames for species if available, otherwise no translation: Codon translation is performed only for those species where the region is annotated as protein coding. Use reading frames for species if available, otherwise use default species: Codon translation is done on those species that are annotated as being protein coding over the aligned region using species-specific annotation; the remaining species are translated using the default species annotation. Codon translation uses the following gene tracks as the basis for translation, depending on the species chosen: Gene TrackSpecies Known Genesrat, mouse, human RefSeq Geneschicken MGC GenesX. tropicalis mRNAsdog, cow, zebrafish not translatedopossum Methods Best-in-genome pairwise alignments were generated for each species using blastz, followed by chaining and netting. The pairwise alignments were then multiply aligned using multiz, following the ordering of the species tree diagrammed above. The resulting multiple alignments were then assigned conservation scores by phastCons, using a tree model with branch lengths derived from the ENCODE project Multi-Species Sequence Analysis group, September 2005 tree model. This tree was generated from TBA alignments over 23 vertebrate species and is based on 4D sites. The phastCons program computes conservation scores based on a phylo-HMM, a type of probabilistic model that describes both the process of DNA substitution at each site in a genome and the way this process changes from one site to the next (Felsenstein and Churchill 1996, Yang 1995, Siepel and Haussler 2005). PhastCons uses a two-state phylo-HMM, with a state for conserved regions and a state for non-conserved regions. The value plotted at each site is the posterior probability that the corresponding alignment column was "generated" by the conserved state of the phylo-HMM. These scores reflect the phylogeny (including branch lengths) of the species in question, a continuous-time Markov model of the nucleotide substitution process, and a tendency for conservation levels to be autocorrelated along the genome (i.e., to be similar at adjacent sites). The general reversible (REV) substitution model was used. Note that, unlike many conservation-scoring programs, phastCons does not rely on a sliding window of fixed size, so short highly-conserved regions and long moderately conserved regions can both obtain high scores. More information about phastCons can be found in Siepel et al. (2005). PhastCons currently treats alignment gaps as missing data, which sometimes has the effect of producing undesirably high conservation scores in gappy regions of the alignment. We are looking at several possible ways of improving the handling of alignment gaps. Credits This track was created using the following programs: Alignment tools: blastz and multiz by Minmei Hou, Scott Schwartz and Webb Miller of the Penn State Bioinformatics Group Chaining and Netting: axtChain, chainNet by Jim Kent at UCSC Conservation scoring: PhastCons, phyloFit, tree_doctor, msa_view by Adam Siepel while at UCSC, now at Cold Spring Harbor Laboratory MAF Annotation tools: mafAddIRows by Brian Raney, UCSC; genePredToMafFrames by Mark Diekhans, UCSC Tree image generator: phyloPng by Galt Barber, UCSC Conservation track display: Kate Rosenbloom, Hiram Clawson (wiggle display), and Brian Raney (gap annotation and codon framing) at UCSC The phylogenetic tree is based on Murphy et al. (2001) and general consensus in the vertebrate phylogeny community. References Phylo-HMMs and phastCons: Felsenstein J, Churchill GA. A Hidden Markov Model approach to variation among sites in rate of evolution. Mol Biol Evol. 1996 Jan;13(1):93-104. PMID: 8583911 Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K, Clawson H, Spieth J, Hillier LW, Richards S, et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 2005 Aug;15(8):1034-50. PMID: 16024819; PMC: PMC1182216 Siepel A, Haussler D. Phylogenetic Hidden Markov Models. In: Nielsen R, editor. Statistical Methods in Molecular Evolution. New York: Springer; 2005. pp. 325-351. Yang Z. A space-time process model for the evolution of DNA sequences. Genetics. 1995 Feb;139(2):993-1005. PMID: 7713447; PMC: PMC1206396 Chain/Net: Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Multiz: Blanchette M, Kent WJ, Riemer C, Elnitski L, Smit AF, Roskin KM, Baertsch R, Rosenbloom K, Clawson H, Green ED, et al. Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res. 2004 Apr;14(4):708-15. PMID: 15060014; PMC: PMC383317 Blastz: Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 Phylogenetic Tree: Murphy WJ, Eizirik E, O'Brien SJ, Madsen O, Scally M, Douady CJ, Teeling E, Ryder OA, Stanhope MJ, de Jong WW, Springer MS. Resolution of the early placental mammal radiation using Bayesian phylogenetics. Science. 2001 Dec 14;294(5550):2348-51. PMID: 11743200 cons9wayViewalign Multiz Alignments 9-Way Multiz Alignment & Conservation Comparative Genomics multiz9way Multiz Align Multiz Alignments of 9 Species Comparative Genomics Description This track shows a measure of evolutionary conservation in 9 vertebrates, including mammalian, amphibian, bird, and fish species, based on a phylogenetic hidden Markov model, phastCons (Siepel et al., 2005). Multiz alignments of the following assemblies were used to generate this track: rat (Nov. 2004 (Baylor 3.4/rn4), rn4) mouse (Feb 2006, mm8) human (Mar 2006, hg18) dog (May 2005, canFam2) cow (Mar 2005, bosTau2) opossum (Jan 2006, monDom4) chicken (Feb 2004, galGal2) frog (Oct 2004, xenTro1) zebrafish (May 2005, danRer3) Display Conventions and Configuration In full and pack display modes, conservation scores are displayed as a "wiggle" (histogram), where the height reflects the size of the score. Pairwise alignments of each species to the rat genome are displayed below as a grayscale density plot (in pack mode) or as a "wiggle" (in full mode) that indicates alignment quality. In dense display mode, conservation is shown in grayscale using darker values to indicate higher levels of overall conservation as scored by phastCons. The conservation wiggle can be configured in a variety of ways to highlight different aspects of the displayed information. Click the Graph configuration help link for an explanation of the configuration options. Checkboxes in the track configuration section allow excluding species from the pairwise display; however, this does not remove them from the conservation score display. To view detailed information about the alignments at a specific position, zoom in the display to 30,000 or fewer bases, then click on the alignment. Gap Annotation The "Display chains between alignments" configuration option enables display of gaps between alignment blocks in the pairwise alignments in a manner similar to the Chain track display. The following conventions are used: Single line: No bases in the aligned species. Possibly due to a lineage-specific insertion between the aligned blocks in the rat genome or a lineage-specific deletion between the aligned blocks in the aligning species. Double line: Aligning species has one or more unalignable bases in the gap region. Possibly due to excessive evolutionary distance between species or independent indels in the region between the aligned blocks in both species. Pale yellow coloring: Aligning species has Ns in the gap region. Reflects uncertainty in the relationship between the DNA of both species, due to lack of sequence in relevant portions of the aligning species. Genomic Breaks Discontinuities in the genomic context (chromosome, scaffold or region) of the aligned DNA in the aligning species are shown as follows: Vertical blue bar: Represents a discontinuity that persists indefinitely on either side, e.g. a large region of DNA on either side of the bar comes from a different chromosome in the aligned species due to a large scale rearrangement. Green square brackets: Enclose shorter alignments consisting of DNA from one genomic context in the aligned species nested inside a larger chain of alignments from a different genomic context. The alignment within the brackets may represent a short misalignment, a lineage-specific insertion of a transposon in the rat genome that aligns to a paralogous copy somewhere else in the aligned species, or other similar occurrence. Base Level When zoomed-in to the base-level display, the track shows the base composition of each alignment. The numbers and symbols on the Gaps line indicate the lengths of gaps in the rat sequence at those alignment positions relative to the longest non-rat sequence. If there is sufficient space in the display, the size of the gap is shown; if not, and if the gap size is a multiple of 3, a "*" is displayed, otherwise "+" is shown. Codon translation is available in base-level display mode if the displayed region is identified as a coding segment. To display this annotation, select the species for translation from the pull-down menu in the Codon Translation configuration section at the top of the page. Then, select one of the following modes: No codon translation: The gene annotation is not used; the bases are displayed without translation. Use default species reading frames for translation: The annotations from the genome displayed in the Default species to establish reading frame pull-down menu are used to translate all the aligned species present in the alignment. Use reading frames for species if available, otherwise no translation: Codon translation is performed only for those species where the region is annotated as protein coding. Use reading frames for species if available, otherwise use default species: Codon translation is done on those species that are annotated as being protein coding over the aligned region using species-specific annotation; the remaining species are translated using the default species annotation. Codon translation uses the following gene tracks as the basis for translation, depending on the species chosen: Gene TrackSpecies Known Genesrat, mouse, human RefSeq Geneschicken MGC GenesX. tropicalis mRNAsdog, cow, zebrafish not translatedopossum Methods Best-in-genome pairwise alignments were generated for each species using blastz, followed by chaining and netting. The pairwise alignments were then multiply aligned using multiz, following the ordering of the species tree diagrammed above. The resulting multiple alignments were then assigned conservation scores by phastCons, using a tree model with branch lengths derived from the ENCODE project Multi-Species Sequence Analysis group, September 2005 tree model. This tree was generated from TBA alignments over 23 vertebrate species and is based on 4D sites. The phastCons program computes conservation scores based on a phylo-HMM, a type of probabilistic model that describes both the process of DNA substitution at each site in a genome and the way this process changes from one site to the next (Felsenstein and Churchill 1996, Yang 1995, Siepel and Haussler 2005). PhastCons uses a two-state phylo-HMM, with a state for conserved regions and a state for non-conserved regions. The value plotted at each site is the posterior probability that the corresponding alignment column was "generated" by the conserved state of the phylo-HMM. These scores reflect the phylogeny (including branch lengths) of the species in question, a continuous-time Markov model of the nucleotide substitution process, and a tendency for conservation levels to be autocorrelated along the genome (i.e., to be similar at adjacent sites). The general reversible (REV) substitution model was used. Note that, unlike many conservation-scoring programs, phastCons does not rely on a sliding window of fixed size, so short highly-conserved regions and long moderately conserved regions can both obtain high scores. More information about phastCons can be found in Siepel et al. (2005). PhastCons currently treats alignment gaps as missing data, which sometimes has the effect of producing undesirably high conservation scores in gappy regions of the alignment. We are looking at several possible ways of improving the handling of alignment gaps. Credits This track was created using the following programs: Alignment tools: blastz and multiz by Minmei Hou, Scott Schwartz and Webb Miller of the Penn State Bioinformatics Group Chaining and Netting: axtChain, chainNet by Jim Kent at UCSC Conservation scoring: PhastCons, phyloFit, tree_doctor, msa_view by Adam Siepel while at UCSC, now at Cold Spring Harbor Laboratory MAF Annotation tools: mafAddIRows by Brian Raney, UCSC; genePredToMafFrames by Mark Diekhans, UCSC Tree image generator: phyloPng by Galt Barber, UCSC Conservation track display: Kate Rosenbloom, Hiram Clawson (wiggle display), and Brian Raney (gap annotation and codon framing) at UCSC The phylogenetic tree is based on Murphy et al. (2001) and general consensus in the vertebrate phylogeny community. References Phylo-HMMs and phastCons: Felsenstein J, Churchill GA. A Hidden Markov Model approach to variation among sites in rate of evolution. Mol Biol Evol. 1996 Jan;13(1):93-104. PMID: 8583911 Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K, Clawson H, Spieth J, Hillier LW, Richards S, et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 2005 Aug;15(8):1034-50. PMID: 16024819; PMC: PMC1182216 Siepel A, Haussler D. Phylogenetic Hidden Markov Models. In: Nielsen R, editor. Statistical Methods in Molecular Evolution. New York: Springer; 2005. pp. 325-351. Yang Z. A space-time process model for the evolution of DNA sequences. Genetics. 1995 Feb;139(2):993-1005. PMID: 7713447; PMC: PMC1206396 Chain/Net: Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Multiz: Blanchette M, Kent WJ, Riemer C, Elnitski L, Smit AF, Roskin KM, Baertsch R, Rosenbloom K, Clawson H, Green ED, et al. Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res. 2004 Apr;14(4):708-15. PMID: 15060014; PMC: PMC383317 Blastz: Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 Phylogenetic Tree: Murphy WJ, Eizirik E, O'Brien SJ, Madsen O, Scally M, Douady CJ, Teeling E, Ryder OA, Stanhope MJ, de Jong WW, Springer MS. Resolution of the early placental mammal radiation using Bayesian phylogenetics. Science. 2001 Dec 14;294(5550):2348-51. PMID: 11743200 cons9wayViewphastcons Element Conservation (phastCons) 9-Way Multiz Alignment & Conservation Comparative Genomics phastCons9way 9 Species Cons 9 Species Conservation by PhastCons Comparative Genomics cons9wayViewelements Conserved Elements 9-Way Multiz Alignment & Conservation Comparative Genomics phastConsElements9way 9 Species El 9 Species Conserved Elements Comparative Genomics Description This track shows predictions of conserved elements produced by the phastCons program. PhastCons is part of the PHAST (PHylogenetic Analysis with Space/Time models) package. The predictions are based on a phylogenetic hidden Markov model (phylo-HMM), a type of probabilistic model that describes both the process of DNA substitution at each site in a genome and the way this process changes from one site to the next. Methods Best-in-genome pairwise alignments were generated for each species using blastz, followed by chaining and netting. A multiple alignment was then constructed from these pairwise alignments using multiz. Predictions of conserved elements were then obtained by running phastCons on the multiple alignments with the --most-conserved option. PhastCons constructs a two-state phylo-HMM with a state for conserved regions and a state for non-conserved regions. The two states share a single phylogenetic model, except that the branch lengths of the tree associated with the conserved state are multiplied by a constant scaling factor rho (0 <= rho <= 1). The free parameters of the phylo-HMM, including the scaling factor rho, are estimated from the data by maximum likelihood using an EM algorithm. This procedure is subject to certain constraints on the "coverage" of the genome by conserved elements and the "smoothness" of the conservation scores. Details can be found in Siepel et al. (2005). The predicted conserved elements are segments of the alignment that are likely to have been "generated" by the conserved state of the phylo-HMM. Each element is assigned a log-odds score equal to its log probability under the conserved model minus its log probability under the non-conserved model. The "score" field associated with this track contains transformed log-odds scores, taking values between 0 and 1000. (The scores are transformed using a monotonic function of the form a * log(x) + b.) The raw log odds scores are retained in the "name" field and can be seen on the details page or in the browser when the track's display mode is set to "pack" or "full". Credits This track was created at UCSC using the following programs: Blastz and multiz by Minmei Hou, Scott Schwartz and Webb Miller of the Penn State Bioinformatics Group. AxtBest, axtChain, chainNet, netSyntenic, and netClass by Jim Kent at UCSC. PhastCons by Adam Siepel at Cornell University. References PhastCons: Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K, Clawson H, Spieth J, Hillier LW, Richards S, et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 2005 Aug;15(8):1034-50. PMID: 16024819; PMC: PMC1182216 Chain/Net: Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Multiz: Blanchette M, Kent WJ, Riemer C, Elnitski L, Smit AF, Roskin KM, Baertsch R, Rosenbloom K, Clawson H, Green ED, et al. Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res. 2004 Apr;14(4):708-15. PMID: 15060014; PMC: PMC383317 Blastz: Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetMm9 Mouse Chain/Net Mouse (July 2007 (NCBI37/mm9)), Chain and Net Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of mouse (July 2007 (NCBI37/mm9)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both mouse and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the mouse assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best mouse/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The mouse sequence used in this annotation is from the July 2007 (NCBI37/mm9) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the mouse/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single mouse chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A56-109-45-137 C-109100-103-45 G-45-103100-109 T-137-45-10956 Chains scoring below a minimum score of "5000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=loose tablesize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 tGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 bothGap 625 660 700 750 900 1400 4000 8000 16000 32000 57000 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetMm9Viewnet Net Mouse (July 2007 (NCBI37/mm9)), Chain and Net Alignments Comparative Genomics netMm9 Mouse Net Mouse (July 2007 (NCBI37/mm9)) Alignment Net Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of mouse (July 2007 (NCBI37/mm9)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both mouse and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the mouse assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best mouse/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The mouse sequence used in this annotation is from the July 2007 (NCBI37/mm9) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the mouse/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single mouse chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A56-109-45-137 C-109100-103-45 G-45-103100-109 T-137-45-10956 Chains scoring below a minimum score of "5000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=loose tablesize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 tGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 bothGap 625 660 700 750 900 1400 4000 8000 16000 32000 57000 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetMm9Viewchain Chain Mouse (July 2007 (NCBI37/mm9)), Chain and Net Alignments Comparative Genomics chainMm9 Mouse Chain Mouse (July 2007 (NCBI37/mm9)) Chained Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of mouse (July 2007 (NCBI37/mm9)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both mouse and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the mouse assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best mouse/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The mouse sequence used in this annotation is from the July 2007 (NCBI37/mm9) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the mouse/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single mouse chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A56-109-45-137 C-109100-103-45 G-45-103100-109 T-137-45-10956 Chains scoring below a minimum score of "5000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=loose tablesize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 tGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 bothGap 625 660 700 750 900 1400 4000 8000 16000 32000 57000 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetCavPor3 Guinea pig Chain/Net Guinea pig (Feb. 2008 (Broad/cavPor3)), Chain and Net Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of guinea pig (Feb. 2008 (Broad/cavPor3)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both guinea pig and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the guinea pig assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best guinea pig/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The guinea pig sequence used in this annotation is from the Feb. 2008 (Broad/cavPor3) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the guinea pig/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single guinea pig chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "3000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=medium tableSize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 tGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 bothGap 750 825 850 1000 1300 3300 23300 58300 118300 218300 318300 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetCavPor3Viewnet Net Guinea pig (Feb. 2008 (Broad/cavPor3)), Chain and Net Alignments Comparative Genomics netCavPor3 Guinea pig Net Guinea pig (Feb. 2008 (Broad/cavPor3)) Alignment Net Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of guinea pig (Feb. 2008 (Broad/cavPor3)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both guinea pig and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the guinea pig assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best guinea pig/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The guinea pig sequence used in this annotation is from the Feb. 2008 (Broad/cavPor3) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the guinea pig/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single guinea pig chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "3000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=medium tableSize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 tGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 bothGap 750 825 850 1000 1300 3300 23300 58300 118300 218300 318300 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetCavPor3Viewchain Chain Guinea pig (Feb. 2008 (Broad/cavPor3)), Chain and Net Alignments Comparative Genomics chainCavPor3 Guinea pig Chain Guinea pig (Feb. 2008 (Broad/cavPor3)) Chained Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of guinea pig (Feb. 2008 (Broad/cavPor3)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both guinea pig and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the guinea pig assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best guinea pig/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The guinea pig sequence used in this annotation is from the Feb. 2008 (Broad/cavPor3) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the guinea pig/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single guinea pig chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "3000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=medium tableSize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 tGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 bothGap 750 825 850 1000 1300 3300 23300 58300 118300 218300 318300 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetRheMac2 Rhesus Chain/Net Rhesus (Jan. 2006 (MGSC Merged 1.0/rheMac2)), Chain and Net Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of rhesus (Jan. 2006 (MGSC Merged 1.0/rheMac2)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both rhesus and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the rhesus assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best rhesus/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The rhesus sequence used in this annotation is from the Jan. 2006 (MGSC Merged 1.0/rheMac2) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the rhesus/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single rhesus chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "3000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=medium tableSize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 tGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 bothGap 750 825 850 1000 1300 3300 23300 58300 118300 218300 318300 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetRheMac2Viewnet Net Rhesus (Jan. 2006 (MGSC Merged 1.0/rheMac2)), Chain and Net Alignments Comparative Genomics netRheMac2 Rhesus Net Rhesus (Jan. 2006 (MGSC Merged 1.0/rheMac2)) Alignment net Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of rhesus (Jan. 2006 (MGSC Merged 1.0/rheMac2)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both rhesus and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the rhesus assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best rhesus/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The rhesus sequence used in this annotation is from the Jan. 2006 (MGSC Merged 1.0/rheMac2) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the rhesus/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single rhesus chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "3000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=medium tableSize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 tGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 bothGap 750 825 850 1000 1300 3300 23300 58300 118300 218300 318300 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetRheMac2Viewchain Chain Rhesus (Jan. 2006 (MGSC Merged 1.0/rheMac2)), Chain and Net Alignments Comparative Genomics chainRheMac2 Rhesus Chain Rhesus (Jan. 2006 (MGSC Merged 1.0/rheMac2)) Chained Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of rhesus (Jan. 2006 (MGSC Merged 1.0/rheMac2)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both rhesus and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the rhesus assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best rhesus/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The rhesus sequence used in this annotation is from the Jan. 2006 (MGSC Merged 1.0/rheMac2) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the rhesus/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single rhesus chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "3000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=medium tableSize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 tGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 bothGap 750 825 850 1000 1300 3300 23300 58300 118300 218300 318300 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetPanTro2 panTro2 Chain/Net Chimp (Mar. 2006 (CGSC 2.1/panTro2)), Chain and Net Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of chimp (Mar. 2006 (CGSC 2.1/panTro2)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both chimp and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the chimp assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best chimp/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The chimp sequence used in this annotation is from the Mar. 2006 (CGSC 2.1/panTro2) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the chimp/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single chimp chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "3000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=medium tableSize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 tGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 bothGap 750 825 850 1000 1300 3300 23300 58300 118300 218300 318300 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetPanTro2Viewnet Net Chimp (Mar. 2006 (CGSC 2.1/panTro2)), Chain and Net Alignments Comparative Genomics netPanTro2 Chimp Net Chimp (Mar. 2006 (CGSC 2.1/panTro2)) Alignment net Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of chimp (Mar. 2006 (CGSC 2.1/panTro2)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both chimp and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the chimp assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best chimp/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The chimp sequence used in this annotation is from the Mar. 2006 (CGSC 2.1/panTro2) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the chimp/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single chimp chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "3000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=medium tableSize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 tGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 bothGap 750 825 850 1000 1300 3300 23300 58300 118300 218300 318300 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetPanTro2Viewchain Chain Chimp (Mar. 2006 (CGSC 2.1/panTro2)), Chain and Net Alignments Comparative Genomics chainPanTro2 Chimp Chain Chimp (Mar. 2006 (CGSC 2.1/panTro2)) Chained Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of chimp (Mar. 2006 (CGSC 2.1/panTro2)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both chimp and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the chimp assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best chimp/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The chimp sequence used in this annotation is from the Mar. 2006 (CGSC 2.1/panTro2) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the chimp/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single chimp chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "3000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=medium tableSize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 tGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 bothGap 750 825 850 1000 1300 3300 23300 58300 118300 218300 318300 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetHg19 Human Chain/Net Human (Feb. 2009 (GRCh37/hg19)), Chain and Net Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of human (Feb. 2009 (GRCh37/hg19)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both human and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the human assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best human/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The human sequence used in this annotation is from the Feb. 2009 (GRCh37/hg19) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the human/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single human chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "3000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=medium tableSize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 tGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 bothGap 750 825 850 1000 1300 3300 23300 58300 118300 218300 318300 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetHg19Viewnet Net Human (Feb. 2009 (GRCh37/hg19)), Chain and Net Alignments Comparative Genomics netHg19 Human Net Human (Feb. 2009 (GRCh37/hg19)) Alignment net Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of human (Feb. 2009 (GRCh37/hg19)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both human and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the human assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best human/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The human sequence used in this annotation is from the Feb. 2009 (GRCh37/hg19) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the human/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single human chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "3000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=medium tableSize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 tGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 bothGap 750 825 850 1000 1300 3300 23300 58300 118300 218300 318300 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetHg19Viewchain Chain Human (Feb. 2009 (GRCh37/hg19)), Chain and Net Alignments Comparative Genomics chainHg19 Human Chain Human (Feb. 2009 (GRCh37/hg19)) Chained Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of human (Feb. 2009 (GRCh37/hg19)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both human and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the human assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best human/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The human sequence used in this annotation is from the Feb. 2009 (GRCh37/hg19) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the human/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single human chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "3000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=medium tableSize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 tGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 bothGap 750 825 850 1000 1300 3300 23300 58300 118300 218300 318300 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetAilMel1 Panda Chain/Net Panda (Dec. 2009 (BGI-Shenzhen 1.0/ailMel1)), Chain and Net Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of panda (Dec. 2009 (BGI-Shenzhen 1.0/ailMel1)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both panda and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the panda assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best panda/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The panda sequence used in this annotation is from the Dec. 2009 (BGI-Shenzhen 1.0/ailMel1) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the panda/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single panda chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "3000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=medium tableSize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 tGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 bothGap 750 825 850 1000 1300 3300 23300 58300 118300 218300 318300 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetAilMel1Viewnet Net Panda (Dec. 2009 (BGI-Shenzhen 1.0/ailMel1)), Chain and Net Alignments Comparative Genomics netAilMel1 Panda Net Panda (Dec. 2009 (BGI-Shenzhen 1.0/ailMel1)) Alignment Net Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of panda (Dec. 2009 (BGI-Shenzhen 1.0/ailMel1)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both panda and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the panda assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best panda/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The panda sequence used in this annotation is from the Dec. 2009 (BGI-Shenzhen 1.0/ailMel1) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the panda/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single panda chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "3000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=medium tableSize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 tGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 bothGap 750 825 850 1000 1300 3300 23300 58300 118300 218300 318300 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetAilMel1Viewchain Chain Panda (Dec. 2009 (BGI-Shenzhen 1.0/ailMel1)), Chain and Net Alignments Comparative Genomics chainAilMel1 Panda Chain Panda (Dec. 2009 (BGI-Shenzhen 1.0/ailMel1)) Chained Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of panda (Dec. 2009 (BGI-Shenzhen 1.0/ailMel1)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both panda and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the panda assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best panda/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The panda sequence used in this annotation is from the Dec. 2009 (BGI-Shenzhen 1.0/ailMel1) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the panda/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single panda chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "3000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=medium tableSize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 tGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 bothGap 750 825 850 1000 1300 3300 23300 58300 118300 218300 318300 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetCanFam2 Dog Chain/Net Dog (May 2005 (Broad/canFam2)), Chain and Net Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of dog (May 2005 (Broad/canFam2)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both dog and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the dog assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best dog/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The dog sequence used in this annotation is from the May 2005 (Broad/canFam2) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the dog/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single dog chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "1000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=loose tablesize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 tGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 bothGap 625 660 700 750 900 1400 4000 8000 16000 32000 57000 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetCanFam2Viewnet Net Dog (May 2005 (Broad/canFam2)), Chain and Net Alignments Comparative Genomics netCanFam2 Dog Net Dog (May 2005 (Broad/canFam2)) Alignment Net Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of dog (May 2005 (Broad/canFam2)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both dog and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the dog assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best dog/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The dog sequence used in this annotation is from the May 2005 (Broad/canFam2) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the dog/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single dog chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "1000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=loose tablesize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 tGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 bothGap 625 660 700 750 900 1400 4000 8000 16000 32000 57000 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetCanFam2Viewchain Chain Dog (May 2005 (Broad/canFam2)), Chain and Net Alignments Comparative Genomics chainCanFam2 Dog Chain Dog (May 2005 (Broad/canFam2)) Chained Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of dog (May 2005 (Broad/canFam2)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both dog and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the dog assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best dog/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The dog sequence used in this annotation is from the May 2005 (Broad/canFam2) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the dog/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single dog chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "1000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=loose tablesize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 tGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 bothGap 625 660 700 750 900 1400 4000 8000 16000 32000 57000 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetEquCab1 Horse Chain/Net Horse (Jan. 2007 (Broad/equCab1)), Chain and Net Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of horse (Jan. 2007 (Broad/equCab1)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both horse and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the horse assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best horse/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The horse sequence used in this annotation is from the Jan. 2007 (Broad/equCab1) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the horse/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single horse chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "3000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=medium tableSize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 tGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 bothGap 750 825 850 1000 1300 3300 23300 58300 118300 218300 318300 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetEquCab1Viewnet Net Horse (Jan. 2007 (Broad/equCab1)), Chain and Net Alignments Comparative Genomics netEquCab1 Horse Net Horse (Jan. 2007 (Broad/equCab1)) Alignment Net Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of horse (Jan. 2007 (Broad/equCab1)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both horse and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the horse assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best horse/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The horse sequence used in this annotation is from the Jan. 2007 (Broad/equCab1) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the horse/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single horse chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "3000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=medium tableSize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 tGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 bothGap 750 825 850 1000 1300 3300 23300 58300 118300 218300 318300 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetEquCab1Viewchain Chain Horse (Jan. 2007 (Broad/equCab1)), Chain and Net Alignments Comparative Genomics chainEquCab1 Horse Chain Horse (Jan. 2007 (Broad/equCab1)) Chained Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of horse (Jan. 2007 (Broad/equCab1)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both horse and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the horse assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best horse/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The horse sequence used in this annotation is from the Jan. 2007 (Broad/equCab1) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the horse/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single horse chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "3000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=medium tableSize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 tGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 bothGap 750 825 850 1000 1300 3300 23300 58300 118300 218300 318300 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetBosTau7 Cow Chain/Net Cow (Oct. 2011 (Baylor Btau_4.6.1/bosTau7)), Chain and Net Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of cow (Oct. 2011 (Baylor Btau_4.6.1/bosTau7)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both cow and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the cow assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best cow/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The cow sequence used in this annotation is from the Oct. 2011 (Baylor Btau_4.6.1/bosTau7) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the cow/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single cow chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "3000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=medium tableSize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 tGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 bothGap 750 825 850 1000 1300 3300 23300 58300 118300 218300 318300 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetBosTau7Viewnet Net Cow (Oct. 2011 (Baylor Btau_4.6.1/bosTau7)), Chain and Net Alignments Comparative Genomics netBosTau7 Cow Net Cow (Oct. 2011 (Baylor Btau_4.6.1/bosTau7)) Alignment Net Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of cow (Oct. 2011 (Baylor Btau_4.6.1/bosTau7)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both cow and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the cow assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best cow/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The cow sequence used in this annotation is from the Oct. 2011 (Baylor Btau_4.6.1/bosTau7) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the cow/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single cow chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "3000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=medium tableSize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 tGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 bothGap 750 825 850 1000 1300 3300 23300 58300 118300 218300 318300 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetBosTau7Viewchain Chain Cow (Oct. 2011 (Baylor Btau_4.6.1/bosTau7)), Chain and Net Alignments Comparative Genomics chainBosTau7 Cow Chain Cow (Oct. 2011 (Baylor Btau_4.6.1/bosTau7)) Chained Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of cow (Oct. 2011 (Baylor Btau_4.6.1/bosTau7)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both cow and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the cow assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best cow/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The cow sequence used in this annotation is from the Oct. 2011 (Baylor Btau_4.6.1/bosTau7) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the cow/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single cow chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "3000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=medium tableSize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 tGap 350 425 450 600 900 2900 22900 57900 117900 217900 317900 bothGap 750 825 850 1000 1300 3300 23300 58300 118300 218300 318300 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetMonDom4 Opossum Chain/Net Opossum (Jan. 2006 (Broad/monDom4)), Chain and Net Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of opossum (Jan. 2006 (Broad/monDom4)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both opossum and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the opossum assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best opossum/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The opossum sequence used in this annotation is from the Jan. 2006 (Broad/monDom4) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the opossum/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single opossum chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-90-25-100 C-90100-100-25 G-25-100100-90 T-100-25-9091 Chains scoring below a minimum score of "5000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=loose tablesize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 tGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 bothGap 625 660 700 750 900 1400 4000 8000 16000 32000 57000 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetMonDom4Viewnet Net Opossum (Jan. 2006 (Broad/monDom4)), Chain and Net Alignments Comparative Genomics netMonDom4 Opossum Net Opossum (Jan. 2006 (Broad/monDom4)) Alignment Net Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of opossum (Jan. 2006 (Broad/monDom4)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both opossum and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the opossum assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best opossum/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The opossum sequence used in this annotation is from the Jan. 2006 (Broad/monDom4) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the opossum/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single opossum chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-90-25-100 C-90100-100-25 G-25-100100-90 T-100-25-9091 Chains scoring below a minimum score of "5000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=loose tablesize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 tGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 bothGap 625 660 700 750 900 1400 4000 8000 16000 32000 57000 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetMonDom4Viewchain Chain Opossum (Jan. 2006 (Broad/monDom4)), Chain and Net Alignments Comparative Genomics chainMonDom4 Opossum Chain Opossum (Jan. 2006 (Broad/monDom4)) Chained Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of opossum (Jan. 2006 (Broad/monDom4)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both opossum and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the opossum assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best opossum/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The opossum sequence used in this annotation is from the Jan. 2006 (Broad/monDom4) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the opossum/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single opossum chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-90-25-100 C-90100-100-25 G-25-100100-90 T-100-25-9091 Chains scoring below a minimum score of "5000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=loose tablesize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 tGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 bothGap 625 660 700 750 900 1400 4000 8000 16000 32000 57000 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetGalGal3 Chicken Chain/Net Chicken (May 2006 (WUGSC 2.1/galGal3)), Chain and Net Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of chicken (May 2006 (WUGSC 2.1/galGal3)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both chicken and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the chicken assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best chicken/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The chicken sequence used in this annotation is from the May 2006 (WUGSC 2.1/galGal3) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the chicken/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single chicken chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-90-25-100 C-90100-100-25 G-25-100100-90 T-100-25-9091 Chains scoring below a minimum score of "5000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=loose tablesize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 tGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 bothGap 625 660 700 750 900 1400 4000 8000 16000 32000 57000 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetGalGal3Viewnet Net Chicken (May 2006 (WUGSC 2.1/galGal3)), Chain and Net Alignments Comparative Genomics netGalGal3 Chicken Net Chicken (May 2006 (WUGSC 2.1/galGal3)) Alignment Net Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of chicken (May 2006 (WUGSC 2.1/galGal3)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both chicken and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the chicken assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best chicken/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The chicken sequence used in this annotation is from the May 2006 (WUGSC 2.1/galGal3) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the chicken/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single chicken chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-90-25-100 C-90100-100-25 G-25-100100-90 T-100-25-9091 Chains scoring below a minimum score of "5000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=loose tablesize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 tGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 bothGap 625 660 700 750 900 1400 4000 8000 16000 32000 57000 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetGalGal3Viewchain Chain Chicken (May 2006 (WUGSC 2.1/galGal3)), Chain and Net Alignments Comparative Genomics chainGalGal3 Chicken Chain Chicken (May 2006 (WUGSC 2.1/galGal3)) Chained Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of chicken (May 2006 (WUGSC 2.1/galGal3)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both chicken and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the chicken assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best chicken/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The chicken sequence used in this annotation is from the May 2006 (WUGSC 2.1/galGal3) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the chicken/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single chicken chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-90-25-100 C-90100-100-25 G-25-100100-90 T-100-25-9091 Chains scoring below a minimum score of "5000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=loose tablesize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 tGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 bothGap 625 660 700 750 900 1400 4000 8000 16000 32000 57000 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetMelGal1 Turkey Chain/Net Turkey (Dec. 2009 (TGC Turkey_2.01/melGal1)), Chain and Net Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of turkey (Dec. 2009 (TGC Turkey_2.01/melGal1)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both turkey and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the turkey assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best turkey/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The turkey sequence used in this annotation is from the Dec. 2009 (TGC Turkey_2.01/melGal1) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the turkey/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single turkey chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "5000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=loose tablesize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 tGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 bothGap 625 660 700 750 900 1400 4000 8000 16000 32000 57000 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetMelGal1Viewnet Net Turkey (Dec. 2009 (TGC Turkey_2.01/melGal1)), Chain and Net Alignments Comparative Genomics netMelGal1 Turkey Net Turkey (Dec. 2009 (TGC Turkey_2.01/melGal1)) Alignment Net Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of turkey (Dec. 2009 (TGC Turkey_2.01/melGal1)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both turkey and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the turkey assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best turkey/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The turkey sequence used in this annotation is from the Dec. 2009 (TGC Turkey_2.01/melGal1) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the turkey/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single turkey chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "5000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=loose tablesize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 tGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 bothGap 625 660 700 750 900 1400 4000 8000 16000 32000 57000 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetMelGal1Viewchain Chain Turkey (Dec. 2009 (TGC Turkey_2.01/melGal1)), Chain and Net Alignments Comparative Genomics chainMelGal1 Turkey Chain Turkey (Dec. 2009 (TGC Turkey_2.01/melGal1)) Chained Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of turkey (Dec. 2009 (TGC Turkey_2.01/melGal1)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both turkey and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the turkey assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best turkey/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The turkey sequence used in this annotation is from the Dec. 2009 (TGC Turkey_2.01/melGal1) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the turkey/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single turkey chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-114-31-123 C-114100-125-31 G-31-125100-114 T-123-31-11491 Chains scoring below a minimum score of "5000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=loose tablesize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 tGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 bothGap 625 660 700 750 900 1400 4000 8000 16000 32000 57000 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetXenTro3 X. tropicalis Chain/Net X. tropicalis (Nov. 2009 (JGI 4.2/xenTro3)), Chain and Net Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of X. tropicalis (Nov. 2009 (JGI 4.2/xenTro3)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both X. tropicalis and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the X. tropicalis assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best X. tropicalis/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The X. tropicalis sequence used in this annotation is from the Nov. 2009 (JGI 4.2/xenTro3) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the X. tropicalis/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single X. tropicalis chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-90-25-100 C-90100-100-25 G-25-100100-90 T-100-25-9091 Chains scoring below a minimum score of "5000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=loose tablesize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 tGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 bothGap 625 660 700 750 900 1400 4000 8000 16000 32000 57000 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetXenTro3Viewnet Net X. tropicalis (Nov. 2009 (JGI 4.2/xenTro3)), Chain and Net Alignments Comparative Genomics netXenTro3 X. tropicalis Net X. tropicalis (Nov. 2009 (JGI 4.2/xenTro3)) Alignment Net Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of X. tropicalis (Nov. 2009 (JGI 4.2/xenTro3)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both X. tropicalis and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the X. tropicalis assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best X. tropicalis/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The X. tropicalis sequence used in this annotation is from the Nov. 2009 (JGI 4.2/xenTro3) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the X. tropicalis/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single X. tropicalis chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-90-25-100 C-90100-100-25 G-25-100100-90 T-100-25-9091 Chains scoring below a minimum score of "5000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=loose tablesize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 tGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 bothGap 625 660 700 750 900 1400 4000 8000 16000 32000 57000 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetXenTro3Viewchain Chain X. tropicalis (Nov. 2009 (JGI 4.2/xenTro3)), Chain and Net Alignments Comparative Genomics chainXenTro3 X. tropicalis Chain X. tropicalis (Nov. 2009 (JGI 4.2/xenTro3)) Chained Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of X. tropicalis (Nov. 2009 (JGI 4.2/xenTro3)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both X. tropicalis and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the X. tropicalis assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best X. tropicalis/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The X. tropicalis sequence used in this annotation is from the Nov. 2009 (JGI 4.2/xenTro3) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the X. tropicalis/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single X. tropicalis chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-90-25-100 C-90100-100-25 G-25-100100-90 T-100-25-9091 Chains scoring below a minimum score of "5000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=loose tablesize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 tGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 bothGap 625 660 700 750 900 1400 4000 8000 16000 32000 57000 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetDanRer4 Zebrafish Chain/Net Zebrafish (Mar. 2006 (Zv6/danRer4)), Chain and Net Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of zebrafish (Mar. 2006 (Zv6/danRer4)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both zebrafish and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the zebrafish assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best zebrafish/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The zebrafish sequence used in this annotation is from the Mar. 2006 (Zv6/danRer4) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the zebrafish/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single zebrafish chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-90-25-100 C-90100-100-25 G-25-100100-90 T-100-25-9091 Chains scoring below a minimum score of "5000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=loose tablesize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 tGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 bothGap 625 660 700 750 900 1400 4000 8000 16000 32000 57000 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetDanRer4Viewnet Net Zebrafish (Mar. 2006 (Zv6/danRer4)), Chain and Net Alignments Comparative Genomics netDanRer4 Zebrafish Net Zebrafish (Mar. 2006 (Zv6/danRer4)) Alignment Net Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of zebrafish (Mar. 2006 (Zv6/danRer4)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both zebrafish and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the zebrafish assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best zebrafish/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The zebrafish sequence used in this annotation is from the Mar. 2006 (Zv6/danRer4) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the zebrafish/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single zebrafish chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-90-25-100 C-90100-100-25 G-25-100100-90 T-100-25-9091 Chains scoring below a minimum score of "5000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=loose tablesize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 tGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 bothGap 625 660 700 750 900 1400 4000 8000 16000 32000 57000 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainNetDanRer4Viewchain Chain Zebrafish (Mar. 2006 (Zv6/danRer4)), Chain and Net Alignments Comparative Genomics chainDanRer4 Zebrafish Chain Zebrafish (Mar. 2006 (Zv6/danRer4)) Chained Alignments Comparative Genomics Description This track shows regions of the genome that are alignable to other genomes ("chain" subtracks) or in synteny ("net" subtracks). The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown like introns. Chain Track The chain track shows alignments of zebrafish (Mar. 2006 (Zv6/danRer4)) to the rat genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both zebrafish and rat simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the zebrafish assembly or an insertion in the rat assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the rat genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Net Track The net track shows the best zebrafish/rat chain for every part of the rat genome. It is useful for finding syntenic regions, possibly orthologs, and for studying genome rearrangement. The zebrafish sequence used in this annotation is from the Mar. 2006 (Zv6/danRer4) assembly. Display Conventions and Configuration Chain Track By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Net Track In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chain track Transposons that have been inserted since the zebrafish/rat split were removed from the assemblies. The abbreviated genomes were aligned with lastz, and the transposons were added back in. The resulting alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single zebrafish chromosome and a single rat chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-90-25-100 C-90100-100-25 G-25-100100-90 T-100-25-9091 Chains scoring below a minimum score of "5000" were discarded; the remaining chains are displayed in this track. The linear gap matrix used with axtChain: -linearGap=loose tablesize 11 smallSize 111 position 1 2 3 11 111 2111 12111 32111 72111 152111 252111 qGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 tGap 325 360 400 450 600 1100 3600 7600 15600 31600 56600 bothGap 625 660 700 750 900 1400 4000 8000 16000 32000 57000 Net track Chains were derived from lastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits Lastz (previously known as blastz) was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains and nets were created by Robert Baertsch and Jim Kent. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. References Harris, R.S. (2007) Improved pairwise alignment of genomic DNA Ph.D. Thesis, The Pennsylvania State University Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961