cytoBand Chromosome Band Chromosome Bands Mapping and Sequencing Description This track shows chromosome bands annotated by FlyBase (D. melanogaster version 4.3). Credits Thanks to FlyBase for providing these annotations. cytoBandIdeo Chromosome Band (Low-res) Chromosome Bands (Low-resolution for Chromosome Ideogram) Mapping and Sequencing refGene RefSeq Genes RefSeq Genes Genes and Gene Predictions Description The RefSeq Genes track shows known D. melanogaster protein-coding and non-protein-coding genes taken from the NCBI RNA reference sequences collection (RefSeq). The data underlying this track are updated weekly. Please visit the Feedback for Gene and Reference Sequences (RefSeq) page to make suggestions, submit additions and corrections, or ask for help concerning RefSeq records. Display Conventions and Configuration This track follows the display conventions for gene prediction tracks. The color shading indicates the level of review the RefSeq record has undergone: predicted (light), provisional (medium), reviewed (dark). The item labels and display colors of features within this track can be configured through the controls at the top of the track description page. This page is accessed via the small button to the left of the track's graphical display or through the link on the track's control menu. Label: By default, items are labeled by gene name. Click the appropriate Label option to display the accession name instead of the gene name, show both the gene and accession names, or turn off the label completely. Codon coloring: This track contains an optional codon coloring feature that allows users to quickly validate and compare gene predictions. To display codon colors, select the genomic codons option from the Color track by codons pull-down menu. Click here for more information about this feature. Hide non-coding genes: By default, both the protein-coding and non-protein-coding genes are displayed. If you wish to see only the coding genes, click this box. Methods RefSeq RNAs were aligned against the D. melanogaster genome using blat; those with an alignment of less than 15% were discarded. When a single RNA aligned in multiple places, the alignment having the highest base identity was identified. Only alignments having a base identity level within 0.1% of the best and at least 96% base identity with the genomic sequence were kept. Credits This track was produced at UCSC from RNA sequence data generated by scientists worldwide and curated by the NCBI RefSeq project. References Kent WJ. BLAT--the BLAST-like alignment tool. Genome Res. 2002 Apr;12(4):656-64. PMID: 11932250; PMC: PMC187518 Pruitt KD, Tatusova T, Maglott DR. NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 2005 Jan 1;33(Database issue):D501-4. PMID: 15608248; PMC: PMC539979 bacEndPairs BAC End Pairs BAC End Pairs Mapping and Sequencing Description Bacterial artificial chromosomes (BACs) are a key part of many large-scale sequencing projects. A BAC typically consists of 50 - 300 kb of DNA. During the early phase of a sequencing project, it is common to sequence a single read (approximately 500 bases) off each end of a large number of BACs. Later on in the project, these BAC end reads can be mapped to the genome sequence. This track shows these mappings in cases where both ends could be mapped. These BAC end pairs can be useful for validating the assembly over relatively long ranges. In some cases, the BACs are useful biological reagents. This track can also be used for determining which BAC contains a given gene, useful information for certain wet lab experiments. The RPCI-98 and DrosBAC libraries, individual clones, and hybridization filters are available from the BACPAC Resources Center (BPRC) at Children's Hospital Oakland Research Institute (CHORI). Individual clones from the RPCI-98 library are named BACR01A01 - BACR48H12 (96-well format; R stands for EcoRI). Individual clones from the DrosBAC library are named BACN01A01 - BACN47H12 and BACH48A01 - BACH61H12 (N stands for NdeII; H stands for HinDIII). In order to be included in this track, a valid pair of BAC end sequence alignments must be at least 25 kb but no more than 500 kb away from each other. The orientation of the first BAC end sequence must be "+" and the orientation of the second BAC end sequence must be "-". The scoring scheme used for this annotation assigns 1000 to an alignment when the BAC end pair aligns to only one location in the genome (after filtering). When a BAC end pair or clone aligns to multiple locations, the score is calculated as 1500/(number of alignments). Methods BAC end sequences were downloaded from Genoscope ( http://www.cea.fr/drf/ig/english/Pages/Genoscope/Genoscope_s-bioinformatics-resources.aspx) and then placed on the assembled sequence using Jim Kent's blat program. Terry Furey's pslPairs program was used to identify paired end alignments. Credits The RPCI-98 BAC library was produced by BACPAC Resources, then at Roswell Park Cancer Institute and now at CHORI, in collaboration with the Berkeley Drosophila Genome Project. The DrosBAC library was made by Alain Billaud at CEPH (Centre d'Etude du Polymorphisme Humain) in a collaboration with the European Drosophila Genome Project co-ordinated by D. Glover. Thanks to Genoscope for providing the BAC end sequence files. flyBaseGene FlyBase Genes FlyBase Protein-Coding Genes Genes and Gene Predictions Description This track shows protein-coding genes annotated by FlyBase (D. melanogaster version 4.3). Credits Thanks to FlyBase for providing these annotations. flyBaseNoncoding FB Noncoding FlyBase Noncoding Genes Genes and Gene Predictions Description This track shows non-coding genes annotated by FlyBase (D. melanogaster version 4.3). Credits Thanks to FlyBase for providing these annotations. intronEst Spliced ESTs D. melanogaster ESTs That Have Been Spliced mRNA and EST Description This track shows alignments between D. melanogaster expressed sequence tags (ESTs) in GenBank and the genome that show signs of splicing when aligned against the genome. ESTs are single-read sequences, typically about 500 bases in length, that usually represent fragments of transcribed genes. To be considered spliced, an EST must show evidence of at least one canonical intron, i.e. one that is at least 32 bases in length and has GT/AG ends. By requiring splicing, the level of contamination in the EST databases is drastically reduced at the expense of eliminating many genuine 3' ESTs. For a display of all ESTs (including unspliced), see the D. melanogaster EST track. Display Conventions and Configuration This track follows the display conventions for PSL alignment tracks. In dense display mode, darker shading indicates a larger number of aligned ESTs. The strand information (+/-) indicates the direction of the match between the EST and the matching genomic sequence. It bears no relationship to the direction of transcription of the RNA with which it might be associated. The description page for this track has a filter that can be used to change the display mode, alter the color, and include/exclude a subset of items within the track. This may be helpful when many items are shown in the track display, especially when only some are relevant to the current task. To use the filter: Type a term in one or more of the text boxes to filter the EST display. For example, to apply the filter to all ESTs expressed in a specific organ, type the name of the organ in the tissue box. To view the list of valid terms for each text box, consult the table in the Table Browser that corresponds to the factor on which you wish to filter. For example, the "tissue" table contains all the types of tissues that can be entered into the tissue text box. Wildcards may also be used in the filter. If filtering on more than one value, choose the desired combination logic. If "and" is selected, only ESTs that match all filter criteria will be highlighted. If "or" is selected, ESTs that match any one of the filter criteria will be highlighted. Choose the color or display characteristic that should be used to highlight or include/exclude the filtered items. If "exclude" is chosen, the browser will not display ESTs that match the filter criteria. If "include" is selected, the browser will display only those ESTs that match the filter criteria. This track may also be configured to display base labeling, a feature that allows the user to display all bases in the aligning sequence or only those that differ from the genomic sequence. For more information about this option, click here. Methods To make an EST, RNA is isolated from cells and reverse transcribed into cDNA. Typically, the cDNA is cloned into a plasmid vector and a read is taken from the 5' and/or 3' primer. For most — but not all — ESTs, the reverse transcription is primed by an oligo-dT, which hybridizes with the poly-A tail of mature mRNA. The reverse transcriptase may or may not make it to the 5' end of the mRNA, which may or may not be degraded. In general, the 3' ESTs mark the end of transcription reasonably well, but the 5' ESTs may end at any point within the transcript. Some of the newer cap-selected libraries cover transcription start reasonably well. Before the cap-selection techniques emerged, some projects used random rather than poly-A priming in an attempt to retrieve sequence distant from the 3' end. These projects were successful at this, but as a side effect also deposited sequences from unprocessed mRNA and perhaps even genomic sequences into the EST databases. Even outside of the random-primed projects, there is a degree of non-mRNA contamination. Because of this, a single unspliced EST should be viewed with considerable skepticism. To generate this track, D. melanogaster ESTs from GenBank were aligned against the genome using blat. Note that the maximum intron length allowed by blat is 750,000 bases, which may eliminate some ESTs with very long introns that might otherwise align. When a single EST aligned in multiple places, the alignment having the highest base identity was identified. Only alignments having a base identity level within 0.5% of the best and at least 96% base identity with the genomic sequence are displayed in this track. Credits This track was produced at UCSC from EST sequence data submitted to the international public sequence databases by scientists worldwide. References Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. GenBank: update. Nucleic Acids Res. 2004 Jan 1;32(Database issue):D23-6. PMID: 14681350; PMC: PMC308779 Kent WJ. BLAT - the BLAST-like alignment tool. Genome Res. 2002 Apr;12(4):656-64. PMID: 11932250; PMC: PMC187518 evofold EvoFold EvoFold predictions of RNA secondary structure (id_strand_score) Genes and Gene Predictions Description This track shows RNA secondary structure predictions made with the EvoFold program, a comparative method that exploits the evolutionary signal of genomic multiple-sequence alignments for identifying conserved functional RNA structures. Display Conventions and Configuration Track elements are labeled using the convention ID_strand_score. When zoomed out beyond the base level, secondary structure prediction regions are indicated by blocks, with the stem-pairing regions shown in a darker shade than unpaired regions. Arrows indicate the predicted strand. When zoomed in to the base level, the specific secondary structure predictions are shown in parenthesis format. The confidence score for each position is indicated in grayscale, with darker shades corresponding to higher scores. The details page for each track element shows the predicted secondary structure (labeled SS anno), together with details of the multiple species alignments at that location. Substitutions relative to the Drosophila melanogaster sequence are color-coded according to their compatibility with the predicted secondary structure (see the color legend on the details page). Each prediction is assigned an overall score and a sequence of position-specific scores. The overall score measures evidence for any functional RNA structures in the given region, while the position-specific scores (0 - 9) measure the confidence of the base-specific annotations. Base-pairing positions are annotated with the same pair symbol. The offsets are provided to ease visual navigation of the alignment in terms of the D. melanogaster sequence. The offset is calculated (in units of ten) from the start position of the element on the positive strand or from the end position when on the negative strand. The graphical display may be filtered to show only those track elements with scores that meet or exceed a certain threshhold. To set a threshhold, type the minimum score into the text box at the top of the description page. Methods Evofold makes use of phylogenetic stochastic context-free grammars (phylo-SCFGs), which are combined probabilistic models of RNA secondary structure and primary sequence evolution. The predictions consist of both a specific RNA secondary structure and an overall score. The overall score is essentially a log-odd score between a phylo-SCFG modeling the constrained evolution of stem-pairing regions and one which only models unpaired regions. The predictions for this track were based on the conserved elements of a 12-way Drosophila alignment of the D. melanogaster (dm2), D. simulans (droSim1), D. sechellia (droSec1), D. yakuba (droYak2), D. erecta (droEre2), D. ananassae (droAna3), D. pseudoobscura (dp4), D. persimilis (droPer1), D. willistoni (droWil1), D. virilis (droVir3), D. mojavensis (droMoj3), and D. grimshawi (droGri2) assemblies. The 12-way Drosophila alignment was extracted from a 15-way insect alignment, which is the one displayed in the Conservation track. Credits The EvoFold program and browser track were developed by Jakob Skou Pedersen. References Prediction analysis Stark A, Lin MF, Kheradpour P, Pedersen JS, Parts L, Carlson JW, Crosby MA, Rasmussen MD, Roy S, Deoras AN et al. Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures. Nature. 2007 Nov 8;450(7167):219-32. PMID: 17994088; PMC: PMC2474711 EvoFold Pedersen JS, Bejerano G, Siepel A, Rosenbloom K, Lindblad-Toh K, Lander ES, Kent J, Miller W, Haussler D. Identification and classification of conserved RNA secondary structures in the human genome. PLoS Comput Biol. 2006 Apr;2(4):e33. PMID: 16628248; PMC: PMC1440920 Phylo-SCFGs Knudsen B, Hein J. RNA secondary structure prediction using stochastic context-free grammars and evolutionary history. Bioinformatics. 1999 Jun;15(6):446-54. PMID: 10383470 Pedersen JS, Meyer IM, Forsberg R, Simmonds P, Hein J. A comparative method for finding and folding RNA secondary structures within protein-coding regions. Nucleic Acids Res. 2004;32(16):4925-36. PMID: 15448187; PMC: PMC519121 PhastCons Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K, Clawson H, Spieth J, Hillier LW, Richards S et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 2005 Aug;15(8):1034-50. PMID: 16024819; PMC: PMC1182216 affyDrosDevSignal Affy Signal Affymetrix Drosophila Development Signal Expression and Regulation Description This track shows an estimate of RNA abundance (transcription) over the first 24 hours of D. melanogaster development in two hour increments, measured by a tiling array as described in Manak et al. (2006) (see References). Composite signals are shown in separate subtracks for each of the twelve timepoints. Display Conventions and Configuration The subtracks within this composite annotation track may be configured in a variety of ways to highlight different aspects of the displayed data. The graphical configuration options for the subtracks are shown at the top of the track description page, followed by a list of subtracks. To show only selected subtracks, uncheck the boxes next to the tracks that you wish to hide. For more information about the graphical configuration options, click the Graph configuration help link. Color differences among the subtracks are arbitrary. They provide a visual cue for distinguishing between the different timepoints. Methods The data were processed into signal and transfrags as described in Cheng et al. (2005) and Kampa et al. (2004). The data from replicate arrays were quantile-normalized and all arrays were scaled to a median array intensity of 25. Within a sliding 101 bp window centered on each probe, an estimate of RNA abundance (signal) was found by calculating the median of all pairwise average PM-MM values, where PM is a perfect match and MM is a mismatch. Verification Samples were hybridized to duplicate arrays (three technical replicates). Transcribed regions were generated from the composite signal track by merging genomic positions to which probes are mapped. This merging was based on a 5% false positive rate cutoff in negative bacterial controls, a maximum gap (MaxGap) of 50 base-pairs and minimum run (MinRun) of 90 base-pairs (see the Affy TransFrags track for the merged regions). Credits These data were generated and analyzed by the Tom Gingeras group at Affymetrix. References Please see the Affymetrix Transcriptome site for a project overview and additional references to Affymetrix tiling array publications. Cheng J, Kapranov P, Drenkow J, Dike S, Brubaker S, Patel S, Long J, Stern D, Tammana H, Helt G et al. Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution. Science. 2005 May 20;308(5725):1149-54. PMID: 15790807 Kampa D, Cheng J, Kapranov P, Yamanaka M, Brubaker S, Cawley S, Drenkow J, Piccolboni A, Bekiranov S, Helt G et al. Novel RNAs identified from an in-depth analysis of the transcriptome of human chromosomes 21 and 22. Genome Res. 2004 Mar;14(3):331-42. PMID: 14993201; PMC: PMC353210 Manak JR, Dike S, Sementchenko V, Kapranov P, Biemar F, Long J, Cheng J, Bell I, Ghosh S, Piccolboni A et al. Biological function of unannotated transcription during the early development of Drosophila melanogaster. Nat Genet. 2006 Oct;38(10):1151-8. PMID: 16951679 affyDrosDevSignal12 Affy Devel 22-24h Affymetrix Drosophila Development Signal, 22-24 hours Expression and Regulation affyDrosDevSignal11 Affy Devel 20-22h Affymetrix Drosophila Development Signal, 20-22 hours Expression and Regulation affyDrosDevSignal10 Affy Devel 18-20h Affymetrix Drosophila Development Signal, 18-20 hours Expression and Regulation affyDrosDevSignal9 Affy Devel 16-18h Affymetrix Drosophila Development Signal, 16-18 hours Expression and Regulation affyDrosDevSignal8 Affy Devel 14-16h Affymetrix Drosophila Development Signal, 14-16 hours Expression and Regulation affyDrosDevSignal7 Affy Devel 12-14h Affymetrix Drosophila Development Signal, 12-14 hours Expression and Regulation affyDrosDevSignal6 Affy Devel 10-12h Affymetrix Drosophila Development Signal, 10-12 hours Expression and Regulation affyDrosDevSignal5 Affy Devel 8-10h Affymetrix Drosophila Development Signal, 8-10 hours Expression and Regulation affyDrosDevSignal4 Affy Devel 6-8h Affymetrix Drosophila Development Signal, 6-8 hours Expression and Regulation affyDrosDevSignal3 Affy Devel 4-6h Affymetrix Drosophila Development Signal, 4-6 hours Expression and Regulation affyDrosDevSignal2 Affy Devel 2-4h Affymetrix Drosophila Development Signal, 2-4 hours Expression and Regulation affyDrosDevSignal1 Affy Devel 0-2h Affymetrix Drosophila Development Signal, 0-2 hours Expression and Regulation affyDrosDevTransfrags Affy Transfrags Affymetrix Drosophila Development Transfrags Expression and Regulation Description This track shows the location of sites showing transcription over the first 24 hours of D. melanogaster development in two hour increments, measured by a tiling array as described in Manak et al. (2006) (see References). Clustered sites are shown in separate subtracks for each of the twelve timepoints. Display Conventions and Configuration To show only selected subtracks, uncheck the boxes next to the tracks that you wish to hide. Color differences among the subtracks are arbitrary. They provide a visual cue for distinguishing between the different timepoints. Methods The data were processed into signal and transfrags as described in Cheng et al. (2005) and Kampa et al. (2004). The data from replicate arrays were quantile-normalized and all arrays were scaled to a median array intensity of 25. Within a sliding 101 bp window centered on each probe, an estimate of RNA abundance (signal) was found by calculating the median of all pairwise average PM-MM values, where PM is a perfect match and MM is a mismatch. Verification Samples were hybridized to duplicate arrays (three technical replicates). Transcribed regions (see the Affy Signal track) were generated from the composite signal track by merging genomic positions to which probes are mapped. This merging was based on a 5% false positive rate cutoff in negative bacterial controls, a maximum gap (MaxGap) of 50 base-pairs and minimum run (MinRun) of 90 base-pairs. Credits These data were generated and analyzed by the Tom Gingeras group at Affymetrix. References Please see the Affymetrix Transcriptome site for a project overview and additional references to Affymetrix tiling array publications. Cheng J, Kapranov P, Drenkow J, Dike S, Brubaker S, Patel S, Long J, Stern D, Tammana H, Helt G et al. Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution. Science. 2005 May 20;308(5725):1149-54. PMID: 15790807 Kampa D, Cheng J, Kapranov P, Yamanaka M, Brubaker S, Cawley S, Drenkow J, Piccolboni A, Bekiranov S, Helt G et al. Novel RNAs identified from an in-depth analysis of the transcriptome of human chromosomes 21 and 22. Genome Res. 2004 Mar;14(3):331-42. PMID: 14993201; PMC: PMC353210 Manak JR, Dike S, Sementchenko V, Kapranov P, Biemar F, Long J, Cheng J, Bell I, Ghosh S, Piccolboni A et al. Biological function of unannotated transcription during the early development of Drosophila melanogaster. Nat Genet. 2006 Oct;38(10):1151-8. PMID: 16951679 affyDrosDevTransfrags12 Affy Devel 22-24h Affymetrix Drosophila Development Transfrags, 22-24 hours Expression and Regulation affyDrosDevTransfrags11 Affy Devel 20-22h Affymetrix Drosophila Development Transfrags, 20-22 hours Expression and Regulation affyDrosDevTransfrags10 Affy Devel 18-20h Affymetrix Drosophila Development Transfrags, 18-20 hours Expression and Regulation affyDrosDevTransfrags9 Affy Devel 16-18h Affymetrix Drosophila Development Transfrags, 16-18 hours Expression and Regulation affyDrosDevTransfrags8 Affy Devel 14-16h Affymetrix Drosophila Development Transfrags, 14-16 hours Expression and Regulation affyDrosDevTransfrags7 Affy Devel 12-14h Affymetrix Drosophila Development Transfrags, 12-14 hours Expression and Regulation affyDrosDevTransfrags6 Affy Devel 10-12h Affymetrix Drosophila Development Transfrags, 10-12 hours Expression and Regulation affyDrosDevTransfrags5 Affy Devel 8-10h Affymetrix Drosophila Development Transfrags, 8-10 hours Expression and Regulation affyDrosDevTransfrags4 Affy Devel 6-8h Affymetrix Drosophila Development Transfrags, 6-8 hours Expression and Regulation affyDrosDevTransfrags3 Affy Devel 4-6h Affymetrix Drosophila Development Transfrags, 4-6 hours Expression and Regulation affyDrosDevTransfrags2 Affy Devel 2-4h Affymetrix Drosophila Development Transfrags, 2-4 hours Expression and Regulation affyDrosDevTransfrags1 Affy Devel 0-2h Affymetrix Drosophila Development Transfrags, 0-2 hours Expression and Regulation bdtnpDnase BDTNP DNase Accs Berkeley Drosophila Transcription Network Project Chromatin Accessibility (DNase) Expression and Regulation Description This track shows the accessibility of genomic DNA to DNase I digestion in the D. melanogaster embryo for five stages of development: 5, 9, 10, 11 and 14 (Thomas et al.). These data have been used to show the dynamics of chromatin accessibility during embryogenesis (Thomas et al) and that chromatin accessibility plays a dominant role in determining the widespread, overlapping patterns of binding by functionally distinct transcription factors in Drosophila embryos (Li et al.; Kaplan et al.). Display Conventions and Configuration Subtracks are provided showing either the density of DNA sequence tags in 75 bp windows across the genome or the locations of 5% FDR accessible regions. DNA sequence tag density data for independent replica for each stage are provided, though by default only one replica is shown. The DNA tag density subtracks are by default shown in "full" and the locations of 5% FDR regions that are concordant in both replicas are shown in "squish". Data for each stage is shown in a different color: green for stage 5, orange for stage 9, red for stage 10, light blue for stage 11 and purple for stage 14. The graphical configuration options for the subtracks are shown at the top of the track controls page, followed by a list of subtracks. To show only selected subtracks, uncheck the boxes next to the tracks that you wish to hide. For more information about the graphical configuration options, click the Graph configuration help link. Methods One hour collections of wild type embryos were aged to the appropriate developmental stage and then nuclei were isolated and briefly digested with DNase I. The DNA released by digestion was size fractionated through a sucrose gradient to capture 100 - 400 bp fragments, which were then used to generate an average of ~14 million sequence tags per sample to the Drosophila genome with an Illumina GA2. The data were analyzed to determine short genomic regions that are accessible to digestion at a 5% false discovery rate; see Thomas et al. for further details. References Kaplan T, Li XY, Sabo PJ, Thomas S, Stamatoyannopoulos JA, Biggin MD, Eisen MB. Quantitative models of the mechanisms that control genome-wide patterns of transcription factor binding during early Drosophila development. PLoS Genet. 2011 Feb 3;7(2):e1001290. PMID: 21304941; PMC: PMC3033374 Li XY, Thomas S, Sabo PJ, Eisen MB, Stamatoyannopoulos JA, Biggin MD. The role of chromatin accessibility in directing the widespread, overlapping patterns of Drosophila transcription factor binding. Genome Biol. 2011;12(4):R34. PMID: 21473766; PMC: PMC3218860 Thomas S, Li XY, Sabo PJ, Sandstrom R, Thurman RE, Canfield TK, Giste E, Fisher W, Hammonds A, Celniker SE et al. Dynamic reprogramming of chromatin accessibility during Drosophila embryo development. Genome Biol. 2011;12(5):R43. PMID: 21569360; PMC: PMC3219966 bdtnpDnaseViewRegions Regions (FDR=5%) Berkeley Drosophila Transcription Network Project Chromatin Accessibility (DNase) Expression and Regulation bdtnpDnaseAccS14 S14 Regions BDTNP Chromatin Accessibility (DNase) Stage 14, FDR 5% euchromatic accessible regions Expression and Regulation bdtnpDnaseAccS11 S11 Regions BDTNP Chromatin Accessibility (DNase) Stage 11, FDR 5% euchromatic accessible regions Expression and Regulation bdtnpDnaseAccS10 S10 Regions BDTNP Chromatin Accessibility (DNase) Stage 10, FDR 5% euchromatic accessible regions Expression and Regulation bdtnpDnaseAccS9 S9 Regions BDTNP Chromatin Accessibility (DNase) Stage 9, FDR 5% euchromatic accessible regions Expression and Regulation bdtnpDnaseAccS5 S5 Regions BDTNP Chromatin Accessibility (DNase) Stage 5, FDR 5% euchromatic accessible regions Expression and Regulation bdtnpDnaseViewAcc Accessibility Berkeley Drosophila Transcription Network Project Chromatin Accessibility (DNase) Expression and Regulation bdtnpDnaseS14R9478 S14 repl. 2 BDTNP Chromatin Accessibility (DNase) Stage 14, Replicate 2 Expression and Regulation bdtnpDnaseS14R9477 S14 repl. 1 BDTNP Chromatin Accessibility (DNase) Stage 14, Replicate 1 Expression and Regulation bdtnpDnaseS11R9486 S11 repl. 2 BDTNP Chromatin Accessibility (DNase) Stage 11, Replicate 2 Expression and Regulation bdtnpDnaseS11R9485 S11 repl. 1 BDTNP Chromatin Accessibility (DNase) Stage 11, Replicate 1 Expression and Regulation bdtnpDnaseS10R8820 S10 repl. 2 BDTNP Chromatin Accessibility (DNase) Stage 10, Replicate 2 Expression and Regulation bdtnpDnaseS10R8816 S10 repl. 1 BDTNP Chromatin Accessibility (DNase) Stage 10, Replicate 1 Expression and Regulation bdtnpDnaseS9R9128 S9 repl. 2 BDTNP Chromatin Accessibility (DNase) Stage 9, Replicate 2 Expression and Regulation bdtnpDnaseS9R9127 S9 repl. 1 BDTNP Chromatin Accessibility (DNase) Stage 9, Replicate 1 Expression and Regulation bdtnpDnaseS5R9482 S5 repl. 2 BDTNP Chromatin Accessibility (DNase) Stage 5, Replicate 2 Expression and Regulation bdtnpDnaseS5R9481 S5 repl. 1 BDTNP Chromatin Accessibility (DNase) Stage 5, Replicate 1 Expression and Regulation bdtnpChipper BDTNP ChIP/chip Berkeley Drosophila Transcription Network Project Transcription Factor ChIP/chip Expression and Regulation Description This track shows an estimate of the binding activity of 24 transcription factors in the D. melanogaster embryo. Chromatin immunoprecipitation and whole-genome tiling arrays (ChIP/chip) were used (see Li, MacArthur et al.) to map the genomic regions bound by 22 sequence specific transcription factors and two general transcription factors: TFIIB and the transcriptionally active phosphorylated form of RNA polymerase II. The sequence specific factors (except for Zeste), described in the table below, fall into three regulatory classes: anterior-posterior (A-P) early, A-P pair rule, and dorsal-ventral (D-V). Data for all proteins except Zeste are for stage 4-5 blastoderm embryos. Data for Zeste are for stage 11 embryos. Enrichment factors (1 = no enrichment) are shown in separate subtracks for 36 antibodies at false discovery rates (FDR) of 1% and 25%. Seq. Specific Factor SymbolDNA binding domainRegulatory Class Bicoid bcdhomeodomainA-P early maternal Caudal cadhomeodomainA-P early maternal Giant gtb-zip domainA-P early gap Hunchback hbC2H2 Zinc fingerA-P early gap Knirps knireceptor Zinc fingerA-P early gap Krüppel KrC2H2 Zinc fingerA-P early gap Huckebein hkbC2H2 Zinc fingerA-P early terminal Tailless tllreceptor Zinc fingerA-P early terminal Dichaete DHMG/SOX classA-P early gap-like Fushi tarazu ftzhomeodomainA-P pair rule Hairy hbHLHA-P pair rule Paired prdhomeodomain / paired domainA-P pair rule Runt runrunt domainA-P pair rule Sloppy paired 1 slp1forkhead domainA-P pair rule Daughterless dabHLHD-V maternal Dorsal dlNFkB/relD-V maternal Mothers against dpp madSMAD-MH1D-V zygotic Medea medSMAD-MH1D-V zygotic Schnurri shnC2H2 Zinc fingerD-V zygotic Snail snaC2H2 Zinc fingerD-V zygotic Twist twibHLHD-V zygotic Zeste zuniqueubiquitous Display Conventions and Configuration By default, values are displayed in grayscale ("dense" mode) instead of graphing ("full" mode), and only 24 of the 72 subtracks are shown: only those with FDR of 1% and only one antibody per factor (the antibody with the most bound regions at FDR of 1%). To change the configuration, click on the blue or gray button to the left of the track or click on the track title in the controls below the image. The subtracks within this composite annotation track may be configured in a variety of ways to highlight different aspects of the displayed data. The graphical configuration options for the subtracks are shown at the top of the track controls page, followed by a list of subtracks. To show only selected subtracks, uncheck the boxes next to the tracks that you wish to hide. For more information about the graphical configuration options, click the Graph configuration help link. Subtracks are colored according to regulatory class: green for A-P early, orange for A-P pair rule, blue for D-V, brown for stage 11 zeste, and red for general transcription factors. Methods Where practicable two antibody preparations that were independently purified against nonoverlapping epitopes were used. For each purified antibody, two independent replicates of three different sample types were analyzed on separate arrays: "Factor immunoprecipitates (IPs)" obtained by immunoprecipitation using a factor-specific antibody "immunoglobulin G (IgG) control IPs" obtained by immunoprecipitation using a normal IgG antibody "input DNA" obtained from the chromatin prior to immunoprecipitation for a total of six arrays per antibody. Mean hybridization intensities for transcription factor IP replicates and IgG control IP replicates were divided by the mean probe intensity in the input DNA samples to produce oligonucleotide ratio values. The logarithms of the oligonucleotide ratios were averaged in windows of 675 bp centered around each probe (after discarding the highest and lowest values, to produce a "trimmed mean") to produce window scores. Bound regions were identified by comparing window scores to expected score distributions computed from a symmetric null distribution. The symmetric null method assumes that the background window score distribution is symmetric about its mean, and estimates the distribution from values less than the observed mode. This estimated null distribution was used to assign p-values to each window score, and these were corrected for multiple testing to control the FDR. A separate FDR estimation method that uses the IgG control data to estimate the null distribution defines a similar number of bound regions (not shown here, see Li et al., 2008). Credits Thanks to the Berkeley Drosophila Transcription Network Project's In Vivo DNA Binding collaboration, and Stewart MacArthur and Mark Biggin in particular, for these data. References Li XY, MacArthur S, Bourgon R, Nix D, Pollard DA, Iyer VN, Hechmer A, Simirenko L, Stapleton M, Luengo Hendriks CL et al. Transcription factors bind thousands of active and inactive regions in the Drosophila blastoderm. PLoS Biol. 2008 Feb;6(2):e27. PMID: 18271625; PMC: PMC2235902 MacArthur S, Li XY, Li J, Brown JB, Chu HC, Zeng L, Grondona BP, Hechmer A, Simirenko L, Keränen SV et al. Developmental roles of 21 Drosophila transcription factors are determined by quantitative differences in binding to an overlapping set of thousands of genomic regions. Genome Biol. 2009;10(7):R80. PMID: 19627575; PMC: PMC2728534 Moses AM, Pollard DA, Nix DA, Iyer VN, Li XY, Biggin MD, Eisen MB. Large-scale turnover of functional transcription factor binding sites in Drosophila. PLoS Comput Biol. 2006 Oct;2(10):e130. PMID: 17040121; PMC: PMC1599766 Thomas S, Li XY, Sabo PJ, Sandstrom R, Thurman RE, Canfield TK, Giste E, Fisher W, Hammonds A, Celniker SE et al. Dynamic reprogramming of chromatin accessibility during Drosophila embryo development. Genome Biol. 2011;12(5):R43. PMID: 21569360; PMC: PMC3219966 bdtnpChipperViewother25 Other Antibodies (FDR=25%) Berkeley Drosophila Transcription Network Project Transcription Factor ChIP/chip Expression and Regulation bdtnpTwi1Fdr25 twi AB 1 FDR 25% BDTNP ChIP/chip: twist (twi) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpSna1Fdr25 sna AB 1 FDR 25% BDTNP ChIP/chip: snail (sna) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpShn3Fdr25 shn AB 3 FDR 25% BDTNP ChIP/chip: schnurri (shn) antibody 3, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpRun2Fdr25 run AB 2 FDR 25% BDTNP ChIP/chip: runt (run) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpPrd2Fdr25 prd AB 2 FDR 25% BDTNP ChIP/chip: paired (prd) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpH1Fdr25 h AB 1 FDR 25% BDTNP ChIP/chip: hairy (h) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpHkb3Fdr25 hkb AB 3 FDR 25% BDTNP ChIP/chip: huckebein (hkb) antibody 3, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpHkb2Fdr25 hkb AB 2 FDR 25% BDTNP ChIP/chip: huckebein (hkb) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpKr1Fdr25 Kr AB 1 FDR 25% BDTNP ChIP/chip: Kruppel (Kr) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpKni1Fdr25 kni AB 1 FDR 25% BDTNP ChIP/chip: knirps (kni) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpHb2S9Fdr25 hb AB2 S9 FDR25% BDTNP ChIP/chip: hunchback (hb) antibody 2, stage 9 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpHb2Fdr25 hb AB 2 FDR 25% BDTNP ChIP/chip: hunchback (hb) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpBcd1Fdr25 bcd AB 1 FDR 25% BDTNP ChIP/chip: bicoid (bcd) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpChipperViewother1 Other Antibodies (FDR=1%) Berkeley Drosophila Transcription Network Project Transcription Factor ChIP/chip Expression and Regulation bdtnpTwi1Fdr1 twi AB 1 FDR 1% BDTNP ChIP/chip: twist (twi) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpSna1Fdr1 sna AB 1 FDR 1% BDTNP ChIP/chip: snail (sna) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpShn3Fdr1 shn AB 3 FDR 1% BDTNP ChIP/chip: schnurri (shn) antibody 3, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpRun2Fdr1 run AB 2 FDR 1% BDTNP ChIP/chip: runt (run) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpPrd2Fdr1 prd AB 2 FDR 1% BDTNP ChIP/chip: paired (prd) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpH1Fdr1 h AB 1 FDR 1% BDTNP ChIP/chip: hairy (h) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpHkb3Fdr1 hkb AB 3 FDR 1% BDTNP ChIP/chip: huckebein (hkb) antibody 3, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpHkb2Fdr1 hkb AB 2 FDR 1% BDTNP ChIP/chip: huckebein (hkb) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpKr1Fdr1 Kr AB 1 FDR 1% BDTNP ChIP/chip: Kruppel (Kr) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpKni1Fdr1 kni AB 1 FDR 1% BDTNP ChIP/chip: knirps (kni) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpHb2S9Fdr1 hb AB2 S9 FDR 1% BDTNP ChIP/chip: hunchback (hb) antibody 2, stage 9 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpHb2Fdr1 hb AB 2 FDR 1% BDTNP ChIP/chip: hunchback (hb) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpBcd1Fdr1 bcd AB 1 FDR 1% BDTNP ChIP/chip: bicoid (bcd) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpChipperViewbest25 Best Antibody (FDR=25%) Berkeley Drosophila Transcription Network Project Transcription Factor ChIP/chip Expression and Regulation bdtnpTFIIB1Fdr25 TFIIB AB 1 FDR25% BDTNP ChIP/chip: Transc. factor IIB (TFIIB) antibody 1, stage 4-5 embryos, False Disc. Rate (FDR) 25% Expression and Regulation bdtnpPolIIFdr25 PolII AB FDR 25% BDTNP ChIP/chip: RNA Polymerase II (PolII) antibody, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpZ2Fdr25 z AB 2 FDR 25% BDTNP ChIP/chip: zeste (z) antibody 2, stage 11 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpTwi2Fdr25 twi AB 2 FDR 25% BDTNP ChIP/chip: twist (twi) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpSna2Fdr25 sna AB 2 FDR 25% BDTNP ChIP/chip: snail (sna) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpShn2Fdr25 shn AB 2 FDR 25% BDTNP ChIP/chip: schnurri (shn) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpMed2S14Fdr25 med AB2 S14 FDR25% BDTNP ChIP/chip: Medea (med) antibody 2, stage 14 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpMed2S10Fdr25 med AB2 S10 FDR25% BDTNP ChIP/chip: Medea (med) antibody 2, stage 10 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpMed2Fdr25 med AB 2 FDR 25% BDTNP ChIP/chip: Medea (med) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpMad2Fdr25 mad AB 2 FDR 25% BDTNP ChIP/chip: Mothers against dpp (mad) antibody 2, stage 4-5 embryos, False Disc. Rate (FDR) 25% Expression and Regulation bdtnpDl3Fdr25 dl AB 3 FDR 25% BDTNP ChIP/chip: dorsal (dl) antibody 3, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpDa2Fdr25 da AB 2 FDR 25% BDTNP ChIP/chip: daughterless (da) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpSlp11Fdr25 slp1 AB 1 FDR 25% BDTNP ChIP/chip: sloppy paired 1 (slp1) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpRun1Fdr25 run AB 1 FDR 25% BDTNP ChIP/chip: runt (run) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpPrd1Fdr25 prd AB 1 FDR 25% BDTNP ChIP/chip: paired (prd) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpH2Fdr25 h AB 2 FDR 25% BDTNP ChIP/chip: hairy (h) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpFtz3Fdr25 ftz AB 3 FDR 25% BDTNP ChIP/chip: fushi tarazu (ftz) antibody 3, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpD1Fdr25 D AB 1 FDR 25% BDTNP ChIP/chip: Dichaete (D) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpTll1Fdr25 tll AB 1 FDR 25% BDTNP ChIP/chip: tailless (tll) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpHkb1Fdr25 hkb AB 1 FDR 25% BDTNP ChIP/chip: huckebein (hkb) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpKr2Fdr25 Kr AB 2 FDR 25% BDTNP ChIP/chip: Kruppel (Kr) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpKni2Fdr25 kni AB 2 FDR 25% BDTNP ChIP/chip: knirps (kni) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpHb1S9Fdr25 hb AB1 S9 FDR25% BDTNP ChIP/chip: hunchback (hb) antibody 1, stage 9 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpHb1Fdr25 hb AB 1 FDR 25% BDTNP ChIP/chip: hunchback (hb) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpGt2Fdr25 gt AB 2 FDR 25% BDTNP ChIP/chip: giant (gt) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpCad1Fdr25 cad AB 1 FDR 25% BDTNP ChIP/chip: caudal (cad) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpBcd2Fdr25 bcd AB 2 FDR 25% BDTNP ChIP/chip: bicoid (bcd) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 25% Expression and Regulation bdtnpChipperViewbest1 Best Antibody (FDR=1%) Berkeley Drosophila Transcription Network Project Transcription Factor ChIP/chip Expression and Regulation bdtnpTFIIB1Fdr1 TFIIB AB 1 FDR 1% BDTNP ChIP/chip: Transc. factor IIB (TFIIB) antibody 1, stage 4-5 embryos, False Disc. Rate (FDR) 1% Expression and Regulation bdtnpPolIIFdr1 PolII AB FDR 1% BDTNP ChIP/chip: RNA Polymerase II (PolII) antibody, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpZ2Fdr1 z AB 2 FDR 1% BDTNP ChIP/chip: zeste (z) antibody 2, stage 11 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpTwi2Fdr1 twi AB 2 FDR 1% BDTNP ChIP/chip: twist (twi) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpSna2Fdr1 sna AB 2 FDR 1% BDTNP ChIP/chip: snail (sna) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpShn2Fdr1 shn AB 2 FDR 1% BDTNP ChIP/chip: schnurri (shn) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpMed2S14Fdr1 med AB2 S14 FDR1% BDTNP ChIP/chip: Medea (med) antibody 2, stage 14 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpMed2S10Fdr1 med AB2 S10 FDR1% BDTNP ChIP/chip: Medea (med) antibody 2, stage 10 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpMed2Fdr1 med AB 2 FDR 1% BDTNP ChIP/chip: Medea (med) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpMad2Fdr1 mad AB 2 FDR 1% BDTNP ChIP/chip: Mothers against dpp (mad) antibody 2, stage 4-5 embryos, False Disc. Rate (FDR) 1% Expression and Regulation bdtnpDl3Fdr1 dl AB 3 FDR 1% BDTNP ChIP/chip: dorsal (dl) antibody 3, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpDa2Fdr1 da AB 2 FDR 1% BDTNP ChIP/chip: daughterless (da) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpSlp11Fdr1 slp1 AB 1 FDR 1% BDTNP ChIP/chip: sloppy paired 1 (slp1) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpRun1Fdr1 run AB 1 FDR 1% BDTNP ChIP/chip: runt (run) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpPrd1Fdr1 prd AB 1 FDR 1% BDTNP ChIP/chip: paired (prd) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpH2Fdr1 h AB 2 FDR 1% BDTNP ChIP/chip: hairy (h) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpFtz3Fdr1 ftz AB 3 FDR 1% BDTNP ChIP/chip: fushi tarazu (ftz) antibody 3, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpD1Fdr1 D AB 1 FDR 1% BDTNP ChIP/chip: Dichaete (D) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpTll1Fdr1 tll AB 1 FDR 1% BDTNP ChIP/chip: tailless (tll) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpHkb1Fdr1 hkb AB 1 FDR 1% BDTNP ChIP/chip: huckebein (hkb) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpKr2Fdr1 Kr AB 2 FDR 1% BDTNP ChIP/chip: Kruppel (Kr) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpKni2Fdr1 kni AB 2 FDR 1% BDTNP ChIP/chip: knirps (kni) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpHb1S9Fdr1 hb AB1 S9 FDR 1% BDTNP ChIP/chip: hunchback (hb) antibody 1, stage 9 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpHb1Fdr1 hb AB 1 FDR 1% BDTNP ChIP/chip: hunchback (hb) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpGt2Fdr1 gt AB 2 FDR 1% BDTNP ChIP/chip: giant (gt) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpCad1Fdr1 cad AB 1 FDR 1% BDTNP ChIP/chip: caudal (cad) antibody 1, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation bdtnpBcd2Fdr1 bcd AB 2 FDR 1% BDTNP ChIP/chip: bicoid (bcd) antibody 2, stage 4-5 embryos, False Discovery Rate (FDR) 1% Expression and Regulation flyreg2 FlyReg FlyReg: Drosophila DNase I Footprint Database Expression and Regulation Description This track shows DNase I Footprint data from FlyReg version 2.0. FlyReg provides access to results of the systematic curation and genome annotation of 1,350 DNase I footprints for the fruitfly D. melanogaster reported in Bergman, C.M. et al. (see below). When available, a footprint motif is also displayed, based on a MEME matrix computed by Dan Pollard on the set of footprints for this factor. Credits Thanks to Casey Bergman for providing the FlyReg data. If used in published work, please cite Bergman CM, Carlson JW, Celniker SE. Drosophila DNase I footprint database: a systematic genome annotation of transcription factor binding sites in the fruitfly, Drosophila melanogaster. Bioinformatics. 2005 Apr 15;21(8):1747-9. PMID: 15572468 Thanks to Dan Pollard for providing the footprint motif matrices. augustus Augustus Genes Augustus Gene Predictions Genes and Gene Predictions Description This track shows predictions of the AUGUSTUS program, which predicts the coding parts of protein-coding genes. This program, which was written by Mario Stanke at the Department of Bioinformatics, University of Göttingen, Germany, is available through the GOBICS web server. Display Conventions and Configuration This track follows the display conventions for gene prediction tracks. This track contains an optional codon coloring feature that allows users to quickly validate and compare gene predictions. To display codon colors, select the genomic codons option from the Color track by codons pull-down menu. Click the Help on codon coloring link for more information about this feature. Methods Augustus uses a generalized hidden Markov model (GHMM) that models coding and non-coding sequence, splice sites, the branch point region, translation start and end, and lengths of exons and introns. This version has been trained on a set of 400 Drosophila genes. These ab initio predictions were made using only the Drosophila genomic sequence; no homology information or transcribed sequences were used. Credits Thanks to Mario Stanke for providing these data. References Stanke, M. Gene prediction with a hidden Markov model. Ph.D. thesis, Universität Göttingen, Germany (2004). Stanke M, Steinkamp R, Waack S, Morgenstern B. AUGUSTUS: a web server for gene finding in eukaryotes. Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W309-12. PMID: 15215400; PMC: PMC441517 Stanke M, Waack S. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics. 2003 Oct;19 Suppl 2:ii215-25. PMID: 14534192 est D. melanogaster ESTs D. melanogaster ESTs Including Unspliced mRNA and EST Description This track shows alignments between D. melanogaster expressed sequence tags (ESTs) in GenBank and the genome. ESTs are single-read sequences, typically about 500 bases in length, that usually represent fragments of transcribed genes. Display Conventions and Configuration This track follows the display conventions for PSL alignment tracks. In dense display mode, the items that are more darkly shaded indicate matches of better quality. The strand information (+/-) indicates the direction of the match between the EST and the matching genomic sequence. It bears no relationship to the direction of transcription of the RNA with which it might be associated. The description page for this track has a filter that can be used to change the display mode, alter the color, and include/exclude a subset of items within the track. This may be helpful when many items are shown in the track display, especially when only some are relevant to the current task. To use the filter: Type a term in one or more of the text boxes to filter the EST display. For example, to apply the filter to all ESTs expressed in a specific organ, type the name of the organ in the tissue box. To view the list of valid terms for each text box, consult the table in the Table Browser that corresponds to the factor on which you wish to filter. For example, the "tissue" table contains all the types of tissues that can be entered into the tissue text box. Wildcards may also be used in the filter. If filtering on more than one value, choose the desired combination logic. If "and" is selected, only ESTs that match all filter criteria will be highlighted. If "or" is selected, ESTs that match any one of the filter criteria will be highlighted. Choose the color or display characteristic that should be used to highlight or include/exclude the filtered items. If "exclude" is chosen, the browser will not display ESTs that match the filter criteria. If "include" is selected, the browser will display only those ESTs that match the filter criteria. This track may also be configured to display base labeling, a feature that allows the user to display all bases in the aligning sequence or only those that differ from the genomic sequence. For more information about this option, click here. Methods To make an EST, RNA is isolated from cells and reverse transcribed into cDNA. Typically, the cDNA is cloned into a plasmid vector and a read is taken from the 5' and/or 3' primer. For most — but not all — ESTs, the reverse transcription is primed by an oligo-dT, which hybridizes with the poly-A tail of mature mRNA. The reverse transcriptase may or may not make it to the 5' end of the mRNA, which may or may not be degraded. In general, the 3' ESTs mark the end of transcription reasonably well, but the 5' ESTs may end at any point within the transcript. Some of the newer cap-selected libraries cover transcription start reasonably well. Before the cap-selection techniques emerged, some projects used random rather than poly-A priming in an attempt to retrieve sequence distant from the 3' end. These projects were successful at this, but as a side effect also deposited sequences from unprocessed mRNA and perhaps even genomic sequences into the EST databases. Even outside of the random-primed projects, there is a degree of non-mRNA contamination. Because of this, a single unspliced EST should be viewed with considerable skepticism. To generate this track, D. melanogaster ESTs from GenBank were aligned against the genome using blat. Note that the maximum intron length allowed by blat is 750,000 bases, which may eliminate some ESTs with very long introns that might otherwise align. When a single EST aligned in multiple places, the alignment having the highest base identity was identified. Only alignments having a base identity level within 0.5% of the best and at least 96% base identity with the genomic sequence are displayed in this track. Credits This track was produced at UCSC from EST sequence data submitted to the international public sequence databases by scientists worldwide. References Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. GenBank: update. Nucleic Acids Res. 2004 Jan 1;32(Database issue):D23-6. PMID: 14681350; PMC: PMC308779 Kent WJ. BLAT - the BLAST-like alignment tool. Genome Res. 2002 Apr;12(4):656-64. PMID: 11932250; PMC: PMC187518 mrna D. melanogaster mRNAs D. melanogaster mRNAs from GenBank mRNA and EST Description The mRNA track shows alignments between D. melanogaster mRNAs in GenBank and the genome. Display Conventions and Configuration This track follows the display conventions for PSL alignment tracks. In dense display mode, the items that are more darkly shaded indicate matches of better quality. The description page for this track has a filter that can be used to change the display mode, alter the color, and include/exclude a subset of items within the track. This may be helpful when many items are shown in the track display, especially when only some are relevant to the current task. To use the filter: Type a term in one or more of the text boxes to filter the mRNA display. For example, to apply the filter to all mRNAs expressed in a specific organ, type the name of the organ in the tissue box. To view the list of valid terms for each text box, consult the table in the Table Browser that corresponds to the factor on which you wish to filter. For example, the "tissue" table contains all the types of tissues that can be entered into the tissue text box. Wildcards may also be used in the filter. If filtering on more than one value, choose the desired combination logic. If "and" is selected, only mRNAs that match all filter criteria will be highlighted. If "or" is selected, mRNAs that match any one of the filter criteria will be highlighted. Choose the color or display characteristic that should be used to highlight or include/exclude the filtered items. If "exclude" is chosen, the browser will not display mRNAs that match the filter criteria. If "include" is selected, the browser will display only those mRNAs that match the filter criteria. This track may also be configured to display codon coloring, a feature that allows the user to quickly compare mRNAs against the genomic sequence. For more information about this option, click here. Methods GenBank D. melanogaster mRNAs were aligned against the genome using the blat program. When a single mRNA aligned in multiple places, the alignment having the highest base identity was found. Only alignments having a base identity level within 0.5% of the best and at least 96% base identity with the genomic sequence were kept. Credits The mRNA track was produced at UCSC from mRNA sequence data submitted to the international public sequence databases by scientists worldwide. References Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. GenBank: update. Nucleic Acids Res. 2004 Jan 1;32(Database issue):D23-6. PMID: 14681350; PMC: PMC308779 Kent WJ. BLAT - the BLAST-like alignment tool. Genome Res. 2002 Apr;12(4):656-64. PMID: 11932250; PMC: PMC187518 gap Gap Gap Locations Mapping and Sequencing Description This track depicts gaps — represented by black boxes — in the D. melanogaster genome sequence. An assembly region is designated as a gap if the sequence contains a series of Ns. The minimum number of Ns that constitute a gap varies among assemblies. gcPercent GC Percent Percentage GC in 20,000-Base Windows Mapping and Sequencing Description The GC percent track shows the percentage of G (guanine) and C (cytosine) bases in a 20,000 base window. Windows with high GC content are drawn more darkly than windows with low GC content. High GC content is typically associated with gene-rich areas. Credits This track was generated at UCSC. geneid Geneid Genes Geneid Gene Predictions Genes and Gene Predictions Description This track shows gene predictions from the geneid program developed by Roderic Guigó's Computational Biology of RNA Processing group which is part of the Centre de Regulació Genòmica (CRG) in Barcelona, Catalunya, Spain. Methods Geneid is a program to predict genes in anonymous genomic sequences designed with a hierarchical structure. In the first step, splice sites, start and stop codons are predicted and scored along the sequence using Position Weight Arrays (PWAs). Next, exons are built from the sites. Exons are scored as the sum of the scores of the defining sites, plus the the log-likelihood ratio of a Markov Model for coding DNA. Finally, from the set of predicted exons, the gene structure is assembled, maximizing the sum of the scores of the assembled exons. Credits Thanks to Computational Biology of RNA Processing for providing these data. References Blanco E, Parra G, Guigó R. Using geneid to identify genes. Curr Protoc Bioinformatics. 2007 Jun;Chapter 4:Unit 4.3. PMID: 18428791 Parra G, Blanco E, Guigó R. GeneID in Drosophila. Genome Res. 2000 Apr;10(4):511-5. PMID: 10779490; PMC: PMC310871 genscan Genscan Genes Genscan Gene Predictions Genes and Gene Predictions Description This track shows predictions from the Genscan program written by Chris Burge. The predictions are based on transcriptional, translational and donor/acceptor splicing signals as well as the length and compositional distributions of exons, introns and intergenic regions. For more information on the different gene tracks, see our Genes FAQ. Display Conventions and Configuration This track follows the display conventions for gene prediction tracks. The track description page offers the following filter and configuration options: Color track by codons: Select the genomic codons option to color and label each codon in a zoomed-in display to facilitate validation and comparison of gene predictions. Go to the Coloring Gene Predictions and Annotations by Codon page for more information about this feature. Methods For a description of the Genscan program and the model that underlies it, refer to Burge and Karlin (1997) in the References section below. The splice site models used are described in more detail in Burge (1998) below. Credits Thanks to Chris Burge for providing the Genscan program. References Burge C. Modeling Dependencies in Pre-mRNA Splicing Signals. In: Salzberg S, Searls D, Kasif S, editors. Computational Methods in Molecular Biology. Amsterdam: Elsevier Science; 1998. p. 127-163. Burge C, Karlin S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 1997 Apr 25;268(1):78-94. PMID: 9149143 blastHg18KG Human Proteins Human Proteins Mapped by Chained tBLASTn Genes and Gene Predictions Description This track contains tBLASTn alignments of the peptides from the predicted and known genes identified in the hg18 UCSC Genes track. Methods First, the predicted proteins from the human UCSC Genes track were aligned with the human genome using the Blat program to discover exon boundaries. Next, the amino acid sequences that make up each exon were aligned with the D. melanogaster sequence using the tBLASTn program. Finally, the putative D. melanogaster exons were chained together using an organism-specific maximum gap size but no gap penalty. The single best exon chains extending over more than 60% of the query protein were included. Exon chains that extended over 60% of the query and matched at least 60% of the protein's amino acids were also included. Credits tBLASTn is part of the NCBI BLAST tool set. For more information on BLAST, see Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990 Oct 5;215(3):403-410. Blat was written by Jim Kent. The remaining utilities used to produce this track were written by Jim Kent or Brian Raney. microsat Microsatellite Microsatellites - Di-nucleotide and Tri-nucleotide Repeats Variation and Repeats Description This track displays regions that are likely to be useful as microsatellite markers. These are sequences of at least 15 perfect di-nucleotide and tri-nucleotide repeats and tend to be highly polymorphic in the population. Methods The data shown in this track are a subset of the Simple Repeats track, selecting only those repeats of period 2 and 3, with 100% identity and no indels and with at least 15 copies of the repeat. The Simple Repeats track is created using the Tandem Repeats Finder. For more information about this program, see Benson (1999). Credits Tandem Repeats Finder was written by Gary Benson. References Benson G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999 Jan 15;27(2):573-80. PMID: 9862982; PMC: PMC148217 nscanGene N-SCAN N-SCAN Gene Predictions Genes and Gene Predictions Description This track shows gene predictions using the N-SCAN gene structure prediction software provided by the Computational Genomics Lab at Washington University in St. Louis, MO, USA. Methods N-SCAN combines biological-signal modeling in the target genome sequence along with information from a multiple-genome alignment to generate de novo gene predictions. It extends the TWINSCAN target-informant genome pair to allow for an arbitrary number of informant sequences as well as richer models of sequence evolution. N-SCAN models the phylogenetic relationships between the aligned genome sequences, context-dependent substitution rates, insertions, and deletions. Drosophila melanogaster N-SCAN uses Drosophila yakuba (droYak1), Drosophila pseudoobscura (dp2), and Anopheles gambiae (anoGam1) as informants. Credits Thanks to Michael Brent's Computational Genomics Group at Washington University St. Louis for providing this data. Special thanks for this implementation of N-SCAN to Aaron Tenney in the Brent lab, and Robert Zimmermann, currently at Max F. Perutz Laboratories in Vienna, Austria. References Gross SS, Brent MR. Using multiple alignments to improve gene prediction. J Comput Biol. 2006 Mar;13(2):379-93. PMID: 16597247 Haas BJ, Delcher AL, Mount SM, Wortman JR, Smith RK Jr, Hannick LI, Maiti R, Ronning CM, Rusch DB, Town CD et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 2003 Oct 1;31(19):5654-66. PMID: 14500829; PMC: PMC206470 Korf I, Flicek P, Duan D, Brent MR. Integrating genomic homology into gene structure prediction. Bioinformatics. 2001;17 Suppl 1:S140-8. PMID: 11473003 van Baren MJ, Brent MR. Iterative gene prediction and pseudogene removal improves genome annotation. Genome Res. 2006 May;16(5):678-85. PMID: 16651666; PMC: PMC1457044 oreganno ORegAnno Regulatory elements from ORegAnno Expression and Regulation Description This track displays literature-curated regulatory regions, transcription factor binding sites, and regulatory polymorphisms from ORegAnno (Open Regulatory Annotation). For more detailed information on a particular regulatory element, follow the link to ORegAnno from the details page. ORegAnno (Open Regulatory Annotation). --> Display Conventions and Configuration The display may be filtered to show only selected region types, such as: regulatory regions (shown in light blue) regulatory polymorphisms (shown in dark blue) transcription factor binding sites (shown in orange) regulatory haplotypes (shown in red) miRNA binding sites (shown in blue-green) To exclude a region type, uncheck the appropriate box in the list at the top of the Track Settings page. Methods An ORegAnno record describes an experimentally proven and published regulatory region (promoter, enhancer, etc.), transcription factor binding site, or regulatory polymorphism. Each annotation must have the following attributes: A stable ORegAnno identifier. A valid taxonomy ID from the NCBI taxonomy database. A valid PubMed reference. A target gene that is either user-defined, in Entrez Gene or in EnsEMBL. A sequence with at least 40 flanking bases (preferably more) to allow the site to be mapped to any release of an associated genome. At least one piece of specific experimental evidence, including the biological technique used to discover the regulatory sequence. (Currently only the evidence subtypes are supplied with the UCSC track.) A positive, neutral or negative outcome based on the experimental results from the primary reference. (Only records with a positive outcome are currently included in the UCSC track.) The following attributes are optionally included: A transcription factor that is either user-defined, in Entrez Gene or in EnsEMBL. A specific cell type for each piece of experimental evidence, using the eVOC cell type ontology. A specific dataset identifier (e.g. the REDfly dataset) that allows external curators to manage particular annotation sets using ORegAnno's curation tools. A "search space" sequence that specifies the region that was assayed, not just the regulatory sequence. A dbSNP identifier and type of variant (germline, somatic or artificial) for regulatory polymorphisms. Mapping to genome coordinates is performed periodically to current genome builds by BLAST sequence alignment. The information provided in this track represents an abbreviated summary of the details for each ORegAnno record. Please visit the official ORegAnno entry (by clicking on the ORegAnno link on the details page of a specific regulatory element) for complete details such as evidence descriptions, comments, validation score history, etc. Credits ORegAnno core team and principal contacts: Stephen Montgomery, Obi Griffith, and Steven Jones from Canada's Michael Smith Genome Sciences Centre, Vancouver, British Columbia, Canada. The ORegAnno community (please see individual citations for various features): ORegAnno Citation. References Lesurf R, Cotto KC, Wang G, Griffith M, Kasaian K, Jones SJ, Montgomery SB, Griffith OL, Open Regulatory Annotation Consortium.. ORegAnno 3.0: a community-driven resource for curated regulatory annotation. Nucleic Acids Res. 2016 Jan 4;44(D1):D126-32. PMID: 26578589; PMC: PMC4702855 Griffith OL, Montgomery SB, Bernier B, Chu B, Kasaian K, Aerts S, Mahony S, Sleumer MC, Bilenky M, Haeussler M et al. ORegAnno: an open-access community-driven resource for regulatory annotation. Nucleic Acids Res. 2008 Jan;36(Database issue):D107-13. PMID: 18006570; PMC: PMC2239002 Montgomery SB, Griffith OL, Sleumer MC, Bergman CM, Bilenky M, Pleasance ED, Prychyna Y, Zhang X, Jones SJ. ORegAnno: an open access database and curation system for literature-derived promoters, transcription factor binding sites and regulatory variation. Bioinformatics. 2006 Mar 1;22(5):637-40. PMID: 16397004 xenoMrna Other mRNAs Non-D. melanogaster mRNAs from GenBank mRNA and EST Description This track displays translated blat alignments of vertebrate and invertebrate mRNA in GenBank from organisms other than D. melanogaster. Display Conventions and Configuration This track follows the display conventions for PSL alignment tracks. In dense display mode, the items that are more darkly shaded indicate matches of better quality. The strand information (+/-) for this track is in two parts. The first + indicates the orientation of the query sequence whose translated protein produced the match (here always 5' to 3', hence +). The second + or - indicates the orientation of the matching translated genomic sequence. Because the two orientations of a DNA sequence give different predicted protein sequences, there are four combinations. ++ is not the same as --, nor is +- the same as -+. The description page for this track has a filter that can be used to change the display mode, alter the color, and include/exclude a subset of items within the track. This may be helpful when many items are shown in the track display, especially when only some are relevant to the current task. To use the filter: Type a term in one or more of the text boxes to filter the mRNA display. For example, to apply the filter to all mRNAs expressed in a specific organ, type the name of the organ in the tissue box. To view the list of valid terms for each text box, consult the table in the Table Browser that corresponds to the factor on which you wish to filter. For example, the "tissue" table contains all the types of tissues that can be entered into the tissue text box. Wildcards may also be used in the filter. If filtering on more than one value, choose the desired combination logic. If "and" is selected, only mRNAs that match all filter criteria will be highlighted. If "or" is selected, mRNAs that match any one of the filter criteria will be highlighted. Choose the color or display characteristic that should be used to highlight or include/exclude the filtered items. If "exclude" is chosen, the browser will not display mRNAs that match the filter criteria. If "include" is selected, the browser will display only those mRNAs that match the filter criteria. This track may also be configured to display codon coloring, a feature that allows the user to quickly compare mRNAs against the genomic sequence. For more information about this option, click here. Methods The mRNAs were aligned against the D. melanogaster genome using translated blat. When a single mRNA aligned in multiple places, the alignment having the highest base identity was found. Only those alignments having a base identity level within 1% of the best and at least 25% base identity with the genomic sequence were kept. Credits The mRNA track was produced at UCSC from mRNA sequence data submitted to the international public sequence databases by scientists worldwide. References Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. GenBank: update. Nucleic Acids Res. 2004 Jan 1;32(Database issue):D23-6. PMID: 14681350; PMC: PMC308779 Kent WJ. BLAT--the BLAST-like alignment tool. Genome Res. 2002 Apr;12(4):656-64. PMID: 11932250; PMC: PMC187518 xenoRefGene Other RefSeq Non-D. melanogaster RefSeq Genes Genes and Gene Predictions Description This track shows known protein-coding and non-protein-coding genes for organisms other than D. melanogaster, taken from the NCBI RNA reference sequences collection (RefSeq). The data underlying this track are updated weekly. Display Conventions and Configuration This track follows the display conventions for gene prediction tracks. The color shading indicates the level of review the RefSeq record has undergone: predicted (light), provisional (medium), reviewed (dark). The item labels and display colors of features within this track can be configured through the controls at the top of the track description page. Label: By default, items are labeled by gene name. Click the appropriate Label option to display the accession name instead of the gene name, show both the gene and accession names, or turn off the label completely. Codon coloring: This track contains an optional codon coloring feature that allows users to quickly validate and compare gene predictions. To display codon colors, select the genomic codons option from the Color track by codons pull-down menu. Click here for more information about this feature. Hide non-coding genes: By default, both the protein-coding and non-protein-coding genes are displayed. If you wish to see only the coding genes, click this box. Methods The RNAs were aligned against the D. melanogaster genome using blat; those with an alignment of less than 15% were discarded. When a single RNA aligned in multiple places, the alignment having the highest base identity was identified. Only alignments having a base identity level within 0.5% of the best and at least 25% base identity with the genomic sequence were kept. Credits This track was produced at UCSC from RNA sequence data generated by scientists worldwide and curated by the NCBI RefSeq project. References Kent WJ. BLAT - the BLAST-like alignment tool. Genome Res. 2002 Apr;12(4):656-64. PMID: 11932250; PMC: PMC187518 picTar PicTar miRNA MicroRNA target sites in 3' UTRs as predicted by PicTar Expression and Regulation Description This track shows microRNA target sites in 3' UTRs as predicted by PicTar, based on the full-length cDNAs provided by FlyBase. Methods The original PicTar algorithm was published in Krek et al., 2005. The annotations displayed in this track are updated predictions as published in Lall et al., 2006. PicTar is a hidden Markov model that assigns probabilities to 3' UTR subsequences as a binding site for a microRNA, considers all possible ways the 3' UTR could be bound by microRNAs, and then uses a maximum likelihood method to compute the optimal likelihood under which the 3' UTR could be explained by microRNAs and background. The score is this likelihood divided by background, i.e., the local base composition of each 3' UTR is taken into account. To fit the track conventions of the UCSC browser (integers), all scores were scaled by the maximum score of all microRNA 3'-UTR scores observed. Note that the PicTar algorithm scores any 3' UTR that has at least one aligned conserved predicted binding site for a microRNA, but then incorporates all possible binding sites into the score, even if they appear to be non-conserved. Because the score for a 3' UTR is a "phylo" average over all orthologous 3' UTRs used, "scattered" sites that appear in many species may boost the score, and individual sites shown in the display may not be aligned and conserved in all species under consideration. Two levels of conservation can be chosen: -- conservation among four Drosophila species: melanogaster, yakuba, ananassae, and pseudoobscura (high sensitivity settings) -- conservation among six Drosophila species: melanogaster, yakuba, ananassae, pseudoobscura, mojavensis, and virilis The latter settings have improved quality, but lower sensitivity. For a detailed analysis of signal-to-noise ratios and sensitivity, please refer to Grün, et al. and Lall et al.. Credits Thanks to the Dominic Grün, Yi-Lu Wang, and Nikolaus Rajewsky for providing this annotation. More detailed information about individual predictions, including links to other databases, can be found on the PicTar website, a project of the Rajewsky lab while at the New York University Center for Comparative Functional Genomics. References Grün D, Wang YL, Langenberger D, Gunsalus KC, Rajewsky N. microRNA target predictions across seven Drosophila species and comparison to mammalian targets. PLoS Comput Biol. 2005 Jun;1(1):e13. PMID: 16103902; PMC: PMC1183519 Krek A, Grün D, Poy MN, Wolf R, Rosenberg L, Epstein EJ, MacMenamin P, da Piedade I, Gunsalus KC, Stoffel M et al. Combinatorial microRNA target predictions. Nat Genet. 2005 May;37(5):495-500. PMID: 15806104 Murphy WJ, Eizirik E, O'Brien SJ, Madsen O, Scally M, Douady CJ, Teeling E, Ryder OA, Stanhope MJ, de Jong WW et al. A genome-wide map of conserved microRNA targets in C. elegans. Curr Biol. 2006 Mar 7;16(5):460-71. PMID: 16458514 picTarMiRNAS3 PicTar microRNA MicroRNA target sites as predicted by PicTar, high specificity Expression and Regulation picTarMiRNAS1 PicTar microRNA MicroRNA target sites as predicted by PicTar, high sensitivity Expression and Regulation simpleRepeat Simple Repeats Simple Tandem Repeats by TRF Variation and Repeats Description This track displays simple tandem repeats (possibly imperfect repeats) located by Tandem Repeats Finder (TRF) which is specialized for this purpose. These repeats can occur within coding regions of genes and may be quite polymorphic. Repeat expansions are sometimes associated with specific diseases. Methods For more information about the TRF program, see Benson (1999). Credits TRF was written by Gary Benson. References Benson G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999 Jan 15;27(2):573-80. PMID: 9862982; PMC: PMC148217 cons15way Conservation (15) 12 Flies, Mosquito, Honeybee, Beetle Multiz Alignments & phastCons Scores Comparative Genomics Description This track shows a measure of evolutionary conservation in twelve Drosophila species, mosquito, honeybee and red flour beetle, based on a phylogenetic hidden Markov model (phastCons). Multiz alignments of the following assemblies were used to generate this annotation: D. melanogaster Apr. 2004 (BDGP R4/dm2) (dm2) D. simulans Apr. 2005 (droSim1) D. sechellia Oct. 2005 (droSec1) D. yakuba Nov. 2005 (droYak2) D. erecta Feb. 2006 (droEre2) D. ananassae Feb. 2006 (droAna3) D. pseudoobscura Feb. 2006 (dp4) D. persimilis Oct. 2005 (droPer1) D. willistoni Feb. 2006 (droWil1) D. virilis Feb. 2006 (droVir3) D. mojavensis Feb. 2006 (droMoj3) D. grimshawi Feb. 2006 (droGri2) A. gambiae Feb. 2003 (anoGam1) A. mellifera Jan. 2005 (apiMel2) T. castaneum Sep. 2005 (triCas2) Display Conventions and Configuration In full and pack display modes, conservation scores are displayed as a "wiggle" (histogram), where the height reflects the size of the score. Pairwise alignments of each species to the D. melanogaster genome are displayed below as a grayscale density plot (in pack mode) or as a "wiggle" (in full mode) that indicates alignment quality. In dense display mode, conservation is shown in grayscale using darker values to indicate higher levels of overall conservation as scored by phastCons. The conservation wiggle can be configured in a variety of ways to highlight different aspects of the displayed information. Click the Graph configuration help link for an explanation of the configuration options. Checkboxes in the track configuration section allow excluding species from the pairwise display; however, this does not remove them from the conservation score display. To view detailed information about the alignments at a specific position, zoom in the display to 30,000 or fewer bases, then click on the alignment. Gap Annotation The "Display chains between alignments" configuration option enables display of gaps between alignment blocks in the pairwise alignments in a manner similar to the Chain track display. The following conventions are used: Single line: No bases in the aligned species. Possibly due to a lineage-specific insertion between the aligned blocks in the D. melanogaster genome or a lineage-specific deletion between the aligned blocks in the aligning species. Double line: Aligning species has one or more unalignable bases in the gap region. Possibly due to excessive evolutionary distance between species or independent indels in the region between the aligned blocks in both species. Pale yellow coloring: Aligning species has Ns in the gap region. Reflects uncertainty in the relationship between the DNA of both species, due to lack of sequence in relevant portions of the aligning species. Genomic Breaks Discontinuities in the genomic context (chromosome, scaffold or region) of the aligned DNA in the aligning species are shown as follows: Vertical blue bar: Represents a discontinuity that persists indefinitely on either side, e.g. a large region of DNA on either side of the bar comes from a different chromosome in the aligned species due to a large scale rearrangement. Green square brackets: Enclose shorter alignments consisting of DNA from one genomic context in the aligned species nested inside a larger chain of alignments from a different genomic context. The alignment within the brackets may represent a short misalignment, a lineage-specific insertion of a transposon in the D. melanogaster genome that aligns to a paralogous copy somewhere else in the aligned species, or other similar occurrence. Base Level When zoomed-in to the base-level display, the track shows the base composition of each alignment. The numbers and symbols on the Gaps line indicate the lengths of gaps in the D. melanogaster sequence at those alignment positions relative to the longest non-D. melanogaster sequence. If there is sufficient space in the display, the size of the gap is shown; if not, and if the gap size is a multiple of 3, a "*" is displayed, otherwise "+" is shown. Codon translation is available in base-level display mode if the displayed region is identified as a coding segment. To display this annotation, select the species for translation from the pull-down menu in the Codon Translation configuration section at the top of the page. Then, select one of the following modes: No codon translation: The gene annotation is not used; the bases are displayed without translation. Use default species reading frames for translation: The annotations from the genome displayed in the Default species to establish reading frame pull-down menu are used to translate all the aligned species present in the alignment. Use reading frames for species if available, otherwise no translation: Codon translation is performed only for those species where the region is annotated as protein coding. Use reading frames for species if available, otherwise use default species: Codon translation is done on those species that are annotated as being protein coding over the aligned region using species-specific annotation; the remaining species are translated using the default species annotation. Codon translation uses the following gene tracks as the basis for translation, depending on the species chosen: Gene TrackSpecies FlyBase GenesD. melanogaster mRNAsD. simulans, D. yakuba, A. gambiae, A. mellifera not translatedAll other species Methods Best-in-genome pairwise alignments were generated for each species using blastz, followed by chaining and netting. The pairwise alignments were then multiply aligned using the multiz program, according to this topology: (((((((((dm2 (droSim1 droSec1)) (droYak2 droEre2)) droAna3) (dp4 droPer1)) droWil1) ((droVir3 droMoj3) droGri2)) anoGam1) apiMel2) triCas2) The resulting multiple alignments were then assigned conservation scores by phastCons. The phastCons program computes conservation scores based on a phylo-HMM, a type of probabilistic model that describes both the process of DNA substitution at each site in a genome and the way this process changes from one site to the next (Felsenstein and Churchill 1996, Yang 1995, Siepel and Haussler 2005). PhastCons uses a two-state phylo-HMM, with a state for conserved regions and a state for non-conserved regions. The value plotted at each site is the posterior probability that the corresponding alignment column was "generated" by the conserved state of the phylo-HMM. These scores reflect the phylogeny (including branch lengths) of the species in question, a continuous-time Markov model of the nucleotide substitution process, and a tendency for conservation levels to be autocorrelated along the genome (i.e., to be similar at adjacent sites). The general reversible (REV) substitution model was used. Note that, unlike many conservation-scoring programs, phastCons does not rely on a sliding window of fixed size, so short highly-conserved regions and long moderately conserved regions can both obtain high scores. More information about phastCons can be found in Siepel et al. (2005). PhastCons currently treats alignment gaps as missing data, which sometimes has the effect of producing undesirably high conservation scores in gappy regions of the alignment. We are looking at several possible ways of improving the handling of alignment gaps. Credits This track was created at UCSC using the following programs: Blastz and multiz by Minmei Hou, Scott Schwartz and Webb Miller of the Penn State Bioinformatics Group. AxtBest, axtChain, chainNet, netSyntenic, and netClass by Jim Kent at UCSC. PhastCons by Adam Siepel at Cornell University. Conservation track display by Hiram Clawson ("wiggle" display), Brian Raney (gap annotation and codon framing) and Kate Rosenbloom, codon frame software by Mark Diekhans at UCSC. The phylogenetic tree is based on Adam Siepel's phyloFit program using a topology from Teri Markow. References Phylo-HMMs and phastCons: Felsenstein J, Churchill GA. A Hidden Markov Model approach to variation among sites in rate of evolution. Mol Biol Evol. 1996 Jan;13(1):93-104. PMID: 8583911 Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K, Clawson H, Spieth J, Hillier LW, Richards S, et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 2005 Aug;15(8):1034-50. PMID: 16024819; PMC: PMC1182216 Siepel A, Haussler D. Phylogenetic Hidden Markov Models. In: Nielsen R, editor. Statistical Methods in Molecular Evolution. New York: Springer; 2005. pp. 325-351. Yang Z. A space-time process model for the evolution of DNA sequences. Genetics. 1995 Feb;139(2):993-1005. PMID: 7713447; PMC: PMC1206396 Chain/Net: Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Multiz: Blanchette M, Kent WJ, Riemer C, Elnitski L, Smit AF, Roskin KM, Baertsch R, Rosenbloom K, Clawson H, Green ED, et al. Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res. 2004 Apr;14(4):708-15. PMID: 15060014; PMC: PMC383317 Blastz: Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 cons15wayViewalign Multiz Alignments 12 Flies, Mosquito, Honeybee, Beetle Multiz Alignments & phastCons Scores Comparative Genomics multiz15way Multiz Align 12 Flies, Mosquito, Honeybee, Beetle Multiz Alignments & phastCons Scores Comparative Genomics Description This track shows a measure of evolutionary conservation in twelve Drosophila species, mosquito, honeybee and red flour beetle, based on a phylogenetic hidden Markov model (phastCons). Multiz alignments of the following assemblies were used to generate this annotation: D. melanogaster Apr. 2004 (BDGP R4/dm2) (dm2) D. simulans Apr. 2005 (droSim1) D. sechellia Oct. 2005 (droSec1) D. yakuba Nov. 2005 (droYak2) D. erecta Feb. 2006 (droEre2) D. ananassae Feb. 2006 (droAna3) D. pseudoobscura Feb. 2006 (dp4) D. persimilis Oct. 2005 (droPer1) D. willistoni Feb. 2006 (droWil1) D. virilis Feb. 2006 (droVir3) D. mojavensis Feb. 2006 (droMoj3) D. grimshawi Feb. 2006 (droGri2) A. gambiae Feb. 2003 (anoGam1) A. mellifera Jan. 2005 (apiMel2) T. castaneum Sep. 2005 (triCas2) Display Conventions and Configuration In full and pack display modes, conservation scores are displayed as a "wiggle" (histogram), where the height reflects the size of the score. Pairwise alignments of each species to the D. melanogaster genome are displayed below as a grayscale density plot (in pack mode) or as a "wiggle" (in full mode) that indicates alignment quality. In dense display mode, conservation is shown in grayscale using darker values to indicate higher levels of overall conservation as scored by phastCons. The conservation wiggle can be configured in a variety of ways to highlight different aspects of the displayed information. Click the Graph configuration help link for an explanation of the configuration options. Checkboxes in the track configuration section allow excluding species from the pairwise display; however, this does not remove them from the conservation score display. To view detailed information about the alignments at a specific position, zoom in the display to 30,000 or fewer bases, then click on the alignment. Gap Annotation The "Display chains between alignments" configuration option enables display of gaps between alignment blocks in the pairwise alignments in a manner similar to the Chain track display. The following conventions are used: Single line: No bases in the aligned species. Possibly due to a lineage-specific insertion between the aligned blocks in the D. melanogaster genome or a lineage-specific deletion between the aligned blocks in the aligning species. Double line: Aligning species has one or more unalignable bases in the gap region. Possibly due to excessive evolutionary distance between species or independent indels in the region between the aligned blocks in both species. Pale yellow coloring: Aligning species has Ns in the gap region. Reflects uncertainty in the relationship between the DNA of both species, due to lack of sequence in relevant portions of the aligning species. Genomic Breaks Discontinuities in the genomic context (chromosome, scaffold or region) of the aligned DNA in the aligning species are shown as follows: Vertical blue bar: Represents a discontinuity that persists indefinitely on either side, e.g. a large region of DNA on either side of the bar comes from a different chromosome in the aligned species due to a large scale rearrangement. Green square brackets: Enclose shorter alignments consisting of DNA from one genomic context in the aligned species nested inside a larger chain of alignments from a different genomic context. The alignment within the brackets may represent a short misalignment, a lineage-specific insertion of a transposon in the D. melanogaster genome that aligns to a paralogous copy somewhere else in the aligned species, or other similar occurrence. Base Level When zoomed-in to the base-level display, the track shows the base composition of each alignment. The numbers and symbols on the Gaps line indicate the lengths of gaps in the D. melanogaster sequence at those alignment positions relative to the longest non-D. melanogaster sequence. If there is sufficient space in the display, the size of the gap is shown; if not, and if the gap size is a multiple of 3, a "*" is displayed, otherwise "+" is shown. Codon translation is available in base-level display mode if the displayed region is identified as a coding segment. To display this annotation, select the species for translation from the pull-down menu in the Codon Translation configuration section at the top of the page. Then, select one of the following modes: No codon translation: The gene annotation is not used; the bases are displayed without translation. Use default species reading frames for translation: The annotations from the genome displayed in the Default species to establish reading frame pull-down menu are used to translate all the aligned species present in the alignment. Use reading frames for species if available, otherwise no translation: Codon translation is performed only for those species where the region is annotated as protein coding. Use reading frames for species if available, otherwise use default species: Codon translation is done on those species that are annotated as being protein coding over the aligned region using species-specific annotation; the remaining species are translated using the default species annotation. Codon translation uses the following gene tracks as the basis for translation, depending on the species chosen: Gene TrackSpecies FlyBase GenesD. melanogaster mRNAsD. simulans, D. yakuba, A. gambiae, A. mellifera not translatedAll other species Methods Best-in-genome pairwise alignments were generated for each species using blastz, followed by chaining and netting. The pairwise alignments were then multiply aligned using the multiz program, according to this topology: (((((((((dm2 (droSim1 droSec1)) (droYak2 droEre2)) droAna3) (dp4 droPer1)) droWil1) ((droVir3 droMoj3) droGri2)) anoGam1) apiMel2) triCas2) The resulting multiple alignments were then assigned conservation scores by phastCons. The phastCons program computes conservation scores based on a phylo-HMM, a type of probabilistic model that describes both the process of DNA substitution at each site in a genome and the way this process changes from one site to the next (Felsenstein and Churchill 1996, Yang 1995, Siepel and Haussler 2005). PhastCons uses a two-state phylo-HMM, with a state for conserved regions and a state for non-conserved regions. The value plotted at each site is the posterior probability that the corresponding alignment column was "generated" by the conserved state of the phylo-HMM. These scores reflect the phylogeny (including branch lengths) of the species in question, a continuous-time Markov model of the nucleotide substitution process, and a tendency for conservation levels to be autocorrelated along the genome (i.e., to be similar at adjacent sites). The general reversible (REV) substitution model was used. Note that, unlike many conservation-scoring programs, phastCons does not rely on a sliding window of fixed size, so short highly-conserved regions and long moderately conserved regions can both obtain high scores. More information about phastCons can be found in Siepel et al. (2005). PhastCons currently treats alignment gaps as missing data, which sometimes has the effect of producing undesirably high conservation scores in gappy regions of the alignment. We are looking at several possible ways of improving the handling of alignment gaps. Credits This track was created at UCSC using the following programs: Blastz and multiz by Minmei Hou, Scott Schwartz and Webb Miller of the Penn State Bioinformatics Group. AxtBest, axtChain, chainNet, netSyntenic, and netClass by Jim Kent at UCSC. PhastCons by Adam Siepel at Cornell University. Conservation track display by Hiram Clawson ("wiggle" display), Brian Raney (gap annotation and codon framing) and Kate Rosenbloom, codon frame software by Mark Diekhans at UCSC. The phylogenetic tree is based on Adam Siepel's phyloFit program using a topology from Teri Markow. References Phylo-HMMs and phastCons: Felsenstein J, Churchill GA. A Hidden Markov Model approach to variation among sites in rate of evolution. Mol Biol Evol. 1996 Jan;13(1):93-104. PMID: 8583911 Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K, Clawson H, Spieth J, Hillier LW, Richards S, et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 2005 Aug;15(8):1034-50. PMID: 16024819; PMC: PMC1182216 Siepel A, Haussler D. Phylogenetic Hidden Markov Models. In: Nielsen R, editor. Statistical Methods in Molecular Evolution. New York: Springer; 2005. pp. 325-351. Yang Z. A space-time process model for the evolution of DNA sequences. Genetics. 1995 Feb;139(2):993-1005. PMID: 7713447; PMC: PMC1206396 Chain/Net: Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Multiz: Blanchette M, Kent WJ, Riemer C, Elnitski L, Smit AF, Roskin KM, Baertsch R, Rosenbloom K, Clawson H, Green ED, et al. Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res. 2004 Apr;14(4):708-15. PMID: 15060014; PMC: PMC383317 Blastz: Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 cons15wayViewphastcons Element Conservation (phastCons) 12 Flies, Mosquito, Honeybee, Beetle Multiz Alignments & phastCons Scores Comparative Genomics phastCons15way 15 Insect Cons 15 Insect Conservation by PhastCons Comparative Genomics cons15wayViewelements Conserved Elements 12 Flies, Mosquito, Honeybee, Beetle Multiz Alignments & phastCons Scores Comparative Genomics phastConsElements15way 15 Insect El PhastCons Conserved Elements (12 Flies, Mosquito, Honeybee, Beetle) Comparative Genomics Description This track shows predictions of conserved elements produced by the phastCons program. PhastCons is part of the PHAST (PHylogenetic Analysis with Space/Time models) package. The predictions are based on a phylogenetic hidden Markov model (phylo-HMM), a type of probabilistic model that describes both the process of DNA substitution at each site in a genome and the way this process changes from one site to the next. Methods Best-in-genome pairwise alignments were generated for each species using blastz, followed by chaining and netting. A multiple alignment was then constructed from these pairwise alignments using multiz. Predictions of conserved elements were then obtained by running phastCons on the multiple alignments with the --most-conserved option. PhastCons constructs a two-state phylo-HMM with a state for conserved regions and a state for non-conserved regions. The two states share a single phylogenetic model, except that the branch lengths of the tree associated with the conserved state are multiplied by a constant scaling factor rho (0 <= rho <= 1). The free parameters of the phylo-HMM, including the scaling factor rho, are estimated from the data by maximum likelihood using an EM algorithm. This procedure is subject to certain constraints on the "coverage" of the genome by conserved elements and the "smoothness" of the conservation scores. Details can be found in Siepel et al. (2005). The predicted conserved elements are segments of the alignment that are likely to have been "generated" by the conserved state of the phylo-HMM. Each element is assigned a log-odds score equal to its log probability under the conserved model minus its log probability under the non-conserved model. The "score" field associated with this track contains transformed log-odds scores, taking values between 0 and 1000. (The scores are transformed using a monotonic function of the form a * log(x) + b.) The raw log odds scores are retained in the "name" field and can be seen on the details page or in the browser when the track's display mode is set to "pack" or "full". Credits This track was created at UCSC using the following programs: Blastz and multiz by Minmei Hou, Scott Schwartz, and Webb Miller of the Penn State Bioinformatics Group. AxtBest, axtChain, chainNet, netSyntenic, and netClass by Jim Kent at UCSC. PhastCons by Adam Siepel at Cornell University. References PhastCons: Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K, Clawson H, Spieth J, Hillier LW, Richards S, et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 2005 Aug;15(8):1034-50. PMID: 16024819; PMC: PMC1182216 Chain/Net: Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Multiz: Blanchette M, Kent WJ, Riemer C, Elnitski L, Smit AF, Roskin KM, Baertsch R, Rosenbloom K, Clawson H, Green ED, et al. Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res. 2004 Apr;14(4):708-15. PMID: 15060014; PMC: PMC383317 Blastz: Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainApiMel2 A. mellifera Chain A. mellifera (Jan. 2005 (Baylor 2.0/apiMel2)) Chained Alignments Comparative Genomics Description This track shows alignments of A. mellifera (apiMel2, Jan. 2005 (Baylor 2.0/apiMel2)) to the D. melanogaster genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both A. mellifera and D. melanogaster simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the A. mellifera assembly or an insertion in the D. melanogaster assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the D. melanogaster genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Display Conventions and Configuration By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Methods The A. mellifera/D. melanogaster genomes were aligned with blastz and converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single A. mellifera chromosome and a single D. melanogaster chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-90-25-100 C-90100-100-25 G-25-100100-90 T-100-25-9091 Chains scoring below a threshold were discarded; the remaining chains are displayed in this track. Credits Blastz was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains were generated by Robert Baertsch and Jim Kent. References Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 netApiMel2 A. mellifera Net A. mellifera (Jan. 2005 (Baylor 2.0/apiMel2)) Alignment Net Comparative Genomics Description This track shows the best A. mellifera/D. melanogaster chain for every part of the D. melanogaster genome. It is useful for finding orthologous regions and for studying genome rearrangement. The A. mellifera sequence used in this annotation is from the Jan. 2005 (Baylor 2.0/apiMel2) (apiMel2) assembly. Display Conventions and Configuration In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chains were derived from blastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. Blastz was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his program RepeatMasker. The browser display and database storage of the nets were made by Robert Baertsch and Jim Kent. References Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainAnoGam1 A. gambiae Chain A. gambiae (Feb. 2003 (IAGEC MOZ2/anoGam1)) Chained Alignments Comparative Genomics Description This track shows alignments of A. gambiae (anoGam1, Feb. 2003 (IAGEC MOZ2/anoGam1)) to the D. melanogaster genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both A. gambiae and D. melanogaster simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The A. gambiae sequence is from the MOZ2 assembly. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the A. gambiae assembly or an insertion in the D. melanogaster assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the D. melanogaster genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Display Conventions and Configuration By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Methods The A. gambiae/D. melanogaster genomes were aligned with blastz and converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single A. gambiae chromosome and a single D. melanogaster chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-90-25-100 C-90100-100-25 G-25-100100-90 T-100-25-9091 Chains scoring below a threshold were discarded; the remaining chains are displayed in this track. Credits Blastz was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains were generated by Robert Baertsch and Jim Kent. References Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 netAnoGam1 A. gambiae Net A. gambiae (Feb. 2003 (IAGEC MOZ2/anoGam1)) Alignment Net Comparative Genomics Description This track shows the best A. gambiae/D. melanogaster chain for every part of the D. melanogaster genome. It is useful for finding orthologous regions and for studying genome rearrangement. The A. gambiae sequence used in this annotation is from the Feb. 2003 (IAGEC MOZ2/anoGam1) (anoGam1) assembly. Display Conventions and Configuration In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chains were derived from blastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. Blastz was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his program RepeatMasker. The browser display and database storage of the nets were made by Robert Baertsch and Jim Kent. References Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainDroVir1 D. virilis Chain D. virilis (July 2004 (Agencourt prelim/droVir1)) Chained Alignments Comparative Genomics Description This track shows alignments of D. virilis (droVir1, July 2004 (Agencourt prelim/droVir1)) to the D. melanogaster genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both D. virilis and D. melanogaster simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the D. virilis assembly or an insertion in the D. melanogaster assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the D. melanogaster genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Display Conventions and Configuration By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Methods The D. virilis/D. melanogaster genomes were aligned with blastz and converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single D. virilis chromosome and a single D. melanogaster chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-90-25-100 C-90100-100-25 G-25-100100-90 T-100-25-9091 Chains scoring below a threshold were discarded; the remaining chains are displayed in this track. Credits Blastz was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains were generated by Robert Baertsch and Jim Kent. References Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 netDroVir1 D. virilis Net D. virilis (July 2004 (Agencourt prelim/droVir1)) Alignment Net Comparative Genomics Description This track shows the best D. virilis/D. melanogaster chain for every part of the D. melanogaster genome. It is useful for finding orthologous regions and for studying genome rearrangement. The D. virilis sequence used in this annotation is from the July 2004 (Agencourt prelim/droVir1) (droVir1) assembly. Display Conventions and Configuration In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chains were derived from blastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. Blastz was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his program RepeatMasker. The browser display and database storage of the nets were made by Robert Baertsch and Jim Kent. References Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainDroMoj1 D. mojavensis Chain D. mojavensis (Aug. 2004 (Agencourt prelim/droMoj1)) Chained Alignments Comparative Genomics Description This track shows alignments of D. mojavensis (droMoj1, Aug. 2004 (Agencourt prelim/droMoj1)) to the D. melanogaster genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both D. mojavensis and D. melanogaster simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the D. mojavensis assembly or an insertion in the D. melanogaster assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the D. melanogaster genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Display Conventions and Configuration By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Methods The D. mojavensis/D. melanogaster genomes were aligned with blastz and converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single D. mojavensis chromosome and a single D. melanogaster chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-90-25-100 C-90100-100-25 G-25-100100-90 T-100-25-9091 Chains scoring below a threshold were discarded; the remaining chains are displayed in this track. Credits Blastz was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains were generated by Robert Baertsch and Jim Kent. References Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 netDroMoj1 D. mojavensis Net D. mojavensis (Aug. 2004 (Agencourt prelim/droMoj1)) Alignment Net Comparative Genomics Description This track shows the best D. mojavensis/D. melanogaster chain for every part of the D. melanogaster genome. It is useful for finding orthologous regions and for studying genome rearrangement. The D. mojavensis sequence used in this annotation is from the Aug. 2004 (Agencourt prelim/droMoj1) (droMoj1) assembly. Display Conventions and Configuration In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chains were derived from blastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. Blastz was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his program RepeatMasker. The browser display and database storage of the nets were made by Robert Baertsch and Jim Kent. References Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainDp3 D. pseudo Chain D. pseudoobscura (Nov. 2004 (FlyBase 1.03/dp3)) Chained Alignments Comparative Genomics Description This track shows alignments of D. pseudoobscura (dp3, Nov. 2004 (FlyBase 1.03/dp3)) to the D. melanogaster genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both D. pseudoobscura and D. melanogaster simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the D. pseudoobscura assembly or an insertion in the D. melanogaster assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the D. melanogaster genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Display Conventions and Configuration By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Methods The D. pseudoobscura/D. melanogaster genomes were aligned with blastz and converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single D. pseudoobscura chromosome and a single D. melanogaster chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-90-25-100 C-90100-100-25 G-25-100100-90 T-100-25-9091 Chains scoring below a threshold were discarded; the remaining chains are displayed in this track. Credits Blastz was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains were generated by Robert Baertsch and Jim Kent. References Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 netDp3 D. pseudo. Net D. pseudoobscura (Nov. 2004 (FlyBase 1.03/dp3)) Alignment Net Comparative Genomics Description This track shows the best D. pseudoobscura/D. melanogaster chain for every part of the D. melanogaster genome. It is useful for finding orthologous regions and for studying genome rearrangement. The D. pseudoobscura sequence used in this annotation is from the Nov. 2004 (FlyBase 1.03/dp3) (dp3) assembly. Display Conventions and Configuration In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chains were derived from blastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits The dp3 data were obtained from the FlyBase Release 1.0 assembly. The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. Blastz was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his program RepeatMasker. The browser display and database storage of the nets were made by Robert Baertsch and Jim Kent. References Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainDroPer1 D. persimilis Chain D. persimilis (Oct. 2005 (Broad/droPer1)) Chained Alignments Comparative Genomics Description This track shows alignments of D. persimilis (droPer1, Oct. 2005 (Broad/droPer1)) to the D. melanogaster genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both D. persimilis and D. melanogaster simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the D. persimilis assembly or an insertion in the D. melanogaster assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the D. melanogaster genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Display Conventions and Configuration By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Methods The D. persimilis/D. melanogaster genomes were aligned with blastz and converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single D. persimilis chromosome and a single D. melanogaster chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. Chains scoring below a threshold were discarded; the remaining chains are displayed in this track. Credits Blastz was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains were generated by Robert Baertsch and Jim Kent. References Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 netDroPer1 D. persimilis Net D. persimilis (Oct. 2005 (Broad/droPer1)) Alignment Net Comparative Genomics Description This track shows the best D. persimilis/D. melanogaster chain for every part of the D. melanogaster genome. It is useful for finding orthologous regions and for studying genome rearrangement. The D. persimilis sequence used in this annotation is from the Oct. 2005 (Broad/droPer1) (droPer1) assembly. Display Conventions and Configuration In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chains were derived from blastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. Blastz was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his program RepeatMasker. The browser display and database storage of the nets were made by Robert Baertsch and Jim Kent. References Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainDroAna1 D. ananassae Chain D. ananassae (July 2004 (TIGR/droAna1)) Chained Alignments Comparative Genomics Description This track shows alignments of D. ananassae (droAna1, July 2004 (TIGR/droAna1)) to the D. melanogaster genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both D. ananassae and D. melanogaster simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the D. ananassae assembly or an insertion in the D. melanogaster assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the D. melanogaster genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Display Conventions and Configuration By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Methods The D. ananassae/D. melanogaster genomes were aligned with blastz and converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single D. ananassae chromosome and a single D. melanogaster chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. Chains scoring below a threshold were discarded; the remaining chains are displayed in this track. Credits Blastz was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains were generated by Robert Baertsch and Jim Kent. References Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 netDroAna1 D. ananassae Net D. ananassae (July 2004 (TIGR/droAna1)) Alignment Net Comparative Genomics Description This track shows the best D. ananassae/D. melanogaster chain for every part of the D. melanogaster genome. It is useful for finding orthologous regions and for studying genome rearrangement. The D. ananassae sequence used in this annotation is from the July 2004 (TIGR/droAna1) (droAna1) assembly. Display Conventions and Configuration In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chains were derived from blastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. Blastz was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his program RepeatMasker. The browser display and database storage of the nets were made by Robert Baertsch and Jim Kent. References Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainDroYak1 D. yakuba Chain D. yakuba (Apr. 2004 (WUGSC 1.0/droYak1)) Chained Alignments Comparative Genomics Description This track shows alignments of D. yakuba (droYak1, Apr. 2004 (WUGSC 1.0/droYak1)) to the D. melanogaster genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both D. yakuba and D. melanogaster simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the D. yakuba assembly or an insertion in the D. melanogaster assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the D. melanogaster genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Display Conventions and Configuration By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Methods The D. yakuba/D. melanogaster genomes were aligned with blastz and converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single D. yakuba chromosome and a single D. melanogaster chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. The following matrix was used:  ACGT A91-90-25-100 C-90100-100-25 G-25-100100-90 T-100-25-9091 Chains scoring below a threshold were discarded; the remaining chains are displayed in this track. Credits Blastz was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains were generated by Robert Baertsch and Jim Kent. References Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 netDroYak1 D. yakuba Net D. yakuba (Apr. 2004 (WUGSC 1.0/droYak1)) Alignment Net Comparative Genomics Description This track shows the best D. yakuba/D. melanogaster chain for every part of the D. melanogaster genome. It is useful for finding orthologous regions and for studying genome rearrangement. The D. yakuba sequence used in this annotation is from the Apr. 2004 (WUGSC 1.0/droYak1) (droYak1) assembly. Display Conventions and Configuration In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chains were derived from blastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. Blastz was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his program RepeatMasker. The browser display and database storage of the nets were made by Robert Baertsch and Jim Kent. References Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainDroSec1 D. sechellia Chain D. sechellia (Oct. 2005 (Broad/droSec1)) Chained Alignments Comparative Genomics Description This track shows alignments of D. sechellia (droSec1, Oct. 2005 (Broad/droSec1)) to the D. melanogaster genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both D. sechellia and D. melanogaster simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the D. sechellia assembly or an insertion in the D. melanogaster assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the D. melanogaster genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Display Conventions and Configuration By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Methods The D. sechellia/D. melanogaster genomes were aligned with blastz and converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single D. sechellia chromosome and a single D. melanogaster chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. Chains scoring below a threshold were discarded; the remaining chains are displayed in this track. Credits Blastz was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains were generated by Robert Baertsch and Jim Kent. References Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 netDroSec1 D. sechellia Net D. sechellia (Oct. 2005 (Broad/droSec1)) Alignment Net Comparative Genomics Description This track shows the best D. sechellia/D. melanogaster chain for every part of the D. melanogaster genome. It is useful for finding orthologous regions and for studying genome rearrangement. The D. sechellia sequence used in this annotation is from the Oct. 2005 (Broad/droSec1) (droSec1) assembly. Display Conventions and Configuration In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chains were derived from blastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. Blastz was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his program RepeatMasker. The browser display and database storage of the nets were made by Robert Baertsch and Jim Kent. References Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 chainDroSim1 D. simulans Chain D. simulans (Apr. 2005 (WUGSC mosaic 1.0/droSim1)) Chained Alignments Comparative Genomics Description This track shows alignments of D. simulans (droSim1, Apr. 2005 (WUGSC mosaic 1.0/droSim1)) to the D. melanogaster genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both D. simulans and D. melanogaster simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species. The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the D. simulans assembly or an insertion in the D. melanogaster assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the D. melanogaster genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment. Display Conventions and Configuration By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome. To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome. Methods The blastz alignments were converted into axt format using the lavToAxt program. The axt alignments were fed into axtChain, which organizes all alignments between a single D. simulans chromosome and a single D. melanogaster chromosome into a group and creates a kd-tree out of the gapless subsections (blocks) of the alignments. A dynamic program was then run over the kd-trees to find the maximally scoring chains of these blocks. Chains scoring below a threshold were discarded; the remaining chains are displayed in this track. Credits Blastz was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. Lineage-specific repeats were identified by Arian Smit and his RepeatMasker program. The axtChain program was developed at the University of California at Santa Cruz by Jim Kent with advice from Webb Miller and David Haussler. The browser display and database storage of the chains were generated by Robert Baertsch and Jim Kent. References Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002:115-26. PMID: 11928468 Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 netDroSim1 D. simulans Net D. simulans (Apr. 2005 (WUGSC mosaic 1.0/droSim1)) Alignment Net Comparative Genomics Description This track shows the best D. simulans/D. melanogaster chain for every part of the D. melanogaster genome. It is useful for finding orthologous regions and for studying genome rearrangement. The D. simulans sequence used in this annotation is from the Apr. 2005 (WUGSC mosaic 1.0/droSim1) (droSim1) assembly. Display Conventions and Configuration In full display mode, the top-level (level 1) chains are the largest, highest-scoring chains that span this region. In many cases gaps exist in the top-level chain. When possible, these are filled in by other chains that are displayed at level 2. The gaps in level 2 chains may be filled by level 3 chains and so forth. In the graphical display, the boxes represent ungapped alignments; the lines represent gaps. Click on a box to view detailed information about the chain as a whole; click on a line to display information about the gap. The detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display are categorized as one of four types (other than gap): Top - the best, longest match. Displayed on level 1. Syn - line-ups on the same chromosome as the gap in the level above it. Inv - a line-up on the same chromosome as the gap above it, but in the opposite orientation. NonSyn - a match to a chromosome different from the gap in the level above. Methods Chains were derived from blastz alignments, using the methods described on the chain tracks description pages, and sorted with the highest-scoring chains in the genome ranked first. The program chainNet was then used to place the chains one at a time, trimming them as necessary to fit into sections not already covered by a higher-scoring chain. During this process, a natural hierarchy emerged in which a chain that filled a gap in a higher-scoring chain was placed underneath that chain. The program netSyntenic was used to fill in information about the relationship between higher- and lower-level chains, such as whether a lower-level chain was syntenic or inverted relative to the higher-level chain. The program netClass was then used to fill in how much of the gaps and chains contained Ns (sequencing gaps) in one or both species and how much was filled with transposons inserted before and after the two organisms diverged. Credits The chainNet, netSyntenic, and netClass programs were developed at the University of California Santa Cruz by Jim Kent. Blastz was developed at Pennsylvania State University by Minmei Hou, Scott Schwartz, Zheng Zhang, and Webb Miller with advice from Ross Hardison. References Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9. PMID: 14500911; PMC: PMC208784 Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7. PMID: 12529312; PMC: PMC430961 rmsk RepeatMasker Repeating Elements by RepeatMasker Variation and Repeats Description This track was created by using Arian Smit's RepeatMasker program, which screens DNA sequences for interspersed repeats and low complexity DNA sequences. The program outputs a detailed annotation of the repeats that are present in the query sequence (represented by this track), as well as a modified version of the query sequence in which all the annotated repeats have been masked (generally available on the Downloads page). RepeatMasker uses the Repbase Update library of repeats from the Genetic Information Research Institute (GIRI). Repbase Update is described in Jurka, J. (2000) in the References section below. Display Conventions and Configuration In full display mode, this track displays up to ten different classes of repeats: Short interspersed nuclear elements (SINE), which include ALUs Long interspersed nuclear elements (LINE) Long terminal repeat elements (LTR), which include retroposons DNA repeat elements (DNA) Simple repeats (micro-satellites) Low complexity repeats Satellite repeats RNA repeats (including RNA, tRNA, rRNA, snRNA, scRNA) Other repeats, which includes class RC (Rolling Circle) Unknown The level of color shading in the graphical display reflects the amount of base mismatch, base deletion, and base insertion associated with a repeat element. The higher the combined number of these, the lighter the shading. Methods UCSC has used the most current versions of the RepeatMasker software and repeat libraries available to generate these data. Note that these versions may be newer than those that are publicly available on the Internet. Data are generated using the RepeatMasker -s flag. Additional flags may be used for certain organisms. Repeats are soft-masked. Alignments may extend through repeats, but are not permitted to initiate in them. See the FAQ for more information. Credits Thanks to Arian Smit and GIRI for providing the tools and repeat libraries used to generate this track. References Jurka J. Repbase update: a database and an electronic journal of repetitive elements. Trends Genet. 2000 Sep;16(9):418-20. PMID: 10973072