This directory contains the datasets used in the paper:
S. Mahony, D.L. Corcoran, E. Feingold, P.V. Benos, "Regulatory conservation of protein coding and miRNA genes in vertebrates: lessons from the opossum genome", Genome Biol (2007) 8:R84.
A copy of the paper can be found here
The datasets included here are the upstream sequences of the vertebrate protein coding genes, the intergenic microRNA genes, the tRNA genes and ia list of all developmental genes (subset of the protein coding genes). The sequences are in the Multiple Alignment File (MAF) format (see UCSC Genome Browser.
- Protein coding gene dataset
5 kb upstream sequences of all vertebrate protein coding genes.
- Intergenic microRNA gene dataset
5 kb upstream sequences of all vertebrate intergenic microRNA genes. The beginning of the gene is considered the annotated start of the pre-miRNA. Some miRNA genes are clustered. See paper for more details
- tRNA gene dataset
5 kb upstream sequences of all vertebrate tRNA genes.
- List of TRANSAFC matrices
A list of all vertebrate protein coding genes that are involved in developmental processes.