|
RRE: Retrieval of non-coding Regulative
Elements from annotated Genomic databases
RRE is a tool to extract non-coding regions
associated to annotated genes.

RRE tools:
- RRE is a sequence parser
written in JAVA. RRE parser uses the gene/mRNA/CDS features in GBS/GBK
files to extract the annotated regions from the corresponding FA file.
RRE saves the extracted regions in various formats (e.i. fasta,xml)
and it can populate automatically a MySQL
database (www.mysql.com).
In order to run it RRE and Java 1.4 (java.sun.com/j2se/)
has to be installed in the system.
- In order to keep updated the non-coding
information an automatic download robot based on CURL was integrated with RRE (the
download robot has been implemented only for PCs running Linux).
The tool
allows to download new genomic data releases (GBS/GBK
and FA/MFA files) when they are available at
NCBI.
Download:
rre.tgz
(Windows/unix), sample data and windows/unix script file to run RRE
parser data.tgz
.
To run rre it is
necessary to download from NCBI
*.gbs and *.fa or *.gbk files related to the organism of interest.
- RRE
help menu
- Usage
- Installation
-
- Rules used in rre for the
extraction of regulative regions from gbs/gbk files:
- If .gbs and .fa files are
available .gbs annotations are used to extract sequence data from .fa
file.
- If .gbk files are the
only
available rre extract a .fa file from the data present in .gbk.
- 5'UTRs are extracted
starting from the mRNA and CDS annotation using the following rule:
given the following gbs annotations mRNA =
(m1..m2,m3..m4,m5..m6,m7..m8,...,Mx..My) and CDS =
(C1..C2,.........,Cx..Cy) where CDS is included in mRNA, 5UTR starts at
m1, contains the joining of all the exons upstream CDS an ends at C1
- 3'UTRs are extracted
starting from the mRNA and CDS annotation using the following rule: given the following gbs annotations mRNA =
(m1..m2,m3..m4,m5..m6,m7..m8,...,Mx..My) and CDS =
(C1..C2,.........,Cx..Cy) where CDS is included in mRNA, 3'UTR starts at Cy contains all exons in mRNA
and ends at My.
- Upstream regions are
extracted using the following rule: mRNA =
(m1..m2,m3..m4,m5..m6,m7..m8,...,Mx..My), Upstream region = N bp
upstream m1.
- Downstream regions are
extracted using the following rule: mRNA =
(m1..m2,m3..m4,m5..m6,m7..m8,...,Mx..My), Downstream region = N bp
upstream My.
- Exons related to CDS are
extracted using the following rule: mRNA =
(m1..m2,m3..m4,m5..m6,m7..m8,...,Mx..My), CDS =
(C1..C2,.........,Cx..Cy), Exon = Range Mi..Mj where
Mi>C1, Mj<Cy belonging to mRNA sequence
- Exons related to the non
coding part of mRNA (upstream) are extracted using the
following rule: mRNA = (m1..m2,m3..m4,m5..m6,m7..m8,...,Mx..My),
CDS = (C1..C2,.........,Cx..Cy), Exon = Range Mi..Mj where
Mi>m1, Mj<C1 belonging to mRNA sequence
- Exons related to the non
coding part of mRNA (downstream) are extracted using the
following rule: mRNA = (m1..m2,m3..m4,m5..m6,m7..m8,...,Mx..My),
CDS = (C1..C2,.........,Cx..Cy), Exon = Range Mi..Mj where
Mi>Cy, Mj<My belonging to mRNA sequence
- Introns related to CDS are
extracted using the following rule: mRNA =
(m1..m2,m3..m4,m5..m6,m7..m8,...,Mx..My), CDS =
(C1..C2,.........,Cx..Cy), Intron = Range Mi..Mj
where Mi>C1, Mj<Cy not belonging to mRNA sequence.
- Introns related to the non
coding part of mRNA (upstream) are extracted using the
following rule: mRNA = (m1..m2,m3..m4,m5..m6,m7..m8,...,Mx..My),
CDS = (C1..C2,.........,Cx..Cy), Intron = Range Mi..Mj
where Mi>m2, Mj<C1 not belonging to mRNA sequence.
- Introns related to the non
coding part of mRNA (downstream) are extracted using the
following rule: mRNA = (m1..m2,m3..m4,m5..m6,m7..m8,...,Mx..My),
CDS = (C1..C2,.........,Cx..Cy), Intron = Range Mi..Mj
where Mi>Cy, Mj<My not belonging to mRNA sequence.
- Overlaps between
up/downstream regions with gene annotation are handled in the following
way: Size definition rule Gene > Upstream > Downstream; example:
Gene_b (0..500), mRNA_b (1..400), Gene_a (10000..80000),
mRNA_a (40000..50000,60000..70000), Upstream,Downstream
size=10000 -> Up Gene_b not extracted, Down Gene_b
(401..9999), Up Gene_a (30000..39999), Down Gene_a (70001..80000
|
--------------------------------------------------------------------------------------------------------------------------------
Note to the users:
3/mar/2004: It is now available the
possibility to extract sequence
features related to genes containing putative oestrogen-responsive
elements (ERE). The selection of the sequence feature can be performed
using trascription information related to genes specific
expression in ER+ breast cancers and in trascriptional profiling
experiments performed on MCF7 and ZR75 cell lines.
--------------------------------------------------------------------------------------------------------------------------------
Users can access
to the the RRE database requesting an authorization
certificate sending an email to raffaele.calogero@unito.it . Please insert in the subject of the email: ACCESSING TO RRE DB and in the
main body of the message: SURNAME,
NAME, AFFILIATION, EMAIL ADDRESS.
Users will receive a
certificate to be installed in Internet Explorer (5.0 or higher) or
Netscape (7.0 or higher)
|