Input Format: |
|
The program accepts the input in the
following ways: |
|
The sequences are preferred in
FASTA format. The accepted
letters in nucleotide sequence are A,T,G and C. Nucleotide
'U' in mRNA sequence is represented by letter 'T'. The
letters other than those specifying four nucleotides are
translated to 'X'. The sequence in lower case is also
accepted. |
|
Output Format: |
Output Description: |
The output shows the sequence input for prediction of
translation initiation site. The prediction result is
tabulated and listed below. |
The table contains the position of each
AUG triplet inline along mRNA. The second column shows the
context surrounding the AUG triplet used to predict the
translation initiation site. The AUG triplet and nucleotides
at positions -3 and +4 are highlighted. The prediction
score for each AUG triplet equal to and/or close to 1
represents the probable prediction of AUG being the
translation initiation site. The AUG triplets which fail to
reach the output score close to 1 are rejected as being
translation initiation sites. The result of prediction is
listed in the last column. |
|
Example Output: |
|
>gi|32415092|ref|XM_328025.1| Neurospora crassa strain OR74A
CTGATGTCCCGTCAGCTCAGGTACATCATCTACAAGCTTTCGGACGACTTCAAGGAGATTGTCATTGAGA
GCACCAGCGAGGGCGCCACCGAGAACTACGACGAGTTCCGCGAGAAGCTCGTCAACGCCCAGACTAAGAG
CGCTTCTGGCGCCATCAGCAAGGGTCCCCGATATGCCGTCTACGATTTCGAGTACAAGCTTGCGTCTGGC
GAGGGTTCCCGCAACAAGGTGACCTTTATCGCCTGGTCCCCCGATGATGCTGGCATCAAGTCCAAGATGG
TCTACGCCTCTTCCAAGGAGGCCCTCAAGCGCTCTCTCAGCGGTATCGCTGTCGAGCTCCAGGCCAACGA
GCAGGACGACATCGAGTACGAACAGATTATCAAGACCGTGAGCAAGGGTACTGCCGCATAG
Position | | Context | Prediction Score | Predicted | 4 | | AGCTGATGTCCCG | 0.113203164772 | --- | 173 | | CCGATATGCCGTC | 0.932545306262 | --- | 254 | | CCCCGATGATGCT | 0.999251468193 | Yes | 257 | | CGATGATGCTGGC | 0.884324661111 | --- | 277 | | CCAAGATGGTCTA | 0.769546971439 | --- |
|
|
FASTA Format: |
Example: |
A sequence in FASTA format is represented by the following
pattern. |
A format begins with a single-line
description followed by the sequence on the immediate next
line. The single line description starts with a symbol '>'
in the first column. It follows the description of the
sequence. The sequence continues over the lines. Each line
contains less than 80 characters. |
|
|
>gi|32408300|ref|XM_324631.1| Neurospora crassa strain OR74A
CTGGTGTATAGCTTCGTCAAGACCCTCACGGGCAAGACCATCACGCTCGAGGTTGAGTCCTCCGACACGA
TTGACAATGTCAAGCAGAAGATTCAGGACAAGGAGGGTATCCCGCCCGACCAGCAGCGCCTGATTTTTGC
TGGCAAGCAGCTCGAGGATGGCCGCACCCTCTCCGACTACAACATCCAGAAGGAGTCAACCCTCCACTTG
GTCCTCCGTCTGCGCGGTGGTATCATCGAGCCCTCGCTCAAGGCGCTTGCCTCCAAGTTCAACTGCGACA
AGATGATTTGCCGCAAGTGCTACGCTCGTCTGCCTCCCCGTGCCACCAACTGCCGTAAGCGCAAGTGCGG
ACACACCAACCAGCTCCGCCCCAAGAAGAAGCTCAAATAG
|