weakAUG:
A Server to predict translation initiation sites in human mRNA sequences with AUG start codon in weak Kozak context
 
Home
 
About
 
Input Format
 
Output Format
 
Reference
 
Contact Us
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Input Format:
 

The program accepts the input in the following ways:

  • Copy and paste the sequence(s) in the text box provided on the server page

  • Input the sequence files by browsing the hard drive of the local system.

The sequences are preferred in FASTA format. The accepted letters in nucleotide sequence are A,T,G and C. Nucleotide 'U' in mRNA sequence is represented by letter 'T'. The letters other than those specifying four nucleotides are translated to 'X'. The sequence in lower case is also accepted.

 
Output Format:
Output Description:

The output shows the sequence input for prediction of translation initiation site. The prediction result is tabulated and listed below.

The table contains the position of each AUG triplet inline along mRNA. The second column shows the context surrounding the AUG triplet used to predict the translation initiation site. The AUG triplet and nucleotides at positions -3 and +4 are highlighted.  The prediction score for each AUG triplet equal to and/or close to 1 represents the probable prediction of AUG being the translation initiation site. The AUG triplets which fail to reach the output score close to 1 are rejected as being translation initiation sites. The result of prediction is listed in the last column.

 
Example Output:
 
>gi|32415092|ref|XM_328025.1| Neurospora crassa strain OR74A
CTGATGTCCCGTCAGCTCAGGTACATCATCTACAAGCTTTCGGACGACTTCAAGGAGATTGTCATTGAGA
GCACCAGCGAGGGCGCCACCGAGAACTACGACGAGTTCCGCGAGAAGCTCGTCAACGCCCAGACTAAGAG
CGCTTCTGGCGCCATCAGCAAGGGTCCCCGATATGCCGTCTACGATTTCGAGTACAAGCTTGCGTCTGGC
GAGGGTTCCCGCAACAAGGTGACCTTTATCGCCTGGTCCCCCGATGATGCTGGCATCAAGTCCAAGATGG
TCTACGCCTCTTCCAAGGAGGCCCTCAAGCGCTCTCTCAGCGGTATCGCTGTCGAGCTCCAGGCCAACGA
GCAGGACGACATCGAGTACGAACAGATTATCAAGACCGTGAGCAAGGGTACTGCCGCATAG
Position ContextPrediction Score Predicted
4 AGCTGATGTCCCG0.113203164772---
173 CCGATATGCCGTC0.932545306262---
254 CCCCGATGATGCT0.999251468193Yes
257 CGATGATGCTGGC0.884324661111---
277 CCAAGATGGTCTA0.769546971439---
 
FASTA Format:
Example:
A sequence in FASTA format is represented by the following pattern.

A format begins with a single-line description followed by the sequence on the immediate next line. The single line description starts with a symbol '>' in the first column. It follows the description of the sequence. The sequence continues over the lines. Each line contains less than 80 characters.

 
 
>gi|32408300|ref|XM_324631.1| Neurospora crassa strain OR74A
CTGGTGTATAGCTTCGTCAAGACCCTCACGGGCAAGACCATCACGCTCGAGGTTGAGTCCTCCGACACGA
TTGACAATGTCAAGCAGAAGATTCAGGACAAGGAGGGTATCCCGCCCGACCAGCAGCGCCTGATTTTTGC
TGGCAAGCAGCTCGAGGATGGCCGCACCCTCTCCGACTACAACATCCAGAAGGAGTCAACCCTCCACTTG
GTCCTCCGTCTGCGCGGTGGTATCATCGAGCCCTCGCTCAAGGCGCTTGCCTCCAAGTTCAACTGCGACA
AGATGATTTGCCGCAAGTGCTACGCTCGTCTGCCTCCCCGTGCCACCAACTGCCGTAAGCGCAAGTGCGG
ACACACCAACCAGCTCCGCCCCAAGAAGAAGCTCAAATAG

  Contact Us:  
     
  Developed by: Suhas Tikole
     
  Principal Investigator Dr. R.Sankararamakrishnan
     

Copyright (c) 2009 All rights reserved, IIT Kanpur.

 

Laboratory of Computational Biology
Department of Biological Sciences and Bioengineering

Indian Institute of Technology Kanpur -208016
INDIA