TIS Prediction

Home

About

Input Format

Output Format

Reference

Contact Us

Input Format:

The program accepts the input in the following ways:

Copy and paste the sequence(s) in the text box provided on the server page
Input the sequence files by browsing the hard drive of the local system.

The sequences are preferred in FASTA format. The accepted letters in nucleotide sequence are A,T,G and C. Nucleotide 'U' in mRNA sequence is represented by letter 'T'. The letters other than those specifying four nucleotides are translated to 'X'. The sequence in lower case is also accepted.

Output Format:

Output Description:

The output shows the sequence input for prediction of translation initiation site. The prediction result is tabulated and listed below.

The table contains the position of each AUG triplet inline along mRNA. The second column shows the context surrounding the AUG triplet used to predict the translation initiation site. The AUG triplet and nucleotides at positions -3 and +4 are highlighted. The prediction score for each AUG triplet equal to and/or close to 1 represents the probable prediction of AUG being the translation initiation site. The AUG triplets which fail to reach the output score close to 1 are rejected as being translation initiation sites. The result of prediction is listed in the last column.

Example Output:

>gi|32415092|ref|XM_328025.1| Neurospora crassa strain OR74A
CTGATGTCCCGTCAGCTCAGGTACATCATCTACAAGCTTTCGGACGACTTCAAGGAGATTGTCATTGAGA
GCACCAGCGAGGGCGCCACCGAGAACTACGACGAGTTCCGCGAGAAGCTCGTCAACGCCCAGACTAAGAG
CGCTTCTGGCGCCATCAGCAAGGGTCCCCGATATGCCGTCTACGATTTCGAGTACAAGCTTGCGTCTGGC
GAGGGTTCCCGCAACAAGGTGACCTTTATCGCCTGGTCCCCCGATGATGCTGGCATCAAGTCCAAGATGG
TCTACGCCTCTTCCAAGGAGGCCCTCAAGCGCTCTCTCAGCGGTATCGCTGTCGAGCTCCAGGCCAACGA
GCAGGACGACATCGAGTACGAACAGATTATCAAGACCGTGAGCAAGGGTACTGCCGCATAG

Position	Context	Prediction Score	Predicted
4	AGCTGATGTCCCG	0.113203164772	---
173	CCGATATGCCGTC	0.932545306262	---
254	CCCCGATGATGCT	0.999251468193	Yes
257	CGATGATGCTGGC	0.884324661111	---
277	CCAAGATGGTCTA	0.769546971439	---

FASTA Format:

Example:

A sequence in FASTA format is represented by the following pattern.

A format begins with a single-line description followed by the sequence on the immediate next line. The single line description starts with a symbol '>' in the first column. It follows the description of the sequence. The sequence continues over the lines. Each line contains less than 80 characters.

>gi|32408300|ref|XM_324631.1| Neurospora crassa strain OR74A
CTGGTGTATAGCTTCGTCAAGACCCTCACGGGCAAGACCATCACGCTCGAGGTTGAGTCCTCCGACACGA
TTGACAATGTCAAGCAGAAGATTCAGGACAAGGAGGGTATCCCGCCCGACCAGCAGCGCCTGATTTTTGC
TGGCAAGCAGCTCGAGGATGGCCGCACCCTCTCCGACTACAACATCCAGAAGGAGTCAACCCTCCACTTG
GTCCTCCGTCTGCGCGGTGGTATCATCGAGCCCTCGCTCAAGGCGCTTGCCTCCAAGTTCAACTGCGACA
AGATGATTTGCCGCAAGTGCTACGCTCGTCTGCCTCCCCGTGCCACCAACTGCCGTAAGCGCAAGTGCGG
ACACACCAACCAGCTCCGCCCCAAGAAGAAGCTCAAATAG

	Contact Us:

	Developed by:	Suhas Tikole

	Principal Investigator	Dr. R.Sankararamakrishnan

Laboratory of Computational Biology
Department of Biological Sciences and Bioengineering
Indian Institute of Technology Kanpur -208016
INDIA