Help [fasta]
Introduction
------------
This is the main help text for using the European
Bioinformatics Institute fasta3 email server.
The fasta3 (1) program provides rapid and sensitive
scanning of
single protein or nucleic acid sequences against
protein or
nucleic acid sequence databases.
Databases available
-------------------
The following databases are available. These are
the most
recent and up-to-date databases produced at the
EBI.
Name Description Sequence Input
--------------------------------------------------------------
SWALL
SWALL NON-REDUNDANT Protein protein
sequence database
SWISSPROT SWISS-PROT
Protein Database -"-
SWNEW
Updates to SWISS-PROT
-"-
TREMBL
TREMBL (Translated EMBL)
-"-
TREMBLNEW TREMBLNEW
-"-
EMBL
The EMBL Database
nucleic
EFUN
EMBL Fungi
-"-
EINV
EMBL Invertebrates
-"-
EHUM
EMBL Human
-"-
EMAM
EMBL Mammalian
-"-
EORG
EMBL Organelles
-"-
EPHG
EMBL Phages
-"-
EPLN
EMBL Plants
-"-
EPRO
EMBL Prokaryote
-"-
EROD
EMBL Rodents
-"-
ESTS
EMBL STSs
-"-
ESYN
EMBL Synthetic
-"-
EUNA
EMBL Unclassified
-"-
EVRL
EMBL Viral
-"-
EVRT
EMBL Vertebrates -"-
EEST
EMBL ESTs
-"-
EGSS
EMBL Genome Survey Sequences -"-
EHTG
EMBL High Throughput Genome
Sequences
-"-
EMNEW
EMBL New (Updates)
-"-
EMALL
EMBL + EMBL New (Updates) -"-
Using the fasta email server
---------------------------
Using fasta through email is simply. Send a properly
formatted normal mail message to FASTA@EBI.AC.UK
and wait for
the results to drop into your mailbox. Please,
don't send
interactive messages, the software can't handle
them!
The Input Format
----------------
Since blast through email is an automatic process
without any
human intervention it only understands a limited
set of
commands. Thus you have to adhere to a well-defined
syntax,
which is pretty easy to learn and understand
and should not
cause any problems. Some general rules are:
-Your mail message must contain only one command per line.
-There is only one mandatory command, SEQ. All the
other
commands are optional, and default values
will be used
whenever they are not specified.
-You can use both uppercase and lowercase characters,
or mix
them.
-The order of the commands is not important, but
make sure
that SEQ is the last one, since everything
following this
line will be treated as a sequence (see
below).
-Blank lines or space characters are accepted.
Here is a list of valid commands that are accepted
by email
server:
HELP
You know what it's for, don't you ?
PATH
This will normally not be required but if you want to
email server to send results somewhere else type that
email address here.
Example:
PATH joe@somewhere.there
TITLE
If you want to identify your search with a title please
type that description here:
Example:
TITLE gpr-ii-rpt
LIB
LIB can be one of the following (the default
is EMALL for DNA sequences or SWALL for protein sequence):
Please referr to the list of available databases above for
the names and descriptions of databases you may use.
Example:
LIB tremblnew
SEQ
Your sequence itself.
Example:
SEQ
MPNIPTISLNDGRPFAEPGLGTYNLRGDEGVAAMVAAIDSGYRLLDTAVNYENESEVGRA
VRASSVDRDELIVASKIPGRQHGRAEAVDSIRGSLDRLGLDVIDLQLIHWPNPSVGRWLD
TWRGMIDAREAGLVRSIGVSNFTEPMLKTLIDETGVTPAVNQVELHPYFPQAA
END
END
This is required in order to tell the server program where the
sequence ends. Please see the SEQ command for an example.
EXAMPLE OF A SIMPLE SUBMISSION
------------------------------
PATH joe@somewhere.there
TITLE My Sequence
LIB swall
SEQ
MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNAQDFC
TDLAVSSANFIPTVTAISTSPDLQWLVQPALVSSVAPSQTRAPHPFGVPA
PSAGAYSRAGVVKTMTGGRAQSIGRRGKVEQLSPEEEEKRRIRRERNKMA
AAKCRNRRRELTDTLQAETDQLEDEKSALQTEIANLLKEKEKLEFILAAH
RPACKIPDDLGFPEEMSVASLDLTGGLPEVATPESEEAFTLPLLNDPEPK
PSVEPVKSISSMELKTEPFDDFLFPASSRPSGSETARSVPDMDLSGSFYA
ADWEPLHSGSLGMGPMATELEPLCTPVVTCTPSCTAYTSSFVFTYPEADS
FPSCAAAHRKGSSSNEPSSDSLSSPTLLAL
END
The following commands are not compulsory. Defaults
will be
generated with the mail server so you do not
have to include
these in your mail if you do not wish to do so.
MATRIX
You may decide to use another matrix from the default
blosum62. Specify the name of the matrix here.
Example:
MATRIX pam250
WORD
Word (also know as ktup) size. This is the size of
words that will be used during the comparison. For
protein and nucleic acid searches the defaults are 2
and 6 respectively. These values may be set smaller.
This will increase the sensitivity of the search but
dramatically increase the search time. We recommend you
use the default values.
Example:
WORD 4
ALIGN
The number of alignments that will be returned in the
output file. The default is set to 25.
Example:
ALIGN 50
LIST
Setting this option to any number available in the menu
allows you to set to maximum number of reported scores
in the output file. The default is 50.
Example:
LIST 25
STRAND
This is nucleic acid only option which tells the
program to search the reverse complement of your DNA
query sequence.
Example:
STRAND bottom
HISTOGRAM
This will make the program include in it's output a
histogram. Fasta3 calculates two scores called Init1
and Initn for each comparison between the query
sequence and sequences in the database. The histogram
show how many comparisons were observed with a certain
score.
Example:
HISTOGRAM yes
References
-----------
(1) W. R. Pearson
and D. J. Lipman (1988),
"Improved Tools for Biological Sequence Analysis",
PNAS 85:2444- 2448,
W. R. Pearson (1990),
"Rapid and Sensitive Sequence Comparison with
FASTP and FASTA",
Methods in Enzymology 183:63-98.
Fasta3 and Blast services at the EBI
Rodrigo Lopez EMBL Outstation,
The European Bioinformatics Institute.
embnet.news Vol. 7.1 (1997)
Contacts
--------
European Bioinformatics Institute
Services Programme (Support)
Wellcome Trustt Genome Campus
Hinxton, Cambridge CB10 1SD, UK
Tel: +44 (0) 1223-494444
Fax: +44 (0) 1223-494468
support@ebi.ac.uk
http://www2.ebi.ac.uk/fasta3
William R. Pearson
Department of Biochemistry
Box 440, Jordan Hall
U. of Virginia
Charlottesville, VA
| Gracies a | Software |