Programes del GCG


Taula de continguts

 

Comparison

Pairwise Comparison  
Gap Uses the algorithm of Needleman and Wunsch to find the alignment of two complete sequences that maximizes the number of matches and minimizes the number of gaps.
BestFit Creates an optimal alignment of the best segment of similarity between two sequences. Optimal alignments are found by inserting gaps to maximize the number of matches using the local homology algorithm of Smith and Waterman.
FrameAlign Creates an optimal alignment of the best segment of similarity (local alignment) between a protein sequence and the codons in all possible reading frames of a nucleotide sequence. Optimal alignments may include reading frame shifts.
Compare Compares two protein or nucleic acid sequences and creates a file of the points of similarity between them for plotting with DotPlot. Compare finds the points using either a window/stringency or a word match criterion. The word match comparison is 1,000 times faster than the window/stringency comparison but somewhat less sensitive.
DotPlot+ Creates a dot-plot with the output file from Compare, FoldRNA, or StemLoop.
GapShow+ Displays an alignment by making a graph that shows the distribution of similarities and gaps. Align the two input sequences with either Gap or BestFit before you display them with GapShow.
Multiple Comparison  
PileUp+ Creates a multiple sequence alignment from a group of related sequences using progressive, pairwise alignments. It also can plot a tree showing the clustering relationships used to create the alignment.
SeqLab An interactive multiple sequence analysis tool that lets you view and refine alignments both by hand and by calling Wisconsin Package applications like PileUp. You can do any analysis available in the Wisconsin Package from SeqLab.
PlotSimilarity+ Plots the running average of the similarity among the sequences in a multiple sequence alignment.
Pretty Displays multiple sequence alignments and calculates a consensus sequence. It does not create the alignment; it simply displays it.
ProfileMake Creates a position-specific scoring table, called a profile, that quantitatively represents the information from a group of aligned sequences. You can use the profile for database searching (ProfileSearch) or sequence alignment (ProfileGap).
ProfileGap Creates an optimal alignment between a profile and a sequence.
Overlap Compares two sets of DNA sequences to each other in both orientations using a WordSearch-style comparison.
NoOverlap Identifies the places where a group of nucleotide sequences do not share any common words.
OldDistances Makes a table of the pairwise similarities within a group of aligned sequences.

Database Searching

Reference Searching
LookUp Identifies sequences by name, accession number, author, organism, keyword, title, reference, feature, definition, length, or date. The output is a list of sequences.
StringSearch Identifies sequences by searching sequence documentation for character patterns you specify, for example "globin" or "human."
Names Identifies GCG data files and sequence entries by name. It can show you what set of sequences is implied by any sequence specification.
Sequence Searching
BLAST Searches for sequences similar to a query sequence. The query and the database searched can be either peptide or nucleic acid in any combination. BLAST can search databases on your own computer or databases maintained at the National Center for Biotechnology Information (NCBI) in Bethesda, Maryland, USA.
Fetch Copies GCG sequences or data files from the GCG database into your directory or displays them on your terminal screen.
FastA Performs a Pearson and Lipman search for similarity between a query sequence and any group of sequences. FastA answers the question "What sequences in the database are similar to my sequence?"
TFastA Performs a Pearson and Lipman search for similarity between a query peptide sequence and any group of nucleotide sequences. TFastA translates the nucleotide sequences in all six frames before performing the comparison. It answers the question "What implied peptide sequences in a nucleotide sequence database are similar to my peptide sequence?"
FrameSearch+ Searches for similarities between a group of nucleotide sequences and a group of peptide sequences. For each sequence comparison, the program finds an optimal alignment between the protein sequence and all possible codons on each strand of the nucleotide sequence. Because the alignments are not restricted to a single frame of the nucleotide sequence, FrameSearch can identify similarities that include reading frame shifts.
ProfileSearch Uses a profile (representing a group of aligned sequences) as a probe to search the database for new sequences with similarity to the group. The profile is created with the program ProfileMake.
ProfileSegments Creates optimal alignments showing the segments of similarity found by ProfileSearch.
FindPatterns Identifies sequences containing short patterns like GAATTC or YRYRYRYR. You can define the patterns ambiguously and allow mismatches. In addition, you can provide the patterns in a file or simply type them from the terminal.
Motifs Looks for sequence motifs by searching through proteins for the patterns defined in the PROSITE Dictionary of Protein Sites and Patterns. Motifs can display an abstract of the current literature on each of the motifs it finds.
WordSearch+ Identifies sequences in the database that share large numbers of common words in the same register of comparison with your query sequence. The output of WordSearch can be displayed with Segments.
Segments Aligns and displays the segments of similarity found by WordSearch.

Editing and Publication

SeqEd Lets you enter and modify sequences and assemble parts of existing sequences into new genetic constructs. You can enter sequences from the keyboard or from a digitizer.
LineUp Lets you edit multiple sequence alignments. You can edit up to 30 sequences simultaneously. Type new sequences by hand or add them from existing sequences files. A consensus sequence identifies places where the sequences are in conflict.
SeqLab An interactive multiple sequence analysis tool that lets you view and refine alignments both by hand and by calling Wisconsin Package applications like PileUp. You can do any analysis available in the Wisconsin Package from SeqLab.
Assemble Assembles new sequences from pieces of existing sequences. It concatenates the fragments you specify and writes them out as a new sequence file. (SeqEd is more powerful than Assemble for most applications.)
Pretty Displays multiple sequence alignments and calculates a consensus sequence. It does not create the alignment; it simply displays it.
Publish Arranges sequences for publication. It creates a text file that you can modify with a text editor.
PlasmidMap+ Draws a circular plot of a plasmid construct. It can display restriction patterns, inserts, and known genetic elements. The plot is suitable for publication, record keeping, or analysis. It is drawn from one or more labeling files such as those written by MapSort.

Evolution

Distances Creates a table of the pairwise corrected distances within a group of aligned sequences, expressed as a number of nucleotide or amino acid substitutions per 100 residues.
GrowTree+ Constructs a phylogenetic tree based on a table of pairwise distances created by Distances.
Diverge Estimates the number of synonymous and nonsynonymous substitutions per site between two nucleic acid sequences that code for proteins. It uses a variant of the method published by Li et al.

Fragment Assembly

GelStart Begins a fragment assembly session by creating a new fragment assembly project or by identifying an existing project.
GelEnter Adds fragment sequences to a fragment assembly project. It accepts sequence data from your terminal keyboard, a digitizer, or existing sequence files.
GelMerge Aligns the sequences in a fragment assembly project into assemblies called contigs. You can view and edit these assemblies in GelAssemble.
GelAssemble Lets you view and edit contig assemblies created by GelMerge.
GelView Displays the structure of the contigs in a fragment assembly project.
GelDisassemble Breaks up the contigs in a fragment assembly project into single fragments.

Gene Finding and Pattern Recognition

TestCode+ Helps you identify protein coding sequences by plotting a measure of the non-randomness of the composition at every third base.
CodonPreference+ Recognizes protein coding sequences by the similarity of their codon usage to a codon frequency table or by the bias of their composition (usually GC) in the third position of each codon.
Frames+ Shows open reading frames for the six translation frames of a DNA sequence. Frames can superimpose the pattern of rare codon choices if you provide it with a codon frequency table.
Terminator Searches for prokaryotic factor-independent RNA polymerase terminators according to the method of Brendel and Trifonov.
 
Motifs Looks for sequence motifs by searching through proteins for the patterns defined in the PROSITE Dictionary of Protein Sites and Patterns. Motifs can display an abstract of the current literature on each of the motifs it finds.
Repeat Finds direct repeats in sequences. You must set the size, stringency, and range within which the repeat must occur; all the repeats of that size or greater are displayed as short alignments.
FindPatterns Identifies sequences containing short patterns like GAATTC or YRYRYRYR. You can define the patterns ambiguously and allow mismatches. In addition, you can provide the patterns in a file or simply type them from the terminal.
Composition Determines the composition of sequence(s). For nucleotide sequence(s), Composition also determines dinucleotide and trinucleotide content.
CodonFrequency Tabulates codon usage from sequences and/or existing codon usage tables. The output file is correctly formatted for input to the CodonPreference, Correspond, and Frames programs.
Correspond Looks for similar patterns of codon usage by comparing codon frequency tables.
Window Creates a table of the frequencies of different sequence patterns within a window as it is moved along a sequence. A pattern is any short sequence like GC or R or ATG. You can plot the output with the program StatPlot.
StatPlot+ Plots a set of parallel curves from a table of numbers like the table written by the Window program. The statistics in each column of the table are associated with a position in the analyzed sequence.
FitConsensus Uses a table written by Consensus as a probe to find the best examples of the consensus in a DNA sequence. You can specify the number of fits you want to see, and FitConsensus tabulates them with their position, frame, and statistical measure of their quality.
Consensus Calculates a consensus sequence for a set of prealigned short nucleic acid sequences by tabulating the percent of G, A, T, and C for each position in the set.
Xnu Replaces statistically significant tandem repeats in protein sequences with X characters. If a resulting protein sequence is used as a query for a BLAST search, the regions with X characters are ignored.
Seg Replaces low complexity regions in protein sequences with X characters. If a resulting protein sequence is used as a query for a BLAST search, the regions with X characters are ignored.

Importing / Exporting

Reformat Rewrites sequence files, symbol comparison tables, or enzyme data files so GCG programs can read them.
BreakUp Reads a GCG-format sequence file containing more than 350,000 sequence characters and writes it as a set of separate, shorter, overlapping sequence files that can be analyzed by GCG programs.
ChopUp Converts a non-GCG sequence file containing lines as long as 32,000 characters into a new file containing lines no longer than 50 characters. The new file can be read by Reformat to create a GCG-format sequence file.
FromStaden Reformats a sequence from Staden format into a file in GCG format. If the file contains a nucleotide sequence, the ambiguity codes are converted as shown in Appendix III of the Program Manual.
FromEMBL Reformats sequences from the distribution (flat file) format of the EMBL database into individual sequence files in GCG format.
FromGenBank Reformats one or more sequences in the flat file format of the GenBank database into individual sequence files in GCG format.
FromPIR Reformats sequences from the protein database of the Protein Identification Resource (PIR) into individual files in GCG format.
FromIG Reformats sequences from IntelliGenetics format into individual files in GCG format.
FromFastA Reformats sequences from FastA format into individual files in GCG format.
ToStaden Writes GCG sequences into a single file in Staden format. If the file contains a nucleotide sequence, the ambiguity codes are converted as shown in Appendix III of the Program Manual.
ToPIR Writes GCG sequences into a single file in PIR format.
ToIG Converts GCG sequence files into a single file in IntelliGenetics format.
ToFastA Converts GCG sequences into a single file in FastA format.
GetSeq Reads a sequence from another computer acting as a terminal and creates the same sequence in GCG format on your system.
Spew Sends a GCG sequence from the minicomputer to a personal computer acting as a terminal.

Mapping

Map Maps a DNA sequence and displays both strands of the mapped sequence with restriction enzyme cut points above the sequence and protein translations below. Map can also create a peptide map of an amino acid sequence.
MapPlot+ Displays restriction sites graphically. If you don't have a printer or plotter, MapPlot can write a text file that approximates the graph.
MapSort Finds the coordinates of the restriction enzyme cuts in a DNA sequence and sorts the fragments of the resulting digest by size. MapSort can sort the fragments from single or multiple enzyme digests.
Fingerprint Identifies the products of T1 ribonuclease digestion.
PeptideMap Creates a peptide map of an amino acid sequence.
PlasmidMap+ Draws a circular plot of a plasmid construct. It can display restriction patterns, inserts, and known genetic elements. The plot is suitable for publication, record keeping, or analysis. It is drawn from one or more labeling files such as those written by MapSort.
PeptideSort Shows the peptide fragments from a digest of an amino acid sequence. It sorts the peptides by weight, position, and HPLC retention at pH 2.1 and shows the composition of each peptide. It also prints a summary of the composition of the whole protein.

Primer Selection

Prime+ Selects oligonucleotide primers for polymerase chain reaction (PCR) experiments and for DNA sequencing. You can allow the program to choose any appropriate primers from the input DNA sequence or optionally limit the choices to primers specified in a file.

Protein Analysis

Motifs Looks for sequence motifs by searching through proteins for the patterns defined in the PROSITE Dictionary of Protein Sites and Patterns. Motifs can display an abstract of the current literature on each of the motifs it finds.
ProfileScan Uses a database of profiles to find structural motifs in protein sequences.
PeptideSort Shows the peptide fragments from a digest of an amino acid sequence. It sorts the peptides by weight, position, and HPLC retention at pH 2.1 and shows the composition of each peptide. It also prints a summary of the composition of the whole protein.
Isoelectric+ Plots the total positive and negative charges and the net charge of a protein as a function of pH.
PeptideMap Creates a peptide map of an amino acid sequence.
PepPlot+ Plots measures of protein secondary structure and hydrophobicity in parallel panels of the same plot.
PeptideStructure Makes secondary structure predictions for a peptide sequence. The predictions include measures for antigenicity, flexibility, hydrophobicity, and surface probability. PlotStructure displays the predictions graphically.
PlotStructure+ Plots the measures of protein secondary structure in the output file from PeptideStructure. The measures can be shown on parallel panels of a graph or with a two-dimensional squiggly representation.
Moment+ Creates a contour plot of the helical hydrophobic moment of a peptide sequence.
HelicalWheel+ Plots a peptide sequence as a helical wheel to help you recognize amphiphilic regions.
Xnu Replaces statistically significant tandem repeats in protein sequences with X characters. If a resulting protein sequence is used as a query for a BLAST search, the regions with X characters are ignored.
Seg Replaces low complexity regions in protein sequences with X characters. If a resulting protein sequence is used as a query for a BLAST search, the regions with X characters are ignored.

RNA Secondary Structure

 
MFold Predicts optimal and suboptimal secondary structures for an RNA molecule using the energy minimization method of Zuker.
PlotFold+ Displays the optimal and suboptimal secondary structures for an RNA molecule predicted by MFold.
FoldRNA Predicts a single optimal secondar structure for an RNA molecule by the older method of Zuker.
Squiggles+ Uses an output file from FoldRNA to make a plot of an RNA secondary structure.
Circles+ Uses an output file from FoldRNA to make a circular Nussinov plot of an RNA secondary structure.
Domes+ Uses an output file from FoldRNA to make a linear plot of a folded RNA molecule.
Mountains+ Uses an output file from FoldRNA to make a plot of an RNA secondary structure.
StemLoop Finds stems, or inverted repeats, within a sequence. You specify the minimum stem length, minimum and maximum loop sizes, and the minimum number of bonds per stem. You can display all loops or only the best loops on your screen or write them into a file.
DotPlot+ Creates a dot-plot with the output file from Compare, FoldRNA, or StemLoop.

Translation

Translate Translates nucleotide sequences into peptide sequences.
BackTranslate Translates an amino acid sequence into a nucleotide sequence. The output display helps you recognize minimally ambiguous regions that may be good for constructing synthetic probes.
Map Maps a DNA sequence and displays both strands of the mapped sequence with restriction enzyme cut points above the sequence and protein translations below. Map can also create a peptide map of an amino acid sequence.
ExtractPeptide Writes a peptide sequence from one or more of the translation frames displayed in the output from Map. (Translate supercedes ExtractPeptide for most applications.)
PepData Translates DNA sequence(s) in all six frames.

Utilities

Sequence Utilities  
Reverse Reverses and/or complements a sequence.
Shuffle Randomizes the order of the symbols in a sequence, keeping the composition constant.
Simplify Simplifies peptide or nucleic acid sequences into broad categories.
CompTable Creates a scoring matrix using equivalences defined in a simplification scheme such as the one used for Simplify.
Corrupt Randomly introduces small numbers of substitutions, insertions, and deletions into nucleotide sequence(s).
Xnu Replaces statistically significant tandem repeats in protein sequences with X characters. If a resulting protein sequence is used as a query for a BLAST search, the regions with X characters are ignored.
Seg Replaces low complexity regions in protein sequences with X characters. If a resulting protein sequence is used as a query for a BLAST search, the regions with X characters are ignored.
Sample Extracts sequence fragments randomly from sequence(s). You can set a sampling rate to determine how many fragments Sample extracts.
Database Utilities  
DataSet Creates a GCG personal database from any set of sequences in GCG format.
GCGToBLAST Combines any set of GCG sequences into a database that you can search with BLAST.
Sample Extracts sequence fragments randomly from sequence(s). You can set a sampling rate to determine how many fragments Sample extracts.
Printing / Plotting Utilities  
LPrint Prints text files on a PostScript printer connected to LPrintPort.
ListFile Prints a file on a printer attached to your terminal's passthrough printer port.
SetPlot Lets you choose a graphics configuration from a menu of available graphics devices at your site. For more information, see the Chapter 5, Using Graphics in the User's Guide.
Figure+ Creates figures by drawing graphics and text together. You can include one or more GCG graphics files within a Figure file.
PlotTest+ Plots a pattern to test if your plotter is configured properly. The test pattern uses every GCG graphics feature.
File Utilities  
ChopUp Converts a non-GCG sequence file containing lines as long as 32,000 characters into a new file containing lines no longer than 50 characters. The new file can be read by Reformat to create a GCG-format sequence file.
Replace Makes character string replacements in text files. You provide a table of replacements in a file showing each existing string and its replacement.
CompressText Removes any or all of the following from files: 1) blank lines;
2) trailing space; 3) extra space between words; or 4) all space.
OneCase Changes all of the alphabetic characters in a file to lower- or uppercase. It also can capitalize every word.
ShiftOver Moves the contents of a file to the right or to the left as many columns as you specify.
Detab Replaces the tab characters in one or more files with spaces. Detab also can limit line length to 80 characters, truncating all characters beyond that length.
Miscellaneous Utilities  
SetKeys Writes a file in your directory that redefines your keyboard's keys for sequence entry with the programs SeqEd, LineUp, GelEnter, and GelAssemble. You can edit the output file, called set.keys, if you want to use keys other than those you initially defined.
Reformat Rewrites sequence files, symbol comparison tables, or enzyme data files so GCG programs can read them.
Red Lets you format text to create publication-quality documents on a PostScript printer.
Name

(UNIX) Creates, changes, deletes, or displays GCG logical name(s) from the GCG logical names table.
Symbol Creates, changes, deletes, or displays GCG symbol(s) from the GCG symbol table.


 Introducció

Tema 2

 Tema 3

Tema 4

Tema5i6

 Tema 7

  Presentació

 Buscadors

  Truks :)

  Aprofitant...
 Gracies a  Software