| Home | Program | HumChroms | MouseChroms | SNPs |
This page contains instructions for usage the programs phred, phrap, consed, and polyphred for SNP and polymorphism determination from ABI DNA sequencing chromat files.
Additional information is provided for the basic Unix commands needed and for setting up of needed programs on Macintosh computers, namely, use of OS X Unix programs (telnet, rlogin, ssh, ftp, and X11) on Mac OS X computers, and use of NCSA Telnet, Fetch for ftp file transfer, and MacX for X-windows emulation on Mac OS 9 or earlier computers.
Material relevant to Mac OS X computers was added in May, 2003, in a major revision of this document.
Topics:
Appendix:
0. Caveats
a. Parameters given for phred, phrap, and polyphred appear to work well on the limited number of testdata sets that we have used. However, there are many 'variations on a theme' here and further optimizationmay be needed for some data sets.
b. The programs run well and fast on elcapitan, a Sun Solaris Ultra10 computer. However, disk space may become an issue. If you have disk space problems, send email to Doug Smith.
c. The Appendix contains tips for setting up telnet, rlogin, ssh, ftp, and X-windows on Macintosh computers, to run phred, phrap, consed, and polyphred from Mac computers. I would be pleased to assist users using PCs, but to date have limited experience with comparable programs for PCs.
d. The following documentation may appear intimidating and long. However, running these programs is actually not difficult nor timeconsuming. I tend to write documentation that is rather verbose and detailed, perhaps overly so. However, in my experience, most documentation never answers key questions that I (or other users) have, and I try to avoid this problem as much as possible. I also find that 'a picture is worth a thousands words'.
A. Programs and Documentation
1. The programs:
phred, phrap, consed, and polyphred are Unix programs from U.Wash that work as a group
for analysis of new DNA sequences. They do the following:
2. Program Documentation:
Program developers have documentation available for those who
wish more information.
We have used the following documents available on the Web:
a. Document visualization directly on the Web:
b. Sites for downloading program documentation:
3. Primary References for the programs:
B. Basics and Summary of Program Use
1. Files and Program Execution
Input files: ABI chromat or SCF files
Order of Program execution: phred, then
phrap,
then consed
or: phred,
then phrap, then polyphred, then consed
or: phred,
then phrap, then consed, then polyphred, then consed
2. Summary of Program Usage, with links to details below
C. Use of Programs on the Unix Sun Ultra10 computer 'elcapitan'
elcapitan is a Unix machine in the Doug
Smith group.
It is a server for the UCSD Hypertension Web site and houses this
suite of sequence analysis programs,
as well as other programs
NOTE: Unix machines are case sensitive
!!! ... be careful with your typing of commands!!
In the usage commands below, Courier New font is used for the
actual Unix commands and for programs.
BE SURE to use the case indicated! ... else you will get a cryptic,
meaningless error message ...
Programs are usually named using lower case only ... this convention
is used here
However, some of these programs and commands have one or more
CAPITAL letters!?!
1. Use of Local Computers to access and
use the phred, phrap, etc programs
Macs and PCs can be used to access these programs. This requires
programs on your local computer.
The following are some suggestions, based on my usage. Several other similar programs are available.
a. Macintosh OS 10 or later
The required programs are either
already present on the Mac (telnet or ssh, ftp)
or can be downloaded from the Web (X windows capability); see
Appendix below.
b. Macintosh OS 9 or earlier:
1) File Transfer: Fetch
Fetch 3.0.3: use for ftp transfer of chromat files to elcap, and
for downloading output files
Fetch 3.0.3 is free from Dartmouth for downloading from here
any ftp program should work here, including ftp capability of
Netscape or IE
2) command line usage of Unix and other
computers: NCSA
Telnet
use for phred, phrap, and polyphred, and for moving around in your elcap account
NCSA Mac Telnet 2.6 or later:
NCSA Telnet for Macs is free for the downloading here
3) X-Window usage of Unix computers:
MacX
use for consed graphics
MacX 2.0: available from Apple for $150.00 , upgrade from MacX
1.5 for $90, by calling 1-800-293-6617 ... or search for "MacX
2.0"
b. PCs
1) File Transfer
Many ftp transfer programs exist
for PCs. Some of these as shareware, found on zdnet, are as follows:
and many others, found via a search on 'ftp AND windows'
2) command line usage of Unix and other
computers: NCSA
Telnet
use for phred, phrap, and polyphred, and for moving around in your elcap account
NCSA PC Telnet 2.3 or later:
NCSA Telnet for PCs is free for the downloading here
3) X-Window usage of Unix computers:
Hummingbird
Exceed
This seems to be the PC X-windows emulation package of choice.
Exceed also does ftp transfer and standard Unix text stuff that
NCSA Telnet does.
I could not find a price easily at the Hummingbird Web site ...
D. Unix commands you will need to use or which are highly useful:
You need to currently use a few Unix commands to run these programs (not very many !!):
For additional Unix information, there are
many 'Unix help' Web sites as well as many books.
One of these sites is the Basic
Unix site at the University of Washington Genome Center site.
For Mac OS X, the O'Reilly book "Learning Unix for Mac OS X", by Taylor and Jepson, is useful.
E. Use of phred, phrap, consed, and polyphred
1. Get an account on elcapitan and login:
a. Get an Account
on elcapitan
If you do not have an account on elcapitan, send email to Doug Smith
An account will be set up for you together with a Password.
b. Changing
your Assigned Password
If you wish to change your password, proceed as follows:
1) Login to elcapitan using either ssh or telnet; ssh is preferable
for security purposes.
See the Appendix for tips on doing this using
Mac OS X or using NCSA Telnet for Mac OS 9 or for PCs
X-windows solutions can also be used here.
2) Enter your account name at the prompt:
Enter your password at the prompt:
You should get a brief message and then the elcapitan prompt:
To change your password, do at the elcapitan 'elcapitan%:' prompt:
and type in stuff in response to the prompts ...
For passwords, use a 6-8 character password
easy for you to remember BUT not easy for others to guess
Do NOT use your account name, your name, the name of your sig
other or ex, etc for a password
General password rules: include in your password one or more lower
case AND upper case letters,
plus numbers, plus other non-alphanumeric characters, eg ?^()$#
If you change your password, please do
not forget your password!
2. Directory structure in your account on elcapitan for these four Programs
All four of these programs either require or are easier to run with an appropriate directory structure and Naming Convention.
Subdirectories are required for each new:
1) Candidate Locus
2) Each ABI DNA sequence data run for each locus
3) Each type of file for each sequence data run (4 subdirectories
for 4 types of files)
It is important logistically to be consist in the Naming Conventions for Files and for Subdirectories.
The following Subdirectory structure and Naming Conventions is one possibility that has been used at UCSD by Bruce Hamilton and coworkers:
a. Create the Main Subdirectory
or "1st Layer" with name: <LocusName>
where <LocusName> is your favorite name for the Candidate Locus under
analysis, eg ADRA1B
Do (type in and do 'return') at the prompt:
This subdirectory need be created only once ...
b. Create Subdirectories within
<LocusName> subdirectory for each DNA sequence data run:
These subdirectories need to be created for EACH ABI DNA sequence
data run.
Dates have been used conveniently for such Sequence Data Run Subdirectories.
EXAMPLE
for data from a 96-lane run on 19 Nov 1999. do the following:
1) Enter the <LocusName> subdirectory; do:
2) Create the subdirectory; do:
Thus the primary Subdirectory for the data run done on November 19, 1999, is:
c. Create Four Subdirectories within each Run Subdirectory
The four Subdirectories chromat_dir, edit_dir, phd_dir, poly_dir must be created for each DNA sequence data run.
EXAMPLE for the 19nov99 run:
1) Enter the 11.19.99 subdirectory; do:
2) Create the four subdirectories; do four operations at the 'elcapitan %:' prompt:
EXAMPLE directory structure for two runs on 19 and 23 nov99 for gene ADRA1B:
You can move directly from your Login Directory
to any other directory using the cd command.
For example, to go to the edit_dir of the ADRA1B 23nov99 run from your Home Directory,
do:
3. Upload ABI chromat files to the chromat_dir subdirectory for the particular ABI run
Use Mac OS X ftp or Fetch or similar ... select the appropriate
chromat_dir subdirectory ... transfer the files
See the Appendix below for tips on using ftp and Fetch
...
3a. Construct a *.phd file for the "Wildtype" Sequence of the Candidate Locus
When polyphred is used here for SNP and polymorphism determination, it is most useful to have the "wildtype" sequence available for comparison and to provide a "backbone" for the multiple sequence alignment. What is needed is a *.phd file for this sequence. To generate and use such a file, do the following:
a. Retrieve the "wildtype" DNA sequence as a FASTA-formatted
Sequence File
from an appropriate DNA database, e.g. using Entrez at GenBank.
This sequence wants to contain somewhat
more than the sequences that are being re-sequenced.
However, limit the size of this sequence to no more than about
30,000 bp.
Retrieve this sequence as a FASTA-formatted
file, or convert what you have retrieved to a FASTA-formatted
file.
FASTA-format is described here in Appendix C.
Name the file containing the FASTA-formatted
sequence as: <LocusName>.fasta
where <LocusName> is the name for the Candidate Locus you used in
setting up your Directory Structure.
b. Move this FASTA-formatted Sequence to the Locus Main Subdirectory
A convenient Subdirectory for this <LocusName>.fasta file is in the "1st Layer" or Main Subdirectory
for analysis of this Candidate Locus, namely subdirectory <LocusName>
If you retrieved the GenBank FASTA-formatted file to a computer other than elcapitan, use Fetch or other ftp program to transfer the file to subdirectory <LocusName>.
c. Convert the FASTA-formatted Sequence to a PHD-formatted Sequence
File
Use the Perl script fasta2Phd.perl to convert the <LocusName>.fasta file to a <LocusName>.phd.1 file.
Do at the prompt:
This will automatically create an Output File of name <LocusName>.phd.1 containing the candidate locus sequence in phd-format.
d. Copy this PHD-formatted File to each of the phd_dir Subdirectories for the Locus
This <LocusName>.phd.1 file must be used in the phred, phrap, consed, polyphred analysis of each Run.
The file thus must be present in each and every phd_dir used.
1) Move to the Locus Main Subdirectory
Go to the Main Subdirectory for this Candidate Locus, subdirectory
<LocusName> if you are not
there.
Do at the prompt:
The .. in Unix moves you to the subdirectory immediately above your currect subdirectory in the Unix directory heirarchy ... and you move then from there.
2) Copy (cp command) the <LocusName>.phd.1 file to a new phd_dir subdirectory.
For example, for the 19nov99 run for this Candidate Locus <LocusName>.phd.1,
do:
This "backbone" or Control Wildtype Sequence will then be included in the subsequent contig analysis.
4. Execution of phred and phrap: phredPhrap Script
a. Move to
the edit_dir for the appropriate run
You can do this in many ways, depending on which subdirectory
you are in currently.
A general way to do this is to return to your Login directory,
and then cd down to the edit_dir:
1) to return to your Login directory, do at the prompt:
2) to go to the edit_dir, eg for the 19nov99 run for candidate locus <LocusName>, do:
These two commands can be combined in the single command:
3) at any time, you can see what subdirectory you are in by doing:
b. Run
the Perl script phredPhrap
This executes RepeatMasker, phred, phrap, one after the other
1) do at the prompt IN THE appropriate edit_dir subdirectory for your Candidate Locus:
You will see a lot of messages come up to the screen as the programs execute ...
Once finished, examine your new files; do:
You should see something like:
elcapitan% cd edit_dir
elcapitan% ls
03.28.01.contigs 03.28.01.fasta.screen.problems.qual
03.28.01.fasta 03.28.01.fasta.screen.qual
03.28.01.fasta.log 03.28.01.fasta.screen.singlets
03.28.01.fasta.screen 03.28.01.fasta.screen.view
03.28.01.fasta.screen.ace.1 03.28.01.newtags
03.28.01.fasta.screen.contigs 03.28.01.phrap.out
03.28.01.fasta.screen.contigs.qual 03.28.01.screen.out
03.28.01.fasta.screen.log 03.28.01NewChromats.fof
03.28.01.fasta.screen.polyphred.out 03.28.01_to_alu.cross
03.28.01.fasta.screen.problems
To examine the contents of any file of name <fileName>, do:
more brings up the contents of the file
to the screen, one page at a time
Scroll to the next page by pressing the space
bar on your keypad
To stop the file examination, press the Q
key on your keypad
5. Execution of consed
Note: If you are doing SNP / Polymorphism work, you can skip Step 5 here and go immediately to Step 6, the running of PolyPhred. Step 7 is then the subsequent use of Consed and is largely a repeat of this Step 5.
To visually examine the contigs formed by phrap, and the quality of alignments, run consed
This must be from a program that can emulate X-windows, eg X windows for Mac OS X, MacX for Mac OS 9 or earlier, or Exceed for a PC
a. Start X-Windows from your Computer
If configured correctly, you should directly logon to elcapitan
and get a text-based window similar to the one you get from Telnet.
Instructions for setting up an X windows Xterm window from a Mac OS X computer are provided in Appendix A3, and for MacX from a Mac OS 9 or earlier are provided in Appendix B3.
b. Move to
the edit_dir for the appropriate run
if you are not already there
You can do this in many ways, depending on which subdirectory
you are in currently.
A general way to do this is to return to your Login directory,
and then cd down to the edit_dir:
1) to return to your Login directory, do at the prompt:
2) to go to the edit_dir, eg for the 19nov99 run for candidate locus <LocusName>, do:
These two commands can be combined in the single command:
3) at any time, you can see what subdirectory you are in by doing:
c. Run
consed
At the elcapitan prompt in the X windows text window, do:
Note: For Mac OS X, follow the instructions in Appendix A3 regarding the xhost command.
If all is well, you will get some graphics windows appearing on your computer screen.
You can now proceed to look at your data using consed.
d. Learning
how to use consed
consed
is well described in the consed README documentation, which is available on elcapitan.
To examine this documentation one page at a time, type the following command at the 'elcapitan%' prompt in your elcapitan connection window (X windows, terminal, telnet window):
You can also get a copy of this documentation into your own account by typing from your account directory:
NOTE the period . in the above command ... most important!
You can of course also download this file via ftp and print it out ... or whatever.
e. Run the
consed Tutorial from the consed README documentation
It is most worthwhile to do some of the "Quick
Tour of consed" tutorial that is found in the consed README
documentation. Spending an hour or two on this tutorial will save
you time later.
The easiest way to do this is to have TWO connections to elcapitan at the same time, and then do:
1) execute the commands in the tutorial from an X windows connection ... and ...
2) simultaneously read the Tutorial information from the README documentation using a second connection to elcapitan. This can be done using any connection (X windows, Terminal, NCSA Telnet, etc).
Although you can run the tutorial on your
own data, it is more convenient to use the standard test data
provided with consed since some of the tutorial examples apply directly
to these standard test data.
To then do this using the consed standard test data, do the following:
1) Turn on X windows and get an X windows text window (xterm, MacX, eXceed, etc)
2) Type the following two commands at the 'elcapitan%' prompt in your X windows window:
The first command takes
you to the edit_dir containing the *.ace file for the polyphred test data.
The second command executes consed, and brings up the standard test
data used in the tutorial.
You are now ready to execute the commands in the tutorial using this X windows window.
Note: go through the first part of the tutorial that deals with consed distribution, installation, and turning on consed ... Distribution and installation has been done by us, and you already have turned on consed.
Begin where the tutorial says: "Two windows will appear. ..."
3) To simultaneously read the information on what to do to run the tutorial from the consed README documentation, open a SECOND connection to elcapitan. This can be a second X windows window, or a Terminal window on a Mac OS X computer, or a NCSA Telnet window.
4) Log on to elcapitan from this second window, and bring up the consed README documentation by doing at the 'elcapitan%' prompt:
Click on the space bar or on the <return> key to move down to the 'Quick Tour of consed' tutorial part of the documentation (the space bar moves one page at a time, the <return> key moves one line at a time, through the document displayed via the more command).
5) Read the tutorial in this second window and execute the tutorial commands in your first, X windows window.
f. To Sort
the Chromatogram "Reads" before running consed:
If you wish to sort the reads before running consed, do the following:
1) Turn on consed, but BEFORE opening the .ace file, go to the consed Main Menu.
2) Under options, choose general preferences.
3) The eleventh selection says Display reads sorted alphabetically or by strand/left read end; click on alpha, and then click on apply and dismiss.
4) Now open the *.ace file and proceed as usual.
The chromatograms will now
be sorted alphabetically (or numerically).
Experience at UCSD has shown this to be a useful option.
g. Shutting
down consed
Use the Quit command from the consed Main Menu to
shut down consed
After using consed and quitting, return to your Home directory on elcapitan by typing in the X windows text window:
cd
to bring you back to your Home directory.
6. Execution of polyphred
Now find polymorphisms in your contigs and
add appropriate tags to the data files via polyphred.
polyphred is most conveniently
run from the edit_dir directory for the run of interest.
a. Enter the edit_dir for the appropriate run
You can do this in many ways, depending
on which subdirectory you are currently in.
A general way to do this is to return to your Login directory,
and then cd down to the edit_dir:
1) to return to your Login directory, do at the prompt:
2) to go to the edit_dir, eg for the 19nov99 run, do:
3) at any time, you can see what subdirectory you are in by doing:
b. Use
Telnet and text commands to run polyphred
The text window in MacX can also
be used to run polyphred.
c. Run
polyphred
Example for the 19nov99 data run; do at the elcapitan prompt in Telnet and the edit_dir directory:
where *.ace.1 is the .ace file present in the edit_dir subdirectory.
Example of such a *.fasta.screen.ace.1 filename:
Note: this
command will give polymorphisms of quality 'ranks' 1 through 3
1 is the highest quality, 6 is the lowest quality
The qualifier -tag p is used to list the tagged polymorphisms in the
polyphred
output file *.polyphred.out.
To see ALL polymorphisms (qualities 1-6), add the -rank 6 option in the above command:
Polymorphisms of ranks 4, 5, or 6 are very seldom real, as is true also of many of rank 2 or 3 ...
d. Results:
1. Lots of messages will come up to the
screen as polyphred does its thing ...
2. polyphred writes output to your output file; for the above,
file:
3. polyphred also MODIFIES the *.fasta.screen.ace file, the *.phd files in the phd_dir directory for sequences with polymorphisms, and the *.poly files in the poly_dir directory for sequences with polymorphisms.
e. Examine
Polymorphisms found by polyphred
To see via text display the polymorphisms found by polyphred:
1) Examine the contents of the polyphred output file, eg:
Do:
Continue examination until you come to the section beginning with:
Examine these data to the end of this section:
These data show position of polymorphism in a contig, 5' and 3' sequences, the SNP, and quality of the polymorphism.
EXAMPLE of such data:
Posn 5'seq SNP 3'seq Quality BEGIN_POLY 93 GTGGTCGGT A G TGTTCATCT 6 137 TCTACCGCT T C GGTAAGTTG 6 138 CTACCGCTT G T GTAAGTTGG 6 139 TACCGCTTG G T TAAGTTGGG 6 145 TTGGTAAGT A T GGGGACTAG 2 147 GGTAAGTTG G A GGACTAGCA 6 148 GTAAGTTGG G A GACTAGCAG 6 160 CTAGCAGCA G C GGGGACTGG 6 170 GGGGACTGG G A CATTTTTGG 6 186 TGGACCTTG G A GTTTACTGA 6 189 ACCTTGGGT T G TACTGATGA 6 193 TGGGTTTAC T A GATGAGCTT 6 209 CTTACTCTA A C AGTTTTTTG 6 216 TAAAGTTTT T G TGTGGGTTT 6 225 TTGTGGGTT T G TGTTTCTTA 6 239 TCTTATGCA G A TCTGTGCGT 6 253 TGCGTGTTC G A GAGATTGAA 6 259 TTCGGAGAT T A GAATAATAT 6 272 TAATATTGT A T TGTTCTGCA 2 273 AATATTGTT T C GTTCTGCAA 6 281 TTGTTCTGC A T AAGGGTTTG 6 282 TGTTCTGCA A C AGGGTTTGC 6 283 GTTCTGCAA C A GGGTTTGCA 1 293 GGGTTTGCA G T ATTGGGGAG 2 298 TGCAGATTG G A GGAGCTGGC 6 306 GGGGAGCTG G A CTAAAAACC 6 311 GCTGGCTAA A C AACCAACTC 6 332 GTGTTAGTA G A AACACGCTA 6 350 AAGGCACTA G T CTTCTGGAA 6 363 CTGGAAATA G C AACCAGGGA 6 389 TCTGGTATG A G GGAATGACT 2 390 CTGGTATGA G T GAATGACTC 6 427 AATAATTAA A G AAGGATATT 6 438 AGGATATTC A G CTGGGCTTG 6 END_POLY
The above data include Quality Ranks of ALL levels (1-6). Normally you will have only qualities of 1-3, and it is the experience of workers that Quality 3 is seldom real, Quality 2 is sometimes real, and Quality 1 is usually but not always real.
Consed
must be used to examine visually
the ABI chromatogram characteristices for each potential Polymorphism
to decide if a given polymorphism is "real' ... this should
be done by at least two different personnel.
This is described in more detail below.
Annotations
or comments can be made to the above file to designate decisions
made concerning potential polymorphisms. Alternatively, COPY-PASTE
can be used to move the above data to a Word file for subsequent
annotation.
f. Modifications in the phrap Files made by polyphred:
To see via text display modifications in the phrap files made by polyphred:
1) Examine the END of the contents of, eg,
the *.fasta.screen.ace file,
EXAMPLE of such a file:
Do:
or do:
and examine modifications such as the following at the end of the file:
CT{
Contig2 polymorphism polyPhred 106 106 1000106:205257
}
CT{
Contig2 polymorphism polyPhred 150 150 1000106:205257
}
CT{
Contig2 polymorphism polyPhred 151 151 1000106:205257
}
7. Reexecution of consed
If all went well with polyphred, you will now be able to visualize the polymorphisms using consed.
This must be from an X windows text window (Xterm, MacX, eXceed, etc).
a. Start X-Windows from your Computer
If configured correctly, you should directly logon to elcapitan
and get a text-based window similar to the one you get from Telnet.
Instructions for setting up an X windows Xterm window from a Mac OS X computer are provided in Appendix A3, and for MacX from a Mac OS 9 or earlier are provided in Appendix B3.
b. Move to
the edit_dir for the appropriate run
if you are not already there
You can do this in many ways, depending on which subdirectory
you are in currently.
A general way to do this is to return to your Login directory,
and then cd down to the edit_dir:
1) to return to your Login directory, do at the prompt:
2) to go to the edit_dir, eg for the 19nov99 run for candidate locus <LocusName>, do:
These two commands can be combined in the single command:
3) at any time, you can see what subdirectory you are in by doing:
c. Run consed
At the elcapitan prompt in the X windows text window, do:
Note: For Mac OS X, follow the instructions in Appendix A3 regarding the xhost command.
If all is well, you will get some graphics windows appearing on your computer screen.
You can now proceed to look at your data using consed.
d. Learning
how to use consed
consed
is well described in the consed README documentation, which is available on elcapitan.
To examine this documentation one page at a time, type the following command at the 'elcapitan%' prompt in your elcapitan connection window (X windows, terminal, telnet window):
You can also get a copy of this documentation into your own account by typing from your account directory:
NOTE the period . in the above command ... most important!
You can of course also download this file via ftp and print it out ... or whatever.
e. Run the
consed Tutorial from the consed README documentation
It is most worthwhile to do some of the "Quick
Tour of consed" tutorial that is found in the consed README
documentation. Spending an hour or two on this tutorial will save
you time later.
The easiest way to do this is to have TWO connections to elcapitan at the same time, and then do:
1) execute the commands in the tutorial from an X windows connection ... and ...
2) simultaneously read the Tutorial information from the README documentation using a second connection to elcapitan. This can be done using any connection (X windows, Terminal, NCSA Telnet, etc).
Although you can run the tutorial on your
own data, it is more convenient to use the standard test data
provided with consed since some of the tutorial examples apply directly
to these standard test data.
To then do this using the consed standard test data, do the following:
1) Turn on X windows and get an X windows text window (xterm, MacX, eXceed, etc)
2) Type the following two commands at the 'elcapitan%' prompt in your X windows window:
The first command takes
you to the edit_dir containing the *.ace file for the polyphred test data.
The second command executes consed, and brings up the standard test
data used in the tutorial.
You are now ready to execute the commands in the tutorial using this X windows window.
Note: go through the first part of the tutorial that deals with consed distribution, installation, and turning on consed ... Distribution and installation has been done by us, and you already have turned on consed.
Begin where the tutorial says: "Two windows will appear. ..."
3) To simultaneously read the information on what to do to run the tutorial from the consed README documentation, open a SECOND connection to elcapitan. This can be a second X windows window, or a Terminal window on a Mac OS X computer, or a NCSA Telnet window.
4) Log on to elcapitan from this second window, and bring up the consed README documentation by doing at the 'elcapitan%' prompt:
Click on the space bar or on the <return> key to move down to the 'Quick Tour of consed' tutorial part of the documentation (the space bar moves one page at a time, the <return> key moves one line at a time, through the document displayed via the more command).
5) Read the tutorial in this second window and execute the tutorial commands in your first, X windows window.
f. consed Output related to Polymorphisms
To examine the consed presentation related to polymorphisms, do the following,
as taken from the consed documentation README.consed:
The following is similar, but based on suggestions and procedures used by Sarah Shaw:
1) in consed, call up the appropriate *.ace file and the contig.
2) under the Navigate pulldown menu, select:
This will turn the feature on.
3) under the Navigate pulldown menu, select:
and then select:
This creates a new window called polymorphism tags. This window lists all individuals who have a putative SNP, and gives the consensus location for each SNP.
4) double-click on the consensus location
for a SNP in the polymorphism
tags window.
This will bring up all ABI traces for each individual at this
location.
Scroll through the traces and visualize each lane to determine
if the SNP is real or not.
The lanes thought to have a SNP by consed are tagged in blue.
Note: when one is evaluating the worthiness of a putative SNP via examination of ABI trace data, it is worthwhile to also open the *.polyphred.out file and make notes next to (annotate) the listed tagged polymorphisms regarding this decision process.
5) if a given SNP is determined to be "real", then genotype information for the SNP is copied from the *.polyphred.out file into a Master file for the candidate locus in Excel.
These annotation operations are also described below.
g. If you wish to SORT THE READS before running consed, do the following:
1) BEFORE opening the *.ace file, go to the consed Main Menu.
2) Under options, choose general preferences.
3) The eleventh selection
says Display reads sorted alphabetically or by strand/left
read end;
click on alpha, and then click on apply and dismiss.
4) Now open the *.ace file and proceed as usual.
The chromatograms will now be sorted alphabetically (or numerically).
h. consed tutorial using polyphred analysed data:
You can learn how to use consed on the test data mentioned in the above documention as follows:
1) Turn on X windows and get an X windows text window (xterm, MacX, eXceed, etc)
2) Type the following two commands at the 'elcapitan%' prompt in the MacX text window:
The first command takes
you to the edit_dir containing the *.ace file for the polyphred test data.
The second command executes consed.
You can now do with these data in consed using X windows as suggested in the consed tutorial, as reproduced above.
After using consed and quitting, return to your Home directory on elcapitan by typing in the MacX text window:
cd
To learn how to optimally use consed in general, I highly recommend following the tutorial in the consed README documentation. You can do this as described above.
F. Subsequent Candidate Locus SNP Analysis:
The following constitutes the initial types of subsequent analysis, as described by Sarah Shaw.
1. Determine which potential polymorphisms are real:
This is described above and comprises the following:
1) Use of consed and visual inspection by two or more personnel of ABI chromatograms to decide if a given polymorphism is "real".
2) Annotate or add comments to the Output File *.polyphred.out or to a Word File containing the polymorphism data using COPY-PASTE operations.
2. Annotate further the Candidate Locus information:
Construct an Excel Master Spreadsheet for the Candidate Locus.
This file will contain at least the following:
1) Annotation on the Candidate Locus as obtained from GenBank, ExPASy, and other sources.
2) Genotypic data for each real SNP determined from the above DNA Sequencing analyses.
3) Phenotypic data from a variety of sources.
4) Further human genetic and statistical analyses.
Appendix:
A. For Use
of Mac OS X (10.2 or higher)
The Macintosh Operating
System OS X is based on the Darwin version of Unix. Thus, Unix
functions needed to communicate with another Unix computer (telnet
etc), to transfer files via ftp (file transfer protocol), and
to run X-Windows directly as a graphics protocol on the Macintosh
are either provided directly with OS X or can be obtained online,
with no need for third party programs. The following briefly describes
how to perform these tasks with a Mac OS X computer (OS X 10.2
or higher).
1. telnet, rlogin, and ssh under OS X
The Unix functions telnet, rlogin, and ssh all can be used directly in Mac OS X to connect to, and communicate with, a second computer that supports these functions.
On a Mac computer running
under OS X, these can all be used from the terminal window.
The terminal window on a Mac OS X computer is used to execute
Unix commands on the Mac;
the Mac largely becomes a Unix computer.
1. Open a terminal window by executing the terminal program, found in:
/ Applications / Utilities
2. To use telnet to connect to a second computer, eg elcapitan, do at the terminal window prompt:
and then login with Username and Password.
3. To use rlogin to connect to a second computer, eg elcapitan, do at the terminal window prompt:
and then login with Username and Password.
4. To use ssh to connect to a second computer, eg elcapitan, do at the terminal window prompt:
ssh -l <UserName> elcapitan.ucsd.edu
where <UserNamer> is your Username on the second computer, eg elcapitan.
Then respond with your Password.
5. With each of these connections, the terminal window is then used as a Unix text window for execution of Unix line commands.
NOTE: Of these three commands, any of which permits connection to the second computer, the ssh (Secure SHell) command is the best to use for security purposes. ssh esentially encripts your password during transit between the computers, thereby making it more difficult for hackers to obtain your password. Because of the lack of security with the telnet and rlogin commands, some computer systems, eg those at the San Diego Supercomputer Center, prohibit use of these commands and require use of ssh.
2. ftp under OS X
The Unix ftp function can also be used directly in Mac OS X , to transfer files (File Transfer Protocol) to or from a second computer that supports the ftp function.
On a Mac computer running under OS X, ftp can also be used from the terminal window.
1. Open a terminal window by executing the terminal program, found in:
/ Applications / Utilities
2. To use ftp for file transfer to or from a second computer, eg elcapitan, do at the terminal window prompt:
ftp elcapitan.ucsd.edu
and login with your UserName and Password. You should get the ftp prompt: ftp>
3. Execute appropriate ftp commands
to transfer ascii or binary files to or from your Mac OS X system.
These commands are explained in any standard Unix book, or you
can use the "man" facility on any Unix machine, eg from
the terminal window on the Mac.
To see commands available (with no explanation of what they do),
at the ftp> prompt do:
?
4. The most important ftp commands, executed at the ftp> prompt, for file transfer are:
Notes:
Final Note: Because of the Unix text-only "look and feel" of ftp as executed in the terminal window (line commands, no menus, etc), versus the Mac interactive mouse "look and feel" of Fetch (menus, shortcuts, drag and drop, etc), you may wish to use Fetch for file transfer even if you are using a Mac OS X computer!!
3. X-Windows under OS X
X-Windows is the standard graphics interface used by most Unix computers. It is used here by Consed to display sequence alignments, sequence features, sequencing traces, etc. Although the Mac OS X operating system supports X-windows, X-windows software does not come as a standard part of OS X. However, this software is free and available on the Web. The following describes in four Steps how to do this, how to customize your Xterm windows, and how to connect up and use Consed and other Unix graphics programs (Xapps).
The current standard version of X-windows is the X11 version, and can be downloaded for free from several Web sites. The version available at Apple is recommended (stable, clean, robust).
1) Go to Apple X11 MacOSX Web site at:
2) Retrieve FAQ pages as learning material, store in convenient folder under Documents
3) Download X11 package (X11 Public Beta 2-12-03, based on XFree86 4.2.1) from:
4) The file to download is:
5) Open this StuffIt bin file, get folder:
6) Move (drag and drop) this folder tot he Mac folder for OS X applications:
Applications
7) Install X11 via Basic Installation from Installer:
thereby yielding X11 as application X11:
b. Run X11 on the Mac as Xterm window and Customize your Xterm windows
To run X11, open the X11 file (doubleclick on it). This opens an xterm window, a Unix window similar to a terminal or term window except that Xapps are supported. This xterm Unix window works like all other Unix line command windows: you type in commands at the Unix xterm prompt.
1) To run standard X11 applications, just
type their names at the xterm window prompt.
Examples are:
xclock & ... xlogo & ... xcalc & ... xload &
Note: The & runs the Unix application in the background, ie in a separate window, thus still giving you control over your xterm window.
2) Most of these applications are found in folder:
/ usr / X11R6 / bin
And there is more stuff in folder:
/ etc / X11
To customize your Xterm
windows:
You can easily cutomize your prompt, the window size and type,
lines saved for scrollback, etc.
This is done modifying the xterm command in an "init"
file called .xinitrc
Init files are Unix files that are executed upon login, to
set parameters for your use of the computer.
The .xinitrc file is the appropriate init file for X-window
applications, including the xterm window
3) To see what can be done, type at the xterm window prompt:
xterm -Help
4) Using a Unix terminal window, copy a sample xinitrc file to your home directory as file.xinitrc; do:
i) Go to your home directory ... this is typically on a Mac OS X computer:
/ Macintosh HD / Users / <YourName>
ii) Execute the following copy (cp) command:
cp /etc/X11/xinitrc/xinitrc .xinitrc
5) Modify the xterm command in this
file as desired, using vi or other Mac Unix editor.
My command looks like:
xterm -sb -sl 5000 -fs 9 -bc -geometry 100x55 &
c. Run ssh from the Xterm
Window to connect to elcapitan or other Computer
This is done as described above
for ssh
connecting from a terminal window with the -X option.
1) To do this to connect to a second computer, eg elcapitan, do at the xterm window prompt:
ssh -X -l <UserName> elcapitan.ucsd.edu
where <UserName> is your Username on the second computer, eg elcapitan.
Then respond with your Password.
Note: You can omit this -l <UserName> option if your Username is the same on your Mac and on the second computer.
You are now logged in to the second computer just as via a terminal window, but you now have the potential to run X-windows graphics applications with display of the graphics on your Mac display.
d. Execution of X-windows
Applications on Second Computer, eg elcapitan
In connecting up to a computer such
as elcapitan with the purpose of running X-window applications,
one must assure that the computer sends the X-window graphics
back to your Mac and displays the graphics on your display screen.
The only way we have successfully found to do this invokes the
xhost
command. This creates a security leak while on, but one need only
have xhost on while turning on the X-windows application. The
X application, turned on with xhost on, will continue to work
with xhost off.
Note: Xforwarding must be set up on the second computer on which you wish to run the X applications. If upon doing the following, you can not get a display of an Xapp, eg xclock, on your Mac from the second computer, contact the system administrator of the second computer and ask if Xforwarding is set up.
Do the following to execute Xapps from the Second Computer:
1) Connect to the second computer, eg elcapitan, from an xterm window on the Mac using ssh as described above.
2) Open a terminal window as a second Unix window on the Mac; do NOT connect up to the second computer.
To do this, execute the terminal program, found on the Mac in:
/ Applications / Utilities
3) When ready to turn on an Xapp in the xterm window from the second computer, eg elcapitan, do the following:
i) In the terminal window on the Mac (NOT connected to the second computer !!), type at the prompt:
xhost +
You should see a response like:
access control disabled, clients can connect from any host
ii) In the xterm window on the second computer, execute one or more X applications, for example:
xclock & ... xload & ... xcalc & ... consed &
Note: The & runs the Unix application in the background, ie in a separate window, thus still giving you control over your xterm window.
iii) Finally, in the terminal window on the Mac, type at the prompt:
xhost -
You should see a response like:
access control enabled, only authorized clients can connect
This removes the security leak and the X applications in the xterm windows should still be running ...
e. Tests of xterm etc
I find xclock to be a good test of your connectivity and functionality.
In particular, you might try the following:
1) Open an xterm window and turn on xclock on the Mac with a light blue background; do:
xclock -bg lightblue &
Note: to learn about options for an Xapp, eg xclock, do:
man xclock ... or ... xclock usage
2) Now connect up to a second computer, eg elcapitan, and use the xhost + / xhost - game in a Mac terminal window as described above while turning on xclock with a yellow background on elcapitan:
xclock -bg yellow &
You should now have two xclocks on your screen, the lightblue one showing the Macintosh time and the yellow one showing the time on the second computer ...
If this all works, you are pretty well set up !!!
B. For Use of Mac OS 9 or lower
The Macintosh Operating System for the PowerPC prior to OS X (OS 8-9) is not Unix based, and hence third party programs for tasks including Telnet (TCP/IP communication to a second computer, eg elcapitan), ftp (file transfer protocol; interchange of files between computers), and X-Windows (the default graphics standard used on Unix computers) emulation must be used. Typical programs for these three tasks are NCSA Telnet, Dartmouth Fetch, and Apple MacX. Set up and use of these programs is briefly described here.
1. NCSA Telnet
NCSA Telnet can be obtained for the downloading from NCSA for Macs here and for PCs here.
a. Preferences
Once installed, you need to set the Preference settings.
These can be found in the Edit pulldown Mac menu.
There are five menus: Global, Terminals, Sessions, FTP Server,
FTP Users

The following are the settings I use:
1) Global settings:

2) Terminal settings:
You can have more than one set of Terminal settings. I use only one, as follows, accessible by clicking on the change button in the Terminal settings window:

3) Sessions settings:
You can have more than one set of Terminal settings. I use only one, as follows, accessible by clicking on the change button in the Terminal settings window:

4) FTP Server settings:
NCSA Telnet can be used for ftp file transfer. However, I prefer the DRAG and DROP capabilities of Fetch, and so do not use ftp with Telnet. However, the settings that I have for the FTP Server are as follows:

5) FTP Users settings:
I don't use the FTP Users capabilities of NCSA Telnet.
2. Dartmouth Fetch
Fetch can be obtained for
the downloading from Dartmouth here.
The version as of 7 Jan 2000 is Fetch 3.0.3. Fetch comes with
a good Help facility.
a. Preferences:
Once installed, you need to set the Preference settings which
can be found in the Customize dropdown menu:

The Preferences popup menu has several parts:
Settings I use for each part are as follows:
1) General settings:

2) Download settings:

3) Upload settings:

4) Formats settings:

5) Firewall settings:
These are not used by me.
6) Mirrors settings:
These are not used by me.
7) Misc settings:

b. Shortcuts:
Shortcuts are used
for rapid login to your favorite computer and directory.
Examples of Shortcuts that I use are as follows, from the Fetch File pulldown menu:

A given Shortcut can be selected from the Open Shortcut menu shown, or from the following popup menu obtained when you execute a New Connection... operation also shown in the menu above:

An example of a Shortcut is that for elcapitan among the choices above. Selecting elcapitan yields:

Clicking on ok automatically logs me into my account doug on elcapitan, and puts me in my home directory /export/home/doug, shown as follows:

A window comes up that shows
the connection being made and retrieval of the list of files in
my home directory.
You can now move down into lower subdirectories by double-clicking
on any directory name in the window.
Files may be uploaded to elcapitan by simply 'dragging' their names from the appropriate Mac folder to the appropriate directory displayed in the Fetch directory window. Transfer rates and state of completion are displayed at the right in the above window. The following is an example of transfer of 96 ABI chromat files:

Files may be downloaded (get operation) by 'dragging' them from the Fetch directory window to the appropriate Mac folder. Transfer rates and state of completion are again displayed at the right in the above window.
Other operations are possible, eg uploading or downloaded all files in a folder or directory, deleting files on the computer to which you are connected, editig Shortcuts you have created, etc. See the Fetch Help facility and other Fetch pulldown menus for details.
3. Apple MacX
MacX 2.0 may be purchased from the UCSD bookstore or from UCSD Academic Computing Services (ACS).
Once installed, you need to set the Preference settings. MacX can be used in a variety of ways; see the MacX User Guide that comes with MacX.
The following rather simple setup has worked for me so far with little trouble (MacX does bomb sometimes ...)
a. Setting up an elcapitan Command
MacX accesses remote computers
by executing a 'command', and you must create a new 'command'
to access elcapitan.
Do the following:
1) Turn on MacX on your Mac
2) Click on the New Command ... option in the MacX Remote pulldown menu:

3) In the popup window that appears, type in the following information and check the appropriate boxes:

4) This should create for you a new Mac file in the MacX folder with an Icon that looks like:
![]()
Subsequently, just double-clicking on this icon should bring up MacX and directly connect you to elcapitan using the account information you provided in the Remote Command window above. You should see a Unix text window similar to that seen in NCSA Telnet that looks like the following:

5) You can now test the X-windows capabilities by typing at the 'elcapitan%' prompt:
This should bring up the Main Window of the ACeDB database DictyDB, which you can peruse as desired.
6) As another test, try consed with the standard test data that
is the basis of the consed tutorial.
Type the following two commands at the 'elcapitan%' prompt:
The first command takes
you to the edit_dir containing the *.ace file for the polyphred
test data.
The second command executes consed.
After using consed and quitting, return to your Home directory on elcapitan by typing in the MacX text window:
cd
7) As a final test, try consed with the polyphred test data ...
type the following two commands at the 'elcapitan%' prompt:
The first command takes
you to the edit_dir containing the *.ace file for the polyphred
test data.
The second command executes consed.
You can now do with these data in consed using MacX as suggested in the consed tutorial, reproduced above.