PPG Home Page
Home page for the PPG program

 

PPG SNP Data
Excel Workbooks, SNP ID Names, Public Domain and Proprietary Data

A. Locus-Specific Workbooks
Initial attempts to design workbook containing complete Locus info

B. Timeline Workbooks
Workbooks based on progress over time for Locus analysis

C. UCSD SNP ID Names
Locus and SNP data leading to SNP ID Names used at UCSD

D. UCSD DNA Data Files
Primary Genotypic Analysis of UCSD DNA data Files

E. Sequenom DNA Data Files
Primary Genotypic Analysis of Sequenom DNA data Files

F. TCGA DNA Data Files
Primary Genotypic Analysis of TCGA DNA data Files

Z. Powerpoint Presentation
Presentation "Primary Analysis of new SNP Genotype Data"

 

DNASYSTEM
Web pages with Links
to other Bioinformatic Sites

 

UCSD NHLBI Program

Sympathetic Neuroeffector Junctions and Blood Pressure

Human Essential Hypertension

 

PPG SNP Data, Public Domain and Proprietary

 


| PPG Home Page | A | B | C | D | E | F | Z | DNASYSTEM |


This is the Home Page for proprietary SNP data, both public domain and those generated by the PPG Hypertension program.
The SNP data are displayed primarily in Excel workbooks.
A few Excel workbooks can be visualized as Web pages ... and all can be downloaded and manipulated as Excel xls files ...
downloading and personal manipulation is the best way to go ...

 

The spreadsheets and data are formatted in two ways:
A. Locus-specific Sequence and Workbook files, and
B. Timeline data for all Loci whose DNA is analysed at UCSD.

Development of these spreadsheets, data, and data analysis has required determination of genome-position based SNP ID names:
C. UCSD "Standardized" SNP ID Names

Analysis is performed on all data obtained from external sources, eg Sequenom or TCGA, or from in house/UCSD:
D. Analysed UCSD DNA data Files and Loci
E. Analysed Sequenom DNA data Files and Loci
F. Analysed TCGA test DNA data Files and Locus

 

A. Locus-specific Sequence and Workbook Files

Loci completed to date are the ones in the Table below which have links to additional completed information.

Additional information and relevant Web links are available for each of these loci at the Hypertension Candidate Loci web site. You may also click on "Locus" below to go to this site, or on any one of the specific "Locus" entries to go to the Candidate Locus page for that specific locus.

A History of Creation, Modification, and Updates of these Locus-specific files is available. You may also click on "Last Update" below to go to this History site, or on each date below to go to the History for a specific locus.

Some of the Loci are "Unusual" is some sense, eg the locus has mutliple Isoforms, the AUG translation start site is distant from one (or more) of the CAP transcription start site(s), etc. Such loci receive a "yes" to the question "Unusual ??" in the "Last Update" column, and information concerning the nature of the unusual properties of the locus are provided with the History information; click on the date below to go to this information.

 

Locus

Last Update
Unusual ??

GB-type Sequence
AUG-site at 5001 bp

GB-type Sequence
CAP-site at 5001 bp

FASTA Sequence
CAP-site at 5001 bp

Web Page
Workbook

ABCB1

01.06.2005
yes

ABCB1.gbAUG.old
ABCB1.gbAUG.doc

ABCB1.gbCAP.old
ABCB1.gbCAP.doc

ABCB1.faNeg.old
ABCB1.faNeg.doc

ABCB1.htm
ABCB1.xls

ACE

01.06.2005
yes

ACE.gbAUG.old
ACE.gbAUG.doc

ACE.gbCAP.old
ACE.gbCAP.doc

ACE.fasta.old
ACE.fasta.doc

ACE.htm
ACE.xls

ACHE

01.06.2005
yes

ACHE.gbAUG.old
ACHE.gbAUG.doc

ACHE.gbCAP.old
ACHE.gbCAP.doc

ACHE.faNeg.old
ACHE.faNeg.doc

ACHE.htm
ACHE.xls

ADD1

01.07.2005
yes

ADD1.gbAUG.old
ADD1.gbAUG.doc

ADD1.gbCAP.old
ADD1.gbCAP.doc

ADD1.fasta.old
ADD1.fasta.doc

ADD1.htm
ADD1.xls

ADRA1B

01.07.2005
no

ADRA1B.gbAUG.old
ADRA1B.gbAUG.doc

ADRA1B.gbCAP.old
ADRA1B.gbCAP.doc

ADRA1B.fasta.old
ADRA1B.fasta.doc

ADRA1B.htm
ADRA1B.xls

ADRA1D

01.07.2005
no

ADRA1D.gbAUG.old
ADRA1D.gbAUG.doc

ADRA1D.gbCAP.old
ADRA1D.gbCAP.doc

ADRA1D.faNeg.old
ADRA1D.faNeg.doc

ADRA1D.htm
ADRA1D.xls

ADRA2A

01.07.2005
no

ADRA2A.gbAUG.old
ADRA2A.gbAUG.doc

ADRA2A.gbCAP.old
ADRA2A.gbCAP.doc

ADRA2A.fasta.old
ADRA2A.fasta.doc

ADRA2A.htm
ADRA2A.xls

ADRA2B

01.07.2005
no

ADRA2B.gbAUG.old
ADRA2B.gbAUG.doc

ADRA2B.gbCAP.old
ADRA2B.gbCAP.doc

ADRA2B.faNeg.old
ADRA2B.faNeg.doc

ADRA2B.htm
ADRA2B.xls

ADRA2C

01.07.2005
no

ADRA2C.gbAUG.old
ADRA2C.gbAUG.doc

ADRA2C.gbCAP.old
ADRA2C.gbCAP.doc

ADRA2C.fasta.old
ADRA2C.fasta.doc

ADRA2C.htm
ADRA2C.xls

ADRB1

01.07.2005
no

ADRB1.gbAUG.old
ADRB1.gbAUG.doc

ADRB1.gbCAP.old
ADRB1.gbCAP.doc

ADRB1.fasta.old
ADRB1.fasta.doc

ADRB1.htm
ADRB1.xls

ADRB2

01.07.2005
no

ADRB2.gbAUG.old
ADRB2.gbAUG.doc

ADRB2.gbCAP.old
ADRB2.gbCAP.doc

ADRB2.fasta.old
ADRB2.fasta.doc

ADRB2.htm
ADRB2.xls

ADRB3

01.07.2005
no

ADRB3.gbAUG.old
ADRB3.gbAUG.doc

ADRB3.gbCAP.old
ADRB3.gbCAP.doc

ADRB3.faNeg.old
ADRB3.faNeg.doc

ADRB3.htm
ADRB3.xls

AGT

01.07.2005
yes

AGT.gbAUG.old
AGT.gbAUG.doc

AGT.gbCAP.old
AGT.gbCAP.doc

AGT.faNeg.old
AGT.faNeg.doc

AGT.htm
AGT.xls

AGTR1

01.07.2005
yes

AGTR1.gbAUG.old
AGTR1.gbAUG.doc

AGTR1.gbCAP.old
AGTR1.gbCAP.doc

AGTR1.fasta.old
AGTR1.fasta.doc

AGTR1.htm
AGTR1.xls

ANGPT1

01.07.2005
yes

ANGPT1.gbAUG.old
ANGPT1.gbAUG.doc

ANGPT1.gbCAP.old
ANGPT1.gbCAP.doc

ANGPT1.fasta.old
ANGPT1.faNeg.doc

ANGPT1.htm
ANGPT1.xls

BCHE

01.07.2005
yes

BCHE.gbAUG.old
BCHE.gbAUG.doc

BCHE.gbCAP.old
BCHE.gbCAP.doc

BCHE.faNeg.old
BCHE.faNeg.doc

BCHE.htm
BCHE.xls

CACNA1S

01.07.2005
no

CACNA1S.gbAUG.old
CACNA1S.gbAUG.doc

CACNA1S.gbCAP.old
CACNA1S.gbCAP.doc

CACNA1S.faNeg.old
CACNA1S.faNeg.doc

CACNA1S.htm
CACNA1S.xls

CBS

01.07.2005
yes

CBS.gbAUG.old
CBS.gbAUG.doc

CBS.gbCAP.old
CBS.gbCAP.doc

CBS.faNeg.old
CBS.faNeg.doc

CBS.htm
CBS.xls

CHAT

01.07.2005
yes

CHAT.gbAUG.old
CHAT.gbAUG.doc

CHAT.gbCAP.old
CHAT.gbCAP.doc

CHAT.fasta.old
CHAT.fasta.doc

CHAT.htm
CHAT.xls

CHGA

01.07.2005
no

CHGA.gbAUG.old
CHGA.gbAUG.doc

CHGA.gbCAP.old
CHGA.gbCAP.doc

CHGA.fasta.old
CHGA.fasta.doc

CHGA.htm
CHGA.xls

CHGB

01.07.2005
no

CHGB.gbAUG.old
CHGB.gbAUG.doc

CHGB.gbCAP.old
CHGB.gbCAP.doc

CHGB.fasta.old
CHGB.fasta.doc

CHGB.htm
CHGB.xls

CHRM2

01.07.2005
yes !!!

CHRM2.gbAUG.old
CHRM2.gbAUG.doc

CHRM2.gbCAP.old
CHRM2.gbCAP.doc

CHRM2.fasta.old
CHRM2.fasta.doc

CHRM2.htm
CHRM2.xls

CHRM3

01.07.2005
yes !!!

CHRM3.gbAUG.old
CHRM3.gbAUG.doc

CHRM3.gbCAP.old
CHRM3.gbCAP.doc

CHRM3.fasta.old
CHRM3.fasta.doc

CHRM3.htm
CHRM3.xls

CHRNA3

01.07.2005
no

CHRNA3.gbAUG.old
CHRNA3.gbAUG.doc

CHRNA3.gbCAP.old
CHRNA3.gbCAP.doc

CHRNA3.faNeg.old
CHRNA3.faNeg.doc

CHRNA3.htm
CHRNA3.xls

CHRNA5

01.07.2005
no

CHRNA5.gbAUG.old
CHRNA5.gbAUG.doc

CHRNA5.gbCAP.old
CHRNA5.gbCAP.doc

CHRNA5.fasta.old
CHRNA5.fasta.doc

CHRNA5.htm
CHRNA5.xls

CHRNA7

01.07.2005
no

CHRNA7.gbAUG.old
CHRNA7.gbAUG.doc

CHRNA7.gbCAP.old
CHRNA7.gbCAP.doc

CHRNA7.fasta.old
CHRNA7.fasta.doc

CHRNA7.htm
CHRNA7.xls

CHRNB4

01.07.2005
no

CHRNB4.gbAUG.old
CHRNB4.gbAUG.doc

CHRNB4.gbCAP.old
CHRNB4.gbCAP.doc

CHRNB4.faNeg.old
CHRNB4.faNeg.doc

CHRNB4.htm
CHRNB4.xls

COMT

01.07.2005
yes

COMT.gbAUG.old
COMT.gbAUG.doc

COMT.gbCAP.old
COMT.gbCAP.doc

COMT.fasta.old
COMT.fasta.doc

COMT.htm
COMT.xls

CTSL

01.07.2005
yes

CTSL.gbAUG.old
CTSL.gbAUG.doc

CTSL.gbCAP.old
CTSL.gbCAP.doc

CTSL.fasta.old
CTSL.fasta.doc

CTSL.htm
CTSL.xls

CYB561

01.07.2005
yes

CYB561.gbAUG.old
CYB561.gbAUG.doc

CYB561.gbCAP.old
CYB561.gbCAP.doc

CYB561.faNeg.old
CYB561.faNeg.doc

CYB561.htm
CYB561.xls

CYBA

01.07.2005
no

CYBA.gbAUG.old
CYBA.gbAUG.doc

CYBA.gbCAP.old
CYBA.gbCAP.doc

CYBA.faNeg.old
CYBA.faNeg.doc

CYBA.htm
CYBA.xls

CYP11B2

01.07.2005
no

CYP11B2.gbAUG.old
CYP11B2.gbAUG.doc

CYP11B2.gbCAP.old
CYP11B2.gbCAP.doc

CYP11B2.faNeg.old
CYP11B2.faNeg.doc

CYP11B2.htm
CYP11B2.xls

CYP3A4

01.07.2005
no

CYP3A4.gbAUG.old
CYP3A4.gbAUG.doc

CYP3A4.gbCAP.old
CYP3A4.gbCAP.doc

CYP3A4.faNeg.old
CYP3A4.faNeg.doc

CYP3A4.htm
CYP3A4.xls

DBH

01.08.2005
no

DBH.gbAUG.old
DBH.gbAUG.doc

DBH.gbCAP.old
DBH.gbCAP.doc

DBH.fasta.old
DBH.fasta.doc

DBH.htm
DBH.xls

DRD1

01.08.2005
no

DRD1.gbAUG.old
DRD1.gbAUG.doc

DRD1.gbCAP.old
DRD1.gbCAP.doc

DRD1.faNeg.old
DRD1.faNeg.doc

DRD1.htm
DRD1.xls

DRD1IP

01.08.2005
yes

DRD1IP.gbAUG.old
DRD1IP.gbAUG.doc

DRD1IP.gbCAP.old
DRD1IP.gbCAP.doc

DRD1IP.faNeg.old
DRD1IP.faNeg.doc

DRD1IP.htm
DRD1IP.xls

FMO2

01.08.2005
no

FMO2.gbAUG.old
FMO2.gbAUG.doc

FMO2.gbCAP.old
FMO2.gbCAP.doc

FMO2.fasta.old
FMO2.fasta.doc

FMO2.htm
FMO2.xls

FMO3

01.08.2005
no

FMO3.gbAUG.old
FMO3.gbAUG.doc

FMO3.gbCAP.old
FMO3.gbCAP.doc

FMO3.fasta.old
FMO3.fasta.doc

FMO3.htm
FMO3.xls

GNAS

01.08.2005
yes

GNAS.gbAUG.old
GNAS.gbAUG.doc

GNAS.gbCAP.old
GNAS.gbCAP.doc

GNAS.fasta.old
GNAS.fasta.doc

GNAS.htm
GNAS.xls

GNB3

01.08.2005
no

GNB3.gbAUG.old
GNB3.gbAUG.doc

GNB3.gbCAP.old
GNB3.gbCAP.doc

GNB3.fasta.old
GNB3.fasta.doc

GNB3.htm
GNB3.xls

GPRK2L
=GRK4

01.08.2005
yes

GRK4.gbAUG.old
GRK4.gbAUG.doc

GRK4.gbCAP.old
GRK4.gbCAP.doc

GRK4.fasta.old
GRK4.fasta.doc

GRK4.htm
GRK4.xls

GSTT1

01.08.2005
no

GSTT1.gbAUG.old
GSTT1.gbAUG.doc

GSTT1.gbCAP.old
GSTT1.gbCAP.doc

GSTT1.faNeg.old
GSTT1.faNeg.doc

GSTT1.htm
GSTT1.xls

HSD11B1

01.08.2005
yes

HSD11B1.gbAUG.old
HSD11B1.gbAUG.doc

HSD11B1.gbCAP.old
HSD11B1.gbCAP.doc

HSD11B1.fasta.old
HSD11B1.fasta.doc

HSD11B1.htm
HSD11B1.xls

HSD11B2

01.08.2005
no

HSD11B2.gbAUG.old
HSD11B2.gbAUG.doc

HSD11B2.gbCAP.old
HSD11B2.gbCAP.doc

HSD11B2.fasta.old
HSD11B2.fasta.doc

HSD11B2.htm
HSD11B2.xls

ITGAL

01.08.2005
no

ITGAL.gbAUG.old
ITGAL.gbAUG.doc

ITGAL.gbCAP.old
ITGAL.gbCAP.doc

ITGAL.fasta.old
ITGAL.fasta.doc

ITGAL.htm
ITGAL.xls

KCNA5

01.08.2005
no

KCNA5.gbAUG.old
KCNA5.gbAUG.doc

KCNA5.gbCAP.old
KCNA5.gbCAP.doc

KCNA5.fasta.old
KCNA5.fasta.doc

KCNA5.htm
KCNA5.xls

KCNB1

01.08.2005
no

KCNB1.gbAUG.old
KCNB1.gbAUG.doc

KCNB1.gbCAP.old
KCNB1.gbCAP.doc

KCNB1.faNeg.old
KCNB1.faNeg.doc

KCNB1.htm
KCNB1.xls

KCNMB1

01.08.2005
no

KCNMB1.gbAUG.old
KCNMB1.gbAUG.doc

KCNMB1.gbCAP.old
KCNMB1.gbCAP.doc

KCNMB1.faNeg.old
KCNMB1.faNeg.doc

KCNMB1.htm
KCNMB1.xls

KLK1

01.08.2005
no

KLK1.gbAUG.old
KLK1.gbAUG.doc

KLK1.gbCAP.old
KLK1.gbCAP.doc

KLK1.faNeg.old
KLK1.faNeg.doc

KLK1.htm
KLK1.xls

MAOA

01.10.2005
no

MAOA.gbAUG.old
MAOA.gbAUG.doc

MAOA.gbCAP.old
MAOA.gbCAP.doc

MAOA.fasta.old
MAOA.fasta.doc

MAOA.htm
MAOA.xls

MAOB

01.10.2005
no

MAOB.gbAUG.old
MAOB.gbAUG.doc

MAOB.gbCAP.old
MAOB.gbCAP.doc

MAOB.faNeg.old
MAOB.faNeg.doc

MAOB.htm
MAOB.xls

MTHFR

01.10.2005
no

MTHFR.gbAUG.old
MTHFR.gbAUG.doc

MTHFR.gbCAP.old
MTHFR.gbCAP.doc

MTHFR.faNeg.old
MTHFR.faNeg.doc

MTHFR.htm
MTHFR.xls

MTR

01.10.2005
no

MTR.gbAUG.old
MTR.gbAUG.doc

MTR.gbCAP.old
MTR.gbCAP.doc

MTR.fasta.old
MTR.fasta.doc

MTR.htm
MTR.xls

NET1

01.10.2005
no

NET1.gbAUG.old
NET1.gbAUG.doc

NET1.gbCAP.old
NET1.gbCAP.doc

NET1.fasta.old
NET1.fasta.doc

NET1.htm
NET1.xls

NOS3

01.10.2005
no

NOS3.gbAUG.old
NOS3.gbAUG.doc

NOS3.gbCAP.old
NOS3.gbCAP.doc

NOS3.fasta.old
NOS3.fasta.doc

NOS3.htm
NOS3.xls

NPY

01.10.2005
no

NPY.gbAUG.old
NPY.gbAUG.doc

NPY.gbCAP.old
NPY.gbCAP.doc

NPY.fasta.old
NPY.fasta.doc

NPY.htm
NPY.xls

NPY1R

01.10.2005
yes

NPY1R.gbAUG.old
NPY1R.gbAUG.doc

NPY1R.gbCAP.old
NPY1R.gbCAP.doc

NPY1R.faNeg.old
NPY1R.faNeg.doc

NPY1R.htm
NPY1R.xls

NPY2R

01.10.2005
yes

NPY2R.gbAUG.old
NPY2R.gbAUG.doc

NPY2R.gbCAP.old
NPY2R.gbCAP.doc

NPY2R.fasta.old
NPY2R.fasta.doc

NPY2R.htm
NPY2R.xls

NR3C2

01.10.2005
yes

NR3C2.gbAUG.old
NR3C2.gbAUG.doc

NR3C2.gbCAP.old
NR3C2.gbCAP.doc

NR3C2.faNeg.old
NR3C2.faNeg.doc

NR3C2.htm
NR3C2.xls

PEMT

01.10.2005
yes

PEMT.gbAUG.old
PEMT.gbAUG.doc

PEMT.gbCAP.old
PEMT.gbCAP.doc

PEMT.faNeg.old
PEMT.faNeg.doc

PEMT.htm
PEMT.xls

PNMT

01.10.2005
no

PNMT.gbAUG.old
PNMT.gbAUG.doc

PNMT.gbCAP.old
PNMT.gbCAP.doc

PNMT.fasta.old
PNMT.fasta.doc

PNMT.htm
PNMT.xls

PYY

01.10.2005
no

PYY.gbAUG.old
PYY.gbAUG.doc

PYY.gbCAP.old
PYY.gbCAP.doc

PYY.faNeg.old
PYY.faNeg.doc

PYY.htm
PYY.xls

REN

01.10.2005
no

REN.gbAUG.old
REN.gbAUG.doc

REN.gbCAP.old
REN.gbCAP.doc

REN.faNeg.old
REN.faNeg.doc

REN.htm
REN.xls

RGS1

01.10.2005
no

RGS1.gbAUG.old
RGS1.gbAUG.doc

RGS1.gbCAP.old
RGS1.gbCAP.doc

RGS1.fasta.old
RGS1.fasta.doc

RGS1.htm
RGS1.xls

SCG2

01.10.2005
yes

SCG2.gbAUG.old
SCG2.gbAUG.doc

SCG2.gbCAP.old
SCG2.gbCAP.doc

SCG2.faNeg.old
SCG2.faNeg.doc

SCG2.htm
SCG2.xls

SLC18A1

01.10.2005
no

SLC18A1.gbAUG.old
SLC18A1.gbAUG.doc

SLC18A1.gbCAP.old
SLC18A1.gbCAP.doc

SLC18A1.faNeg.old
SLC18A1.faNeg.doc

SLC18A1.htm
SLC18A1.xls

SLC18A2

01.10.2005
no

SLC18A2.gbAUG.old
SLC18A2.gbAUG.doc

SLC18A2.gbCAP.old
SLC18A2.gbCAP.doc

SLC18A2.fasta.old
SLC18A2.fasta.doc

SLC18A2.htm
SLC18A2.xls

SLC9A3

01.10.2005
no

SLC9A3.gbAUG.old
SLC9A3.gbAUG.doc

SLC9A3.gbCAP.old
SLC9A3.gbCAP.doc

SLC9A3.faNeg.old
SLC9A3.faNeg.doc

SLC9A3.htm
SLC9A3.xls

SLC9A3R1

01.10.2005
no

SLC9A3R1.gbAUG.old
SLC9A3R1.gbAUG.doc

SLC9A3R1.gbCAP.old
SLC9A3R1.gbCAP.doc

SLC9A3R1.fasta.old
SLC9A3R1.fasta.doc

SLC9A3R1.htm
SLC9A3R1.xls

RGSPX1
=SNX13

01.10.2005
no

SNX13.gbAUG.old
SNX13.gbAUG.doc

SNX13.gbCAP.old
SNX13.gbCAP.doc

SNX13.faNeg.old
SNX13.faNeg.doc

SNX13.htm
SNX13.xls

TH

01.10.2005
yes

TH.gbAUG.old
TH.gbAUG.doc

TH.gbCAP.old
TH.gbCAP.doc

TH.faNeg.old
TH.faNeg.doc

TH.htm
TH.xls

XDH

01.10.2005
no

XDH.gbAUG.old
XDH.gbAUG.doc

XDH.gbCAP.old
XDH.gbCAP.doc

XDH.faNeg.old
XDH.faNeg.doc

XDH.htm
XDH.xls

 

Files for each Locus of interest include: 1) six Locus-specific Sequence Files, and 2) a Locus-specific Excel spreadsheet

 

1. Locus-specific Sequence files

01.06.2005: Locus-specific sequence files now include six text files.
Unless there are "unusual" Locus properties, these include two copies, an old (*.old files) and a new (*.doc files), of each of three files:

  1. LOCUS.gbAUG.doc: GenBank-formatted file, with the AUG translation protein start site at position 5001
  2. LOCUS.gbCAP.doc: GenBank-formatted file, with the CAP transcription mRNA start site at position 5001
  3. LOCUS.fasta.doc: FASTA-formatted file, with the CAP transcription mRNA start site at position 5001

1) LOCUS.gbAUG.doc: This text file is a GenBank-annotated sequence file for the locus DNA sequence plus varying amounts of sequence 5' (upstream) and 3' (downstream) of the locus. In the usual case, the AUG translation protein start site is at position 5001. The annotation is typical GenBank description information, including dbSNP SNP information and all Exon-Intron junction position information.
If the locus is transcribed off the complementary DNA strand to that of the NP sequence, the LOCUS.gbAUG.doc file shows annotation and sequence of the complementary (or NEGative) strand and the corresponding FASTA-formatted file is called LOCUS.faNeg.doc

2) LOCUS.gbCAP.doc: This text file is a GenBank-annotated sequence file for the locus DNA sequence plus varying amounts of sequence 5' (upstream) and 3' (downstream) of the locus. In the usual case, the CAP transcription mRNA start site is at position 5001. The sequence and annotation in this file is thus identical to that found in the LOCUS.gbAUG.doc file except for a translation of coordinate system. The annotation is typical GenBank description information, including dbSNP SNP information and all Exon-Intron junction position information.
If the locus is transcribed off the complementary DNA strand to that of the NP sequence, the LOCUS.gbAUG.doc file shows annotation and sequence of the complementary (or NEGative) strand and the corresponding FASTA-formatted file is called LOCUS.faNeg.doc

3) LOCUS.fasta.doc: This text file is a FASTA-formatted sequence file of the same sequence as found in the Genbank-annotated sequence file with position 5001 located at the CAP transcription start site, the LOCUS.gbCAP.doc file.
If the locus is transcribed off the complementary DNA strand to that of the NP sequence, the LOCUS.fasta.doc file shows the sequence of the complementary (or NEGative) strand and the file is called LOCUS.faNeg.doc

These sequence files provide a coordinate system that completely covers the gene with additional ~5000 bp sequence 5' of the gene and additional ~2000 bp 3' of the sequence. The annotation in the LOCUS.gb.old file includes all dbSNP SNPs with alleles and position. Thus, one can easily use these files to place any given SNP relative to other structural features of the Locus and relative to dbSNP SNPs. The LOCUS.fasta.old file provides a FASTA-formatted version of the gb.old sequence, for convenient use in BLAST2SEQS and other analysis programs.

Old (*.old) files vs New (*.doc) Sequence Files: Two versions of each of the three types of Sequence Files are available. The first is the current file (current at time of the latest update) and is the *.doc file. The second is the immediately previous file (file that was updated at time of the latest update) and is the *.old file.

It turns out that NCBI continues to update the Human Genome coordinate system via new Genome Assemblies; this update occurs roughly every three months! The result is a new version of the current RefSeq NT genomic DNA sequence, or, on occasion, generation of a new NT sequence for a given Locus. Most of these updates result in changes in the NT coordinates for the Locus. However, relatively few result in chnages in SNP or Intron/Exon positional coordinates RELATIVE TO a start position within the Locus, eg position 5001 at the CAP site. This is because most of the Exon sequences are now complete and no longer changing, and relatively few of the Introns (mainly only large ones) are still changing in sequence with new assemblies.

However, when such changes DO occur relative to the CAP site or AUG site, the SNP ID Name may change, since the SNP ID Name contains position information.

Thus the purpose of having BOTH old *.old files and new *.doc files is to permit the User to compare whether changes have taken place for their Locus of interest, and then to take appropriate action.

Information is contained in the History information (click on the "Last Update" date link) on comparisons of length of the LOCUS.gbCAP.old and LOCUS.gbCAP.doc sequences. If these are the same, then there have been no changes. If they are different, then use of BLAST2SEQ to compare the two FASTA-formatted sequences (LOCUS.fasta.old and LOCUS.fasta.doc), with subsequent comparison to the GB-annotation, provides information on where within the gene the sequence changes took place.

NOTE: if you find that any SNP coordinates in the current SNP ID Names Excel file are out of date, please so inform Doug Smith !!

"Unusual" Properties of a Locus: As indicated above, for "usual" loci, the position of the AUG protein translation start site is set to be 5001 in the LOCUS.gbAUG.doc file, and the position of the CAP mRNA transcription start site is set to be 5001 in the LOCUS.gbCAP.doc and LOCUS.fasta.doc files. This however does not work well for some loci. Here are two examples:

ACE has 3 isoforms, with CAP sites for the mRNA species at 5001, 12744, and 12744, and AUG sites for the coding sequence at 5023, 12796, and 12796. In this case, the ACE.gbCAP.doc sequence contains 5000 bp upstream of the CAP site for mRNA 1 and 2000 bp downstream of the polyA site for mRNA 2, the longest of the three mRNA species.

ABCB1 has only one isoform, with CAP site for the mRNA species at 5001 in the ABCB1.gbCAP.doc file. However, the AUG site for the coding sequence in this file is at position 118065! Thus, if one wishes to include roughly the same nucleotides in both the ABCB1.gbCAP.doc and ABCB1.gbAUG.doc files, the AUG site position in the ABCB1.gbAUG.doc file cannot be at position 5001, it must be at least at position 118000 or so.

When a given Locus has such "unusual" properties, the word "yes" appears in the "last Update" column of the Table above, under the Update Date, otherwise the word "no" appears. If "yes" appears, the nature of the "unusual" Locus properties is summarized with the History information; just click on the locus "Last Update" link to go to this information.

 

Uses of the Locus-specific Sequence Files:
These Sequence Files are very useful for a variety of tasks.

Two types of Sequence Files: The Locus-specific Sequence Files are of two types:

  • 1) The GenBank-formatted files LOCUS.gbAUG.doc and LOCUS.gbCAP.doc. These contain nearly all NCBI annotation for the given locus, including Intron-Exon junction info, dbSNP SNP info, STS info, and PubMed info. These files thus are very useful for any locus structure and feature information.
  • 2) The FASTA-formatted file LOCUS-fasta.doc (LOCUS.faNeg.doc is the locus is encoded by the strand complementary to the primary strand of the RefSeq NT genomic DNA reference sequence) is a FASTA-formatted sequence identical to the sequence found in the LOCUS.gbCAP.doc file. Except for the header line, this file is just the locus sequence, plus 5000 bp upstream and 2000 bases downstream. It thus is ideal and very useful for locus analyses, such as use of BLAST or BLAST2SEQ to identify position of a new SNP (simply subtract 5000; if the position is a promoter position and a negative number comes up, increase the negative number by one, eg -100 to -101, to account for no use of position zero, ie one goes from +1 to -1).

The two types of files thus complement each other vis a vis types of use.

Sequence Reference Position is at Postion 5001 "usually", for either the CAP site or for the AUG site:
The LOCUS.gbCAP.doc and LOCUS.fasta.doc files contain the complete gene plus 5000 bp upstream and 2000 bp downstream. Thus, promoter sequences and sequences past the polyA site are present, permitting examination and analysis of SNPs and other features in these regions. Further, there is no problem with negation position coordinates in the promoter region.

The LOCUS.gbAUG.doc file is identical in annotation to the LOCUS.gbCAP.doc file, and contains similar sequences. However, the positions have been moved (mathematically "translated") such that the AUG site is at position 5001 for "usual" loci, and at position 10001 or other convenient position for some of the "unusual" loci (click on "Last Update" links for details for a given Locus). This readily permits identification of SNP position relative to the protein AUG translation-start position rather than to the mRNA CAP transcription-start position (simply subtract 5000; if the position is a promoter position and a negative number comes up, increase the negative number by one, eg -100 to -101, to account for no use of position zero, ie one goes from +1 to -1).

 

 

2. Locus-specific Spreadsheets

The Locus-specific Spreadsheets contain considerable locus-specific information absent in the timeline spreadsheets. Their precise format is still under development as of June, 2003.

01.06.2005 - Note: Construction of these Locus-specific Spreadsheets has been difficult to automate. Rather a combination of the Locus-specific Sequence Files described above and the UCSD "Standardized SNP ID Names Excel file provides nearly all of this Locus-specific information. The manually-constructed Locus-specific Spreadsheets for the ADRB2, CHGA, and CHGB loci are still available here.

a. Recommendations for Use of the Locus-specific Spreadsheets:

Display vs Download:

The *.htm Web Page files are Web page versions of the Locus-specific spreadsheets. Their display is still rather finicky ... display requires Internet Explorer 5.2 ... and sometimes requires a few "refresh the screen" ...
Also, IE 5.2 on Mac OS X tends to hang up ...

Use of the *.doc Sequence files and the *.xls Excel files requires that they be downloaded to disk.
The files then or course are "yours" and can be modified, sorted, etc as desired.

Such downloading is usually the best way to go ...

Recommendations for Web Display of the *.htm Files:

  • Visualization of the *.htm files is BEST done using Internet Explorer 5.2 or higher
    Do NOT use Netscape as your Browser ... Netscape gets very confused on both horizontal and vertical spacing of columns
    Safari, the Apple browser for Macs, is also good ... very fast, no hangup time, but poor display of horizontal and vertical separator lines ...

  • You may prefer to Open the *.htm file in a NEW WINDOW ... do this in one of the following ways:
    • With a 2-button mouse, RIGHT click on the link and choose "Open Link in new Window"
    • With a 1-button mouse, click on the link and do NOT release the click until a pop-up Menu appears.
      You may need to "nudge" the mouse to get the Menu. Then choose "Open Link in new Window"
    • With a 1-button mouse, do "Control-Click" on the link. Choose "Open Link in new Window" from the pop-up Menu. If "Control" does not work with your Browser and Computer, try "Option" o