4. Data Source Version

4.1. GRCh38 (hg38)

4.1.1. Variant annotation

  • star(*): the star(*) means this version is used in CGAP project.

Source name

Version

Version date

Source link

VEP *

v99

11/04/2019

VEP

v100

11/04/2019

dbSNP *

20190722

07/22/2019

gnomAD *

3.0

UK10K *

20160215

02/15/2016

TOPMED *

freeze 5

08/28/2017

CLINVAR *

20200106

01/06/2020

COSMIC *

v90

08/06/2019

SPLICEAI *

20191004

01/06/2019

PRIMATEAI *

v0.2_hg38

12/18/2019

CADD

1.6

03/26/2020

CADD *

1.5

02/22/2019

GERP *

100_mammals

01/01/2000

PHASTCONS *

100way

07/17/2017

PHASTCONS *

30way

07/17/2017

PHASTCONS *

20way

07/17/2017

PHYLOP *

100way

04/16/2015

PHYLOP *

30way

11/05/2017

PHYLOP *

20way

05/07/2015

SIPHY *

20way

01/01/2000

SUPER_DUPLICATES *

20way

01/01/2000

SIMPLE_REPEAT *

20way

01/01/2000

RMSK *

20way

01/01/2000

NESTED_REPEATS *

20way

01/01/2000

MICROSATELLITE *

20way

08/23/2015

4.1.2. Gene annotation

Source name

Version

Version date

Source link

ENSEMBLgene *

v99

11/21/2019

ftp://ftp.ensembl.org/pub/release-99/gtf/homo_sapiens/Homo_sapiens.GRCh38.99.gtf.gz

ENSEMBLgene

v100

03/06/2020

ftp://ftp.ensembl.org/pub/release-100/gtf/homo_sapiens/Homo_sapiens.GRCh38.100.gtf.gz

ENSEMBLgene

v101

07/11/2020

ftp://ftp.ensembl.org/pub/release-101/gtf/homo_sapiens/Homo_sapiens.GRCh38.101.gtf.gz

ENSEMBLgeneGRCh37 *

v75(GRCh37.p13)

09/01/2013

ftp://ftp.ensembl.org/pub/release-75/gtf/homo_sapiens/Homo_sapiens.GRCh37.75.gtf.gz

CYTOBAND *

2017-07-17

07/17/2017

not available

CYTOBAND

2019-03-11

03/11/2019

http://hgdownload.cse.ucsc.edu/goldenpath/hg38/database/cytoBand.txt.gz

RefSeq *

2020-03-20

03/20/2020

not available

RefSeq

2020-09-08

09/08/2020

ftp://ftp.ncbi.nlm.nih.gov/refseq/H_sapiens/RefSeqGene/refseqgene.1.genomic.gbff.gz

HGNC *

2020-02-24

02/24/2020

not available

HGNC

2020-09-11

09/11/2020

ftp://ftp.ebi.ac.uk/pub/databases/genenames/new/tsv/hgnc_complete_set.txt

ClinGen

20200403

04/03/2020

not available

ClinGen *

20200911

09/11/2020

https://search.clinicalgenome.org/kb/curations/

ClinGenDisease *

20200403

04/03/2020

not available

ClinGenDisease

20200911

09/11/2020

https://search.clinicalgenome.org/kb/gene-validity.csv

ENSEMBLIDxrefTrscriptID *

2017-11-22

11/22/2017

not available

ENSEMBLIDxrefTrscriptID

2020-08-12

08/12/2020

ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/idmapping/by_organism/HUMAN_9606_idmapping_selected.tab.gz

ENSEMBLIDxref *

2017-11-22

11/22/2017

not available

ENSEMBLIDxref

2020-08-12

08/12/2020

ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/idmapping/by_organism/HUMAN_9606_idmapping_selected.tab.gz

dbNSFP *

4.0c

06/05/2017

ftp://dbnsfp:dbnsfp@dbnsfp.softgenetics.com/dbNSFP4.0c.zip

gnomADmetrics *

v2.1.1

11/22/2017

https://storage.googleapis.com/gnomad-public/release/2.1.1/constraint/gnomad.v2.1.1.lof_metrics.by_gene.txt.bgz

Marrvel *

v1.2

06/01/2017

http://marrvel.org/doc

CassaNatGenet2017 *

04/03/2017

04/03/2017

https://www.biorxiv.org/highwire/filestream/20869/field_highwire_adjunct_files/0/075523-1.xlsx

GTEx *

v8

06/05/2017

https://storage.googleapis.com/gtex_analysis_v8/rna_seq_data/GTEx_Analysis_2017-06-05_v8_RNASeQCv1.1.9_gene_median_tpm.gct.gz

BrainSpan *

v2

06/05/2017

not available

BrainSpan

v10

06/05/2017

https://www.brainspan.org/api/v2/well_known_file_download/267666525

BrainAtlas *

v2

03/01/2013

https://human.brain-map.org/api/v2/well_known_file_download/178238359

GenCode *

v33

12/13/2019

ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_33/gencode.v33.annotation.gff3.gz

GenCode

v35

03/01/2020

ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_35/gencode.v35.annotation.gff3.gz

4.2. GRCh37 (hg19)

4.2.1. Variant annotation

Source name

Version

Date

link

CADD

1.6

03/26/2020

4.2.2. Gene annotation