Date: May 20th 2014 Sarah Djebali, CRG, Barcelona, sarah.djebali@crg.eu Gencode v19 TSSs from all genes: http://genome.crg.es/~sdjebali/Gencode/version19/gencode.v19.annotation_capped_sites_nr_with_confidence.gff.gz Fantom5 cage peaks: http://fantom.gsc.riken.jp/5/tet/#!/?f=hg19.cage_peak_ann.txt.gz&c=0 Here we use: - Fantom5 permissive cage peaks (~1,000,000 for human), - Fantom5 strict cage peaks (~217,000 for human) And produce two gencode-cage tss cluster files: http://genome.crg.es/~sdjebali/Gencode/version19/Fantom5_CAGE/ The one using ~1 million F5 permissive cage peaks is here: http://genome.crg.es/~sdjebali/Gencode/version19/Fantom5_CAGE/TSS_human_with_gencodetss_notlow_ext50eachside_merged_withgenctsscoord_andgnlist.gff.gz The one using ~217,000 F5 strict cage peaks is here: http://genome.crg.es/~sdjebali/Gencode/version19/Fantom5_CAGE/TSS_human_strict_with_gencodetss_notlow_ext50eachside_merged_withgenctsscoord_andgnlist.gff.gz The proportion of gencode tss clusters seen by cage is 44% for the permissive set, and 32% for the strict set. The partition of the gencode-cage tss clusters is: - 942452 CAGEOnly / 51435 GencCAGE / 65454 GencOnly for the permissive set, - 130776 CAGEOnly / 36711 GencCAGE / 80193 GencOnly for the strict set