This directory contains the genome coordinates of PacBio reads (ROIs) produced within the GENCODE Capture Long-Seq project, in GTF format. Only reads with the following characteristics are reported: - mapped uniquely in the genome - if spliced, their entire chain of introns is canonical (GT|GC / AG) - if unspliced, they must have a detectable polyA tail In other words, the difference between this set and the HCGMs present in the parent directory, is that this set may contain non double-bounded reads. All files correspond to genome assemblies hg38 and mm10. # File naming scheme: All_Cap1__allCanonicalExons.gff.gz where: species: "mm": mouse "hs": human tissue: self-explanatory. "Undeter": non-demultiplexed reads (i.e., of unknown tissue origin) # File format: Standard GTF, with exon records only. The originating PacBio read identifier is reported in both the "gene_id" and "transcript_id" attributes.