This directory contains the genome coordinates of PacBio introns observed in the GENCODE Capture Long-Seq project, in GTF format. Only canonical (GT|GC / AG) introns are reported. Only reads mapping uniquely in the genome, and whose entire chain of introns is canonical, were used. There is one record per intron and read. This means a given intron (as identified by its chromosome, start, end, strand genome coordinates) can be present multiple times. All files correspond to genome assemblies hg38 and mm10. # File naming scheme: All_Cap1__allCanonicalIntrons.gff.gz where: species: "mm": mouse "hs": human tissue: self-explanatory # File format: Standard GTF, with intron records only. The originating PacBio read identifier is reported in both the "gene_id" and "transcript_id" attributes. The gene biotype of the originating read is specified in the "overlapping_gene_types" attribute.