This directory contains lists of merged transcript model identifiers (GTF's "transcript_id") produced using the "non anchored" merging procedure on PacBio HCGMs (High-Confidence Genome Mappings) within the GENCODE Capture Long-Seq project, and whose entire set of introns, if any, is supported by captured HiSeq data (i.e., at least one spliced HiSeq read with the exact same coordinates and strand as the PacBio intron). The corresponding transcript coordinates can be retrieved from unfiltered GTFs in the ../gtf/ directory, using e.g. the following command (after gunzipping both files): $ grep -F -w -f All_Cap1__noAnchor.HiSS.list.txt All_Cap1__noAnchor.compmerge..gtf # File naming scheme: All_Cap1__noAnchor.HiSS.list.txt.gz where: species: "mm": mouse "hs": human tissue: self-explanatory, except: "all": transcript models merged across all available tissues # File format: GZIPPED ascii text. One transcript_id per line, corresponding to either: - a spliced transcript model whose full set of splice junctions is supported by HiSeq data, or - a mono-exonic transcript