5_Constrained_Genes directory Contains 4 files - 1to1orth_dnrok_allinfo.txt.gz - mouse_human_constrained.BP.tsv - mouse_human_constrained.MF.tsv - mouse_human_constrained.CC.tsv The txt.gz file include the 14,363 mouse/human 1 to 1 orthologs with a Dynamic range (DNR) across mouse and human experiments The head of this file looks like this hs_gn mm_gn log10min_rpkm log10max_rpkm dynrange_both constr_class constr_6species gene_name top1000 matched_expr_nozero matched_expr_nozero_hs matched_expr_nozero_mm ENSG00000000003 ENSMUSG00000067377 -0.701381 1.61573 2.31711 unconstrained 0 TSPAN6 0 1 1 1 ENSG00000000419 ENSMUSG00000078919 0.168199 1.81092 1.64272 constrained 1 DPM1 0 1 1 1 ENSG00000000457 ENSMUSG00000026584 0.107353 1.14142 1.03407 constrained 0 SCYL3 1 1 1 1 ENSG00000000460 ENSMUSG00000041406 -0.822544 1.67499 2.49753 unconstrained 0 C1orf112 0 0 1 0 ENSG00000000938 ENSMUSG00000028874 -1.35343 2.88838 4.24181 unconstrained 0 FGR 1 1 1 1 ENSG00000001036 ENSMUSG00000019810 0.115743 1.79366 1.67792 constrained 1 FUCA2 0 1 0 1 ENSG00000001084 ENSMUSG00000032350 0.298555 1.84959 1.55104 constrained 0 GCLC 0 1 1 1 ENSG00000001167 ENSMUSG00000023994 0.528324 1.63117 1.10284 constrained 0 NFYA 0 1 1 1 ENSG00000001460 ENSMUSG00000028801 -1.29194 1.74495 3.03689 unconstrained 0 C1orf201 0 0 1 1 Each row is a mouse/human 1 to 1 ortholog gene pair, and the 12 columns correspond to the following information: - hs_gn: human gencode v10 gene id (before dot) - mm_gn: mouse ensembl v65 gene id - log10min_rpkm: log10 of the minimum of the non null rpkm in all mouse and human experiments - log10max_rpkm: log10 of the maximum of the non null rpkm in all mouse and human experiments - dynrange_both: dynamic range across all mouse and human experiments with a non null rpkm (difference between the max and the min) - constr_class: constrained class (constrained if dnr<=2, unconstrained otherwise) - constr_6species: a boolean saying whether the gene is also constrained in the 6 vertebrate species of Merkin et al Science 2013 - gene_name: human gene name - top1000: a boolean saying whether the gene is in the top 1000 constrained or top 1000 unconstrained genes (1000 highest or lowest dnr) - matched_expr_nozero: a boolean saying whether the gene is in the matched expression set computed according to avg (with no zero) in all mouse and human experiments - matched_expr_nozero_hs: a boolean saying whether the gene is in the matched expression set computed according to avg (with no zero) in all human experiments - matched_expr_nozero_mm: a boolean saying whether the gene is in the matched expression set computed according to avg (with no zero) in all mouse experiments The 3 tsv files are the results of the gene ontology term enrichment in the 6,636 constrained genes with respect to 15,736 1 to 1 orthologs (done with the GOstat R package) for the 3 gene ontologies (biological process (BP), molecular function (MF), cellular compartment (CC))