mRatBN7.2 is a rat reference genome with improved contiguity and accuracy
•
Gene annotations, from both RefSeq and Ensembl, are improved with mRatBN7.2
•
Our analysis of 120 strains/substrains of rats found 20 million sequence variations
•
A refined phylogenetic tree reveals the relationships between laboratory rats
Summary
The seventh iteration of the reference genome assembly for Rattus norvegicus—mRatBN7.2—corrects numerous misplaced segments and reduces base-level errors by approximately 9-fold and increases contiguity by 290-fold compared with its predecessor. Gene annotations are now more complete, improving the mapping precision of genomic, transcriptomic, and proteomics datasets. We jointly analyzed 163 short-read whole-genome sequencing datasets representing 120 laboratory rat strains and substrains using mRatBN7.2. We defined ∼20.0 million sequence variations, of which 18,700 are predicted to potentially impact the function of 6,677 genes. We also generated a new rat genetic map from 1,893 heterogeneous stock rats and annotated transcription start sites and alternative polyadenylation sites. The mRatBN7.2 assembly, along with the extensive analysis of genomic variations among rat strains, enhances our understanding of the rat genome, providing researchers with an expanded resource for studies involving rats.