10.6078/D1CC74 Srouji, John Xu, Anting Park, Annsea Kirsch, Jack Brenner, Steven The Evolution of Function within the Nudix Superfamily UC Berkeley 2015 hydrolase homoplasy Nudix sequence alignment structural alignment 10.6078/d17p4v 10.6078/d13w2x 10.6078/d1059d 10.6078/d1vc7g 10.6078/d1qp46 10.6078/d1kw28 10.6078/d1g59r 10.6078/d1bc7t 10.6078/d16p4j 10.6078/d1301k 10.6078/d1z593 10.6078/d1tg6t 10.6078/d1pp4w 10.6078/D1K01X 10.6078/D1F59F 10.6078/D19G65 10.6078/D15P47 10.6078/D12018 10.6078/D1X59S 10.6078/D1SG6H 10.6078/D1NP4K 10.6078/D1J01M 10.6078/D1D594 10.6078/D18G6V 10.6078/D14S3K 10.6078/D1101Z 10.6078/D1W884 Data and software 292902042 Creative Commons Attribution 4.0 International (CC-BY 4.0) The Nudix superfamily encompasses over 80,000 protein domains from all three domains of life. These proteins fall into four general functional classes: isopentenyl diphosphate isomerases (IDIs), adenine/guanine mismatch-specific adenine glycosylases (A/G-specific adenine glycosylases), pyrophosphohydrolases, and non-enzymatic activities such as protein/protein interaction and transcriptional regulation. The largest group, pyrophosphohydrolases, encompasses more than 100 distinct hydrolase specificities. To understand the evolution of this vast number of activities, we assembled and analyzed experimental and structural data for 205 Nudix proteins collected from the literature. We corrected erroneous functions or provided more appropriate descriptions for 53 annotations described in the Gene Ontology Annotation database in this family, and propose 275 new experimentally-based annotations. We manually constructed structure-guided sequence alignment of 78 Nudix proteins. Using the structural alignment as a seed, we then made an alignment of 347 “select” Nudix domains, curated from structurally determined, functionally characterized, or phylogenetically important Nudix domains. Based on our review of Nudix pyrophosphohydrolase structures and specificities, we further analyzed a loop region downstream of the Nudix hydrolase motif previously shown to contact the substrate molecule and possess known functional motifs. This loop region provides a potential structural basis for the functional radiation and evolution of substrate specificity within the hydrolase family. Finally, phylogenetic analyses of the 347 select protein domains and of the complete Nudix clan revealed general monophyly with regard to function and a few instances of probable homoplasy.