Publications & Products

Research products from students and faculty in the Center for Biological Data Science. Student co-authors are indicated with (§) for undergraduates and (‡) for graduates; postdocs are indicaed by (*).

Rosenberg, M.S. (2018) New record and range extension of the fiddler crab Uca princeps (Smith, 1870) (Brachyura, Ocypodidae) from California, USA. Journal of Crustacean Biology


The fiddler crab Uca princeps (Smith, 1870) has previously been recorded along the Pacific coast of the Americas from Peru to Mexico. Here we extend its range into the United States, based on photographs posted on the iNaturalist website.

Rodionova, I.A., N. Goodacre‡, J. Do, A. Hosseinnia, M. Babu, P. Uetz, and M.H. Saier, Jr. (2018) The uridylyltransferase, GlnD, and tRNA modification GTPase, MnmE allosterically control Escherichia coli folylpoly-γ-glutamate synthase, FolC. Journal of Biological Chemistry


Folate derivatives are important cofactors for enzymes in several metabolic processes. Folate-related inhibition and resistance mechanisms in bacteria are potential targets for antimicrobial therapies and therefore a significant focus of current research. Here, we report that the activity of Escherichia coli poly-γ-glutamyl tetrahydrofolate/dihydrofolate synthase (FolC) is regulated by the glutamate/glutamine-sensing uridylyltransferase (GlnD), the THF-dependent tRNA modification enzyme (MnmE) and the UDP-glucose dehydrogenase (Ugd) as shown by direct in vitro protein-protein interactions.

Using kinetics analyses, we observed that GlnD, Ugd, and MnmE activate FolC many fold by decreasing the Khalf of FolC for its substrate, L-glutamate. Moreover, FolC inhibited the GTPase activity of MnmE at low GTP concentrations. The growth phenotypes associated with these proteins are discussed. These results, obtained using direct in vitro enzyme assays, reveal unanticipated networks of allosteric regulatory interactions in the folate pathway in E. coli and indicate regulation of polyglutamylated tetrahydrofolate biosynthesis by the availability of nitrogen sources, signaled by the glutamine-sensing GlnD protein.

Goodacre, N.‡, P. Devkota, E. Bae, S. Wuchty, and P. Uetz (2018) Protein-protein interactions of human viruses. Seminars in Cell & Developmental Biology


Viruses infect their human hosts by a series of interactions between viral and host proteins, indicating that detailed knowledge of such virus-host interaction interfaces are critical for our understanding of viral infection mechanisms, disease etiology and the development of new drugs. In this review, we primarily survey human host-virus interaction data that are available from public databases following the standardized PSI-MS format. Notably, available host-virus protein interaction information is strongly biased toward a small number of virus families including herpesviridae, papillomaviridae, orthomyxoviridae and retroviridae. While we explore the reliability and relevance of these protein interactions we also survey the current knowledge about viruses functional and topological targets. Furthermore, we assess emerging frontiers of host-virus protein interaction research, focusing on protein interaction interfaces of hosts that are infected by different viruses and viruses that infect multiple hosts. Finally, we cover the current status of research that investigates the relationships of virus-targeted host proteins to other comorbidities as well as the influence of host-virus protein interactions on human metabolism.

Elhai, J., and I. Khudyakov (2018) Ancient association of cyanobacterial multicellularity with the regulator HetR and an RGSGR pentapeptide‐containing protein (PatX). Molecular Microbiology


One simple model to explain biological pattern postulates the existence of a stationary regulator of differentiation that positively affects its own expression, coupled with a diffusible suppressor of differentiation that inhibits the regulator's expression. The first has been identified in the filamentous, heterocyst-forming cyanobacterium, Anabaena PCC 7120 as the transcriptional regulator, HetR, and the second as the small protein, PatS, which contains a critical RGSGR motif that binds to HetR. HetR is present in almost all filamentous cyanobacteria, but only a subset of heterocyst-forming strains carry proteins similar to PatS. We identified a third protein, PatX that also carries the RGSGR motif and is coextensive with HetR. Amino acid sequences of PatX contain two conserved regions: the RGSGR motif and a hydrophobic N‐terminus. Within 69 nt upstream from all instances of the gene is a DIF1 motif correlated in Anabaena with promoter induction in developing heterocysts, preceded in heterocyst-forming strains by an apparent NtcA-binding site, associated with regulation by nitrogen-status. Consistent with a role in the simple model, PatX is expressed dependent on HetR and acts to inhibit differentiation. The acquisition of the PatX/HetR pair preceded the appearance of both PatS and heterocysts, dating back to the beginnings of multicellularity.

Uetz, P., J. Mehla, K. Sinner, F. Oesterheld, K. Richter, D. Schubert, and C. Urbanke (2018) Protein-protein interactions. Bioanalytics, F. Lottspeich, ed. Pp. 381-417.



Wuchty S., S.A. Mueller, J.H. Caufield‡, R. Häuser, P. Aloy, S. Kalkhof, and P. Uetz (2018) Proteome data improves protein function prediction in the interactome of H. pylori. Molecular & Cellular Proteomics 17:961-973.


Helicobacter pylori is a common pathogen that is estimated to infect half of the human population, causing several diseases such as duodenal ulcer. Despite one of the first pathogens to be sequenced, its proteome remains poorly characterised as about one third of its proteins have no functional annotation. Here, we integrate and analyze known protein interactions with proteomic and genomic data from different sources. We find that proteins with similar abundances tend to interact. Such an observation is accompanied by a trend of interactions to appear between proteins of similar functions, although some show marked cross-talk to others. Protein function prediction with protein interactions is significantly improved when interactions from other bacteria are included in our network, allowing us to obtain putative functions of more than 300 poorly or previously uncharacterized proteins. Proteins that are critical for the topological controllability of the underlying network are significantly enriched with genes that are up-regulated in the spiral compared to the coccoid form of H. pylori. Determining their evolutionary conservation, we present evidence for 80 protein complexes to be identical in composition with their counterparts in E. coli while 85 are partially conserved but 120 complexes are completely absent. Furthermore, we determine network clusters that coincide with related functions gene essentiality, genetic context, cellular localization, and gene expression in different cellular states.

Uetz, P., and A. Stylianou‡ (2018) The original descriptions of reptiles and their subspecies Zootaxa 4375:257-264.


By August 2017 an estimated 13,047 species and subspecies of extant reptiles have been described by a total of 6,454 papers and books which are listed in a supplementary file. For 1,052 species a total of 2,452 subspecies (excluding nominate subspecies) had been described by 2017, down from 1,295 species and 4,411 subspecies in 2009, due to the elevation of many subspecies to species. Here we summarize the history of these taxon description beginning with Linnaeus in 1758. While it took 80 years to reach the first 1,000 species in 1838, new species and subspecies descriptions since then have been added at a roughly constant rate of 1000 new taxa every 12-17 years. The only exception were the decades during World Wars I and II and the beginning of this millennium when the rate of descriptions increased to now about 7 years for the last 1,000 taxa. The top 101 most productive herpetologists (in terms of “taxon output”) have described more than 8,000 species and subspecies, amounting to over 60% of all currently valid taxa. More than 90% of all species were described in either English (68.2%), German (12.7%) or French (9.3%).

Mehla J.*, J.H. Caufield‡, and P. Uetz (2018) Making the right choice: Critical parameters of the Y2H systems Two-Hybrid Systems. Methods in Molecular Biology, Oñate-Sánchez, L., ed. 1794:17-28


Two-hybrid methods remain among the most preferred choices for detecting protein–protein interactions (PPIs) and much of the PPI data in databases have been produced using yeast two-hybrid (Y2H) screens. The Y2H methods are extensively used to detect PPIs because of their scalability and accessibility. Several variants of Y2H methods have been developed and used by different research groups, increasing the accessibility of these methods and their applications in detecting different types of PPIs. However, the availability of variations on the same core methodology emphasizes the need to have a systematic comparison of available Y2H methods in the context of their applicability, coverage and efficiency. In this chapter, we discuss the key parameters of Y2H methods, namely proteins of interest, vectors, libraries, screening strategies, data analysis, and provide a flowchart that should help to decide which Y2H strategy is most appropriate for a protein interaction screen.