Shasha Chong

Postdoctoral Researcher – Assistant Professor at Caltech

Transcriptional Regulation by Low-Complexity Domains


Sequence specific DNA-binding transcription factors (TFs) are seminal players in eukaryotic gene regulation. From the earliest studies of human TFs it was recognized that regulatory proteins like Sp1 contain well-structured DNA binding domains (DBDs) and functionally critical transactivation domains (TADs) that participate in specific TF-TF interactions to direct gene transcription. Numerous atomic structures of DBDs have provided a concrete understanding of TF-DNA interactions. In contrast, many TADs contain low-complexity sequence domains (LCDs) that persist in an intrinsically disordered conformation not amenable to conventional structural determination. Mutations in TF-LCDs not only disrupt transcription but also have been implicated in cancer and neurodegenerative disorders. However, how TF-LCDs execute specific transactivation functions has remained a long-standing enigma.

360° view of endogenously labeled EWS/FLI1 in an A673 cell nucleus, image taken on a lattice light sheet microscope  
Individual molecules of endogenous EWS/FLI1 (red) binding in or outside its hubs (green) formed at GGAA microsatellites in an A673 cell nucleus.

A wealth of in vitro studies suggests that purified LCDs from the FET protein family (FUS/EWS/TAF15) can undergo reversible hydrogel formation or liquid-liquid phase separation at high concentrations and low temperatures. Moreover, the C-terminal domain of RNA Polymerase II (Pol II), itself an LCD, can be incorporated into FET LCD hydrogels in a phosphorylation-regulated manner. FET LCDs were also reported to undergo phase separation in live cells upon overexpression. However, there are stark differences between in vivo physiological conditions and in vitro or with overexpression systems. It remains unclear how LCDs behave or function under native physiologically relevant conditions in vivo. Recently, combining CRISPR/Cas9-mediated genome editing and a variety of high-resolution imaging strategies including fluorescence recovery after photobleaching, lattice light sheet microscopy, 3D DNA fluorescence in situ hybridization, and live-cell single-particle tracking, we investigated the behavior of TF-LCDs at synthetic and endogenous genomic loci in live cells. We found that TF-LCDs form local high concentration interaction hubs at targeted genomic loci. TF-LCD hubs stabilize DNA binding, recruit Pol II and activate transcription. LCD-LCD interactions within hubs are highly dynamic, selective towards binding partners (Q-rich Sp1-LCD versus QGYS-rich FET-LCDs), differentially sensitive to disruption by hexanediols, and essential for the oncogenic transcription factor EWS/FLI1 to drive development of Ewing’s sarcoma. Together, our findings suggest that under physiological conditions, rapid reversible and multivalent LCD-LCD interactions occurring between TFs and the Pol II machinery underpin a central mechanism for transactivation and can play a key role in gene expression and disease. Our findings offer a powerful complement to pioneering in vitro studies that provided the first clues about LCD interactions. Importantly, many aspects of LCD-driven interactions uncovered in vitro are born out when probed under physiological settings in live cells. Most striking is their highly dynamic and sequence-specific nature and functional importance. The new insights on TF-LCD interactions may also enhance our ability to develop novel strategies to modulate gene expression in certain disease settings. The fundamental principles that we have uncovered about the dynamics and mechanisms driving LCD-LCD transactions may be applicable to regulatory proteins besides TFs and biomolecular interactions occurring in a variety of cell types.

A preprint of these studies is available at BioRxiv