WIREs RNA

Volume 14, Issue 2 e1752
Advanced Review
Open Access

Integrating transcription and splicing into cell fate: Transcription factors on the block

Panagiotis Boumpas

Panagiotis Boumpas

Institut de Génomique Fonctionnelle de Lyon, UMR5242, Ecole Normale Supérieure de Lyon, Centre National de la Recherche Scientifique, Université Claude Bernard-Lyon 1, Lyon, France

Contribution: Visualization (equal), Writing - review & editing (supporting)

Search for more papers by this author
Samir Merabet

Samir Merabet

Institut de Génomique Fonctionnelle de Lyon, UMR5242, Ecole Normale Supérieure de Lyon, Centre National de la Recherche Scientifique, Université Claude Bernard-Lyon 1, Lyon, France

Contribution: Writing - original draft (supporting)

Search for more papers by this author
Julie Carnesecchi

Corresponding Author

Julie Carnesecchi

Institut de Génomique Fonctionnelle de Lyon, UMR5242, Ecole Normale Supérieure de Lyon, Centre National de la Recherche Scientifique, Université Claude Bernard-Lyon 1, Lyon, France

Correspondence

Julie Carnesecchi, Institut de Génomique Fonctionnelle de Lyon, UMR5242, Ecole Normale Supérieure de Lyon, Centre National de la Recherche Scientifique, Université Claude Bernard-Lyon 1, Lyon, France.

Email: [email protected]

Contribution: Conceptualization (lead), Funding acquisition (lead), Project administration (lead), Supervision (lead), Visualization (equal), Writing - original draft (lead), Writing - review & editing (lead)

Search for more papers by this author
First published: 28 July 2022
Citations: 1
Edited by: Alexandra Moreira, Associate Editor and Jeff Wilusz, Editor-in-Chief

Funding information: H2020 Marie Skłodowska-Curie Actions, Grant/Award Number: 101024467

Abstract

Transcription factors (TFs) are present in all life forms and conserved across great evolutionary distances in eukaryotes. From yeast to complex multicellular organisms, they are pivotal players of cell fate decision by orchestrating gene expression at diverse molecular layers. Notably, TFs fine-tune gene expression by coordinating RNA fate at both the expression and splicing levels. They regulate alternative splicing, an essential mechanism for cell plasticity, allowing the production of many mRNA and protein isoforms in precise cell and tissue contexts. Despite this apparent role in splicing, how TFs integrate transcription and splicing to ultimately orchestrate diverse cell functions and cell fate decisions remains puzzling. We depict substantial studies in various model organisms underlining the key role of TFs in alternative splicing for promoting tissue-specific functions and cell fate. Furthermore, we emphasize recent advances describing the molecular link between the transcriptional and splicing activities of TFs. As TFs can bind both DNA and/or RNA to regulate transcription and splicing, we further discuss their flexibility and compatibility for DNA and RNA substrates. Finally, we propose several models integrating transcription and splicing activities of TFs in the coordination and diversification of cell and tissue identities.

This article is categorized under:

  • RNA Processing > Splicing Regulation/Alternative Splicing
  • RNA Interactions with Proteins and Other Molecules > Protein-RNA Interactions: Functional Implications
  • RNA Processing > Splicing Mechanisms

Graphical Abstract

Transcription factors (TFs) are key players of gene expression integrating transcription and splicing. Their activity is orchestrated by multiple features such as DNA/RNA-binding affinity and co-regulatory proteins, altogether fine-tuned by the local nuclear environment.

Abbreviations

  • AR
  • androgen receptor
  • Bcd
  • Bicoid
  • CRM
  • cis-regulatory module
  • CryoEM
  • cryo-electron microscopy
  • CTCF
  • CCCTC-binding factor
  • CTD
  • carboxy-terminal domain
  • DBD
  • DNA-binding domain
  • EJC
  • exon junction complex
  • ESC
  • embryonic stem cell
  • ESE/ISE, ESS/ISS
  • exonic or intronic enhancer and silencer
  • HD
  • homeodomain
  • HMG
  • high mobility group
  • hnRNP
  • heterogeneous nuclear RNP protein
  • iPSC
  • induced pluripotent stem cell
  • Nacc1
  • nucleus accumbens associated 1
  • NR
  • nuclear receptor
  • Pol II
  • RNA polymerase II
  • PTM
  • posttranslational modification
  • RBM
  • RNA-binding motif
  • RBP
  • RNA-binding protein
  • RBR
  • RNA-binding region
  • SF
  • splicing factor
  • snRNA
  • small nuclear RNA
  • Sox
  • SRY-related high-mobility-group box
  • SR-protein
  • serine arginine protein
  • SS
  • splice site
  • TF
  • transcription factor
  • TSS
  • transcription start site
  • U snRNP
  • U small nuclear ribonucleoprotein
  • Ubx
  • Ultrabithorax
  • ZnF
  • zinc finger
  • 1 INTRODUCTION

    The acquisition of cellular identity is a remarkable process. From the same genetic material, a variety of cell fates is realized to build multicellular organisms and respond to environmental changes with great plasticity. This diversity relies on inter-connected processes which fine-tune gene expression, from gene selection to protein translation (Maniatis & Reed, 2002; Moore & Proudfoot, 2009). One of the pivotal players of gene expression is transcription factors (TFs). TFs regulate gene expression by recognizing and binding cis-regulatory modules (CRMs including enhancers), influencing the chromatin landscape through histone marks deposition, chromatin accessibility, or by connecting CRMs with basal promoter (Carnesecchi et al., 2018; Sartorelli & Puri, 2018). Their molecular activity is refined by cooperative or competitive interactions with cofactors, which coordinate their function temporally and spatially (Braun & Gingras, 2012; Carnesecchi et al., 2020; Junion et al., 2012). Beyond their molecular function in transcription, the role of TFs is much more comprehensive. They regulate the RNA fate to various degrees by contributing to mRNA processing (Rambout et al., 2018), transport (Basyuk et al., 2021), and translation (Rivera-Pomar et al., 1996; Xu et al., 2021). These moonlighting functions are not restricted to a specific class of TFs, as various TFs possess extended molecular repertoire, such as the nuclear receptors (NR) (Auboeuf et al., 2002; Xu et al., 2021), the SRY-related high-mobility-group (HMG) box (Sox) TFs (Y. Zhang & Hou, 2021), or the homeobox (HD) family such as Hox TFs (Carnesecchi et al., 2018). Importantly, pioneer work by the Kornblihtt lab uncovered the interplay between transcription and splicing, and the role of TFs in integrating both processes (Cramer et al., 1997). Many studies further determined the function of TFs in alternative splicing for coordinating various cell functions and cell fate decisions in complex organisms (Auboeuf et al., 2002; Carnesecchi et al., 2022; Han et al., 2017; Rambout et al., 2018; Thompson et al., 2019). By affecting the quantitative and qualitative aspects of RNA regulation, TFs orchestrate the production of specific isoforms in distinct cell contexts. Despite their pivotal position in the gene regulatory network, how TFs integrate transcription and splicing to shape cell fate decision is still puzzling.

    We explore this issue by first introducing the general connection between transcription and splicing, and the main players. Next, we portray various studies in different organisms to illustrate the key role of TFs in splicing for promoting tissue-specific function and cell fate determination. We describe recent advances showing the molecular connection between the transcriptional and splicing activities of TFs. Finally, we discuss the role of TFs at the DNA and RNA layers and propose updated molecular models on how TFs integrate transcription and splicing to ultimately coordinate cellular diversity.

    2 TRANSCRIPTION AND SPLICING ARE LINKED THROUGH TIME AND SPACE

    2.1 Overview of splicing mechanisms

    The processing of nascent pre-mRNA into mature mRNAs is one of the key layers providing a high degree of plasticity to the cell (Moore & Proudfoot, 2009). Among these events, splicing is a prominent strategy to produce many mRNA isoforms and diversify proteins and their functions (see also section Further Reading). This is well illustrated by the Drosophila Down syndrome cell adhesion molecule (Dscam1) gene which may give rise to more than 38,000 isoforms (Schmucker et al., 2000). Of these, 95% of human and 60% of Drosophila genes undergo alternative splicing, in various cell and tissue types (Baralle & Giudice, 2017) and at different stages of development (Graveley et al., 2011). Beyond proteome diversity, alternative splicing impacts on core molecular functions such as genome integrity (Auboeuf, 2018). At the molecular level, splicing relies on the sequential assembly of small nuclear ribonucleoprotein complexes (U snRNPs) that compose the core unit of the spliceosome, which catalyzes intron excision and exon ligation (Wahl et al., 2009). Splicing is regulated by numerous cis- and trans-regulatory features that shape the retention or exclusion of alternative exons in cell- and tissue-specific contexts (Dvinge, 2018). Notably, it is regulated by numerous accessory proteins, namely, SR-proteins and heterogeneous nuclear RNP (hnRNPs) proteins, respectively acting as splicing activators and repressors (Figure 1a; Bourgeois et al., 2004; Bradley & Blanchette, 2015; Brooks et al., 2015; Howard & Sanford, 2015). Moreover, the recognition of RNA consensus or cryptic splice sites, their strength, the presence of RNA-binding proteins (RBPs) cognate elements such as exonic or intronic enhancer and silencer (ESE/ISE, ESS/ISS) influence the splicing outcome (Figure 1a; Bourgeois et al., 2004; Wahl et al., 2009). Overall, splicing relies on a wide range of mechanisms, such as the use of alternative transcription start sites (A-TSS), alternative 5′- and 3′-splice sites (A-SS), alternative polyadenylation sites (A-pA), exon skipping (ES), mutually exclusive exon (MXE), and intron retention (IR; Figure 1b; Chen & Manley, 2009; Nilsen & Graveley, 2010; Pinto et al., 2011).

    Details are in the caption following the image
    Splice site definition during pre-mRNA constitutive and alternative splicing. (a) The 5′ splice site (SS) is defined through binding of U1 snRNP on the pre-mRNA. U2 snRNP binds the branch point and interacts with U2AF to determine the 3′SS. SR-proteins bind exonic splicing enhancers (ESE) to promote the assembly of the splicing machinery through interaction with U1 and U2 snRNPs and subsequent exon inclusion. On the other hand, heterogeneous nuclear ribonucleoprotein particles (hnRNPs) bind exonic splicing silencers (ESS) and hinder the assembly of the spliceosome to promote exon exclusion. Finally, various splicing factors (SF) associate with intronic splicing enhancers (ISE) or intronic splicing silencers (ISS) to either promote or inhibit the assembly of the spliceosome, respectively. (b) Mechanisms of alternative splicing. Diverse mRNAs can be produced through the use of alternative transcription start sites (A-TSS) as well as alternative 5′-, 3′-splice sites (A-SS), and alternative polyadenylation (A-pA) represented by the poly(A) signal consensus sequence AAUAAA. Exons can be spliced out by exon skipping (ES), while mutually exclusive exons (MXE) cannot co-exist in the same mature mRNA. Introns can be retained (IR) in transcripts

    Although classically described at the post-transcriptional level, countless studies showed that a great part of splicing events occurs co-transcriptionally (Bentley, 2014; Beyer & Osheim, 1988; Carrocci & Neugebauer, 2019; Cramer et al., 1997; Kornblihtt et al., 2004). Over the years, co-transcriptional splicing has been uncovered using methods for imaging (Coulon et al., 2014; Schmidt et al., 2011), biochemical assays (Cramer et al., 1997; Das, 2006; Roberts, 1998), and genome-wide profiling of nascent transcriptome and RNA Polymerase II (Pol II) occupancy (Churchman & Weissman, 2012; Nojima et al., 2015; Sousa-Luís et al., 2021) in diverse species (Ameur et al., 2011; Graveley et al., 2011; Lacadie et al., 2006). The molecular mechanism of co-transcriptional splicing has been addressed in significant reviews (Bentley, 2014; Carrocci & Neugebauer, 2019; Giono & Kornblihtt, 2020; Perales & Bentley, 2009). Here, we present the key features that contribute to position TFs as upmost candidates for connecting transcription and splicing.

    2.2 Coupling transcription and splicing with kinetics

    Pivotal studies connecting transcription and splicing employed Pol II mutant impacting on the transcription elongation rate (de la Mata et al., 2003; Fong et al., 2014; Saldi et al., 2021). It has been primarily described that a slow elongation rate increases exon inclusion in a model termed window of opportunity (Aebi et al., 1986; Figure 2a). Yet, impaired Pol II kinetics do not only promote exon retention (Fong et al., 2014; Saldi et al., 2016). Fong et al. demonstrated that each gene is characterized by an optimal elongation rate which fine-tunes splicing and the production of specific isoforms. This goldilocks model proposes that optimal elongation rate is gene- and exon-specific (Figure 2b). Within one gene, the transcription rate can vary from 0.5 to 5 kb/min, depending on the local chromatin environment (Brown et al., 2012; Cramer et al., 1997; Jonkers et al., 2014). Accordingly, Pol II can pause, accelerate or slow down (Jonkers et al., 2014; Neugebauer, 2019) and thus modulates the use of alternative splice sites, the recruitment of RBPs and influences the RNA shape (Saldi et al., 2021). Contrariwise, co-transcriptional splicing can be decoupled from active elongation. Using live cell imaging, Brody et al. showcased that pre-mRNA can be processed post-transcriptionally while still anchored on the chromatin (Brody et al., 2011). This offers an alternative model of co-localized transcriptional splicing (Figure 2c).

    Details are in the caption following the image
    Models illustrating the coupling of transcription and splicing in time and space. (a,b) KINETIC: Pol II kinetic plays a pivotal role in co-transcriptional splicing. According to the window of opportunity model, slow Pol II favors exon retention. Contrariwise, the goldilocks model proposes that an optimal Pol II elongation rate (slow or fast) is exon-specific. (c) Post-transcriptional splicing can take place while the produced mRNA is still anchored on the chromatin, thereby providing a co-localization of transcription and splicing. (d) The largest subunit of Pol II contains the carboxy-terminal domain (CTD) which is post-translationally modified during transcription in a dynamic way. The CTD can act as a platform for the recruitment of RBPs thereby linking transcription and splicing. (e) Nuclear hubs: The interfacial model argues for the post-transcriptional splicing of exons localized in nuclear speckles containing SR-proteins, while hnRNPs are predominantly present in the nucleoplasm where introns are excised. However, the close proximity observed for transcription and splicing hubs supports a functional link between transcription and splicing (right panel)

    All in all, innovative methods are continuously developed to capture Pol II kinetic by genome-wide profiling (Mahat et al., 2016; Sousa-Luís et al., 2021) and imaging (Cisse et al., 2013; Shibuta et al., 2021), revealing a connected or independent interplay between transcription kinetic and splicing. This is illustrated in significant reviews on transcription rate (Muniz et al., 2021) and alternative splicing (Giono & Kornblihtt, 2020).

    2.3 Importance of the Pol II CTD

    Early on, McCraken et al showed that the carboxy-terminal domain (CTD) deletion of Pol II affects both transcription and mRNA processing (McCracken et al., 1997). The largest subunit of Pol II complex contains an evolutionarily conserved CTD composed of heptad repeats “YSPTSPS” with some degrees of variation (Custódio & Carmo-Fonseca, 2016; Eick & Geyer, 2013; Lu et al., 2019). The CTD is subjected to dynamic post-translational modifications (PTMs) during transcription, allowing to discriminate paused or processive Pol II (Buratowski, 2003). Prior to any experimental evidences, Arno Greenleaf proposed that the negative phospho-charge of the CTD associates with the positive charge of SR-proteins (Greenleaf, 1993). These striking hints were followed by extensive groundwork linking transcription and mRNA processing via the interaction between the CTD—so-called landing pad—and various machineries of mRNA processing (Bentley, 2005, 2014; Custódio & Carmo-Fonseca, 2016). The association between the CTD and RBPs is referred to as the recruitment model (Figure 2d). This was shown for several snRNPs (U1, U2) and accessory proteins (SR, p54; Bentley, 2005; Das et al., 2007; de la Mata & Kornblihtt, 2006; Hsin & Manley, 2012; Nojima et al., 2018).

    It is not consensual whether the CTD is an important feature for co-transcriptional splicing and Pol II-RBPs association. Recently, Zhang et al. unlocked the structure of Pol II and U1 snRNP assembly by cryo-electron microscopy (CryoEM; Zhang, Aibara, et al., 2021). They showed that the complexes directly interact, yet the CTD is not involved. The model supports the co-transcriptional assembly of the spliceosome, while subsequent catalytic steps might occur independently of Pol II. Interestingly, interaction between RBPs and Pol II does not always link to splicing activity. U1A and U1 snRNA subunits of U1 snRNP both travel with Pol II during elongation on intron-containing and intronless genes (Brody et al., 2011). This could relate to the additional function of U1 snRNP in telescripting, in which U1 snRNP inhibits the recognition of cryptic polyadenylation sites located in introns and represses aberrant premature polyadenylation (So et al., 2019). All in all, there is divergence between the modes of recruitment (CTD, other subunits or RNA recognition) revealing most likely a gene-by-gene instruction.

    2.4 Transcription and splicing are a matter of proximity

    Rather than interplay with Pol II, availability (Hochberg-Laufer et al., 2019) and proximity to the transcription machinery can favor loading of RBPs on nascent RNA. It is now clear that local concentration and condensates are essential parameters for linking transcription and splicing (Li & Jiang, 2022). The nuclear environment is heterogeneous and contains nuclear speckles that act as a storage for RBPs and a hub for active transcription sites (Galganski et al., 2017; Spector & Lamond, 2011). Studies on nuclear heterogeneity have exploded with the effort of many groups to describe the role of phase separation or biomolecular condensates in chromatin organization (Mir et al., 2019), transcription (Hnisz et al., 2017; McSwiggen et al., 2019), and mRNA processing (Ishov et al., 2020; Liao & Regev, 2021). Notably, it has been shown that the CTD, RBPs (including splicing factors), and TFs can drive the assembly of distinct molecular condensates (Boija et al., 2018; Guo et al., 2019; Maita & Nakagawa, 2020).

    A model proposed that distinct biochemical compositions favor exon retention in nuclear speckles and intron excision in the nucleoplasm. This interfacial model is orchestrated by the combination of (1), phase separation mechanism, (2), differential concentration of SR-proteins in speckles compared to hnRNPs in the nucleoplasm, (3), RNA cis-regulatory sequences, and (4), RNA concentration at the speckle/nucleoplasm interface (Liao & Regev, 2021). If the model primarily applies for posttranscriptional splicing, the close proximity of transcription sites and nuclear speckles could argue for a functional link between transcription and splicing (Figure 2e). This is prompted by studies on heat shock genes which suggested that proximity of nuclear speckles and active transcription site enhances the level of gene expression (Hasenson & Shav-Tal, 2020; Kim et al., 2020). In agreement, the increased abundance of RNA associated with nuclear speckles correlates with a decrease of RNA degradation and larger active Pol II foci (Kim et al., 2020). Reciprocally, blocking transcription impacts on the shape and dynamic of nuclear speckles (Kim et al., 2019). Yet, the importance of this interplay remains to be determined on a larger class of genes and in vivo at the functional level. Interestingly, Bertero et al. showed a tissue-specific function of transcription and splicing condensates (Bertero et al., 2019). They demonstrated that co-regulated cardiac genes are part of a muscle-specific RBM20-dependent chromatin domain that controls their alternative splicing. Thus, the dynamic interplay between chromatin architecture and splicing is a key feature driving cell fate decision. All in all, local concentration and spatial organization are important parameters for linking transcription and splicing (Figure 2f).

    3 TFS COORDINATE CELL FATE DECISION VIA TRANSCRIPTION AND SPLICING

    The role of TFs in gene expression is notorious. By employing their DNA-binding domain (DBD), they recognize and bind DNA regulatory sequences to activate or repress transcription initiation by the Pol II complex. Beyond transcription initiation, TFs act as moonlighting proteins by impacting on the RNA fate at multiple layers of gene regulation (Carnesecchi et al., 2018; Zanzoni et al., 2019; Y. Zhang & Hou, 2021). Notably, many TFs are able to shape the transcriptome by coordinating the differential expression and splicing of mRNAs throughout development and differentiation processes.

    Early on it has been shown that Sox TFs impact on cell fate by regulating alternative splicing in spermatogenesis (Ohe et al., 2002). For example, the function of SOX9 in transcription and splicing is essential for the testis somatic Sertoli cells determination (Rahmoun et al., 2017). Sox2 orchestrates stemness and differentiation of embryonic stem cells (ESCs) by regulating transcription and splicing programs. In detail, deletion of the RNA-binding domain (RBM) of Sox2 impairs the reprogramming of fibroblasts into induced pluripotent stem cells (Hou et al., 2020). Similarly, the pioneer TFs Forkhead Foxa1 and Foxa2 regulate the determination of lymphocytes in the hematopoietic lineage by coordinating splicing (Lau et al., 2021). Conversely, the role of TFs in splicing has been highlighted in oncogenic context of the hematopoietic system, in which Runx1/Runx1T1 shapes the transcriptome at the expression and splicing levels in leukemia (Grinev et al., 2021). The ETS containing TFs ERG (ERG, FLI1, FEV) also impact on alternative splicing in myeloid leukemia. For instance, EWS-FLI1 fusion protein alters transcription and RBFOX2-dependent splicing programs, hence promoting cancer cell phenotypes in Ewing sarcoma (Saulnier et al., 2021).

    Beyond cell determination and reprogramming, TFs regulate splicing in vivo for coordinating tissue development of multicellular organisms. This is exemplified in the nematode Caenorhabditis elegans for establishing the neuronal lineage (Thompson et al., 2019) as well as in Drosophila for regulating embryonic muscles development (Carnesecchi et al., 2020, 2022). In the latter, transcriptome profiling revealed that the Hox TF Ultrabithorax (Ubx) regulates alternative splicing of genes involved in muscle specific features. These functions are distinct from the ones enriched for genes regulated at the RNA expression level (Carnesecchi et al., 2022). Similarly, Girardot et al. showed that SOX9 regulates transcription and splicing of distinct gene sets associated with different functions in colon tumor cells (Girardot et al., 2018). Conversely, developmental pathologies associated with deregulation of both transcription and splicing functions of TFs have been underlined. This is the case for the cardiac T-box containing TF TBX5. Mutation of TBX5 gene is associated with the syndrome Holt–Oram and leads to various degrees of heart defects and limb abnormalities due to transcription and splicing defects (Fan et al., 2009). Similarly, mutation of TBX3 perturbs transcription and splicing, leading to congenital malformations and ulnar-mammary syndrome (genital, mammary, dental, and limb abnormalities; Kumar et al., 2014).

    An extensive transcriptomic study has been conducted in haplo-insufficient mice (Ctcf+/−) for the CCCTC-binding factor CTCF, a zinc-finger (ZnF) TF with DNA/RNA interacting domain. This dose-reduction experiment demonstrated that CTCF coordinates transcription and splicing programs in various tissues including brain, kidney, liver, and spleen (Alharbi et al., 2021). Combined with other expression studies (Carnesecchi et al., 2022), it also highlighted that the dose of TFs is essential for their splicing activity as previously shown for transcriptional regulation (Auer et al., 2020; Paul et al., 2021). In order to identify proteins that impact on splicing and cell fate, Han et al. developed a high-throughput method called Systematic Parallel Analysis of RNA regulation coupled to barcode sequencing (SPAR-seq) in mouse ESCs and neuroblastoma (Han et al., 2017). They identified diverse players with one-third being TFs and impacting on stemness or differentiation. These examples illustrate the key role of TFs in coordinating cell fate decision through transcription and splicing. Nonetheless, many of these studies await molecular dissection at the gene level to determine a direct role of TFs in splicing.

    4 TFS COORDINATE SPLICING VIA DIVERSE MOLECULAR MECHANISMS

    4.1 TFs regulate splicing via DNA-binding ability

    TFs employ an extended DNA-binding toolbox of indirect and direct mechanisms for orchestrating gene-specific alternative splicing. Notably, various TFs regulate RBPs expression. For example, RUNX1/RUNX1T1 regulates splicing via the control of RBPs expression level, leading to the production of alternative transcripts with differential junction usages (Grinev et al., 2021). Depletion of the ZnF Zfp871 induces a decrease of Srrm4 expression in neural N2A cells (Han et al., 2017). The TF nucleus accumbens associated 1 (Nacc1) controls ESCs differentiation by modulating expression of several RBPs including Mbnl1 (Han et al., 2017). This effect seems direct as genome-wide binding profile of Nacc1 revealed an enrichment near the transcription start site (TSS) of its target genes. Interestingly, Zfp871 and Nacc1 bind RNA of their target genes indicating distinct mode of action (Han et al., 2017). Similarly, our study demonstrated that the Drosophila Hox TF Ubx regulates the expression and splicing of RBPs (Carnesecchi et al., 2022). Yet, as Zfp871 and Nacc1, Ubx uses different molecular strategies as it binds in vitro and in vivo the RNA of differentially spliced exons (Section 4.4).

    TFs also directly modulate splicing via chromatin binding. Great efforts have been made to unveil the role of NR and their cofactors in splicing (among them: Auboeuf et al., 2002; Bhat-Nakshatri et al., 2013; Monsalve et al., 2000; Shah et al., 2020). These studies underlined the importance of promoter identity for TF splicing activity (Figure 3a). Similarly, a recent study revealed a so far unidentified role of enhancer identity in alternative splicing, for the VEGF enhancer located 157 kb downstream of the promoter (Dahan et al., 2021; Figure 3b). Whether this depends on specific TFs remains unknown. As NRs, the Sox TFs have been largely studied for their role in transcription and splicing (Hou et al., 2020; Ohe et al., 2002; Y. Zhang & Hou, 2021). Girardot et al. demonstrated that SOX9 does not affect the level of RBPs. Instead, SOX9 regulates splicing via TSS-specific binding of its target genes (Girardot et al., 2018). As SOX9 also binds RNA, the authors proposed a promoter proximal alternative splicing model in which SOX9 brings together promoter and splice sites for driving cell-specific alternative splicing events (Figure 3c). This is partly mediated by the interaction with Y14, a subunit of the exon junction complex (EJC). The authors anticipated that the model does not depend on Pol II kinetic. This remains to be demonstrated, as well as the physical proximity of exon and promoter. Similarly, the oncogene fusion protein RUNX1/RUNX1T1 regulates splicing via the usage of alternative TSS (Figure 3d; Grinev et al., 2021). This leads to the production of isoforms containing alternative 5′UTRs in Kasumi-1 human cells. Other TFs like CTCF mediates intron retention by recognizing and binding sequences located in upstream or downstream sites of the targeted intron in mouse tissues (Figure 3e; Alharbi et al., 2021). Importantly, TF binding to CRMs, TSSs, and gene bodies could also impact on splicing via the regulation of the chromatin landscape and the recruitment of histone modifiers (Figure 3f; Agirre et al., 2021; Dušková et al., 2014; Rambout et al., 2018). All in all, several mechanisms can be employed by TFs to regulate splicing at the DNA layer. Further examinations of the chromatin environment and proximity of the regulatory sequences could be assessed to determine the mechanistic cues more precisely.

    Details are in the caption following the image
    Models illustrating the regulatory mechanisms of TFs in splicing. TFs regulate splicing via their DNA-binding ability. This can be driven (a) by specific promoter binding, (b) from enhancer via chromatin looping, (c) from promoter via promoter-exon loop, (d) by alternative promoter or TSS selection and impact on alternative 5′UTR, (e) via the binding of downstream regions or (f) by impacting on chromatin marks and influencing exon/intron retention. Moreover, TFs coordinate splicing via interaction with Pol II and elongation machineries by (g) traveling along with the gene body with Pol II or (h) impacting on RBPs recruitment or the chromatin landscape to regulate splicing. Alternatively, (i) TFs associate with RBPs which are spatially organized in the (j) nucleus at chromatin, nucleoplasm or speckles levels thereby fine-tuning transcription and splicing. Additionally, TFs can regulate splicing via (k) direct or indirect binding of the nascent transcript, using different protein domains (TF with DBD, TFb with other interaction domain). TF is schematized as core domain containing the DBD with an additional flexible interaction domain

    4.2 TFs associate with Pol II and elongation complexes

    The interplay between elongation machinery and TFs represents an important regulatory layer of transcription and splicing. One prominent example is the regulation of rem1 transcription and splicing by the Forkhead TF Mei4 in fission yeast (Moldón et al., 2008). Mei4 binds in the promoter and gene body of rem1. The authors proposed that Mei4 does not directly bind regulatory DNA sequences in the gene body. Instead, Mei4 associated with elongating Pol II, suggesting a function of Mei4 on coupling elongation and splicing. Besides, the loading of Mei4 is required for recruiting the spliceosome onto rem1 RNA. While splicing depends on promoter, these data suggest that Mei4 could travel with Pol II and promote the recruitment of the splicing machinery at specific sites.

    Interestingly, we demonstrated a similar mechanism for Ubx in co-transcriptional splicing. Ubx interacts with RBPs and Pol II in the Drosophila embryo as well as directly with Pol II-CTD in vitro (Carnesecchi et al., 2020, 2022). We showed that Ubx binds along with the gene body of its spliced target genes in a transcription-dependent manner. Moreover, a mutant impairing Ubx DNA-binding ability reduces its interaction with processive Pol II (Carnesecchi et al., 2022). Altogether, these data support a model where Ubx binding on chromatin is essential for its splicing function. In addition, Hox TFs can generally interact with paused Pol II complex and promote the release of processive Pol II (Zouaz et al., 2017). As Mei4, we proposed a traveling model in which Ubx interacts with Pol II on promoter and travels with the active elongating complex to regulate splicing by recruiting the spliceosome on its target exons (Figure 3g). These studies illustrate the importance of the interplay between chromatin, TFs and Pol II machinery for efficient coupling between transcription and splicing. Like Ubx, the TF/RBP p54 interacts both with Pol II-CTD and RBPs (Kameoka et al., 2004). Another TF, c-Myc interacts with and recruits the elongation factor Spt5 to promoters, thereby enhancing Pol II processivity (Baluapuri et al., 2019). Interestingly, c-Myc binds the promoter of the RBP Sam68 and regulates its expression and splicing. Caggiano et al further demonstrated that c-Myc regulates Sam68 splicing through the variation of Pol II elongation rate (Caggiano et al., 2019). Conversely, subunit of the negative elongation complex (NELF-B) termed cofactor of BRCA1 (COBRA1) interacts with diverse NRs with different affinities (Aiyar et al., 2004). COBRA1 interacts with the Androgen receptor (AR) and promotes exon inclusion from transcripts regulated by AR-specific promoters (Sun et al., 2007). This could be due to a reduced elongation rate or a modulation of RBP recruitment that could be determined experimentally by impacting on Pol II speed (Figure 3h).

    4.3 TFs coordinate splicing via their association with RBPs

    Many TF-interactomes contain RBPs. Most often ignored or even considered as a noise, these interactions are nowadays studied to better understand their role on gene expression in the context of specific cellular functions such as cardiac differentiation (Bertero et al., 2019), apoptosis (Bielli et al., 2014), pathologies (Saulnier et al., 2021) and for orchestrating cell identity (Box 1). RBPs and TFs can act cooperatively or antagonistically (Figure 3i). An elegant study showed that the TF FBI-1 regulates splicing by modulating Sam68 recruitment (Bielli et al., 2014). On one hand, the interaction between FBI-1 and Sam68 decreases Sam68 binding to BCL-X RNA. This promotes the selection of alternative 5′SS and production of a long isoform which inhibits apoptosis. On the other hand, depletion of FBI-1 is associated with binding of Sam68 to BCL-X RNA thus inducing the production of a short isoform leading to apoptosis. Conversely, SOX9 interacts with the EJC subunit Y14 to cooperatively regulate part of its splicing program (Girardot et al., 2018). The Sox TF SRY employs a comparable mechanism and interacts with the spliceosome for regulating alternative splicing, most likely by a cooperative mode of action (Ohe et al., 2002). Importantly, SRY and SOX6 co-localize in nuclear speckles (SC35 marker) with RBPs (SC35, snRNPU1-70K, U2AF65). Blocking splicing with U6 antisense delocalizes SC35 as well as SOX6 (Ohe et al., 2002). Other TFs such as WT1 can localize in nuclear speckles (Rambout et al., 2018). As mentioned in Section 2.4, the spatial localization of transcription sites in close proximity to nuclear speckles is a substantial parameter for transcription and splicing regulation (Zhang, Zhang, et al., 2021). Thus, the localization of TFs in the nuclear landscape and relative to nuclear speckles seems a significant feature to assess for unraveling their molecular function (Figure 3j). Using immunoprecipitation coupled with quantitative mass spectrometry, Samudyata et al. showed that Sox2 interacts with RBPs both on the chromatin (hnRNPs, SRSF1, prp19, prp8, Rbm38) and nucleoplasm fraction (hnRNPs, Dxds; Samudyata et al., 2019). Similarly, proximity labelling method associated with mass spectrometry revealed that Ubx interacts with RBPs both on chromatin and in the nucleoplasm (Carnesecchi et al., 2020). Interestingly, Ubx interacts with mRNA processing proteins yet, with a distinct set of partners in each embryonic tissue analyzed. We envisioned that Ubx regulates splicing in various tissue types yet, via different molecular mechanisms for promoting cell fate decision (Box 1).

    BOX 1. Combinatorial code of TFs and RBPs for cellular identity

    A series of recent studies highlighted the existence of combinatorial codes of TFs and RBPs—both at the interactive and expression levels—that could orchestrate cellular identity. We uncovered tissue-specific protein networks of the Hox TF Ultrabithorax (Ubx) in Drosophila, revealing distinct interactomes with few common proteins (Carnesecchi et al., 2020). Moreover, Ubx interacts with various players of gene expression thereby extending its function to mRNA processing including splicing. This variety of partners was noticed in each tissue studied, suggesting common molecular functions of Ubx which are coordinated with different partners depending on the tissue context (Carnesecchi et al., 2022). Another study in nematodes employed fluorescent reporters to demonstrate that combinatorial expression of 3 TFs and 2 RBPs is necessary for alternative splicing in mechanosensory neurons (Thompson et al., 2019). The authors also highlighted the existence of RBP networks driving similar splicing outcomes in different neurons, hence promoting phenotypic convergence. Similarly, comparison of the splicing transcriptomes from diverse naïve mouse Th-cells revealed distinct roles of alternative splicing in early activation and differentiation (Mir et al., 2021). Comparison on transcriptome and genome-wide binding profiles reinforced a model in which lineage-specific TFs regulate RBP expression, thereby orchestrating the cell-type specific rewiring of splicing throughout differentiation. Furthermore, the TF-RBP combinatorial control of splicing has been proposed in pathological context (He & Hu, 2021). All in all, these studies underscore that integrative transcription and mRNA processing networks could be the key collaboration for orchestrating cell fate determination.

    4.4 TFs regulate splicing via RNA-binding ability

    The ability to bind both DNA and RNA has been described for numerous TFs, including general TFs (TFIIIA, TAF7; Cheng et al., 2021), the Sox family (Sox2, Sox9; Holmes et al., 2020; Hou et al., 2020; Samudyata et al., 2019), the homeobox HD group (knotted1 kn1, Bcd, Ubx; Lucas et al., 1995; Rivera-Pomar et al., 1996; Carnesecchi et al., 2022), the ZnF (TRA-1, WT-1, CTCF; Han et al., 2017; Hansen et al., 2019), the T-box family (TBX3, TBX5; Fan et al., 2009; Kumar et al., 2014), the NR group (ERα; Xu et al., 2021), the ETS group (Spi/PU.1; Hallier et al., 1996), and the Y-box class and singular TFs like p53 or NFKB (Cassiday, 2002). This nonexhaustive list (see also Hudson & Ortlund, 2014) highlights that RNA-binding ability may represent a general and widely distributed molecular property among the different TFs. If RNA binding is not restricted to splicing related function, this ability considerably increases the variety of mechanisms employed by TFs to regulate splicing. Some TFs like TBX3 and Ubx can regulate splicing most likely via their RNA-binding ability (Carnesecchi et al., 2022; Kumar et al., 2014). This is also the case for Sox2, which binds various RNA types like mRNAs (Hou et al., 2020), long noncoding RNAs lncRNA-ES1/2 (Holmes et al., 2020), the noncoding RNA 7SK and small nucleolar RNA Snord34 (Samudyata et al., 2019). However, the impact of these interactions on the RNA fate is still unclear in many cases. For example, SOX9 binds RNA but a mutant impairing RNA binding still drives differential splicing (Girardot et al., 2018). The association of Sox2 with RNA reveals interesting divergences. The DBD of Sox2, the HMG, has been shown to contact RNA (Holmes et al., 2020). Yet, another study showed that HMG deletion does not impair Sox2 RNA binding. Instead, Sox2 contains a RNA-binding motif (RBM) which is essential for promoting cell pluripotency via splicing regulation (Hou et al., 2020). Thus, Sox2 may differently impact on the RNA fate by employing distinct RNA-binding interfaces (Figure 3k).

    All in all, TFs use various indirect and direct mechanisms to regulate splicing. Notably, a significant review described the mechanistic cues of TFs in splicing (Rambout et al., 2018). Collectively, the data further suggest that there is no consensus that defines each TF mode of action, neither for specific TF family (Box 2).

    5 PROSPECTS ON TF DNA/RNA DUAL BINDING ABILITY

    5.1 Specific or not specific, that is one of the questions

    TFs recognize and bind putative DNA-binding sites through their DBD, which is highly conserved in term of structure and binding interface (Banerjee-Basu, 2001; Y. Zhang & Hou, 2021). Yet, a certain degree of flexibility provides specificity via the influence of DNA shape (Pal et al., 2019; Sielemann et al., 2021), the interaction interface with cofactors (Merabet et al., 2007) and the variation of enhancer grammar (Barolo, 2016; Jindal & Farley, 2021). The binding of TFs to RNA could follow the same logic… But it seems not. Some TFs such as the Drosophila Tra2 and Bcd have specificity toward defined RNAs and binding sites (dsx for Tra2, caudal for Bcd), however, this is still a matter of debate (Rödel et al., 2013). In contrast to Tra2, Tra binds various RNA molecules in vitro, yet associates specifically in vivo with dsx RNA to regulate sex determination (Tian & Maniatis, 1992). While we detected a certain binding plasticity in vitro toward RNA probes, we observed that Ubx binds RNAs with different affinities in vitro and with less specificity than in vivo (Carnesecchi et al., 2022). It has been suggested that TF-DNA binding relies mainly on nucleotide sequence recognition while TF-RNA association depends on 3D structure such as stems and loops (Christiansen et al., 1987; Stefl et al., 2005). In other words, TF-DNA association is primarily based on sequence (and then on shape), whereas RNA binding could be largely driven by the shape. We further examined whether this could emerge as a general feature across TFs. It has been shown that TFIIIA interaction with 5S RNA does not rely on site-specific contacts but on the RNA structure (Darsillo & Huber, 1991). This view is supported by recent study of Sox2 binding to lncRNA-ES1. In vitro, Sox2 binds via its HMG double-stranded RNA sequences (hairpins) with high affinity, whereas the sequence nature itself does not impact on the association (Holmes et al., 2020). In contrast, another study showed that Sox2 binds RNA via a RBM domain thereby conferring a degree of specificity toward GC rich sequences (Hou et al., 2020). In their review, Rambout et al. assessed in silico the structure of RNA bound by TBX3 for which, loss of interaction has been shown upon mutation of T-box putative RNA sequence (Kumar et al., 2014). They found that the mutation also affects the RNA structure in silico by disrupting hairpins formation (Rambout et al., 2018). In this context, the influence of RNA conformation and sequence is still to determine unambiguously and crystallography strategies seem pivotal to unravel the TF preferential assembly with DNA or RNA molecules.

    5.2 Compatibility and domains for DNA/RNA interface

    Apart from Sox2 and CTCF, the revealed TF-RNA interactions involve primarily the DBD. This raised several questions that we aimed to examine: how can TFs bind both DNA and RNA with the same domain? Is the binding compatible or mutually exclusive? A TFIIIA mutant affects DNA and RNA binding, revealing the same amino acid interface for contacting both nucleotide sequences (Rawlings et al., 1996). In contrast, it has been shown for several TFs that, while the DBD is pivotal for DNA and RNA binding, these interactions involved different amino acids (Cassiday, 2002; Holmes et al., 2020). For example, point mutation of Ubx HD circumvents DNA binding but not its RNA-binding ability in vitro (Carnesecchi et al., 2022). This opens the prospect that the DBD could arrange concurrently DNA and RNA binding through specific structural conformation (Figure 4a). In the case of Sox2, Holmes et al claimed that DNA and RNA binding is mutually exclusive, yet the HMG amino acid interface is probably different (Holmes et al., 2020). In contrast, Hou et al. showed that Sox2 could contact DNA and RNA simultaneously as RNA recognition is mediated by the RBM domain, allowing the HMG to bind DNA (Figure 4b; Hou et al., 2020). Importantly, PTMs of TFs and RBPs modulate respectively their affinity for DNA and RNA (Arenas et al., 2020; Babic et al., 2004; Filtz et al., 2014). Thus, we speculate that specific PTMs in the DBD favor the binding toward DNA or RNA molecules. Alternatively, PTMs in other regulatory regions may affect the TF conformation thereby impacting on the DBD DNA/RNA affinity (Figure 4c). Moreover, local concentration could ensure that an excess of TF molecules is present to bind both DNA and RNA in a mutually exclusive manner (Figure 4d,e). We envision this mechanism in the case of TBX3, as a variant lacking the DBD has no effect on splicing. Though, bringing TBX3 artificially to the pre-mRNA restores its splicing activity (Kumar et al., 2014).

    Details are in the caption following the image
    Models of TF association with DNA and RNA molecules. (a) TFs employ the same domain, the DBD, or different domains (b) to contact DNA and RNA, possibly via specific conformational changes. These changes could be driven by (c) post-translational modifications (PTMs). Alternatively, TFs could bind DNA and RNA in a mutually exclusive manner, which could be possible thanks to (d) local concentration (excess of TF molecules) or (e) by a dynamic bouncing behavior between DNA and RNA molecules. Moreover, TFs could contact DNA and RNA via (f) homodimerization or (g) heterodimerization. Finally, the formation of R-loop, or DNA–RNA hybrid (h) could be another possible mechanism by which TFs associate with DNA and RNA. Notably, R-loops are formed during the active elongation process

    Another conceivable way to interact both with DNA and RNA is via TF–TF dimerization (Figure 4f) and by indirect interaction with Pol II/elongation machineries or RBPs (Figure 4g; Rambout et al., 2018). Some TFs like WT1 (Rambout et al., 2018), Hox (Merabet & Hudry, 2011), and SOX (Girardot et al., 2018) dimerize, whereas others (such as TBX3) do not. Notably, a SOX9 mutant affecting its dimerization does not circumvent its splicing activity. Thus, the model remains to be experimentally demonstrated. An interesting alternative for contacting DNA–RNA is the R-loop or hybrid, as suggested for CTCF and ERα (Figure 4h; Sanz et al., 2016; Stork et al., 2016). This specific association could be involved in splicing regulation as DNA–RNA hybrids are present upstream of the elongating Pol II complex and could affect splicing (Conn et al., 2017). Once more, several models can be envisioned and vast gaps need to be covered to unravel TF binding abilities in vitro and assess their molecular and functional impacts in vivo (Box 2).

    BOX 2. Acquisition of multiple molecular functions by TFs

    From an evolutionary viewpoint, the burst of TFs relates to the Last Eukaryotic Common Ancestor (LECA; de Mendoza et al., 2013; de Mendoza & Sebé-Pedrós, 2019). This includes the HD, HMG, and Forkhead TF families. In contrast to these well-conserved TFs, a genome-wide comparative analysis in numerous eukaryote species revealed that ZnF TFs present the highest divergence of DNA-binding sites across various TF families (Lambert et al., 2019). In line, Han et al. speculated that the expansion of ZnF TFs in multicellular organisms may not only emerge from the control of transcription and transposable elements but also from the splicing complexity required for vertebrate organogenesis (Han et al., 2017). These data support a theory of TF evolution for both transcription and splicing regulation. However, it does not apply for the DNA/RNA-binding ability of TFs that can be decoupled from their molecular function on gene expression. Instead, the DNA/RNA-binding duality may rise from a common ancestor having both abilities with great plasticity. Binding specificity may have been tuned along with the building of genome and multicellular complexities (Hudson & Ortlund, 2014). From a common DNA/RNA-binding ability, different molecular functions may have emerged throughout evolution by cooperative or independent mechanisms. Hence, some TFs developed sophisticated mechanisms to regulate splicing via their DNA- or RNA- or both binding abilities. Other TFs diversified their activity by uncoupling functions related to DNA and RNA binding, thereby fine-tuning other regulatory layers of gene expression which could promote cell diversification.

    6 MODELS FOR INTEGRATIVE TRANSCRIPTION AND SPLICING ACTIVITY OF TFS

    6.1 Models of TF dynamics in transcription and splicing

    To find (and bind) their cognate sites, TFs have to deal with the nuclear environment organized in distinct biochemical condensates, chromatin compartments, topological domains, and local chromatin landscape (Li & Jiang, 2022; Tena & Santos-Pereira, 2021). They dynamically (and stochastically) scan the chromatin comprising many consensus and degenerate sequences (Garcia et al., 2021; Hansen et al., 2020; Mazzocca et al., 2021; Suter, 2020). In this context, TFs need to adopt innovative behavior to exert their molecular function on transcription and splicing.

    An interesting model integrating the role of TFs as DNA/RNA binding proteins has been proposed (Scherrer, 2012), accounting for different constrains such as the CRM identity (enhancer, promoter), the 3D nuclear organization and local chromatin domain (marks, accessibility). This model includes the fact that CRMs can dynamically interact even separated by long distances. It integrates the recruitment of cofactors on chromatin (TFs, chromatin associated proteins) and on nascent transcripts (snRNPs and associated factors). The transcription factor cycle (TFC) hypothesis proposes that TFs transfer dynamically from CRMs to pre-mRNA and recycle back to the chromatin, all in all driven by a fine-tuned equilibrium between affinity and specificity (Figure 5; Scherrer, 2012). Thus, it includes TF availability as an essential parameter. However, this model cannot solely explain the various molecular mechanisms by which TFs regulate splicing (Section 4).

    Details are in the caption following the image
    Model for integrative regulation of transcription and splicing by TFs. Cartoon summarizing the model of the transcription factor cycle (TFC) proposed by Klaus Scherrer (2012) and refined with additional features impacting on TF activity. In the TFC, TFs transfer dynamically from CRMs (enhancer, promoter) to pre-mRNA and recycle back to the chromatin. This is driven by an equilibrium between affinity and specificity of TFs toward DNA and RNA. We propose that this equilibrium fine-tunes mRNA expression as follows: TFs bind CRMs and promote transcription. The accumulation of transcripts could favor TF binding toward RNA. Consequently, CRM-TF binding declines leading to a decrease of mRNA level and a shift back of TF binding toward DNA/CRMs. Additional parameters could orchestrate TF activity in transcription and splicing, such as the formation of condensates or hubs influencing the local concentration of TFs. TFs could form gradient diffusing from enhancer to promoter, to the nascent RNA. This could be determined by specific post-translational modifications (PTMs) of TFs which modulate the binding affinity of TFs toward DNA and RNA molecules. Finally, the binding behavior of TFs is an essential parameter for coordinating their function in transcription. TFs employ distinct behaviors to scan and find their target genes in the extended chromatin-DNA/RNA landscape. Notably, the frequency and duration of the binding event are deterministic in the regulation of gene expression. We expect that various binding behaviors would also apply to splicing regulation

    To date, there is few data describing the dynamic regulation of splicing by TFs. In contrast, the advance of single molecule imaging unveiled their dynamic behavior for regulating transcription (Auer et al., 2020; Mazzocca et al., 2021). These studies provide substantial information to speculate about spatial and temporal TF dynamics for regulating splicing. The formation of dynamic transcriptional hubs appeared as an important process for gene regulation by optimizing the target-search of TFs in the nuclear landscape. By using live imaging with mathematical modeling, Dufour et al. demonstrated that the pioneer TF Zelda (Zld) forms dynamic nuclear hubs in Drosophila embryos. Importantly, the local concentration of Zld in micro-environment is associated with a highly dynamic binding behavior. Through accumulation in hubs and transient chromatin-binding, Zld facilitates transcriptional activation during zygotic genome activation in Drosophila embryos (Dufourt et al., 2018). Thus, local concentration can affect the rate of TF binding occupancy and impact on transcription burst frequency (Figure 5). This model is further refined by a recent study on CTCF. Single molecule tracking and theoretical modeling revealed that CTCF local concentration facilitates transcriptional activation by transiently trapping CTCF in proximity to its DNA-binding sites (Hansen et al., 2020). This relies on the RNA-binding region (RBR) of CTCF which guides local concentration. Remarkably, the RBR domain of CTCF drives part of CTCF-dependent chromatin loop (Hansen et al., 2019). The CTCF-model of chromatin loop could apply to Sox2 which binds DNA and RNA with different protein domains. Alternatively, this could be driven solely by the DBD contacting both DNA and RNA via different amino-acid interfaces (Figure 4a). Thus, transcriptional hub/local concentration could be a major parameter that refines DNA/RNA scanning and the regulation of transcription and splicing by TFs (Figure 5). Another important parameter of TF behavior is the dwell-time of TF binding occupancy, which has been shown to impact on transcription burst duration (Brouwer & Lenstra, 2019). Interestingly, Garcia et al. used single molecule imaging to propose a continuum affinity model where TF dwell-time binding distribution follows a power law behavior (Garcia et al., 2021). In biological terms, the power law means that we cannot differentiate between specific and nonspecific binding of TFs. This raised critical questions concerning the extent to which this behavior relates to DNA or RNA binding of TFs and/or to interaction with the transcription and/or splicing machineries. These data further exemplify how the heterogeneity of nuclear environment and the variety of nucleic acids and protein–protein interactions mediated by TFs impact on their dynamic behavior.

    All in all, large gaps are still missing to understand the molecular mechanisms of TFs in transcription and splicing. This requires innovative technological development for interpreting experimental data into clear TF behaviors (Lionnet & Wu, 2021) and assess more precisely the features that influence their behavior at the single molecule level in vivo.

    6.2 Updated models of TF activity for integrating transcription and splicing

    Taken together, numerous models can be envisioned, compatible or exclusive, and no preferred one seems to emerge. Instead, the molecular mechanisms employed by each TF to regulate transcription and splicing rely on the target gene involved, for which the cell context instructs specificity. In the following section, we summarize key points centered on TFs, integrating direct regulatory mechanisms linking transcription and splicing (Figure 5).
    • Carried by the crowd (Sections 2.4 and 4, Figures 2e, 3j, and 4d,e)
    Local concentration and nuclear condensates favor the proximity of transcription and splicing machineries either independently or through interconnected processes such as co-transcriptional splicing. This includes chromatin looping driven by TFs. From this proximity, various options emerge for TFs to act on chromatin/DNA and RNA.
    • Importance of being anchored (Section 4.1, Figures 2c, 3a–f, and 4)
    TFs regulate transcription and splicing by chromatin/DNA binding relying on CRM identity or binding in the gene body, with various degrees of specificity. This could promote alternative TSS usage, differential regulatory proteins recruitment, modification of the chromatin landscape, or of elongation rate.
    TFs travel together with the Pol II machinery to regulate both transcription and splicing. This model could depend (or not) on chromatin/DNA binding and on chromatin looping. Contrariwise, it requires either a direct or indirect association with the nascent pre-mRNA to regulate splicing.
    • Associating with RNA (Sections 4.3 and 4.4, Figures 3i–k and 4)
    TFs interact directly—or indirectly via cofactors—with RNA thus affecting the recruitment of splicing machinery and promoting various alternatively spliced events. The mechanism is gene- and more likely sequence-specific. TFs can bind RNA in a transcription-dependent or transcription-independent manner, which could involve (or not) chromatin/DNA binding.
    • Contacting chromatin and RNA with distinct protein interfaces (Section 5.2, Figure 4a)
    TFs can use different domains to contact both DNA and RNA. Alternatively, we foresee that the DBD can contact both molecules via specific 3D arrangement.
    • Contacting chromatin and RNA via TF–TF dimerization (Section 5.2, Figure 4f)
    In order to contact both chromatin/DNA and RNA, TFs can dimerize one molecule contacts DNA and the other RNA.
    • Contacting chromatin and RNA by recruiting partners (Sections 4.3 and 5.2, Figure 4g)
    Local concentration promotes interaction with cofactors. Thus, some TFs could have an essential role in splicing while they are not able to interact directly with RNA. In that configuration, they interact with cofactors which connect them to their pre-mRNA target.
    • Relying on local concentration: excess TFs over RNA and DNA (Sections 2.3 and 5.2, Figures 2e and 4d,e)
    Local concentration of TFs has been highlighted in numerous studies. Thus, it is conceivable that enough TF molecules are available to independently regulate transcription and splicing via mutually exclusive DNA and RNA binding. Alternatively, TFs could bind in a mutually exclusive manner both chromatin/DNA and RNA due to proximity using a dynamic behavior of bouncing on DNA and RNA molecules (Figure 4e).
    • Fine-tuning affinity and dynamics by TF PTMs (Section 5.2, Figure 4c)

    It has been proposed that TF PTMs could be a major parameter for CRMs close-proximity communication (Karr et al., 2022). In particular, TF acetylation at enhancer weakens the DNA interaction, promotes TF diffusion locally from enhancer to the promoter and initiates transcription activation. Thus, a gradient of TF PTMs more globally could be critical for enhancer–promoter communication and transcriptional regulation. It is particularly inspiring as RBP acetylation seems to modulate RNA binding in a positive (Babic et al., 2004) or negative (Arenas et al., 2020) manner. Thus, local concentration and PTMs could fine-tune the preferential affinity of TFs not only for dynamic enhancer/promoter interaction but also for DNA–RNA equilibrium. One could speculate that the accumulation of RNA further favors the TF binding toward RNA. This could negatively feedback on transcription due to the loss of CRMs activation. The decrease of RNA level may switch back TF binding toward DNA thereby recycling back TF to the CRMs for another cycle of transcription and ultimately fine-tunes gene expression (Figure 5).

    7 CONCLUSION AND FUTURE DIRECTIONS

    TFs are pivotal in integrating transcription and splicing, acting at both levels through interconnected mechanisms. Nonetheless, we are still far from understanding their molecular mode of action. Several models have been proposed yet, most of them remain to be experimentally tested and connected. This is where the importance of multidisciplinary research strategies stands. It appears from the current knowledge that one experimental approach is not sufficient for unveiling their molecular activity in gene expression. Instead, key findings have been highlighted using combinatorial approaches of biochemistry, imaging and theoretical modeling. Nowadays, the technological development of -omics field allows the precise assessment of nascent RNAs, Pol II speed and occupancy, and the quantification of transcript isoforms with accuracy using long-read sequencing. Similarly, the advances in microscopy provide another degree of precision using combination of single molecule imaging with biochemistry gold standard and CryoEM for structural insights. Therefore, the field is vibrant and offers exciting prospects to unravel the role of TFs in gene regulation.

    Many questions are left open on several layers. First, concerning the DNA/RNA binding of TFs, its specificity and how TFs accommodate both substrates. Another issue concerns the constrains shaping the RNA affinity toward structure rather than sequence. Finally, it is still unclear why certain TFs regulate splicing via their DNA and/or RNA binding ability and what regulates the target gene specificity. At the DNA-regulatory layer, TF binding in the gene body seems pivotal for co-transcriptional splicing. Interestingly, we observed that Ubx CRMs located in gene body are differently impacted by transcription inhibitor than intergenic ones (Carnesecchi et al., 2022). Whether this observation is a global feature of the TF could be determined at the genome-wide level, as well as its functional relevance in vivo. Additionally, important features are not explored in the proposed models such as the influence of RNA editing (Roignant & Soller, 2017). Another gap concerns the requirement of transcriptional hubs and transcription factories. If the proposed models include the formation of condensates, their requirement at all times is still a matter of debate. The same issue holds true for CRM communication (Karr et al., 2022). Therefore, are there rules that we could extract from genome-wide profiling or should we primarily envision a gene-by-gene case of study? For certain, both approaches will be decisive along with the cell type specific context.

    By orchestrating cell fate decision at several layers of gene expression, TFs build the remarkable complexity of multicellular organisms. In this review, we underlined the importance of splicing regulation by TFs in diverse cell functions and cell fate determination. This emphasized the undoubtable necessity to explore the impact of TF function not only for mRNA expression but also at the splicing level of transcriptome data. Even more remarkably, the function of TFs extends beyond splicing at several layers of mRNA processing, transport, translation, and cellular trafficking. This opens large perspectives to connect their molecular activity to the diversity of their biological functions. Thus, deciphering TF molecular versatility at the various layers of gene regulation offers exciting and critical challenges to unveil the molecular cues orchestrating cell fate decision.

    AUTHOR CONTRIBUTIONS

    Panagiotis Boumpas: Visualization (equal); writing – review and editing (supporting). Samir Merabet: Writing – original draft (supporting). Julie Carnesecchi: Conceptualization (lead); funding acquisition (lead); project administration (lead); supervision (lead); visualization (equal); writing – original draft (lead); writing – review and editing (lead).

    ACKNOWLEDGMENTS

    We apologize to all whose works have not been cited due to space constrictions. We would like to emphasize our wish to cite most recent reviews, that will hopefully provide further access to the many pioneer works in the various fields that the review aims to overlap. Moreover, we are deeply grateful to Didier Auboeuf and Pedro Pinto who kindly offered insightful critics on the review and stimulating discussions on the topic. We warmly thank James Brash for proof-read the manuscript.

      FUNDING INFORMATION

      This project has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement (RNA-NetHOX, 101024467) for Julie Carnesecchi.

      CONFLICT OF INTEREST

      The authors have declared no conflict of interest for this article.

      RELATED WIREs ARTICLES

      What is the switch for coupling transcription and splicing? RNA Polymerase II C-terminal domain phosphorylation, phase separation and beyond

      Transcription and splicing: A two-way street

      DATA AVAILABILITY STATEMENT

      Data sharing is not applicable to this article as no new data were created or analyzed in this study.