Introduction
We seek to fully understand the process of epigenetic regulation, transcription initiation, elongation, and termination, and RNA processing, which are all necessary to make functional RNAs from the genomes. Whereas cells composing our body have the same set of genes, they exert diverse functions. This is made possible by the elaborate cellular machinery that controls spatiotemporal expression of the genome. We have identified and characterized a set of protein factors involved in this process. We are also trying to understand how diverse transcriptomes are generated from the genome.
Regulatory Mechanisms of Transcription Elongation
Transcription occurs in three steps: (i) the initiation step, in which RNA polymerase II (Pol II) binds to the promoter region of a gene and starts RNA synthesis, (ii) the elongation step, in which Pol II actively synthesizes RNA in a 5’-to-3’ direction, and (iii) the termination step, in which Pol II reaches the termination site and is released from DNA. These steps are accomplished by numerous factors that bind to DNA or Pol II. Turning the clock back to the 1990s, research on the transcription initiation stage was in full bloom, and it was widely believed that the transcription initiation stage was the key to on/off regulation of gene expression in eukaryotes.
Under these circumstances, our laboratory turned its attention to DRB, a specific inhibitor of RNA polymerase II transcription, which, when added to cultured mammalian cells, was reported to inhibit the synthesis of long transcripts and cause the accumulation of short transcripts. Thus, it was hypothesized that DRB might inhibit the transcription elongation step. Surprisingly, however, DRB did not inhibit purified RNA polymerase II. We therefore sought to elucidate the mechanism of DRB-mediated transcription inhibition using a biochemical approach.
As a result, we identified two new transcription elongation factors, DSIF (SPT4-SPT5) and NELF. We also found that P-TEFb, a transcription elongation factor discovered by David Price’s lab, works antagonistically with DSIF and NELF. In other words, DSIF and NELF bind to RNA polymerase II immediately after transcription initiation and act as a brake to inhibit transcription elongation. P-TEFb (CDK9-Cyclin T) has protein phosphatase activity and phosphorylates the C-terminal domain (CTD) of RNA polymerase II and the C-terminal domain (CTR) of DSIF, thereby inducing the release of NELF and reactivating transcription. It has become clear that P-TEFb-mediated phosphorylation triggers the recruitment of additional protein factors that bind to CTD and CTR and results in the formation of a mature transcription elongation complex.
The field of transcription elongation research has evolved with the identification of numerous transcription elongation factors. Although it has become a well-established field such that university textbooks write about it, it continues to be an active research field with many papers dealing with its functional and structural studies published in top journals.
Physiological Significance of Elongation Control
Apart from mechanistic studies on transcription elongation, research aimed at elucidating its physiological significance is being conducted in parallel. In addition to research conducted using cultured human cells as a model system, there are also studies that are conducted as collaborative research using various model organisms.
The first obvious physiological significance of elongation control is its role in inducing transcription of a group of genes that are rapidly expressed in response to external stimuli, collectively referred to as immediate-early genes. The physiological significance of elongation control is considered to stall Pol II just before comleting mRNA synthesis, keeping chromatin in an active state so that transcription can be resumed promptly when an external stimulus arrives, and studies of genes rapidly expressed in response to heat shock, growth factors, hormones, and so on have indeed confirmed this idea.
Collaborative studies using zebrafish and Drosophila have also revealed that elongation control plays an important role in the development and differentiation of the central nervous system. The expression of a large number of cell type-specific genes fluctuates dynamically during development and differentiation; therefore, a situation similar to that of immediate-early genes seems to occur.
In addition, the recent development of next-generation sequencers has made it possible to obtain detailed information on the entire genome. This has made it possible to study the behavior of Pol II and transcription factors over individual genes, which is unapproachable by biochemistry. In fact, it has become clear that the regulation of transcription elongation by DSIF and NELF is important for overall genome expression.
Regulatory Mechanisms of Transcription Termination: How Diverse Transcriptomes Are Generated
Our lab is also actively investigating the mechanism by which transcription termination control gives rise to diverse transcriptomes. Pol II-transcribed genes have three distinct 3′ end processing pathways, which are normally selected for each gene. In principle, 3′ end processing is tightly coupled to transcription termination, 3′ end processing inducing transcription termination. We have recently obtained evidence that factors thought to be involved in transcription initiation and elongation (NELF, CBC, Mediator, LEC, and so on) are also involved in the selection of the three processing pathways. Moreover, many genes have multiple transcription termination sites, and we are identifying and analyzing factors involved in the selection of these sites. It is likely that diverse transcriptomes are generated from the same genome in different cell types in part through the regulation of the transcription termination steps.
We are trying to address these challenges by combining genomic and transcriptomic analysis using next-generation sequencers with proteomic analysis using state-of-the-art mass spectrometry and genetic methods using CRISPR/Cas9. This is where the bioinformatics approach becomes very important.
Introduction
A human body consists of 60 trillion cells comprised of lipids, proteins, and nucleic acids. Small molecule compounds, such as vitamins, steroid hormones, and heme, modulate the intricate and sophisticated apparatus in cells. Some of the daily over-the-counter medicines are also made of small molecule compounds. However, exactly how small-molecule compounds exert their activities is not fully elucidated. Even among popular, marketed drugs, there are many for which the mechanism of action is unclear.
Small molecule compounds are extremely useful tools for understanding complex biological systems. Elucidating how a compound affects the body can lead to the elucidation of unknown molecular networks in the body involving target proteins. In the case of pharmaceutical compounds, this knowledge can also lead to drug repositioning and new drug development. Chemical biology is an interdisciplinary research field that addresses these issues, and we are conducting research aimed at drug development using our unique interaction analysis technology (proteomics) and genetic screening technology.
Research Methodology
One of the unique methods in our lab is a biochemical method using FG beads. Under the leadership of Professor Emeritus Hiroshi Handa, who retired in March 2012, our lab took on the challenge of developing a new research tool to elucidate the mechanism of action of small molecule compounds in vivo and developed FG beads, an excellent affinity chromatography carrier. FG beads are microparticles with a magnetic iron core (Fe3O4) and a particle size of 140-200 nm. By using FG beads immobilized with small molecule compounds of interest as ligands, target factors can be purified in one step and in a short time. With a very low background and high yield, the beads are chemically stable and can be surface modified (introduction of carboxylic acids, amino groups, etc.) quite easily. Since FG beads are also magnetic, they can be purified by magnetic separation without using a centrifuge. FG beads have been successfully commercialized and are available from Tamagawa Seiki Co. Our chemical biology studies with FG beads have revealed important and interesting mechanisms woven by various small molecule compounds. We are also developing distance-dependent in situ biotinylation as a new method to explore and identify molecular interactions that are difficult to approach by conventional affinity chromatography. In addition, we are also performing sgRNA library screens using CRISPR/Cas9 as a function-based genetic approach to complement interaction-based approaches.
Thalidomide Research and a New Direction in Drug Development
Undoubtedly, the most significant achievement of our research aimed at drug discovery so far is the identification of the thalidomide target protein. Thalidomide was developed in the 1950s as a hypnotic and sedative drug. Due to its teratogenic effect that was initially overlooked, many babies with limb deformities were born as a result of women taking thalidomide during early pregnancy. This is remembered as one of the worst drug disasters in history. Thalidomide research continued, however, and thalidomide was found to have excellent therapeutic effects against leprosy and multiple myeloma (a type of blood cancer), among others. As a result, thalidomide and its successors have come into the spotlight as commonly used drugs worldwide in this century. The mechanism of action of thalidomide was unknown for a long time, but in 2010, together with Professor Emeritus Hiroshi Handa and Dr. Takumi Ito, we discovered that the intracellular target of thalidomide is a protein called cereblon (CRBN), using FG bead technology, which brought a breakthrough in this research area.
Thalidomide and CRBN do not have just one drug-one target relationship, but have the potential to expand and develop into a new foundation for drug development. CRBN is a subunit of the E3 ubiquitin ligase complex CRL4CRBN. It has become clear that thalidomide-related drugs alter its substrate specificity, inducing the ubiquitination and degradation of proteins that do not serve as substrates in the absence of the drugs (neosubstrates). In other words, thalidomide-related drugs seem to act as “molecular glues” bridging CRL4CRBN and neo-substrates and induce the degradation of various neo-substrates, thereby exerting diverse drug effects. Thus, exploration and identification of neo-substrates for CRBN has become an important research issue in recent years. Since ubiquitination and degradation of different neo-substrates can be induced by slightly changing the molecular structure of thalidomide-related drugs, it is hoped that thalidomide-related drugs can target proteins that have not been considered as therapeutic targets before. Therefore, this research area is attracting much attention as a new direction in drug development.
As described above, thalidomide-related drugs have great potential and are an area of intense research for us. We continue our research in the hope that our research will lead to healthier lives for people.