Microexons In the Autistic Brain
By Laurel Oldach
April 8, 2015 | A recent transcriptomic study has revealed that hundreds of proteins have bonus snippets one to nine amino acids in length that are expressed primarily in the brain, altering protein function compared to isoforms elsewhere in the body. In autistic brains, about a third of these snippets are more often skipped. Though the omissions are almost undetectable by conventional RNA-Seq, the study’s authors report that they have widespread effects on protein binding and neuronal function.
Scientists have long recognized alternative splicing, the variable inclusion of exons into messenger RNA, as a potent way to generate diverse proteins and differentiated tissues. But until next-generation sequencing arrived, it was difficult to generate a whole-genome picture of the splicing landscape. Many of the snippets of sequence found in this study, termed microexons because they are just a fraction of the length of the average exon, had never been described because of their astonishingly small size. RNA-Seq pipelines were more likely to filter them out as noise than to identify them as legitimate isoforms. So how were these small but surprisingly potent binding regulators finally found? The authors of a study published in Cell last December (DOI http://dx.doi.org/10.1016/j.cell.2014.11.035), working in the Blencowe lab at the University of Toronto, used a novel analysis of RNA-Seq data to study alternative splicing in the mammalian nervous system.
Knowing What To Look For
According to lead author Manuel Irimia, a former postdoc in the Blencowe lab, discovering neuronally-regulated microexons was a matter of approaching the data with the right questions in mind. “I designed the pipeline to find microexons, otherwise I wouldn’t have been able to find them,” he says, posing a strange tautology. Before microexons were described in the brain, how did he know to look for them? The answer, as so often in science, is the combination of background knowledge and a good hunch.
Very small exons had been experimentally validated prior to this study, typically in proteins where their presence or absence affects function; however, the majority of microexons were thought to be too small to be spliced in. Irimia was familiar with the field, having worked on a bystander gene that had several microexons early in his career. During an earlier postdoc at Stanford, he had used downtime on a project studying the evolution of genome structure to develop a first version of his microexon analysis pipeline. “Free time can be a good thing sometimes,” he reflects with a chuckle, although early forays into the fly and C. elegans genomes did not appear to reveal extensive microexon regulation. Working in the Blencowe lab at the University of Toronto, Irimia set out to study alternative splicing in the mammalian nervous system, in search of a neuronal splicing program. He incorporated a refined microexon search module into his analytical pipeline—a laborious move, but one that paid off.
The Technical Approach: Finding Needles In A Haystack
Even for conventional exons, analyzing alternative splicing patterns remains a tricky application for RNA-seq. Alignment efficiency at splice junctions is lower than in ordinary sequence, meaning deeper read coverage is required to establish the proportion of transcripts that include an exon of interest. To understand the splicing of conventional exons, Irimia and colleagues had aligned RNA-Seq reads with exon/exon junctions both to an expressed sequence tag (EST) library and to an additional synthetic transcriptome representing all possible combinations of annotated splice donor and acceptor sites. Their method, however, had depended on an exon’s having already been recognized and annotated.
For tiny exons that have not been annotated yet, the likelihood of accurate alignment to the genome drops still lower. Most exon/exon junction recognition modules demand at least eight nucleotides of known sequence on either side of the splice junction to identify an exon; many microexons are too short to meet that requirement, and are filtered out as noise. To solve the mapping efficiency problem, Irimia searched within introns to detect every possible splice site flanking a putative exon of three nucleotides or longer. He built a whole new synthetic transcriptome that incorporated every possible microexon combinatorially. While it matched the preexisting transcriptome as being hypothetically possible, it disregarded the fact that these splice sites had so far mostly escaped the notice of bioinformatics.
This analytical pipeline, available at https://github.com/vastgroup/vast-tools, was first used to analyze RNA-Seq datasets from a variety of tissues, many from Illumina’s Human Bodymap2 project. While comparing neuronal to other tissues, Irimia found a striking relationship between exon length and expression pattern: microexons were much more likely to be differentially expressed in the brain compared to longer conventional exons. The microexons Irimia’s colleagues followed up on with biochemical studies have an unusually high likelihood to express protein-coding sequence, clustering in proteins that regulate synapse formation and cell signaling. They tend to be switched on abruptly, during the terminal neurogenesis stage of development, and to measurably change the binding properties of the proteins they comprise.
The Findings: Microexons in Disease and Vertebrate Evolution
Irimia and his colleagues wondered whether changes in microexon splicing could play a role in autism, a disease with alternative splicing changes that the Blencowe lab had previously studied. Irimia’s wet lab collaborators performed paired-end Illumina sequencing on postmortem samples from autistic and control brains. When he analyzed the results with his pipeline, he found a dramatic downregulation in microexon inclusion in autistic brains.
“These genes are not useless without the microexon,” Irimia stresses. He posits a role for microexons as fine-tuners, tweaking a protein to optimize its function for the nervous system. Therefore, their effect may be very subtle and hard to test. Irimia envisions a hypothetical microexon that affects multiple interactions, but changes the efficiency of many of them subtly; though such a snippet could produce a quantitative change in the tissue where it was expressed, it would be very difficult to identify by traditional biochemical methods.
The Blencowe lab’s findings have shown that microexon regulation has the potential to explain some subtle, genetically intractable diseases and disorders. But at his own recently founded lab at the Center for Genomic Regulation Biology Research Unit in Barcelona, Irimia is now chasing an even deeper question: of how microexons have contributed to vertebrate evolution. “These things have been conserved for millions of years,” Irimia says, and he wants to find out exactly why.