J69. | Yoav Mathov, Malka Nissim-Rafinia, Chen Leibson, Nir Galun, Tomas Marques-Bonet, Arye Kandel,
Meir Liebergal, Eran Meshorer and Liran Carmel (2024) Nature Ecology and Evolution.
abstract
Genome-wide premortem DNA methylation patterns can be computationally reconstructed from
high-coverage DNA sequences of ancient samples. Because DNA methylation is more conserved across
species than across tissues, and ancient DNA is typically extracted from bones and teeth, previous
works utilizing ancient DNA methylation maps focused on studying evolutionary changes in the skeletal
system. Here we suggest that DNA methylation patterns in one tissue may, under certain conditions, be
informative on DNA methylation patterns in other tissues of the same individual. Using the fact that
tissue-specific DNA methylation builds up during embryonic development, we identified the conditions
that allow for such cross-tissue inference and devised an algorithm that carries it out. We trained
the algorithm on methylation data from extant species and reached high precisions of up to 0.92 for
validation datasets. We then used the algorithm on archaic humans, and identified more than 1,850
positions for which we were able to observe differential DNA methylation in prefrontal cortex neurons.
These positions are linked to hundreds of genes, many of which are involved in neural functions such as
structural and developmental processes. Six positions are located in the neuroblastoma breaking point
family (NBPF) gene family, which probably played a role in human brain evolution. The algorithm we
present here allows for the examination of epigenetic changes in tissues and cell types that are
absent from the palaeontological record, and therefore provides new ways to study the evolutionary
impacts of epigenetic changes.
|
J68. | Susanna Sawyer, Pere Gelabert, Benjamin Yakir, Alejandro Llanos-Lizcano, Alessandra Sperduti,
Luca Bondioli, Olivia Cheronet, Christine Neugebauer-Maresch, Maria Teschler-Nicola, Mario Novak,
Ildikó Pap, Ildikó Szikossy, Tamás Hajdu, Vyacheslav Moiseyev, Andrey Gromov, Gunita Zariņa,
Eran Meshorer, Liran Carmel and Ron Pinhasi (2024) Genome Biology 25:261.
abstract
Reconstructing premortem DNA methylation levels in ancient DNA has led to breakthrough studies
such as the prediction of anatomical features of the Denisovan. These studies rely on computationally
inferring methylation levels from damage signals in naturally deaminated cytosines, which requires
expensive high-coverage genomes. Here, we test two methods for direct methylation measurement developed
for modern DNA based on either bisulfite or enzymatic methylation treatments. Bisulfite treatment shows
the least reduction in DNA yields as well as the least biases during methylation conversion, demonstrating
that this method can be successfully applied to ancient DNA.
|
J67. | Critical Assessment of Genome Interpretation Consortium (2024) CAGI, the Critical Assessment of Genome Interpretation, establishes progress and prospects for computational genetic variant interpretation methods Genome Biology 25:53.
abstract
Background: The Critical Assessment of Genome Interpretation (CAGI) aims to
advance the state-of-the-art for computational prediction of genetic variant impact,
particularly where relevant to disease. The five complete editions of the CAGI
community experiment comprised 50 challenges, in which participants made blind
predictions of phenotypes from genetic data, and these were evaluated by independent
assessors.
Results: Performance was particularly strong for clinical pathogenic variants, including some difficult-to-diagnose cases, and extends to interpretation of cancer-related variants. Missense variant interpretation methods were able to estimate biochemical effects with increasing accuracy. Assessment of methods for regulatory variants and complex trait disease risk was less definitive and indicates performance potentially suitable for auxiliary use in the clinic. Conclusions: Results show that while current methods are imperfect, they have major utility for research and clinical applications. Emerging methods and increasingly large, robust datasets for training and assessment promise further progress ahead. |
J66. | Arielle Barouch, Yoav Mathov, Eran Meshorer, Benjamin Yakir
and Liran Carmel (2024) Reconstructing DNA methylation maps of ancient populations Nucleic Acids Research 52:1602-1612.
abstract
Studying premortem DNA methylation from ancient DNA (aDNA) provides a proxy for
ancient gene activity patterns, and hence valuable information on evolutionary
changes in gene regulation. Due to statistical limitations, current methods to
reconstruct aDNA methylation maps are constrained to high-coverage shotgun samples,
which comprise a small minority of available ancient samples. Most samples are
sequenced using in-situ hybridization capture sequencing which targets a predefined
set of genomic positions. Here, we develop methods to reconstruct aDNA methylation
maps of samples that were not sequenced using high-coverage shotgun sequencing, by
way of pooling together individuals to obtain a DNA methylation map that is
characteristic of a population. We show that the resulting DNA methylation maps
capture meaningful biological information and allow for the detection of differential
methylation across populations. We offer guidelines on how to carry out comparative
studies involving ancient populations, and how to control the rate of falsely
discovered differentially methylated regions. The ability to reconstruct DNA
methylation maps of past populations allows for the development of a whole new
frontier in paleoepigenetic research, tracing DNA methylation changes throughout
human history, using data from thousands of ancient samples.
|
J65. | Lily Agranat-Tamir, Shamam Waldman, Naomi Rosen, Benjamin Yakir,
Shai Carmi and Liran Carmel (2021) LINADMIX: Evaluating the effect of ancient admixture events on modern populations Bioinformatics 37:4744-4755.
abstract
Motivation. The rise in the number of genotyped ancient individuals
provides an opportunity to estimate population admixture models for many
populations. However, in models describing modern populations as mixtures of
ancient ones, it is typically difficult to estimate the model mixing
coefficients and to evaluate its fit to the data.
Results. We present LINADMIX, designed to tackle this problem by solving a constrained linear model when both the ancient and the modern genotypes are represented in a low-dimensional space. LINADMIX estimates the mixing coefficients and their standard errors, and computes a P-value for testing the model fit to the data. We quantified the performance of LINADMIX using an extensive set of simulated studies. We show that LINADMIX can accurately estimate admixture coefficients, and is robust to factors such as population size, genetic drift, proportion of missing data and various types of model misspecification. Availability and implementation. LINADMIX is available as a python code at https://github.com/swidler/linadmix. |
J64. | Yigal Erel, Ron Pinhasi, Alfredo Coppa, Adi Ticher,
Ofir Tirosh and Liran Carmel
(2021) Lead in Archeological Human Bones Reflecting Historical Changes in Lead Production Environmental Science & Technology 55:14407-14413.
abstract
Forty years ago, in a seminal paper published in Science, Settle and Patterson used
archeological and historical data to estimate the rates of worldwide lead production
since the discovery of cupellation, approximately 5000 years ago. Here, we record actual
lead exposure of a human population by direct measurements of the concentrations of lead
in petrous bones of individuals representing approximately 12 000 years of inhabitation in
Italy. This documentation of lead pollution throughout human history indicates that,
remarkably, much of the estimated dynamics in lead production is replicated in human
exposure. Thus, lead pollution in humans has closely followed anthropogenic lead production.
This observation raises concerns that the forecasted increase in the production of lead and
other metals might affect human health in the near future.
|
J63. | Avigayel Rabin, Michela Zaffagni, Reut Ashwal-Fluss, Ines Lucia Patop, Aarti Jajoo,
Shlomo Shenzis, Liran Carmel and Sebastian Kadener
(2021) SRCP: a comprehensive pipeline for accurate annotation and quantification of circRNAs Genome Biology 22:277.
abstract
Here we describe a new integrative approach for accurate annotation and
quantification of circRNAs named Short Read circRNA Pipeline (SRCP). Our
strategy involves two steps: annotation of validated circRNAs followed by
a quantification step. We show that SRCP is more sensitive than other
individual pipelines and allows for more comprehensive quantification of a
larger number of differentially expressed circRNAs. To facilitate the use of
SRCP, we generate a comprehensive collection of validated circRNAs in five
different organisms, including humans. We then utilize our approach and
identify a subset of circRNAs bound to the miRNA-effector protein AGO2 in
human brain samples.
|
J62. | Yifat S. Oren, Michal Irony-Tur Sinai, Anita Golec, Ofra Barchad-Avitzur,
Venkateshwar Mutyam, Yao Li, Jeong Hong, Efrat Ozeri-Galai, Aurélie Hatton,
Chen Leibson, Liran Carmel, Joel Reiter,
Eric J. Sorscher, Steve D. Wilton, Eitan Kerem, Steven M. Rowe,
Isabelle Sermet-Gaudelus and Batsheva Kerem (2021) Antisense oligonucleotide-based drug development for Cystic Fibrosis patients carrying the 3849+10kb C-to-T splicing mutation Journal of Cystic Fibrosis 2:S1569-1993(21)01287-X.
abstract
Background. Antisense oligonucleotide (ASO)-based drugs for splicing
modulation were recently approved for various genetic diseases with unmet need.
Here we aimed to develop an ASO-based splicing modulation therapy for Cystic
Fibrosis (CF) patients carrying the 3849+10 kb C-to-T splicing mutation
in the CFTR gene.
Methods. We have screened, in FRT cells expressing the 3849+10 kb C-to-T splicing mutation, ~30 2′-O-Methyl-modified phosphorothioate ASOs, targeted to prevent the recognition and inclusion of a cryptic exon generated due to the mutation. The effect of highly potent ASO candidates on the splicing pattern, protein maturation and CFTR function was further analyzed in well differentiated primary human nasal and bronchial epithelial cells, derived from patients carrying at least one 3849+10 kb C-to-T allele. Results. A highly potent lead ASO, efficiently delivered by free uptake, was able to significantly increase the level of correctly spliced mRNA and completely restore the CFTR function to wild type levels in cells from a homozygote patient. This ASO led to CFTR function with an average of 43% of wild type levels in cells from various heterozygote patients. Optimized efficiency of the lead ASO was further obtained with 2′-Methoxy Ethyl modification (2′MOE). Conclusion. The highly efficient splicing modulation and functional correction, achieved by free uptake of the selected lead ASO in various patients, demonstrate the ASO therapeutic potential benefit for CF patients carrying splicing mutations and is aimed to serve as the basis for our current clinical development. Keywords. Cystic fibrosis, Antisense oligonucleotides, Drug development, Splicing modulation 3849+10 kb, C-to-T mutation |
J61. | Yoav Mathov, Daniel Batyrev, Eran Meshorer and Liran Carmel(2020) Harnessing epigenetics to study human evolution Current Opinion in Genetics & Development 62:23-29.
abstract
Recent advances in ancient DNA extraction and high-throughput sequencing technologies enabled the
high-quality sequencing of archaic genomes, including the Neanderthal and the Denisovan. While
comparisons with modern humans revealed both archaic-specific and human-specific sequence changes,
in the absence of gene expression information, understanding the functional implications of such
genetic variations remains a major challenge. To study gene regulation in archaic humans, epigenetic
research comes to our aid. DNA methylation, which is highly correlated with transcription, can be
directly measured in modern samples, as well as reconstructed in ancient samples. This puts DNA
methylation as a natural basis for comparative epigenetics between modern humans, archaic humans
and nonhuman primates.
|
J60. | Lily Agranat-Tamir, Shamam Waldman, Mario Martin, David Gokhman, Nadav Mishol,
Tzilla Eshel, Olivia Cheronet, Nadin Rohland, Swapan Mallick, Nicole Adamski, Ann Marie Lawson, Matthew Mah,
Megan Michel, Jonas Oppenheimer, Kristin Stewardson, Francesca Candilio, Denise Keating, Beatriz Gamarra,
Shay Tzur, Mario Novak, Rachel Kalisher, Shlomit Bechar, Vered Eshed, Douglas J. Kennett, Marina Faerman,
Naama Yahalom-Mack, Janet M Monge, Yehuda Govrin, Yigal Erel, Benjamin Yakir, Ron Pinhasi, Shai Carmi,
Israel Finkelstein, Liran Carmel and David Reich (2020) The Genomic History of the Bronze Age Southern Levant Cell 181:1146-1157.
abstract
We report genome-wide DNA data for 73 individuals from five archaeological sites across the Bronze
and Iron Ages Southern Levant. These individuals, who share the "Canaanite" material culture,
can be modeled as descending from two sources: (1) earlier local Neolithic populations and (2)
populations related to the Chalcolithic Zagros or the Bronze Age Caucasus. The non-local
contribution increased over time, as evinced by three outliers who can be modeled as descendants
of recent migrants. We show evidence that different "Canaanite" groups genetically resemble each
other more than other populations. We find that Levant-related modern populations typically have
substantial ancestry coming from populations related to the Chalcolithic Zagros and the Bronze Age
Southern Levant. These groups also harbor ancestry from sources we cannot fully model with the
available data, highlighting the critical role of post-Bronze-Age migrations into the region
over the past 3,000 years.
|
J59. | David Gokhman, Malka Nissim-Rafinia, Lily Agranat-Tamir, Genevieve Housman,
Raquel García-Pérez, Esther Lizano, Olivia Cheronet, Swapan Mallick,
Maria A. Nieves-Colón, Heng Li, Songül Alpaslan-Roodenberg, Mario Novak,
Hongcang Gu, Jason M. Osinski, Manuel Ferrando-Bernal, Pere Gelabert, Iddi Lipende,
Deus Mjungu, Ivanela Kondova, Ronald Bontrop, Ottmar Kullmer, Gerhard Weber,
Tal Shahar, Mona Dvir-Ginzberg, Marina Faerman, Ellen E. Quillen, Alexander Meissner,
Yonatan Lahav, Leonid Kandel, Meir Liebergall, María E. Prada, Julio M. Vidal,
Richard M. Gronostajski, Anne C. Stone, Benjamin Yakir, Carles Lalueza-Fox, Ron Pinhasi,
David Reich, Tomas Marques-Bonet, Eran Meshorer and
Liran Carmel (2020) Differential DNA methylation of vocal and facial anatomy genes in modern humans Nature Communications 11:1189.
abstract
Changes in potential regulatory elements are thought to be key drivers of phenotypic divergence.
However, identifying changes to regulatory elements that underlie human-specific traits has proven
very challenging. Here, we use 63 reconstructed and experimentally measured DNA methylation maps
of ancient and present-day humans, as well as of six chimpanzees, to detect differentially
methylated regions that likely emerged in modern humans after the split from Neanderthals and
Denisovans. We show that genes associated with face and vocal tract anatomy went through particularly
extensive methylation changes. Specifically, we identify widespread hypermethylation in a network
of face- and voice-associated genes (SOX9, ACAN, COL2A1, NFIX and XYLT1). We propose that these
repression patterns appeared after the split from Neanderthals and Denisovans, and that they might
have played a key role in shaping the modern human face and vocal tract.
|
J58. | Daniel Batyrev, Elisheva Lapid, Liran Carmel and Eran Meshorer (2020) Predicted Archaic 3D Genome Organization Reveals Genes Related to Head and Spinal Cord Separating Modern from Archaic Humans Cells 9:48.
abstract
High coverage sequences of archaic humans enabled the reconstruction of their DNA methylation patterns.
This allowed comparing gene regulation between human groups, and linking such regulatory changes to
phenotypic differences. In a previous work, a detailed comparison of DNA methylation in modern humans,
archaic humans, and chimpanzees revealed 873 modern human-derived differentially methylated regions (DMRs).
To understand the regulatory implications of these DMRs, we defined differentially methylated genes (DMGs)
as genes that harbor DMRs in their promoter or gene body. While most of the modern human-derived DMRs could
be linked to DMGs, many others remained unassigned. Here, we used information on 3D genome organization to
link ~70 out of the remaining 288 unassigned DMRs to genes. Combined with the previously identified DMGs,
we reinforce the enrichment of these genes with vocal and facial anatomy, and additionally find significant
enrichment with the spinal column, chin, hair, and scalp. These results reveal the importance of 3D genomic
organization in understanding gene regulation by DNA methylation.
Keywords: ancient DNA; epigenetics; DNA methylation; genome organization; gene regulation; archaic humans; comparative epigenomics |
J57. | Fouad Zahdeh and Liran Carmel (2019) Nucleotide composition affects codon usage toward the 3'-end PLoS ONE 14:e0225633.
abstract
The 3'-end of the coding sequence in several species is known to show specific codon usage bias.
Several factors have been suggested to underlie this phenomenon, including selection against
translation efficiency, selection for translation accuracy, and selection against RNA folding.
All are supported by some evidence, but there is no general agreement as to which factors are
the main determinants. Nor is it known how universal this phenomenon is, and whether the same
factors explain it in different species. To answer these questions, we developed a measure that
quantifies the codon usage bias at the gene end, and used it to compute this bias for 91 species
that span the three domains of life. In addition, we characterized the codons in each species by
features that allow discrimination between the different factors. Combining all these data, we
were able to show that there is a universal trend to favor AT-rich codons toward the gene end.
Moreover, we suggest that this trend is explained by avoidance from forming RNA secondary
structures around the stop codon, which may interfere with normal translation termination.
|
J56. | David Gokhman, Nadav Mishol, Marc de Manuel, David de Juan, Jonathan Shuqrun,
Eran Meshorer, Tomas Marques-Bonet, Yoel Rak and Liran Carmel (2019) Reconstructing Denisovan Anatomy Using DNA Methylation Maps Cell 179:180-192.
abstract
Denisovans are an extinct group of humans whose morphology remains unknown. Here, we present a
method for reconstructing skeletal morphology using DNA methylation patterns. Our method is based
on linking unidirectional methylation changes to loss-of-function phenotypes. We tested
performance by reconstructing Neanderthal and chimpanzee skeletal morphologies and obtained
>85% precision in identifying divergent traits. We then applied this method to the Denisovan
and offer a putative morphological profile. We suggest that Denisovans likely shared with
Neanderthals traits such as an elongated face and a wide pelvis. We also identify Denisovan-derived
changes, such as an increased dental arch and lateral cranial expansion. Our predictions match
the only morphologically informative Denisovan bone to date, as well as the Xuchang skull, which
was suggested by some to be a Denisovan. We conclude that DNA methylation can be used to
reconstruct anatomical features, including some that do not survive in the fossil record.
|
J55. | Stephen M. Mount, Ziga Avsec, Liran Carmel,
Rita Casadio, Muhammed Hasan Çelik, Ken Chen, Jun Cheng, Noa E. Cohen,
William G. Fairbrother, Tzila Fenesh, Julien Gagneur, Valer Gotea, Tamar Holzer,
Chiao-Feng Lin, Pier Luigi Martelli, Tatsuhiko Naito, Thi Yen Duong Nguyen,
Castrense Savojardo, Ron Unger, Robert Wang, Yuedong Yang and Huiying Zhao (2019) Assessing predictions of the impact of variants on splicing in CAGI5 Human Mutation 40:1215-1224.
abstract
Precision medicine and sequence-based clinical diagnostics seek to predict
disease risk or to identify causative variants from sequencing data. The
Critical Assessment of Genome Interpretation (CAGI) is a community
experiment consisting of genotype-phenotype prediction challenges;
participants build models, undergo assessment, and share key findings. In
the past, few CAGI challenges have addressed the impact of sequence variants
on splicing. In CAGI 5, two challenges (Vex-seq and MaPSY) involved prediction
of the effect of variants, primarily single nucleotide changes, on splicing.
Although there are significant differences between these two challenges, both
involved prediction of results from high-throughput exon-inclusion assays. Here,
we discuss the methods used to predict the impact of these variants on splicing,
their performance, strengths and weaknesses, and prospects for predicting the
impact of sequence variation on splicing and disease phenotypes.
|
J54. | Shelly Mahlab-Aviv, Ayub Boulos, Ayelet R. Peretz, Tsiona Eliyahu,
Liran Carmel, Ruth Sperling and Michal Linial (2018) Small RNA sequences derived from pre-microRNAs in the supraspliceosome Nucleic Acids Research 46:11014-11029.
abstract
MicroRNAs (miRNAs) are short non-coding RNAs that negatively regulate the expression and
translation of genes in healthy and diseased tissues. Herein, we characterize short RNAs
from human HeLa cells found in the supraspliceosome, a nuclear dynamic machine in which
pre-mRNA processing occurs. We sequenced small RNAs (<200 nt) extracted from the
supraspliceosome, and identified sequences that are derived from 200 miRNAs genes. About
three quarters of them are mature miRNAs, whereas the rest account for various defined
regions of the pre-miRNA, and its hairpin-loop precursor. Out of these aligned sequences,
53 were undetected in cellular extract, and the abundance of additional 48 strongly differed
from that in cellular extract. Notably, we describe seven abundant miRNA-derived sequences
that overlap non-coding exons of their host gene. The rich collection of sequences identical
to pre-miRNAs at the supraspliceosome suggests overlooked nuclear functions. Specifically,
the abundant hsa-mir-99b may affect splicing of LINC01129 primary transcript through base-pairing
with its exon-intron junction. Using suppression and overexpression experiments, we show that
hsa-mir-7704 negatively regulates the level of the lncRNA HAGLR. We claim that in cases of
extended base-pairing complementarity, such supraspliceosomal pre-miRNA sequences might have a
role in transcription attenuation, maturation and processing.
|
J53. | Eitan Lavi and Liran Carmel (2018) Alu exaptation enriches the human transcriptome by introducing new gene ends RNA Biology 15:715-725.
abstract
In mammals, transposable elements are largely silenced, but under fortuitous circumstances may
be co-opted to play a functional role. Here, we show that when Alu elements are inserted within
or nearby genes in sense orientation, they may contribute to the transcriptome diversity by
forming new cleavage and polyadenylation sites. We mapped these new gene ends in human onto the
Alu sequence and identified three hotspots of cleavage and polyadenylation site formation.
Interestingly, the native Alu sequence does not contain any canonical polyadenylation signal. We
therefore studied what evolutionary processes might explain the formation of these specific
hotspots of novel gene ends. We show that two of the three hotspots might have emerged from
mutational processes that turned sequences that resemble polyadenylation signals into full-blown
canonical signals, whereas one hotspot is tightly linked to the process of Alu insertion into the
genome. Overall, Alu elements may lie behind the formation of 302 new gene end variants, affecting
a total of 243 genes. Intergenic Alu elements may elongate genes by creating a downstream cleavage
site, intronic Alu elements may lead to gene variants which code for truncated proteins, and 3'UTR
Alu elements may result in gene variants with alternative 3'UTR.
Keywords: Alu elements, exaptation, polyadenylation signals, nicking signals, gene-end, transcriptome repertoire |
J52. | Bronwyn A. Lucas, Eitan Lavi, Lily Shiue, Hana Cho, Sol Katzman, Keita Miyoshi, Mikiko C. Siomi,
Liran Carmel, Manuel Ares, Jr. and Lynne E. Maquat (2018) Evidence for convergent evolution of SINE-directed Staufen-mediated mRNA decay PNAS 115:968-973.
abstract
Primate-specific Alu short interspersed elements (SINEs) as well as rodent-specific B
and ID (B/ID) SINEs can promote Staufen-mediated decay (SMD) when present in mRNA
3'-untranslated regions (3'-UTRs). The transposable nature of SINEs, their presence in
long noncoding RNAs, their interactions with Staufen, and their rapid divergence in different
evolutionary lineages suggest they could have generated substantial modification of
posttranscriptional gene-control networks during mammalian evolution. Some of the variation
in SMD regulation produced by SINE insertion might have had a similar regulatory effect in
separate mammalian lineages, leading to parallel evolution of the Staufen network by independent
expansion of lineage-specific SINEs. To explore this possibility, we searched for orthologous
gene pairs, each carrying a species-specific 3'-UTR SINE and each regulated by SMD, by measuring
changes in mRNA abundance after individual depletion of two SMD factors, Staufen1 (STAU1) and UPF1,
in both human and mouse myoblasts. We identified and confirmed orthologous gene pairs with 3'-UTR
SINEs that independently function in SMD control of myoblast metabolism. Expanding to other species,
we demonstrated that SINE-directed SMD likely emerged in both primate and rodent lineages >20-25
million years ago. Our work reveals a mechanism for the convergent evolution of posttranscriptional
gene regulatory networks in mammals by species-specific SINE transposition and SMD.
|
J51. | David Gokhman, Anat Malul and Liran Carmel (2017) Inferring past environments from ancient epigenomes Molecular Biology and Evolution 34:2429-2438.
abstract
Analyzing the conditions in which past individuals lived is key to understanding the
environments and cultural transitions to which humans had to adapt. Here, we suggest a
methodology to probe into past environments, using reconstructed pre-mortem DNA methylation
maps of ancient individuals. We review a large body of research showing that differential DNA
methylation is associated with changes in various external and internal factors, and propose
that loci whose DNA methylation level is environmentally-responsive could serve as markers to
infer about ancient daily life, diseases, nutrition, exposure to toxins and more. We demonstrate
this approach by showing that hunger-related DNA methylation changes are found in ancient
hunter-gatherers. The strategy we present here opens a window to reconstruct previously inaccessible
aspects of the lives of past individuals.
|
J50. | Michal Chorev, Alan Joseph Bekker, Jacob Goldberger and
Liran Carmel (2017) Identification of introns harboring functional sequence elements through positional conservation Scientific Reports 7:4201.
abstract
Many human introns carry out a function, in the sense that they are critical to maintain
normal cellular activity. Their identification is fundamental to understanding cellular
processes and disease. However, being noncoding elements, such functional introns are poorly
predicted based on traditional approaches of sequence and structure conservation. Here, we
generated a dataset of human functional introns that carry out different types of functions.
We showed that functional introns share common characteristics, such as higher positional
conservation along the coding sequence and reduced loss rates, regardless of their specific
function. A unique property of the data is that if an intron is unknown to be functional,
it still does not mean that it is indeed non-functional. We developed a probabilistic framework
that explicitly accounts for this unique property, and predicts which specific human introns are
functional. We show that we successfully predict function even when the algorithm is trained on
introns with a different type of function. This ability has many implications in studying
regulatory networks, gene regulation, the effect of mutations outside exons on human disease,
and on our general understanding of intron evolution and their functional exaptation in mammals.
Keywords: Intron evolution, intron function, uncertain labels, intron positional consrevation, gene architecture |
J49. | David Gokhman, Guy Kelman, Adir Amartely, Guy Gershon, Shira Tsur
and Liran Carmel (2017) Gene ORGANizer: Linking Genes to the Organs They Affect Nucleic Acids Research 45(W1) (Web server issue):W138-W145.
abstract
One of the biggest challenges in studying how genes work is understanding their
effect on the physiology and anatomy of the body. Existing tools try to address
this using indirect features, such as expression levels and biochemical pathways.
Here, we present Gene ORGANizer (geneorganizer.huji.ac.il), a phenotype-based tool
that directly links human genes to the body parts they affect. It is built upon an
exhaustive curated database that links >7000 genes to ?150 anatomical parts using
>150 000 gene-organ associations. The tool offers user-friendly platforms to analyze
the anatomical effects of individual genes, and identify trends within groups of genes.
We demonstrate how Gene ORGANizer can be used to make new discoveries, showing that
chromosome X is enriched with genes affecting facial features, that positive selection
targets genes with more constrained phenotypic effects, and more. We expect Gene
ORGANizer to be useful in a variety of evolutionary, medical and molecular studies
aimed at understanding the phenotypic effects of genes.
|
J48. | Fernando Racimo, David Gokhman, Matteo Fumagalli, Amy Ko, Torben Hansen, Ida Moltke,
Anders Albrechtsen, Liran Carmel, Emilia Huerta-Sánchez
and Rasmus Nielsen (2017) Archaic adaptive introgression in TBX15/WARS2 Molecular Biology and Evolution, 34:509-524.
abstract
A recent study conducted the first genome-wide scan for selection in Inuit from Greenland
using SNP chip data. Here, we report that selection in the region with the second most extreme
signal of positive selection in Greenlandic Inuit favored a deeply divergent haplotype that is
closely related to the sequence in the Denisovan genome, and was likely introgressed from an
archaic population. The region contains two genes, WARS2 and TBX15, and has previously been
associated with body-fat distribution in humans. We show that the adaptively introgressed allele
has been under selection in a much larger geographic region than just Greenland. Furthermore, it
is associated with changes in expression of WARS2 and TBX15 in multiple tissues including the adrenal
gland and subcutaneous adipose tissue.
|
J47. | Topaz Halperin, Liran Carmel and Dror Hawlena (2017) Movement correlates of lizards' dorsal pigmentation patterns Functional Ecology 31:370-376.
abstract
Understanding the ecological function of an animal's pigmentation pattern is an intriguing research
challenge. We used quantitative information on lizard foraging behavior to search for movement
correlates of patterns across taxa. We hypothesized that noticeable longitudinal stripes that enhance
escape by motion-dazzle are advantageous for mobile foragers that are highly detectable against the
stationary background. Cryptic pigmentation patterns are beneficial for less-mobile foragers that
rely on camouflage to reduce predation. Using an extensive literature survey and
phylogenetically-controlled analyses, we found that striped lizards were substantially more mobile
than lizards with cryptic patterns. The percent of time spent moving was the major behavioral index
responsible for this difference. We provide empirical support for the hypothesized association between
lizard dorsal pigmentation patterns and foraging behavior. Our simple yet comprehensive explanation may
be relevant to many other taxa that present variation in body pigmentation patterns.
|
J46. | Fouad Zahdeh and Liran Carmel (2016) The role of nucleotide composition in premature termination codon recognition BMC Bioinformatics 17:519.
abstract
Background. It is not fully understood how a termination codon is recognized as
premature (PTC) by the nonsense-mediated decay (NMD) machinery. This is particularly true
for transcripts lacking an exon junction complex (EJC) along their 3' untranslated region
(3'UTR), and thus degrade through the EJC-independent NMD pathway.
Results. Here, we analyzed data of transcript stability change following NMD repression and identified over 200 EJC-independent NMD-targets. We examined many features characterizing these transcripts, and compared them to NMD-insensitive transcripts, as well as to a group of transcripts that are destabilized following NMD repression (destabilized transcripts). Conclusions. We found that none of the known NMD-triggering features, such as the presence of upstream open reading frames, significantly characterizes EJC-independent NMD-targets. Instead, we saw that NMD-targets are strongly enriched with G nucleotides upstream of the termination codon, and even more so along their 3'UTR. We suggest that high G content around the termination codon impedes translation termination as a result of mRNA folding, thus triggering NMD. We also suggest that high G content in the 3'UTR helps to activate NMD by allowing for the accumulation of UPF1, or other NMD-promoting proteins, along the 3'UTR. Keywords. Nonsense-mediated decay (NMD), EJC-independent NMD, NMD-triggering features, Stop codon GC content, Stop codon nucleotide composition, RNA secondary structure, Exon junction complex (EJC), Transcription termination |
J45. | Ruxandra Covacu, Hagit Philip, Merja Jaronen, Jorge Almeida, Jessica Kenison, Samuel Darko,
Chun-Cheih Chao, Gur Yaari, Yoram Louzoun, Liran Carmel,
Daniel C. Douek, Sol Efroni and Francisco J. Quintana (2016) System-wide analysis of the T-cell response Cell Reports 14:2733-2744.
abstract
The T cell receptor (TCR) controls the cellular adaptive immune response to antigens, but our
understanding of TCR repertoire diversity and response to challenge is still incomplete. For
example, TCR clones shared by different individuals with minimal alteration to germline gene
sequences (public clones) are detectable in all vertebrates, but their significance is unknown.
Although small in size, the zebrafish TCR repertoire is controlled by processes similar to those
operating in mammals. Thus, we studied the zebrafish TCR repertoire and its response to stimulation
with self and foreign antigens. We found that cross-reactive public TCRs dominate the T cell response,
endowing a limited TCR repertoire with the ability to cope with diverse antigenic challenges. These
features of vertebrate public TCRs might provide a mechanism for the rapid generation of protective
T cell immunity, allowing a short temporal window for the development of more specific private T cell
responses.
|
J44. | Ranit Jaron, Nuphar Rosenfeld, Fouad Zahdeh, Shai Carmi, Liana Beni-Adani, Reeval Segel,
Sharon Zeligson, Liran Carmel, Paul Renbaum and Ephrat Levy-Lahad
(2016) Expanding the phenotype of CRB2 mutations - A new ciliopathy syndrome? Clinical Genetics 90:540-544.
abstract
Recessive CRB2 mutations were recently reported to cause both steroid resistant nephrotic
syndrome and prenatal onset ventriculomegaly with kidney disease. We report two Ashkenazi
Jewish siblings clinically diagnosed with ciliopathy. Both presented with severe congenital
hydrocephalus and mild urinary tract anomalies. One affected sibling also has lung hypoplasia
and heart defects. Exome sequencing and further CRB2 analysis revealed that both siblings are
compound heterozygotes for CRB2 mutations p.N800K and p.Gly1036Alafs*43, and heterozygous for
a deleterious splice variant in the ciliopathy gene TTCB21. CRB2 is a polarity protein which
plays a role in ciliogenesis and ciliary function. Biallelic CRB2 mutations in animal models
result in phenotypes consistent with ciliopathy. This report expands the phenotype of CRB2
mutations to include lung hypoplasia and uretero-pelvic renal anomalies, and confirms cardiac
malformation as a feature. We suggest that CRB2-associated disease is a new ciliopathy syndrome
with possible digenic/triallelic inheritance, as observed in other ciliopathies. Clinically, CRB2
should be assessed when ciliopathy is suspected, especially in Ashkenazi Jews, where we found that
p.N800K carrier frequency is 1/64. Patients harboring CRB2 mutations should be tested for the full
range of ciliopathy manifestations.
|
J43. | David Gokhman, Eran Meshorer and Liran Carmel (2016) Epigenetics: it's getting old. Past meets future in paleoepigenetics Trends in Ecology and Evolution 31:290-300.
abstract
Recent years have witnessed the rise of ancient DNA (aDNA) technology, allowing comparative genomics to
be carried out at unprecedented time resolution. While it is relatively straightforward to use aDNA to
identify recent genomic changes, it is much less clear how to utilize it to study changes in epigenetic regulation.
Here we review recent works demonstrating that highly degraded aDNA still contains sufficient information to allow
reconstruction of epigenetic signals, including DNA methylation and nucleosome positioning maps. We discuss
challenges arising from the tissue specificity of epigenetics, and show how some of them might in fact turn into
advantages. Finally, we introduce a method to infer methylation states in tissues that do not tend to be preserved
over time.
|
J42. | Michal Chorev, Lotem Guy and Liran Carmel (2016) JuncDB: an exon-exon junction database Nucleic Acids Research 44(D1) (Database issue):D101-D109.
abstract
Intron positions upon the mRNA transcript are sometimes remarkably conserved even across distantly
related eukaryotic species. This has made the comparison of intron-exon architectures across orthologous
transcripts a very useful tool for studying various evolutionary processes. Moreover, the wide range of
functions associated with introns may confer biological meaning to evolutionary changes in gene
architectures. Yet, there is currently no database that offers such comparative information. Here, we
present JuncDB (http://juncdb.carmelab.huji.ac.il/), an exon-exon junction database dedicated to the
comparison of architectures between orthologous transcripts. It covers nearly 40,000 sets of orthologous
transcripts spanning 88 eukaryotic species. JuncDB offers a user-friendly interface, access to detailed
information, instructive graphical displays of the comparative data and easy ways to download data to a
local computer. In addition, JuncDB allows the analysis to be carried out either on specific genes, or at
a genome-wide level for any selected group of species.
|
J41. | Liron Levin, Dan Bar-Yaacov, Amos Bouskila, Michal Chorev, Liran Carmel
and Dan Mishmar (2015) LEMONS - A tool for the identification of splice junctions in transcriptomes of organisms lacking reference genomes PLoS ONE 10:e0143329.
abstract
RNA-seq is becoming a preferred tool for genomics studies of model and non-model organisms. However,
DNA-based analysis of organisms lacking sequenced genomes cannot rely on RNA-seq data alone to isolate
most genes of interest, as DNA codes both exons and introns. With this in mind, we designed a novel tool,
LEMONS, that exploits the evolutionary conservation of both exon/intron boundary positions and splice junction
recognition signals to produce high throughput splice-junction predictions in the absence of a reference genome.
When tested on multiple annotated vertebrate mRNA data, LEMONS accurately identified 87% (average) of the
splice-junctions. LEMONS was then applied to our updated Mediterranean chameleon transcriptome, which lacks
a reference genome, and predicted a total of 90,820 exon-exon junctions. We experimentally verified these
splice-junction predictions by amplifying and sequencing twenty randomly selected genes from chameleon DNA
templates. Exons and introns were detected in 19 of 20 of the positions predicted by LEMONS. To the best of our
knowledge, LEMONS is currently the only experimentally verified tool that can accurately predict splice-junctions
in organisms that lack a reference genome.
|
J40. | Ariella Weinberg-Shukron, Abdulsalam Abu-Libdeh, Fouad Zahdeh, Liran Carmel,
Aviram Kogot-Levin, Lara Kamal, Moien Kanaan, Sharon Zeligson, Paul Renbaum, Ephrat Levy-Lahad and David Zangen (2015) Combined mineralocorticoid and glucocorticoid deficiency is caused by a novel founder nicotinamide nucleotide transhydrogenase mutation that alters mitochondrial morphology and increases oxidative stress Journal of Medical Genetics 52:636-641.
abstract
Background: Familial glucocorticoid deficiency (FGD) reflects specific
failure of adrenocortical glucocorticoid production in response to adrenocorticotropic hormone (ACTH).
Most cases are caused by mutations encoding ACTH-receptor components (MC2R, MRAP) or the general
steroidogenesis protein (StAR). Recently, nicotinamide nucleotide transhydrogenase (NNT) mutations were
found to cause FGD through a postulated mechanism resulting from decreased detoxification of reactive
oxygen species (ROS) in adrenocortical cells.
Methods and Results: In a consanguineous Palestinian family with combined mineralocorticoid and glucocorticoid deficiency, whole-exome sequencing revealed a novel homozygous NNT_c.598 G>A, p.G200S, mutation. Another affected, unrelated Palestinian child was also homozygous for NNT_p.G200S. Haplotype analysis showed this mutation is ancestral; carrier frequency in ethnically matched controls is 1/200. Assessment of patient fibroblasts for ROS production, ATP content and mitochondrial morphology showed that biallelic NNT mutations result in increased levels of ROS, lower ATP content and morphological mitochondrial defects. Conclusions: This report of a novel NNT mutation, p.G200S, expands the phenotype of NNT mutations to include mineralocorticoid deficiency. We provide the first patient-based evidence that NNT mutations can cause oxidative stress and both phenotypic and functional mitochondrial defects. These results directly demonstrate the importance of NNT to mitochondrial function in the setting of adrenocortical insufficiency. |
J39. | David Gokhman, Eitan Lavi, Kay Prüfer, Mario F. Fraga, José A. Riancho,
Janet Kelso, Svante Pääbo,
Eran Meshorer and Liran Carmel (2014) Reconstructing the DNA methylation maps of the Neandertal and the Denisovan Science 344:523-527.
abstract
Ancient DNA sequencing has recently provided high-coverage archaic human genomes.
However, the evolution of epigenetic regulation along the human lineage remains largely
unexplored. We reconstructed the full DNA methylation maps of the Neandertal and the
Denisovan by harnessing the natural degradation processes of methylated and unmethylated
cytosines. Comparing these ancient methylation maps to those of present-day humans, we
identified ~2000 differentially methylated regions (DMRs). Particularly, we found substantial
methylation changes in the HOXD cluster that may explain anatomical differences between archaic
and present-day humans. Additionally, we found that DMRs are significantly more likely to be
associated with diseases. This study provides insight into the epigenetic landscape of our
closest evolutionary relatives, and opens a window to explore the epigenomes of extinct species.
Coverage to this paper can be found here. Free links to the abstract and the full paper. |
J38. | Michal Chorev and Liran Carmel (2013) Computational identification of functional introns: high positional conservation of introns that harbor RNA genes Nucleic Acids Research 41:5604-5613.
abstract
An appreciable fraction of introns is thought to have some function, but there is no obvious way to predict
which specific intron is likely to be functional. We hypothesize that functional introns experience a
different selection regime than non-functional ones and will therefore show distinct evolutionary histories.
In particular, we expect functional introns to be more resistant to loss, and that this would be reflected
in high conservation of their position with respect to the coding sequence. To test this hypothesis, we
focused on introns whose function comes about from microRNAs and snoRNAs that are embedded within their
sequence. We built a data set of orthologous genes across 28 eukaryotic species, reconstructed the
evolutionary histories of their introns and compared functional introns with the rest of the introns. We
found that, indeed, the position of microRNA- and snoRNA-bearing introns is significantly more conserved.
In addition, we found that both families of RNA genes settled within introns early during metazoan evolution.
We identified several easily computable intronic properties that can be used to detect functional introns in
general, thereby suggesting a new strategy to pinpoint non-coding cellular functions.
Coverage to this paper can be found here. |
J37. | Liran Carmel, Eugene V. Koonin and Stella Dracheva (2012) Dependencies among Editing Sites in Serotonin 2C Receptor mRNA PLoS Computational Biology 8:e1002663.
abstract
The serotonin 2C receptor (5-HT2CR) - a key regulator of diverse neurological processes - exhibits functional
variability derived from editing of its pre-mRNA by site-specific adenosine deamination (A-to-I pre-mRNA
editing) in five distinct sites. Here we describe a statistical technique that was developed for analysis
of the dependencies among the editing states of the five sites. The statistical significance of the
observed correlations was estimated by comparing editing patterns in multiple individuals. For both human
and rat 5-HT2CR, the editing states of the physically proximal sites A and B were found to be strongly
dependent. In contrast, the editing states of sites C and D, which are also physically close, seem not to
be directly dependent but instead are linked through the dependencies on sites A and B, respectively. We
observed pronounced differences between the editing patterns in humans and rats: in humans site A is the key
determinant of the editing state of the other sites, whereas in rats this role belongs to site B. The structure
of the dependencies among the editing sites is notably simpler in rats than it is in humans implying more
complex regulation of 5-HT2CR editing and, by inference, function in the human brain. Thus, exhaustive
statistical analysis of the 5-HT2CR editing patterns indicates that the editing state of sites A and B is
the primary determinant of the editing states of the other three sites, and hence the overall editing pattern.
Taken together, these findings allow us to propose a mechanistic model of concerted action of ADAR1 and ADAR2
in 5-HT2CR editing. Statistical approach developed here can be applied to other cases of interdependencies among
modification sites in RNA and proteins.
|
J36. | Inbal Avraham-Davidi, Y. Ely, V.N. Pham, D. Castranova, M. Grunspan, G. Malkinson, L. Gibbs-Bar, O. Mayseless, G. Allmog,
B. Lo, C.M. Warren, T.T. Chen, J. Ungos, K. Kidd, K. Shaw, I. Rogachev, W. Wan, P.M. Murphy, S.A. Farber,
Liran Carmel, G.S. Shelness, M.L. Iruela-Arispe, M.L. Iruela-Arispe, Brant M. Weinstein
and Karina Yaniv (2012) ApoB-containing lipoproteins regulate angiogenesis by modulating expression of VEGF receptor 1 Nature Medicine 18:967-973.
abstract
Despite the clear major contribution of hyperlipidemia to the prevalence of cardiovascular
disease in the developed world, the direct effects of lipoproteins on endothelial cells have
remained obscure and are under debate. Here we report a previously uncharacterized mechanism
of vessel growth modulation by lipoprotein availability. Using a genetic screen for vascular
defects in zebrafish, we initially identified a mutation, stalactite (stl), in the gene
encoding microsomal triglyceride transfer protein (mtp), which is involved in the biosynthesis
of apolipoprotein B (ApoB)-containing lipoproteins. By manipulating lipoprotein concentrations
in zebrafish, we found that ApoB negatively regulates angiogenesis and that it is the ApoB
protein particle, rather than lipid moieties within ApoB-containing lipoproteins, that is
primarily responsible for this effect. Mechanistically, we identified downregulation of
vascular endothelial growth factor receptor 1 (VEGFR1), which acts as a decoy receptor for VEGF,
as a key mediator of the endothelial response to lipoproteins, and we observed VEGFR1
downregulation in hyperlipidemic mice. These findings may open new avenues for the treatment
of lipoprotein-related vascular disorders.
|
J35. | Igor B. Rogozin, Liran Carmel, Miklos Csuros and Eugene V. Koonin (2012) Origin and evolution of spliceosomal introns Biology Direct 7:11.
abstract
Evolution of exon-intron structure of eukaryotic genes has been a matter of
long-standing, intensive debate. The introns-early concept, later rebranded
'introns first' held that protein-coding genes were interrupted by numerous
introns even at the earliest stages of life's evolution and that introns played
a major role in the origin of proteins by facilitating recombination of
sequences coding for small protein/peptide modules. The introns-late concept
held that introns emerged only in eukaryotes and new introns have been
accumulating continuously throughout eukaryotic evolution. Analysis of
orthologous genes from completely sequenced eukaryotic genomes revealed numerous
shared intron positions in orthologous genes from animals and plants and even
between animals, plants and protists, suggesting that many ancestral introns have
persisted since the last eukaryotic common ancestor (LECA). Reconstructions of
intron gain and loss using the growing collection of genomes of diverse eukaryotes
and increasingly advanced probabilistic models convincingly show that the LECA and
the ancestors of each eukaryotic supergroup had intron-rich genes, with intron
densities comparable to those in the most intron-rich modern genomes such as those
of vertebrates. The subsequent evolution in most lineages of eukaryotes involved
primarily loss of introns, with only a few episodes of substantial intron gain that
might have accompanied major evolutionary innovations such as the origin of metazoa.
The original invasion of self-splicing Group II introns, presumably originating from
the mitochondrial endosymbiont, into the genome of the emerging eukaryote might have
been a key factor of eukaryogenesis that in particular triggered the origin of
endomembranes and the nucleus. Conversely, splicing errors gave rise to alternative
splicing, a major contribution to the biological complexity of multicellular eukaryotes.
There is no indication that any prokaryote has ever possessed a spliceosome or introns
in protein-coding genes, other than relatively rare mobile self-splicing introns. Thus,
the introns-first scenario is not supported by any evidence but exon-intron structure of
protein-coding genes appears to have evolved concomitantly with the eukaryotic cell, and
introns were a major factor of evolution throughout the history of eukaryotes.
|
J34. | Michal Chorev and Liran Carmel (2012) The function of introns Frontiers in Genetics 3:55.
abstract
The intron-exon architecture of many eukaryotic genes raises the intriguing question
of whether this unique organization serves any function, or is it simply a result of
the spread of functionless introns in eukaryotic genomes. In this review, we show that
introns in contemporary species fulfill a broad spectrum of functions, and are involved
in virtually every step of mRNA processing. We propose that this great diversity of
intronic functions supports the notion that introns were indeed selfish elements in
early eukaryotes, but then independently gained numerous functions in different
eukaryotic lineages. We suggest a novel criterion of evolutionary conservation,
dubbed intron positional conservation, which can identify functional introns.
|
J33. | Noa E. Cohen, Roy Shen and Liran Carmel (2012) The role of reverse-transcriptase in intron gain and loss mechanisms Molecular Biology and Evolution 29:179-186.
abstract
Intron density is highly variable across eukaryotic species. It seems that
different lineages have experienced considerably different levels of intron
gain and loss events, but the reasons for this are not well-known. A large number
of mechanisms for intron loss and gain have been suggested, and most of them
have at least some level of indirect support. We therefore figured out that the
variability in intron density can be a reflection of the fact that different
mechanisms are active in different lineages. Quite a number of these putative
mechanisms, both for intron loss and for intron gain, postulate that the enzyme
reverse-transcriptase has a key role in the process. In this paper we lay out
three predictions whose approval or falsification gives indication for the
involvement of reverse-transcriptase in intron gain and loss processes. Testing
these predictions requires data on the intron gain and loss rates of individual
genes along different branches of the eukaryotic phylogenetic tree. So far, such
rates could not be computed, and hence these predictions could not be rigorously
evaluated. Here, we use a maximum likelihood algorithm that we have devised in the past,
EREM, which allows the estimation of such rates. Using this algorithm, we computed the
intron loss and gain rates of more than 300 genes, in each branch of the phylogenetic
tree of 19 eukaryotic species. Based on that, we found only little support for
reverse-transcriptase activity in intron gain. In contrast, we suggest that
reverse-transcriptase-mediated intron loss is a mechanism that is very efficient in
removing introns, and thus its levels of activity may be a major determinant of intron
number. Moreover, we found that intron gain and loss rates are negatively correlated in
intron-poor species, but are positively correlated for intron-rich species. One
explanation to this is that intron gain and loss mechanisms in intron-rich species (like
metazoans) share a common mechanistic component, albeit not a reverse-transcriptase.
|
J32. | David Zangen, Yotam Kaufman, Sharon Zeligson, Shira Perlberg, Hila Fridman, Moein Kanaan, Maha Abdulhadi-Atwan,
Abdulsalam Abu Libdeh, Ayal Gussow, Irit Kisslov, Liran Carmel,
Paul Renbaum and Ephrat Levy-Lahad (2011) XX Ovarian Dysgenesis Is Caused by a PSMC3IP/HOP2 Mutation that Abolishes Coactivation of Estrogen-Driven Transcription The American Journal of Human Genetics 89:572-579.
abstract
XX female gonadal dysgenesis (XX-GD) is a rare, genetically heterogeneous disorder characterized
by lack of spontaneous pubertal development, primary amenorrhea, uterine hypoplasia, and
hypergonadotropic hypogonadism as a result of streak gonads. Most cases are unexplained but
thought to be autosomal recessive. We elucidated the genetic basis of XX-GD in a highly
consanguineous Palestinian family by using homozygosity mapping and candidate-gene and
whole-exome sequencing. Affected females were homozygous for a 3 bp deletion (NM_016556.2,
c.600_602del) in the PSMC3IP gene, leading to deletion of a glutamic acid residue (p.Glu201del)
in the highly conserved C-terminal acidic domain. Proteasome 26S subunit, ATPase, 3-Interacting
Protein (PSMC3IP)/Tat Binding Protein Interacting Protein (TBPIP) is a nuclear, tissue-specific
protein with multiple functions. It is critical for meiotic recombination as indicated by the
known role of its yeast ortholog, Hop2. Through the C terminus (not present in yeast), PSMC3IP
also coactivates ligand-driven transcription mediated by estrogen, androgen, glucocorticoid,
progesterone, and thyroid nuclear receptors. In cell lines, the p.Glu201del mutation abolished
PSMC3IP activation of estrogen-driven transcription. Impaired estrogenic signaling can lead to
ovarian dysgenesis both by affecting the size of the follicular pool created during fetal
development and by failing to counteract follicular atresia during puberty. PSMC3IP joins
previous genes known to be mutated in XX-GD, the FSH receptor, and BMP15, highlighting the
importance of hormonal signaling in ovarian development and maintenance and suggesting a common
pathway perturbed in isolated XX-GD. By analogy to other XX-GD genes, PSMC3IP is also a candidate
gene for premature ovarian failure, and its role in folliculogenesis should be further investigated.
|
J31. | John K. Colbourne, ... , Liran Carmel, ... and Jeffrey L. Boore (2011) The ecoresponsive genome of Daphnia pulex Science 331:555-561.
abstract
We describe the draft genome of the microcrustacean Daphnia pulex, which is only
200 megabases and contains at least 30,907 genes. The high gene count is a
consequence of an elevated rate of gene duplication resulting in tandem gene
clusters. More than a third of Daphnia's genes have no detectable homologs in
any other available proteome, and the most amplified gene families are specific
to the Daphnia lineage. The coexpansion of gene families interacting within
metabolic pathways suggests that the maintenance of duplicated genes is not
random, and the analysis of gene expression under different environmental
conditions reveals that numerous paralogs acquire divergent expression patterns
soon after duplication. Daphnia-specific genes, including many additional loci
within sequenced regions that are otherwise devoid of annotations, are the most
responsive genes to ecological challenges.
|
J30. | Liran Carmel, Yuri I. Wolf, Igor B. Rogozin and Eugene V. Koonin (2010) EREM: Parameter estimation and ancestral reconstruction by expectation-maximization algorithm for a probabilistic model of genomic binary characters evolution Advances in Bioinformatics 2010:Article ID 167408.
abstract
Evolutionary binary characters are features of species or genes, indicating the absence (value zero)
or presence (value one) of some property. Examples include eukaryotic gene architecture (the presence
or absence of an intron in a particular locus), gene content, and morphological characters. In many
studies, the acquisition of such binary characters is assumed to represent a rare evolutionary event,
and consequently, their evolution is analyzed using various flavors of parsimony. However, when gain
and loss of the character are not rare enough, a probabilistic analysis becomes essential. Here, we
present a comprehensive probabilistic model to describe the evolution of binary characters on a
bifurcating phylogenetic tree. A fast software tool, EREM, is provided, using maximum likelihood to
estimate the parameters of the model and to reconstruct ancestral states (presence and absence in
internal nodes) and events (gain and loss events along branches).
|
J29. | Liran Carmel and Eugene V. Koonin (2009) A universal nonmonotonic relationship between gene compactness and expression level in multicellular eukaryotes Genome Biology and Evolution 2009:382-390.
abstract
Analysis of gene architecture and expression levels of four organisms, Homo sapiens,
Caenorhabiditis elegans, Drosophila melanogaster, and Arabidopsis thaliana,
reveals a surprising, nonmonotonic, universal relationship between expression level and gene
compactness. With increasing expression level, the genes tend at first to become longer but,
from a certain level of expression, they become more and more compact, resulting in an approximate
bell-shaped dependence. There are two leading hypotheses to explain the compactness of highly
expressed genes. The selection hypothesis predicts that gene compactness is predominantly driven
by the level of expression whereas the genomic design hypothesis predicts that expression breadth
across tissues is the driving force. We observed that the connection between gene expression breadth
in humans and gene compactness to be significantly weaker than the connection between expression level
and compactness, a result that is compatible with the selection hypothesis but not the genome design
hypothesis. The initial gene elongation with increasing expression level could be explained, at least
in part, by accumulation of regulatory elements enhancing expression, in particular, in introns. This
explanation is compatible with the observed positive correlation between intron density and expression
level of a gene. Conversely, the trend toward increasing compactness for highly expressed genes could be
caused by selection for minimization of energy and time expenditure during transcription and splicing,
and for increased fidelity of transcription, splicing and/or translation that is likely to be particularly
critical for highly expressed genes. Regardless of the exact nature of the forces that shape the gene
architecture, we present evidence that, at least, in animals, coding and noncoding parts of genes show
similar architectonic trends.
Keywords: eukaryotic gene structure, eukaryotic gene architecture, selection on gene compactness, genomic design, intron functionality, intron density. |
J28. | Nandor Nagy, Olive Mwizerwa, Karina Yaniv, Liran Carmel, Rafael Pieretti-Vanmarcke,
Brant M. Weinstein and Allan M. Goldstein (2009) Endothelial cells promote migration and proliferation of enteric neural crest cells via β1 integrin signaling Developmental Biology 330:263-272.
abstract
Enteric neural crest-derived cells (ENCCs) migrate along the
intestine to form a highly organized network of ganglia that
comprises the enteric nervous system (ENS). The signals driving the
migration and patterning of these cells are largely unknown. Examining
the spatiotemporal development of the intestinal neurovasculature in
avian embryos, we find endothelial cells (ECs) present in the gut prior
to the arrival of migrating ENCCs. These ECs are patterned in concentric
rings that are predictive of the positioning of later arriving crest-derived
cells, leading us to hypothesize that blood vessels may serve as a substrate
to guide ENCC migration. Immunohistochemistry at multiple stages during ENS
development reveals that ENCCs are positioned adjacent to vessels as they
colonize the gut. A similar close anatomic relationship between vessels and
enteric neurons was observed in zebrafish larvae. When EC development is
inhibited in cultured avian intestine, ENCC migration is arrested and distal
aganglionosis results, suggesting that ENCCs require the presence of vessels to
colonize the gut. Neural tube and avian midgut were explanted onto a variety of
substrates, including components of the extracellular matrix and various cell
types, such as fibroblasts, smooth muscle cells, and endothelial cells. We find
that crest-derived cells from both the neural tube and the midgut migrate avidly
onto cultured endothelial cells. This EC-induced migration is inhibited by the
presence of CSAT antibody, which blocks binding to β1 integrins expressed on the
surface of crest-derived cells. These results demonstrate that ECs provide a
substrate for the migration of ENCCs via an interaction between β1 integrins on the
ENCC surface and extracellular matrix proteins expressed by the intestinal vasculature.
These interactions may play an important role in guiding migration and patterning in
the developing ENS.
Keywords: Enteric nervous system, Endothelial cells, Blood vessels, Hirschsprung's disease, Integrins, Avian, Zebrafish. |
J27. | Rea Ravin, Dan J. Hoeppner, D. M. Munno, Liran Carmel, J. Sullivan,
D. L. Levitt, J. L. Miller, C. Athaide, D. M. Panchision and Ron D. McKay (2008) Potency and fate specification in CNS stem cell populations in vitro Cell Stem Cell 3:670-680.
abstract
To realize the promise of stem cell biology, it is important to identify
the precise time in the history of the cell when developmental potential is
restricted. To achieve this goal, we developed a real-time imaging system
that captures the transitions in fate, generating neurons, astrocytes, and
oligodendrocytes from single CNS stem cells in vitro. In the presence of bFGF,
tripotent cells normally produce specified progenitors through a bipotent
intermediate cell type. Surprisingly, the tripotent state is reset at each
passage. The cytokine CNTF is thought to instruct multipotent cells to an
astrocytic fate. We demonstrate that CNTF both directs astrogliogenesis from
tripotent cells, bypassing two of the three normal bipotent intermediates, and
later promotes the expansion of specified astrocytic progenitors. These results
show how discrete cell types emerge from a multipotent cell and provide a strong
basis for future studies to determine the molecular basis of fate specification.
|
J26. | Sol Efroni, Liran Carmel, Carl G. Schaefer and Ken H. Buetow (2008) Superposition of transcriptional behaviors determines gene state PLoS One 3:e2901.
abstract
We introduce a novel technique to determine the expression state of a gene
from quantitative information measuring its expression. Adopting a productive
abstraction from current thinking in molecular biology, we consider two
expression states for a gene - Up or Down. We determine this state by using a
statistical model that assumes the data behaves as a combination of two biological
distributions. Given a cohort of hybridizations, our algorithm predicts, for the
single reading, the probability of each gene's being in an Up or a Down state in
each hybridization. Using a series of publicly available gene expression data sets,
we demonstrate that our algorithm outperforms the prevalent algorithm. We also show
that our algorithm can be used in conjunction with expression adjustment techniques
to produce a more biologically sound gene-state call. The technique we present here
enables a routine update, where the continuously evolving expression level adjustments
feed into gene-state calculations. The technique can be applied in almost any multi-sample
gene expression experiment, and holds equal promise for protein abundance experiments.
|
J25. | Igor B. Rogozin, Karen Thomson, Miklos Csuros, Liran Carmel and Eugene V. Koonin (2008) Homoplasy in genome-wide analysis of rare amino acid replacements: the molecular-evolutionary basis for Vavilov's law of homologous series Biology Direct 3:7.
abstract
Background: Rare genomic changes (RGCs) that are thought to comprise derived
shared characters of individual clades are becoming an increasingly important class of markers
in genome-wide phylogenetic studies. Recently, we proposed a new type of RGCs designated
RGC_CAMs (after Conserved Amino acids-Multiple substitutions) that were inferred using
genome-wide identification of amino acid replacements that were: i) located in unambiguously
aligned regions of orthologous genes, ii) shared by two or more taxa in positions that contain
a different, conserved amino acid in a much broader range of taxa, and iii) require two or three
nucleotide substitutions. When applied to animal phylogeny, the RGC_CAM approach supported the
coelomate clade that unites deuterostomes with arthropods as opposed to the ecdysozoan (molting
animals) clade. However, a non-negligible level of homoplasy was detected.
Results: We provide a direct estimate of the level of homoplasy caused by parallel changes and reversals among the RGC_CAMs using 462 alignments of orthologous genes from 19 eukaryotic species. It is shown that the impact of parallel changes and reversals on the results of phylogenetic inference using RGC_CAMs cannot explain the observed support for the Coelomata clade. In contrast, the evidence in support of the Ecdysozoa clade, in large part, can be attributed to parallel changes. It is demonstrated that parallel changes are significantly more common in internal branches of different subtrees that are separated from the respective common ancestor by relatively short times than in terminal branches separated by longer time intervals. A similar but much weaker trend was detected for reversals. The observed evolutionary trend of parallel changes is explained in terms of the covarion model of molecular evolution. As the overlap between the covarion sets in orthologous genes from different lineages decreases with time after divergence, the likelihood of parallel changes decreases as well. Conclusions: The level of homoplasy observed here appears to be low enough to justify the utility of RGC_CAMs and other types of RGCs for resolution of hard problems in phylogeny. Parallel changes, one of the major classes of events leading to homoplasy, occur much more often in relatively recently diverged lineages than in those separated from their last common ancestor by longer time intervals of time. This pattern seems to provide the molecular-evolutionary underpinning of Vavilov's law of homologous series and is readily interpreted within the framework of the covarion model of molecular evolution. Reviewers: This article was reviewed by Alex Kondrashov, Nicolas Galtier, and Maximilian Telford and Robert Lanfear (nominated by Laurence Hurst). |
J24. | Malay K. Basu, Liran Carmel, Igor B. Rogozin and Eugene V. Koonin (2008) Evolution of protein domain promiscuity in eukaryotes Genome Research 18:449-461.
abstract
Numerous eukaryotic proteins contain multiple domains.
Certain domains show a tendency to occur in diverse domain architectures and can
be considered "promiscuous". These promiscuous domains are, typically, involved
in protein-protein interactions and play crucial roles in interaction networks,
particularly, those that contribute to signal transduction. A systematic
comparative-genomic analysis of promiscuous domains in eukaryotes is described.
Two quantitative measures of domain promiscuity are introduced and applied to the
analysis of 28 genomes of diverse eukaryotes. Altogether, 215 domains are
identified as strongly promiscuous. The fraction of promiscuous domains in animals
is shown to be significantly greater than that in fungi or plants. Evolutionary
reconstructions indicate that domain promiscuity is a volatile, relatively
fast-changing feature of eukaryotic proteins, with few domains remaining promiscuous
throughout the evolution of eukaryotes. Some domains appear to have attained
promiscuity independently in different lineages, e.g., animals and plants. It is
proposed that promiscuous domains persist within a relatively small pool of
evolutionarily stable domain combinations from which numerous rare architectures
emerge during evolution. Domain promiscuity positively correlates with the number
of experimentally detected domain interactions and with the strength of purifying
selection affecting a domain. Thus, evolution of promiscuous domains seems to be
constrained by the diversity of their interaction partners. The set of promiscuous
domains is enriched for domains mediating protein-protein interactions that are
involved in various forms of signal transduction, especially, in the ubiquitin system
and in the chromatin. Thus, a limited repertoire of promiscuous domains makes a major
contribution to the diversity and evolvability of eukaryotic proteomes and signaling
networks.
|
J23. | Rafi Haddad, Liran Carmel, Noam Sobel and David Harel (2008) Predicting the receptive range of olfactory receptors PLoS Computational Biology 4:e18.
abstract
Although the family of genes encoding for olfactory
receptors was identified more than 15 years ago, the difficulty of functionally
expressing these receptors in an heterologous system has, with only some
exceptions, rendered the receptive range of given olfactory receptors largely
unknown. Furthermore, even when successfully expressed, the task of probing
such a receptor with thousands of odors/ligands remains daunting. Here we
provide proof of concept for a solution to this problem. Using computational
methods we tune an electronic nose to the receptive range of an olfactory
receptor. We then use this electronic nose to predict the receptors' response
to other odorants. Our method can be used to identify the receptive range of
olfactory receptors, and can also be applied to other questions involving
receptor-ligand interactions in non-olfactory settings.
|
J22. | Igor B. Rogozin, Yuri I. Wolf, Liran Carmel and Eugene V. Koonin (2007) Analysis of rare amino acid replacements supports the Coelomata clade Molecular Biology and Evolution 24:2594-2597.
abstract
The recent analysis of a novel class of rare genomic
changes, RGC_CAMs (after Conserved Amino acids-Multiple substitutions), supported
the Coelomata clade of animals as opposed to the Ecdysozoa clade (Rogozin et al.
2007). A subsequent re-analysis, with the sequences from the sea anemone
Nematostella vectensis included in the set of outgroup species, suggested that
this result was an artefact caused by reverse amino replacements and claimed
support for Ecdysozoa (Irimia et al. 2007). We show that the internal branch
connecting the sea anemone to the bilaterian animals is extremely short, resulting
in a weak statistical support for the Coelomata clade. Direct estimation of the
level of homoplasy, combined with taxon sampling with different sets of outgroup
species, reinforces the support for Coelomata whereas the effect of reversals is
shown to be relatively minor.
Keywords: Phylogenetic analysis, cladistics, rare genomic changes, coelomata, ecdysozoa. |
J21. | Liran Carmel, Igor B. Rogozin, Yuri I. Wolf and Eugene V. Koonin (2007) Patterns of intron gain and conservation in eukaryotic genes BMC Evoluionary Biology 7:192.
abstract
Background: The presence of introns in protein-coding genes
is a universal feature of eukaryotic genome organization, and the genes of
multicellular eukaryotes, typically, contain multiple introns, a substantial
fraction of which share position in distant taxa, such as plants and animals.
Depending on the methods and data sets used, researchers have reached opposite
conclusions on the causes of the high fraction of shared introns in orthologous
genes from distant eukaryotes. Some studies conclude that shared intron positions
reflect, almost entirely, a remarkable evolutionary conservation, whereas others
attribute it to parallel gain of introns. To resolve these contradictions, it is
crucial to analyze the evolution of introns by using a model that minimally relies
on arbitrary assumptions.
Results: We developed a probabilistic model of evolution that allows for variability of intron gain and loss rates over branches of the phylogenetic tree, individual genes, and individual sites. Applying this model to an extended set of conserved eukaryotic genes, we find that parallel gain, on average, accounts for only ~8% of the shared intron positions. However, the distribution of parallel gains over the phylogenetic tree of eukaryotes is highly non-uniform. There are, practically, no parallel gains in closely related lineages, whereas for distant lineages, such as animals and plants, parallel gains appear to contribute up to 20% of the shared intron positions. In accord with these findings, we estimated that ancestral introns have a high probability to be retained in extant genomes, and conversely, that a substantial fraction of extant introns have retained their positions since the early stages of eukaryotic evolution. In addition, the density of sites that are available for intron insertion is estimated to be, approximately, one in seven basepairs. Conclusions: We obtained robust estimates of the contribution of parallel gain to the observed sharing of intron positions between eukaryotic species separated by different evolutionary distances. The results indicate that, although the contribution of parallel gains varies across the phylogenetic tree, the high level of intron position sharing is due, primarily, to evolutionary conservation. Accordingly, numerous introns appear to persist in the same position over hundreds of millions of years of evolution. This is compatible with recent observations of a negative correlation between the rate of intron gain and coding sequence evolution rate of a gene, suggesting that at least some of the introns are functionally relevant. |
J20. | Liran Carmel and David Harel (2007) Mix-to-mimic odor synthesis for electronic noses Sensors and Actuators B: Chemical 125:635-643.
abstract
Arrays of chemical sensors, known as electronic noses,
yield a unique pattern for a given mixture of odors. Recently, there has been
increasing interest in trying to mix odors such as to generate a desired response
in the electronic nose. For the time being, this intriguing problem had been tackled
only experimentally with the aid of specific apparatus. Here, we present an
algorithmic solution to the problem. We demonstrate the algorithm on data that
includes mixtures of up to five ingredients.
Keywords: odor communication, sniffer, whiffer, within-sniffer mix-to-mimic algorithm, electronic nose. |
J19. | Alissa M. Resch, Liran Carmel,
Leonardo Mariño-Ramírez, Aleksey Y. Ogurtsov, Svetlana A. Shabalina,
Igor B. Rogozin and Eugene V. Koonin (2007) Widespread positive selection in synonymous sites of mammalian genes Molecular Biology and Evolution 24:1821-1831.
abstract
Evolution of protein sequences is largely governed
by purifying selection, with a small fraction of proteins evolving under
positive selection. The evolution at synonymous positions in protein-coding
genes is not nearly as well understood, with the extent and types of selection
remaining, largely, unclear. A statistical test to identify purifying and
positive selection at synonymous sites in protein-coding genes was developed.
The method compares the rate of evolution at synonymous sites (Ks) to that in
intron sequences of the same gene after sampling the aligned intron sequences
to mimic the statistical properties of coding sequences. We detected purifying
selection at synonymous sites in 28% of the 1562 analyzed orthologous genes
from mouse and rat, and positive selection in 12% of the genes. Thus, the
fraction of genes with readily detectable positive selection at synonymous
sites is much greater than the fraction of genes with comparable positive
selection at non-synonymous sites, i.e., at the level of the protein sequence.
Unlike other genes, the genes with positive selection at synonymous sites
showed no correlation between Ks and the rate of evolution in non-synonymous
sites (Ka), indicating that evolution of synonymous sites under positive
selection is decoupled from protein evolution. The genes with purifying
selection at synonymous sites showed significant anticorrelation between Ks
and expression level and breadth indicating that highly expressed genes evolve
slowly. The genes with positive selection at synonymous sites showed the
opposite trend, i.e., highly expressed genes had, on average, higher Ks. For
the genes with positive selection at synonymous sites, a significantly lower
mRNA stability is predicted compared to the genes with negative selection. Thus,
mRNA destabilization could be an important factor driving positive selection in
non-synonymous sites, probably, through regulation of expression at the level of
mRNA degradation and, possibly, also translation rate. So, unexpectedly, we found
that positive selection at synonymous sites of mammalian genes is substantially
more common than positive selection at the level of protein sequences. Positive
selection at synonymous sites migh.
Keywords: synonymous sites, non-synonymous sites, positive selection, purifying selection, introns. |
J18. | Liran Carmel, Igor B. Rogozin, Yuri I. Wolf and Eugene V. Koonin (2007) Evolutionarily conserved genes preferentially accumulate introns Genome Research 17:1045-1050.
abstract
Introns that interrupt eukaryotic protein-coding
sequences are generally thought to be nonfunctional. However, for reasons
still poorly understood, positions of many introns are highly conserved in
evolution. Previous reconstructions of intron gain and loss events during
eukaryotic evolution used a variety of simplified evolutionary models that
yielded contradicting conclusions and are not suited to reveal some of the
key underlying processes. We combine a comprehensive probabilistic model
and an extended data set, including 391 conserved genes from 19 eukaryotes,
to uncover previously unnoticed aspects of intron evolution - in particular,
to assign intron gain and loss rates to individual genes. The rates of
intron gain and loss in a gene show moderate positive correlation. A gene's
intron gain rate shows a highly significant negative correlation with the
coding-sequence evolution rate; intron loss rate also significantly, but
positively, correlates with the sequence evolution rate. Correlations of
the opposite signs, albeit less significant ones, are observed between
intron gain and loss rates and gene expression level. It is proposed that
intron evolution includes a neutral component, which is manifest in the
positive correlation between the gain and loss rates and a selection-driven
component as reflected in the links between intron gain and loss and
sequence evolution. The increased intron gain and decreased intron loss
in evolutionarily conserved genes indicate that intron insertion often
might be adaptive, whereas some of the intron losses might be deleterious.
This apparent functional importance of introns is likely to be due, at
least in part, to their multiple effects on gene expression.
|
J17. | Liran Carmel, Yuri I. Wolf, Igor B. Rogozin and Eugene V. Koonin (2007) Three distinct modes of intron dynamics in the evolution of eukaryotes Genome Research 17:1034-1044.
abstract
Several contrasting scenarios have been proposed
for the origin and evolution of spliceosomal introns, a hallmark of eukaryotic
genes. A comprehensive probabilistic model to obtain a definitive reconstruction
of intron evolution was developed and applied to 391 sets of conserved genes from
19 eukaryotic species. It is inferred that a relatively high intron density was
reached early, i.e., the last common ancestor of eukaryotes contained >2.15
introns/kilobase, and the last common ancestor of multicellular life forms
harbored 3.4 introns/kilobase, a greater intron density than in most of the extant
fungi and in some animals. The rates of intron gain and intron loss appear to have
been dropping during the last 1.3 billion years, with the decline in the gain rate
being much steeper. Eukaryotic lineages exhibit three distinct modes of evolution
of the intron-exon structure. The primary, balanced mode, apparently, operates in
all lineages. In this mode, intron gain and loss are strongly and positively
correlated, in contrast to previous reports on inverse correlation between these
processes. The second mode involves an elevated rate of intron loss and is
prevalent in several lineages, such as fungi and insects. The third mode,
characterized by elevated rate of intron gain, is seen only in deep branches of the
tree, indicating that bursts of intron invasion occurred at key points in eukaryotic
evolution, such as the origin of animals. Intron dynamics could depend on multiple
mechanisms, and in the balanced mode, gain and loss of introns might share common
mechanistic features.
|
J16. | Igor B. Rogozin, Yuri I. Wolf, Liran Carmel and Eugene V. Koonin (2007) Ecdysozoan clade rejected by genome-wide analysis of rare amino acid replacements Molecular Biology and Evolution 24:1080-1090.
abstract
As the number of sequenced genomes from diverse walks of life
rapidly increases, phylogenetic analysis is entering a new era: reconstruction of the
evolutionary history of organisms on the basis of full-scale comparison of their genomes.
In addition to brute force, genome-wide analysis of alignments, rare genomic characters (RGCs)
that are thought to comprise derived shared characters of individual clades are increasingly
used in genome-wide phylogenetic studies. We propose a new type of RGCs designated RGC_CAMs
(after Conserved Amino acids-Multiple substitutions), which are inferred using a genome-scale
analysis of protein and underlying nucleotide sequence alignments. The RGC_CAM approach utilizes
amino acid residues conserved in major eukaryotic lineages, with the exception of a few species
comprising a putative clade, and selects for phylogenetic inference only those amino acid
replacements that require 2 or 3 nucleotide substitutions, in order to reduce homoplasy. The
RGC_CAM analysis was combined with a procedure for rigorous statistical testing of competing
phylogenetic hypotheses. The RGC_CAM method is shown to be robust to branch length differences
and taxon sampling. When applied to animal phylogeny, the RGC_CAM approach strongly supports the
coelomate clade that unites chordates with arthropods as opposed to the ecdysozoan (molting
animals) clade. This conclusion runs against the view of animal evolution that is currently
prevailing in the evo-devo community. The final solution to the coelomate-ecdysozoa controversy
will require a much larger set of complete genome sequences representing diverse animal taxa. It
is expected that RGC_CAM and other RGC-based methods will be crucial for these future, definitive
phylogenetic studies.
Keywords: Phylogenetic analysis, cladistics, rare genomic changes, coelomata, ecdysozoa, microsporidia. |
J15. | Rafi Haddad, Liran Carmel and David Harel (2007) A feature extraction algorithm for multi-peak signals in electronic noses Sensors and Actuators B: Chemical 120:467-472.
abstract
The Lorentzian model is a powerful feature extraction
technique for electronic noses. In a previous work, it was applied to single-peak
transient signals and was shown to achieve lower classification error rate than
other feature extraction techniques. Here, we generalize the Lorentzian model by
showing how to apply it to transient signals that are comprised of more than a
single peak. The model is based on a fast and robust fitting of the measured
signals to a physically meaningful analytic curve. We show that this model fits
equally well to sensors of different technologies and embeddings, suggesting its
applicability to a diverse repertoire of sensors and analytic devices.
Keywords: feature extraction, electronic nose, signal processing, multiple peaks. |
J14. | Ekaterina Kuznetsova, Michael Proudfoot, Claudio F. Gonzalez, Greg Brown, Marina V. Omelchenko, Ivan Borozan,
Liran Carmel, Yuri I. Wolf, Hirotada Mori, Alexei V. Savchenko, Cheryl H. Arrowsmith,
Eugene V. Koonin, Aled M. Edwards and Alexander F. Yakunin (2006) Genome-wide analysis of substrate specificities of the Escherichia coli haloacid dehalogenase-like phosphatase family Journal of Biological Chemistry 281:36149-36161.
abstract
Members of enzyme families catalyze similar reactions, but
have evolved specific biological functions. Comprehensive determination of the
substrate specificities of enzymes is the essential first step toward elucidation of
these functions and contributions of individual enzymes to the metabolome. Haloacid
dehalogenase (HAD)-like hydrolases are a vast superfamily of largely uncharacterized
enzymes, with a few members shown to possess phosphatase, β-phosphoglucomutase,
phosphonatase, and dehalogenase activities. Using a representative set of 80
phosphorylated substrates, we characterized the substrate specificities of 23
soluble HADs encoded in the Escherichia coli genome. We identified small
molecule phosphatase activity in 21 HADs and β-phosphoglucomutase activity in one
protein. The E. coli HAD phosphatases show high catalytic efficiency and
affinity to a wide range of phosphorylated metabolites (sugars, nucleotides, organic
acids, cofactors) that are intermediates of various metabolic reactions (glycolysis,
pentose phosphate pathway, gluconeogenesis, intermediary sugar and nucleotide
metabolism). Rather than following the classical "one enzyme - one substrate" model,
most of the E. coli HADs show remarkably broad and overlapping substrate
spectra. At least 12 reactions catalyzed by HADs currently have no EC numbers
assigned in the Enzyme Nomenclature. Surprisingly, most HADs hydrolyzed small
phosphodonors (acetyl phosphate, carbamoyl phosphate, phosphoramidate), which also
serve as substrates for autophoshorylation of the receiver domains of the two-component
signal transduction systems. The physiological relevance of the phosphatase activity
with the preferred substrate was validated in vivo for one of the HADs, YniC. Many
of the secondary activities of HADs might have no immediate physiological function,
but could comprise a reservoir for evolution of novel phosphatases.
|
J13. | Yuri I. Wolf, Liran Carmel and Eugene V. Koonin (2006) Unifying measures of gene function and evolution Proceedings of the Royal Society B 273:1507-1515.
abstract
Recent genome analyses revealed intriguing correlations
between variables characterizing the functioning of a gene, such as expression
level (EL), connectivity of genetic and protein-protein interaction networks,
and knockout effect, and variables describing gene evolution, such as sequence
evolution rate (ER) and propensity for gene loss. Typically, variables within
each of these classes are positively correlated, e.g. products of highly
expressed genes also have a propensity to be involved in many protein-protein
interactions, whereas variables between classes are negatively correlated, e.g.
highly expressed genes, on average, evolve slower than weakly expressed genes.
Here, we describe principal component (PC) analysis of seven genome-related
variables and propose biological interpretations for the first three PCs. The
first PC reflects a gene's 'importance', or the 'status' of a gene in the genomic
community, with positive contributions from knockout lethality, EL, number of
protein-protein interaction partners and the number of paralogues, and negative
contributions from sequence ER and gene loss propensity. The next two PCs define
a plane that seems to reflect the functional and evolutionary plasticity of a gene.
Specifically, PC2 can be interpreted as a gene's 'adaptability' whereby genes with
high adaptability readily duplicate, have many genetic interaction partners and
tend to be non-essential. PC3 also might reflect the role of a gene in organismal
adaptation albeit with a negative rather than a positive contribution of genetic
interactions; we provisionally designate this PC 'reactivity'. The interpretation
of PC2 and PC3 as measures of a gene's plasticity is compatible with the
observation that genes with high values of these PCs tend to be expressed in a
condition- or tissue-specific manner. Functional classes of genes substantially vary
in status, adaptability and reactivity, with the highest status characteristic of
the translation system and cytoskeletal proteins, highest adaptability seen in
cellular processes and signalling genes, and top reactivity characteristic of
metabolic enzymes.
Keywords: gene expression, gene dispensability, protein-protein interaction, sequence evolution rate, gene loss, principal component analysis. |
J12. | Liran Carmel, Sol Efroni, Peter D. White, Eric Aslakson,
Ute Vollmer-Conna and Mangalathu S. Rajeevan (2006) Gene expression profile of empirically delineated classes of unexplained chronic fatigue Pharmacogenomics 7:375-386.
abstract
Objectives: To identify the underlying gene
expression profiles of unexplained chronic fatigue subjects classified into five
or six class solutions by principal component (PCA) and latent class analyses (LCA).
Methods: Microarray expression data were available for 15,315 genes and 111 female subjects enrolled from a population-based study on chronic fatigue syndrome. Algorithms were developed to assign gene scores and threshold values that signified the contribution of each gene to discriminate the multiclasses in each LCA solution. Unsupervised dimensionality reduction was first used to remove noise or otherwise uninformative gene combinations, followed by supervised dimensionality reduction to isolate gene combinations that best separate the classes. Results: The authors' gene score and threshold algorithms identified 32 and 26 genes capable of discriminating the five and six multiclass solutions, respectively. Pair-wise comparisons suggested that some genes (zinc finger protein 350 [ZNF350], solute carrier family 1, member 6 [SLC1A6], F-box protein 7 [FBX07] and vacuole 14 protein homolog [VAC14]) distinguished most classes of fatigued subjects from healthy subjects, whereas others (patched homolog 2 [PTCH2] and T-cell leukemia/lymphoma [TCL1A]) differentiated specific fatigue classes. Conclusion: A computational approach was developed for general use to identify discriminatory genes in any multiclass problem. Using this approach, differences in gene expression were found to discriminate some classes of unexplained chronic fatigue, particularly one termed interoception. Keywords: chronic fatigue syndrome, Fisher quotient and discriminatory genes, gene expression and gene scores, interoception, latent class analysis, principal component analysis. |
J11. | Liran Carmel, Noa Sever and David Harel (2005) On predicting responses to mixtures in quartz microbalance sensors Sensors and Actuators B: Chemical (special issue, selected papers from ISOEN 2003 - the 10th International Symposium on Olfaction and Electronic Noses; edited by J. Kleperis, L. Grinberga, A. D'Amico & M. Koudelka-Hep) 106:128-135.
abstract
A fundamental question in studying odor patterns in
electronic noses is how to estimate the response to a mixture, given the
response curves of the pure chemicals. We study this question by proposing two
mixture-predicting models, and verify them against real data collected using
quartz microbalance sensors. We find that a simple additive law explains fairly
well the measured response patterns of binary mixtures, but that a slightly more
complicated mixing model is required in order to produce good estimations of the
response patterns of mixtures that are comprised of more than two compounds.
Keywords: electronic noses, mixtures, response prediction, mixing model, law of mixing, quartz microbalance sensors. |
J10. | Liran Carmel (2005) Electronic nose signal restoration - beyond the dynamic range limit Sensors and Actuators B: Chemical (special issue, selected papers from ISOEN 2003 - the 10th International Symposium on Olfaction and Electronic Noses; edited by J. Kleperis, L. Grinberga, A. D'Amico & M. Koudelka-Hep) 106:95-100.
abstract
When measuring over-concentrated stimuli, chemical sensors
tend to exhibit corrupted time signals, which are normally categorized as missing
data. Such a failure of one or more sensors occurs frequently in applications
where an eNose is exposed to a diverse repertoire of chemicals. As a rule, missing
data are removed from the dataset by leaving a potentially large portion of the
original dataset unutilized. Here we propose an algorithm to handle such missing
data by utilizing intact regions of corrupted signals to restore the damaged regions.
We do so by fitting a parametric model of the sensor response over time to the
intact regions, and using the resulting model for the restoration. We show that the
restoration is both accurate and consistent, thus allowing for the restored signals
to take part in any subsequent data analysis process.
Keywords: electronic nose, signal restoration, missing data, signal corruption, signal failure. |
J9. | Oded Shaham, Liran Carmel and David Harel (2005) On mapping between electronic noses Sensors and Actuators B: Chemical (special issue, selected papers from ISOEN 2003 - the 10th International Symposium on Olfaction and Electronic Noses; edited by J. Kleperis, L. Grinberga, A. D'Amico & M. Koudelka-Hep) 106:76-82.
abstract
We consider the task of finding a mapping between two
eNoses that employ two different sensor technologies, quartz microbalance and
conducting polymers. Such a mapping is a model that predicts the response of
one eNose based on the response of the other. eNose mappings are important for
odor communication and synthesis, as well as for eNose data integration. We
investigated a number of methods for performing this task, including principal
components regression, partial least squares, neural networks and tessellation-based
linear interpolation. Our measure of success is the percentage of predictions that
are correctly classifiable. Using two different techniques for splitting our data
set, we achieved success rates of 67% and 100%.
|
J8. | Yehuda Koren and Liran Carmel (2004) Robust linear dimensionality reduction IEEE Trans. Visualization and Computer Graphics 10:459-470.
abstract
We present a novel family of data-driven linear
transformations, aimed at finding low dimensional embeddings of multivariate
data, in a way that optimally preserves the structure of the data. The
well-studied PCA and Fisher's LDA are shown to be special members in this family
of transformations, and we demonstrate how to generalize these two methods such
as to enhance their performance. Furthermore, our technique is the only one, to
the best of our knowledge, that reflects in the resulting embedding both the data
coordinates and pairwise similarities and/or dissimilarities between the data elements.
Even more so, when information on the clustering (labeling) decomposition of the data
is known, this information can also be integrated in the linear transformation, resulting
in embeddings that clearly show the separation between the clusters, as well as their internal
structure. All this makes our technique very flexible and powerful, and lets us
cope with kinds of data that other techniques fail to describe properly.
Index terms: dimensionality reduction, visualization, classification, feature extraction, projection, linear transformation, principal component analysis, Fisher's linear discriminant analysis. |
J7. | Liran Carmel, David Harel and Yehuda Koren (2004) Combining hierarchy and energy for drawing directed graphs IEEE Trans. Visualization and Computer Graphics 10:46-57.
abstract
We present an algorithm for drawing directed graphs,
which is based on rapidly solving a unique one-dimensional optimization problem
for each of the axes. The algorithm results in a clear description of the
hierarchy structure of the graph. Nodes are not restricted to lie on fixed
horizontal layers, resulting in layouts that convey the symmetries of the graph
very naturally. The algorithm can be applied without change to cyclic or acyclic
digraphs, and even to graphs containing both directed and undirected edges. We
also derive a hierarchy index from the input digraph, which quantitatively
measures its amount of hierarchy.
Keywords: Directed graph drawing, force directed layout, hierarchy energy, Fiedler vector, minimum linear arrangement. |
J6. | Yehuda Koren, Liran Carmel and David Harel (2003) Drawing huge graphs by algebraic multigrid optimization Multiscale Modeling and Simulation 1:645-673.
abstract
We present an extremely fast graph drawing algorithm for
very large graphs, which we term ACE (for Algebraic multigrid Computation of
Eigenvectors). ACE exhibits a vast improvement over the fastest algorithms we are
currently aware of; using a serial PC, it draws graphs of millions of nodes in
less than a minute. ACE finds an optimal drawing by minimizing a quadratic energy
function. The minimization problem is expressed as a generalized eigenvalue problem,
which is solved rapidly using a novel algebraic multigrid technique. The same
generalized eigenvalue problem seems to come up also in other fields, hence ACE
appears to be applicable outside graph drawing too.
Keywords: algebraic multigrid, multiscale/multilevel optimization, graph drawing, generalized eigenvalue problem, Fiedler vector, force directed layout, the Hall energy. |
J5. | Liran Carmel, Noa Sever, Doron Lancet and David Harel (2003) An e-Nose algorithm for identifying chemicals and determining their concentration Sensors and Actuators B: Chemical (special issue, Proceedings of the 9th international Meeting on Chemical Sensors, 2002; edited by J. Stetter & S. Yao) 93:77-83.
abstract
We propose an algorithm for use with multisensor systems that is capable of
the following: a) identify an analyte independently of its concentration; b)
estimate the concentration of the analyte, even if the system was not previously
exposed to this concentration; c) tell when an analyte is of a chemical type not
previously presented to the system. The algorithm, based upon recent work of
Hopfield, uses the multiplicity of sensors explicitly, and is intuitive and easy
to implement. We have tested it against real data, and it exhibits high quality
performance.
Keywords: electronic noses, classification, identification, concentration estimation, reject option, Hopfield algorithm. |
J4. | Liran Carmel, Shlomo Levy, Doron Lancet and David Harel (2003) A feature extraction method for chemical sensors in electronic noses Sensors and Actuators B: Chemical (special issue, Proceedings of the 9th international Meeting on Chemical Sensors, 2002; edited by J. Stetter & S. Yao) 93:67-76.
abstract
We propose a new feature extraction method for use with chemical sensors. It is based
on fitting a parametric analytic model of the sensor's response over time to the measured
signal, and taking the set of best-fitting parameters as the features. The process of
finding the features is fast and robust, and the resulting set of features is shown to
significantly enhance the performance of subsequent classification algorithms. Moreover,
the model that we have developed fits equally well to sensors of different technologies
and embeddings, suggesting its applicability to a diverse repertoire of sensors and
analytic devices.
Keywords: feature extraction, electronic nose, curve fitting, quartz-microbalance sensors, metal-oxide sensors. |
J3. | David Harel, Liran Carmel and Doron Lancet (2003) Towards an odor communication system Computational Biology and Chemistry 27:121-133.
abstract
We propose a setup for an odor communication system. Its different parts
are described, and ways to realize them are outlined. Our scheme enables an
output device --- the whiffer --- to release an imitation of an odorant read in
by an input device --- the sniffer --- upon command. The heart of the system is
the novel algorithmic scheme that makes the scheme feasible. We are currently at work
researching and developing some of the components that constitute the algorithm, and
we hope that the description of the overall scheme in this paper will help to get
other groups to join in this effort.
Keywords: odor communication system, palette odorants, odor space, odorant mixing, sniffer, whiffer. |
J2. | Liran Carmel, David Harel and Doron Lancet (2001) Estimating the size of the olfactory repertoire Bulletin of Mathematical Biology 63:1063-1078.
abstract
The concept of shape space, which has been successfully implemented in
immunology, is used here to construct a model for the discrimination power
of the olfactory system. Using reasonable assumptions on the behavior of
the biological system, we are able to estimate the number of distinct
olfactory receptor types. Our estimated value of around 1000 receptor
types is in high agreement with experimental data.
|
J1. | Liran Carmel and Ady Mann (2000) Geometrical approach to two-level Hamiltonians Physical Reviews A 61:052113.
abstract
Two-level systems were shown to be fully described by a single function,
known sometimes as the Stueckelberg parameter. Using concepts from differential
geometry, we give geometrical meaning to the Stueckelberg parameter and to other
related quantities. As a result, a generalization of the Stueckelberg parameter
is introduced, and a relation obtained between two-level systems and spatial
one-dimensional curves in three-dimensional space. Previous authors used this
Stueckelberg parameter to solve analytically several two-level models. We further
develop this idea, and solve analytically three fundamental models, from which many
other known models emerge as special cases. We present the detailed analysis of
these models.
PACS: 03.65.Db, 34.10.+x, 31.15.-p, 42.50.-p. |