2014-07-06

Phylogenetic position of Ctenophora. A follow-up.

In a previous post I nagged about a paper reporting a genome of a ctenophore Mnemiopsis leidyi (Ryan et al. 2013), where it was suggested that ctenophores are the sister group to all other animals. Now another genome from this animal phylum, that of Pleurobrachia bachei, has been sequenced along with transcriptomes of additional 10 ctenophore species (Moroz et al. 2014)! Regarding the phylogenetic position of Ctenophora, the analysis by Moroz et al. are even less convincing than those by Ryan et al. (2013). Apparently only maximum likelihood analyses with RAxML were done (or at least reported), which either support Ctenophora as sister group to other animals or are inconclusive. But this is not the main point of the article. Moroz et al. suggest that nervous system and possibly muscles of ctenophores evolved independently from other animals (Cnidaria and Bilateria). They showed not only that ctenophores lack many Cnidaria+Bilateria specific genes associated with muscles, nervous system etc, but that ctenophores have recruited for those purposes nearly entirely different set of genes. So it seems unlikely that Ctenophora lost muscles and nervous system specified by genetic toolkit found in Cnidaria and Bilateria and then invented everything from scratch once again. More likely, Ctenophora and Cnidaria+Bilateria evolved phenotypic complexity independently from simpler ancestors. However, this does not mean that Ctenophora has to be a sister group to other animals, but a phylogenetic arrangement where Ctenophora is sister to Cnidaria and these together (called Coelenterata) form a sister group of Bilateria (as found by Philippe et al. 2009), seems less likely now, because it might exactly mean the loss and re-evolution of nerves and muscles in Ctenophora. Then again, compared to Bilateria, Cnidaria, and Porifera, contemporary ctenophores are genetically very closely related to each other (see Podar et al. 2001 and Extended Data Figure 3d in Moroz et al. 2014), meaning that they diverged from each other relatively recently, leaving a long stem going probably back to Precambrian. A lot can happen along this long stem, so who knows...

Regardless of the deepest phylogenetic relationships between animals, independent evolution of muscles and nervous system in ctenophores seems quite likely based on the results by Moroz et al. 2014. This led me to a realization that maybe the phylogenetic relationships between Porifera, Placozoa, Ctenophora, Cnidaria, and Bilateria actually do not matter much. These relationships tend to vary from study to study and the internal branches uniting these groups in different combinations tend to be very short. As short branches suggest little amount of evolution, maybe nothing remarkable happened along those branches anyway and most of the possible relationships between those main animal lineages are more or less equivalent? Then again, considering that these lineages diverged from each other more than 540 million years ago, difficulties in reconstructing their relationships can be expected. Too early to give up. More sophisticated analyses might help to figure out what are the likely causes for conflicting results (systematic errors or true lack of phylogenetic signal).

References

Moroz LL, Kocot KM, Citarella MR, Dosung S, Norekian TP, Povolotskaya IS, Grigorenko AP, Dailey C, Berezikov E, Buckley KM, Ptitsyn A, Reshetov D, Mukherjee K, Moroz TP, Bobkova Y, Yu F, Kapitonov V V, Jurka J, Bobkov Y V, Swore JJ, Girardo DO, Fodor A, Gusev F, Sanford R, Bruders R, Kittler E, Mills CE, Rast JP, Derelle R, Solovyev V V, Kondrashov F a, Swalla BJ, Sweedler J V, Rogaev EI, Halanych KM, Kohn AB (2014) The ctenophore genome and the evolutionary origins of neural systems. Nature 510: 109–114. doi: 10.1038/nature13400 
Philippe H, Derelle R, Lopez P, Pick K, Borchiellini C, Boury-Esnault N, Vacelet J, Renard E, Houliston E, Quéinnec E, Da Silva C, Wincker P, Le Guyader H, Leys S, Jackson DJ, Schreiber F, Erpenbeck D, Morgenstern B, Wörheide G, Manuel M (2009) Phylogenomics revives traditional views on deep animal relationships. Current biology 19: 706–712. doi: 10.1016/j.cub.2009.02.052
Podar M, Haddock SH, Sogin ML, Harbison GR (2001) A molecular phylogenetic framework for the phylum Ctenophora using 18S rRNA genes. Molecular phylogenetics and evolution 21: 218–230. doi: 10.1006/mpev.2001.1036
Ryan JF, Pang K, Schnitzler CE, Nguyen A-D, Moreland RT, Simmons DK, Koch BJ, Francis WR, Havlak P, Smith S a, Putnam NH, Haddock SHD, Dunn CW, Wolfsberg TG, Mullikin JC, Martindale MQ, Baxevanis AD (2013) The genome of the ctenophore Mnemiopsis leidyi and its implications for cell type evolution. Science 342: 1242592. doi: 10.1126/science.1242592

2014-05-11

Phylogenetic position of Ctenophora

The first genome of Ctenophora has now been sequenced (Ryan et al. 2013), specifically that of Mnemiopsis leidyi. Genomes of most of the 30 or so animal phyla are still unavailable, but this might change in the coming years (Bracken-Grissomet al. 2014). I had great hopes of getting new insights about pinpointing the phylogenetic position of ctenophores. Unfortunately, the Science paper was disappointing and mainly not because the phylogenetic position of Ctenophora remains unsolved, but because the authors want to give the impression that it is. The abstract of the paper is terribly misleading. Of course, it is more attractive to present the results in a way that one of the long standing questions in animal evolution has been solved, rather than admitting that even after sequencing the whole genome, it's still not sure.

Ryan et al. want to make the case, that ctenophores are the sister group to all other animals. This was suggested for the first time in 2008 (Dunn et al.), but it wasn't taken very seriously even by the authors and a year later (Philippe et al. 2009) it was shown that the likely problem causing such a strange placement of ctenophores was their fast molecular evolution. Taking this into account, Philippe et al. found that ctenophores most likely belong to Eumetazoa (animals with true tissues, muscles and nervous system) and the sister group to all other animals is Porifera (sponges, who do not have muscles and nervous system), as always suspected. According to Philippe et al. ctenophores are most closely related to Cnidaria, with whom they superficially resemble, and were traditionally classified together as Coelenterata. The results of Philippe et al. can by no means be considered final, however, and reliably deciphering the relationships between the main lineages of animals (Porifera, Placozoa, Ctenophora, Cnidaria, and Bilateria) is still difficult (Nosenko et al. 2013).

So what evidence Ryan et al. provide for such a surprising phylogenetic placement of ctenophores? The two main lines of evidence were phylogenetic analyses of protein sequences and of genome gene contents (presence/absence of genes).

Despite the impression the authors are giving, their phylogenetic analyses are far from conclusive. They used two methods, maximum likelihood (ML) and Bayesian inference, to construct phylogenies based on protein sequences. These methods gave different results, but most likely not because of the methodological differences, but because of different evolutionary models employed.

In the ML framework, they used the standard GTR model, which is a site homogeneous model. This means, that probabilities describing different amino acid (or nucleotide) replacements (termed replacement matrix) do not vary along the sequence. Although the overall rate of replacement among sites can change when gamma rate parameter is introduced in the model, the relative probabilities of amino acid replacements remain the same. In reality, however, different regions of proteins do not only evolve faster or slower, but also qualitatively differently because of various constraints. This means that depending on the position in the protein, only some types of amino acids tend to be allowed (e.g. hydrophobic or aromatic etc). This fact makes it necessary to consider different amino acid replacement probabilities for different positions even when the rate of change is the same (these kinds of models are called site heterogeneous). Fortunately, there is no need to assign to every position in the sequence its own replacement matrix (which would make the analyses computationally intractable), but they can be grouped into fewer categories. The Bayesian CAT model (Lartillot & Philippe 2004) estimates from the data the number of different categories and which kind of amino acid replacements describe these categories the best. As with the GTR model, the site heterogeneous models can be combined with gamma rate parameter to vary the overall rate at sites, adding an additional layer of complexity (but also making analysis computationally more demanding).

The ML analysis using GTR+gamma model favored a tree where ctenophores were the sister group to all other animals. The Bayesian analyses with CAT model favored either a tree were Ctenophores were the sister group to Porifera (105 000 site dataset with little missing data, but small taxon sampling) or positioned within Eumetazoa (88 000 site dataset with lot of missing data, but large taxon sampling).

The GTR model is less realistic and clearly more prone to long-branch attraction artefacts than CAT (Lartillot et al. 2007). Long-branch attraction causes fast evolving (long-branch) taxa to group together regardless of their phylogenetic affinities or pull them towards distant out-group taxa. As ctenophores appear to be at least at molecular level fast evolving (Philippe et al. 2009; Pett et al. 2011; Kohn et al. 2012), it cannot be excluded that the position of ctenophores in the ML analysis is caused by long-branch attraction artefact. Ryan et al. results also show that the ctenophores are among the faster evolving taxa in their dataset, but because the ctenophores were not extremely fast evolving, the authors thought that it is not a problem (unclear to me what gave them this confidence).

Unfortunately their Bayesian analyses are not without problems either. The small taxon analyses (where Mnemiopsis+sponge clade was sister to other animals) were problematic precisely because of poor taxon sampling (15–19 taxa depending on the outgroup size). Large number of taxa are required (Lartillot & Philippe 2004) to reliably estimate parameters of CAT model and decide between ancestral and derived character states. For the large taxon datasets the problem appeared to be the opposite – they were too big to get reliable results even after running analyses on average 200 days. This could perhaps have been solved by excluding some of the taxa (especially among well sampled Bilateria) and analyzing datasets containing for example random 50% of the original positions (44 000 instead of 88 000). PhyloBayes-MPI manual mentions that getting consistent results becomes challenging already beyond 20 000 positions.

Although CAT model is not available in the ML framework, nevertheless there are similar alternatives for ML. For example, structural and empirical mixture models containing 2–6 matrices (instead of just one) implemented in PhyML programs (Le& Gascuel 2010; Le et al. 2012). Some of these models have already been used in studying ancient phylogenetic relationships and shown to affect the results (Lasek-Nesselquist & Gogarten 2013). Pity that Ryan et al. did not explore these models.

The second main evidence Ryan et al. gave regarding phylogenetic position of ctenophores was gene content analyses. It appears that Mnemiopsis lacks many genes that are present in all other animals (including sponges) but not in outgroup species. Although the list of these missing genes for ctenophores as a whole is somewhat smaller (already authors found that few genes that were missing in Mnemiopsis were in fact present in some other ctenophore species), it probably remains quite large as all ctenophores appear to be rather closely related to each other (Podar et al. 2001). As ctenophores evolve fast and the genome of Mnemiopsis is compact and among the smallest in animals (Ryan et al. 2013), it seems likely that the missing genes have been lost secondarily. Two features of ctenophore reproductive biology might explain their fast evolution: inbreeding caused by self-fertilization (almost all ctenophores are hermaphrodites) and capability for rapid and massive reproduction. This can lead to frequent massive die-offs creating genetic bottlenecks, which facilitates the accumulation of deleterious mutations (Pett et al. 2011). It is also evident from Ryan et al's results of ML phylogenetic analyses of gene content that nonsense phylogenetic relationships can be produced: for example Annelida was not monophyletic, because one species was together with a mollusk as a sister group to a cephalochordate.

In summary, Ryan et al's phylogenetic analyses aren’t particularly convincing. Claiming that phylogenetic position of ctenophores is now resolved is annoying. Before we rearrange the animal tree of life, let's wait for more thorough analyses. And more data wouldn't hurt either.

References

Bracken-Grissom H, Collins AG, Collins T, Crandall K, Distel D, Dunn C, Giribet G, Haddock S, Knowlton N, Martindale M, Medina M, Messing C, O’Brien SJ, Paulay G, Putnam N, Ravasi T, Rouse GW, Ryan JF, Schulze A, Wörheide G, Adamska M, Bailly X, Breinholt J, Browne WE, Diaz MC, Evans N, Flot J-F, Fogarty N, Johnston M, Kamel B, Kawahara AY, Laberge T, Lavrov D, Michonneau F, Moroz LL, Oakley T, Osborne K, Pomponi SA, Rhodes A, Santos SR, Satoh N, Thacker RW, Van de Peer Y, Voolstra CR, Welch DM, Winston J, Zhou X (2014) The Global Invertebrate Genomics Alliance (GIGA): developing community resources to study diverse invertebrate genomes. The Journal of heredity 105: 1–18. doi: 10.1093/jhered/est084
Dunn CW, Hejnol A, Matus DQ, Pang K, Browne WE, Smith S a, Seaver E, Rouse GW, Obst M, Edgecombe GD, Sørensen M V, Haddock SHD, Schmidt-Rhaesa A, Okusu A, Kristensen RM, Wheeler WC, Martindale MQ, Giribet G (2008) Broad phylogenomic sampling improves resolution of the animal tree of life. Nature 452: 745–749. doi: 10.1038/nature06614
Kohn AB, Citarella MR, Kocot KM, Bobkova Y V, Halanych KM, Moroz LL (2012) Rapid evolution of the compact and unusual mitochondrial genome in the ctenophore, Pleurobrachia bachei. Molecular phylogenetics and evolution 63: 203–207. doi: 10.1016/j.ympev.2011.12.009
Lartillot N, Philippe H (2004) A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. Molecular biology and evolution 21: 1095–109. doi: 10.1093/molbev/msh112
Lartillot N, Brinkmann H, Philippe H (2007) Suppression of long-branch attraction artefacts in the animal phylogeny using a site-heterogeneous model. BMC evolutionary biology 7 Suppl 1: S4. doi: 10.1186/1471-2148-7-S1-S4
Le SQ, Gascuel O (2010) Accounting for solvent accessibility and secondary structure in protein phylogenetics is clearly beneficial. Systematic biology 59: 277–87. doi: 10.1093/sysbio/syq002
Le SQ, Dang CC, Gascuel O (2012) Modeling protein evolution with several amino Acid replacement matrices depending on site rates. Molecular biology and evolution 29: 2921–36. doi: 10.1093/molbev/mss112
Lasek-Nesselquist E, Gogarten JP (2013) The effects of model choice and mitigating bias on the ribosomal tree of life. Molecular phylogenetics and evolution 69: 17–38. doi: 10.1016/j.ympev.2013.05.006
Nosenko T, Schreiber F, Adamska M, Adamski M, Eitel M, Hammel J, Maldonado M, Müller WEG, Nickel M, Schierwater B, Vacelet J, Wiens M, Wörheide G (2013) Deep metazoan phylogeny: When different genes tell different stories. Molecular phylogenetics and evolution 67: 223–233. doi: 10.1016/j.ympev.2013.01.010
Pett W, Ryan JF, Pang K, Mullikin JC, Martindale MQ, Baxevanis AD, Lavrov D V (2011) Extreme mitochondrial evolution in the ctenophore Mnemiopsis leidyi: Insight from mtDNA and the nuclear genome. Mitochondrial DNA 22: 130–142. doi: 10.3109/19401736.2011.624611; alternateive link
Philippe H, Derelle R, Lopez P, Pick K, Borchiellini C, Boury-Esnault N, Vacelet J, Renard E, Houliston E, Quéinnec E, Da Silva C, Wincker P, Le Guyader H, Leys S, Jackson DJ, Schreiber F, Erpenbeck D, Morgenstern B, Wörheide G, Manuel M (2009) Phylogenomics revives traditional views on deep animal relationships. Current biology 19: 706–712. doi: 10.1016/j.cub.2009.02.052
Podar M, Haddock SH, Sogin ML, Harbison GR (2001) A molecular phylogenetic framework for the phylum Ctenophora using 18S rRNA genes. Molecular phylogenetics and evolution 21: 218–230. doi: 10.1006/mpev.2001.1036
Ryan JF, Pang K, Schnitzler CE, Nguyen A-D, Moreland RT, Simmons DK, Koch BJ, Francis WR, Havlak P, Smith S a, Putnam NH, Haddock SHD, Dunn CW, Wolfsberg TG, Mullikin JC, Martindale MQ, Baxevanis AD (2013) The genome of the ctenophore Mnemiopsis leidyi and its implications for cell type evolution. Science 342: 1242592. doi: 10.1126/science.1242592