Comparative and evolutionary analysis of multigene families in spider (and arthropod) genomes
Understanding the origin, evolutionary diversification and functional role of multigene families in eukaryotic is a central question in Evolutionary Biology.
Despite that modern high-throughput sequencing (HTS) technologies are currently accessible for many labs, the accurate identification and annotation of gene family is one of the major challenges in the field.
This scenario is changing thanks to the irruption of the so called third-generation sequencing technologies (i.e., long-read sequencing).
Currently, our research group generates genomic and transcriptomic data (mainly of spiders) to obtain chromosome-level assemblies using long reads DNA sequence data and Omni-C based scaffolding.
The objective of this TFM is to perform a comparative genomic study of the molecular evolution
of 1) the major gene families involved in the chemosensory system (olfactory and gustatory),
or 2) those encoding venoms and toxins
or 3) the repetitive elements (transposable or other types of repetitive sequences) in chelicerates and, by extension, in arthropods.
Indeed, some of these families are involved in fundamental biological processes, being decisive for survival and reproduction, contributing significantly to adaptation and specialization.
Furthermore, some of the products encoded by these families are very relevant in applied science (i.e, pest and disease control, biomedicine, biotic materials, etc…).
For the analysis, we are using comparative genomics and transcriptomics approaches, under the theoretical framework of molecular evolutionary genetics.
We applied powerful bioinformatics tools, some of them generated in our research group,
to identify the genomic regions and gene functions driving diversification, and to
carry out the appropriated comparative genomic analyses (in coding and non-coding regions, in gene copy number, and in gene expression levels).
Tasks to be carried out by the student:
Expected student skills:
Basic knowledge about comparative genomics and phylogenetics, and/or on NGS data handling, assembly and analysis. Experience with Linux operating systems, and some programing languages commonly used in bioinformatics (Perl, Python, R) are desirable.Project supervisor:
Julio Rozas (jrozas@ub.edu)Software developed in the research group:
Publications of the EGB research group: