Evolutionary Genomics & Bioinformatics BSc and MSc Thesis Projects -2023

Oferta de pràcticums, TFG i TFM a la Facultat de Biologia

Comparative and evolutionary analysis of multigene families in spider (and arthropod) genomes

Understanding the origin, evolutionary diversification and functional role of multigene families in eukaryotic is a central question in Evolutionary Biology. Despite that modern high-throughput sequencing (HTS) technologies are currently accessible for many labs, the accurate identification and annotation of gene family is one of the major challenges in the field. This scenario is changing thanks to the irruption of the so called third-generation sequencing technologies (i.e., long-read sequencing). Currently, our research group generates genomic and transcriptomic data (mainly of spiders) to obtain chromosome-level assemblies using long reads DNA sequence data and Omni-C based scaffolding.

The objective of this TFM is to perform a comparative genomic study of the molecular evolution of 1) the major gene families involved in the chemosensory system (olfactory and gustatory), or 2) those encoding venoms and toxins or 3) the repetitive elements (transposable or other types of repetitive sequences) in chelicerates and, by extension, in arthropods. Indeed, some of these families are involved in fundamental biological processes, being decisive for survival and reproduction, contributing significantly to adaptation and specialization. Furthermore, some of the products encoded by these families are very relevant in applied science (i.e, pest and disease control, biomedicine, biotic materials, etc…). For the analysis, we are using comparative genomics and transcriptomics approaches, under the theoretical framework of molecular evolutionary genetics. We applied powerful bioinformatics tools, some of them generated in our research group, to identify the genomic regions and gene functions driving diversification, and to carry out the appropriated comparative genomic analyses (in coding and non-coding regions, in gene copy number, and in gene expression levels).
2020 Project Tasks to be carried out by the student:

The student will participate in the identification, annotation and evolutionary analysis of different gene families in complete genomes. For that, he/she will use high quality genome sequences, bioinformatics tools (software and scripts to manipulate and visualizate sequences and genomic annotations, to identify gene family copies). The data will be analyse using phylogenetic and evolutionary genetic methods, under the theoretical framework of population genetics and molecular evolution. Many of these analyses will be carried out in our high performance computer cluster.

Expected student skills:

Basic knowledge about comparative genomics and phylogenetics, and/or on NGS data handling, assembly and analysis. Experience with Linux operating systems, and some programing languages commonly used in bioinformatics (Perl, Python, R) are desirable.

Project supervisor:

Julio Rozas (jrozas@ub.edu)

See other projects in the Alejandro Sánchez-Gracia Web page.



Software developed in the research group:

Publications of the EGB research group: