About
My two favorite words: Self-Development & Evolution
My mantra: "The biggest…
Articles by Dr. Maria
Activity
-
What a truly defining moment for Merck Group to welcome Jean-Charles Wirth and Danny Bar-Zohar to the Executive Board as our new CEOs of Merck Life…
What a truly defining moment for Merck Group to welcome Jean-Charles Wirth and Danny Bar-Zohar to the Executive Board as our new CEOs of Merck Life…
Liked by Dr. Maria C. Dunford
-
Last day at EAE Business School where once a year I get to crack GTM strategies and startup creation with a lovely bunch of future entrepreneurs and…
Last day at EAE Business School where once a year I get to crack GTM strategies and startup creation with a lovely bunch of future entrepreneurs and…
Liked by Dr. Maria C. Dunford
-
We are delighted to announce that Parkwalk’s Karolina Zapadka, PhD MBA has been promoted to Investment Director! Since joining in 2021, Karolina has…
We are delighted to announce that Parkwalk’s Karolina Zapadka, PhD MBA has been promoted to Investment Director! Since joining in 2021, Karolina has…
Liked by Dr. Maria C. Dunford
Experience
Education
-
Universitat Pompeu Fabra
-
One of the main and most recent challenges of modern biology is to keep-up with growing amount of biological data coming from next generation sequencing technologies and extract actionable biomedical insights. Large-scale comparative bioinformatics analyses are an integral part of this procedure. When doing comparative bioinformatics, multiple sequence alignments (MSAs) are by far the most widely used models. In this PhD thesis I expose the current relevance of multiple sequence aligners, I…
One of the main and most recent challenges of modern biology is to keep-up with growing amount of biological data coming from next generation sequencing technologies and extract actionable biomedical insights. Large-scale comparative bioinformatics analyses are an integral part of this procedure. When doing comparative bioinformatics, multiple sequence alignments (MSAs) are by far the most widely used models. In this PhD thesis I expose the current relevance of multiple sequence aligners, I show how their current scaling up is leading to serious numerical stability issues and how they impact phylogenetic tree reconstruction. For this purpose, I have developed two new methods, MEGA-Coffee, a large scale aligner and Shootstrap a novel bootstrapping measure. To improve computational efficiency and reproducibility of large-scale analyses like the one carried out in the context of these studies, I co-developed a new computational framework Nextflow.
-
-
-
-
Thesis: “Isoelectric point estimation of peptides with post-translational modifications”
Supervisor: Assistant Professor Lukas Käll
-
-
Thesis: “Markov Models in protein sequence analysis”
Supervisor: Professor Pantelis G. Bagos
-
-
Publications
-
Nextflow enables reproducible computational workflows
Nature biotechnology
The increasing complexity of readouts for omics analyses goes hand-in-hand with concerns about the reproducibility of experiments that analyze 'big data'. When analyzing very large data sets, the main source of computational irreproducibility arises from a lack of good practice pertaining to software and database usage. Small variations across computational platforms also contribute to computational irreproducibility by producing numerical instability, which is especially relevant to…
The increasing complexity of readouts for omics analyses goes hand-in-hand with concerns about the reproducibility of experiments that analyze 'big data'. When analyzing very large data sets, the main source of computational irreproducibility arises from a lack of good practice pertaining to software and database usage. Small variations across computational platforms also contribute to computational irreproducibility by producing numerical instability, which is especially relevant to high-performance computational (HPC) environments that are routinely used for omics analyses. We present a solution to this instability named Nextflow, a workflow management system that uses Docker technology for the multi-scale handling of containerized computation.
Other authorsSee publication -
PSI/TM-Coffee: a web server for fast and accurate multiple sequence alignments of regular and transmembrane proteins using homology extension on reduced databases.
Nucleic Acids Res.
The PSI/TM-Coffee web server performs multiple sequence alignment (MSA) of proteins by combining homology extension with a consistency based alignment approach. Homology extension is performed with Position Specific Iterative (PSI) BLAST searches against a choice of redundant and non-redundant databases. The main novelty of this server is to allow databases of reduced complexity to rapidly perform homology extension. This server also gives the possibility to use transmembrane proteins (TMPs)…
The PSI/TM-Coffee web server performs multiple sequence alignment (MSA) of proteins by combining homology extension with a consistency based alignment approach. Homology extension is performed with Position Specific Iterative (PSI) BLAST searches against a choice of redundant and non-redundant databases. The main novelty of this server is to allow databases of reduced complexity to rapidly perform homology extension. This server also gives the possibility to use transmembrane proteins (TMPs) reference databases to allow even faster homology extension on this important category of proteins. Aside from an MSA, the server also outputs topological prediction of TMPs using the HMMTOP algorithm. Previous benchmarking of the method has shown this approach outperforms the most accurate alignment methods such as MSAProbs, Kalign, PROMALS, MAFFT, ProbCons and PRALINE™. The web server is available at http://tcoffee.crg.cat/tmcoffee.
-
Multiple sequence alignment modeling: methods and applications.
Briefings in Bioinformatics
This review provides an overview on the development of Multiple sequence alignment (MSA) methods and their main applications. It is focused on progress made over the past decade. The three first sections review recent algorithmic developments for protein, RNA/DNA and genomic alignments. The fourth section deals with benchmarks and explores the relationship between empirical and simulated data, along with the impact on method developments. The last part of the review gives an overview on…
This review provides an overview on the development of Multiple sequence alignment (MSA) methods and their main applications. It is focused on progress made over the past decade. The three first sections review recent algorithmic developments for protein, RNA/DNA and genomic alignments. The fourth section deals with benchmarks and explores the relationship between empirical and simulated data, along with the impact on method developments. The last part of the review gives an overview on available MSA local reliability estimators and their dependence on various algorithmic properties of available methods.
-
The impact of Docker containers on the performance of genomic pipelines.
PeerJ
Genomic pipelines consist of several pieces of third party software and, because of their experimental nature, frequent changes and updates are commonly necessary thus raising serious deployment and reproducibility issues. Docker containers are emerging as a possible solution for many of these problems, as they allow the packaging of pipelines in an isolated and self-contained manner. This makes it easy to distribute and execute pipelines in a portable manner across a wide range of computing…
Genomic pipelines consist of several pieces of third party software and, because of their experimental nature, frequent changes and updates are commonly necessary thus raising serious deployment and reproducibility issues. Docker containers are emerging as a possible solution for many of these problems, as they allow the packaging of pipelines in an isolated and self-contained manner. This makes it easy to distribute and execute pipelines in a portable manner across a wide range of computing platforms. Thus, the question that arises is to what extent the use of Docker containers might affect the performance of these pipelines. Here we address this question and conclude that Docker containers have only a minor impact on the performance of common genomic pipelines, which is negligible when the executed jobs are long in terms of computational time.
KEYWORDS:Bioinformatics; Docker; Pipelines; Virtualisation; WorkflowOther authorsSee publication -
SARA-Coffee web server, a tool for the computation of RNA sequence and structure multiple alignments.
Nucleic Acids Research
Abstract
This article introduces the SARA-Coffee web server; a service allowing the online computation of 3D structure based multiple RNA sequence alignments. The server makes it possible to combine sequences with and without known 3D structures. Given a set of sequences SARA-Coffee outputs a multiple sequence alignment along with a reliability index for every sequence, column and aligned residue. SARA-Coffee combines SARA, a pairwise structural RNA aligner with the R-Coffee multiple RNA…Abstract
This article introduces the SARA-Coffee web server; a service allowing the online computation of 3D structure based multiple RNA sequence alignments. The server makes it possible to combine sequences with and without known 3D structures. Given a set of sequences SARA-Coffee outputs a multiple sequence alignment along with a reliability index for every sequence, column and aligned residue. SARA-Coffee combines SARA, a pairwise structural RNA aligner with the R-Coffee multiple RNA aligner in a way that has been shown to improve alignment accuracy over most sequence aligners when enough structural data is available. The server can be accessed from http://tcoffee.crg.cat/apps/tcoffee/do:saracoffee.
doi: 10.1093/nar/gku459. -
Chromatographic retention time prediction for posttranslationally modified peptides
PROTEOMICS
Keywords:
Bioinformatics;
Machine learning;
Posttranslational modification;
Retention time prediction;
Reversed-phase liquid chromatography
Retention time prediction of peptides in liquid chromatography has proven to be a valuable tool for mass spectrometry-based proteomics, especially in designing more efficient procedures for state-of-the-art targeted workflows. Additionally, accurate retention time predictions can also be used to increase…
Keywords:
Bioinformatics;
Machine learning;
Posttranslational modification;
Retention time prediction;
Reversed-phase liquid chromatography
Retention time prediction of peptides in liquid chromatography has proven to be a valuable tool for mass spectrometry-based proteomics, especially in designing more efficient procedures for state-of-the-art targeted workflows. Additionally, accurate retention time predictions can also be used to increase confidence in identifications in shotgun experiments. Despite these obvious benefits, the use of such methods has so far not been extended to (posttranslationally) modified peptides due to the absence of efficient predictors for such peptides. We here therefore describe a new retention time predictor for modified peptides, built on the foundations of our existing Elude algorithm. We evaluated our software by applying it on five types of commonly encountered modifications. Our results show that Elude now yields equally good prediction performances for modified and unmodified peptides, with correlation coefficients between predicted and observed retention times ranging from 0.93 to 0.98 for all the investigated datasets. Furthermore, we show that our predictor handles peptides carrying multiple modifications as well. This latest version of Elude is fully portable to new chromatographic conditions and can readily be applied to other types of posttranslational modifications. Elude is available under the permissive Apache2 open source License at http://per-colator.com or can be run via a web-interface at http://elude.sbc.su.se.
Projects
-
Nextflow
- Present
Nextflow is a fluent DSL modelled around the UNIX pipe concept, that simplifies writing parallel and scalable pipelines in a portable manner. You can use your favourite programming language
and tools, exploiting your current skills.
Nextflow mission is to facilitate the computation and analysis of Big Data, with special emphasis on "Big BioMedical" Data.
Used:
* Groovy
* Java -
T-Coffee
- Present
T-Coffee is one of the first bioinformatics tools created, that does Multiple Sequence Alignments (MSAs). My job is to re-design T-Coffee so as to scale it up in order to be able to handle and deliver Multiple Sequence Alignments of make hundreds of thousands of biological sequences (up to 1 million).
Used:
* C
* C++
-
Elude
-
Implementation of a web server called Elude, which is a bioinformatics software
used for peptide retention time prediction.
Honors & Awards
-
BIGDATA TALENT AWARDS
ORACLE
My PhD thesis won the 2016 BIG DATA TALENT AWARDS
-
“La Caixa” International PhD Programme Fellowships
"La Caixa" Foundation
I was selected and awarded this very competitive fellowship to pursue my PhD in the field of life sciences.
-
Best poster award
Hellenic Society for Computational Biology and Bioinformatics
Best poster award in the Hellenic Society for Computational Biology and Bioinformatics 2010 Conference for poster entitled “Mixture Transition Distribution
(MTD) Markov models: Statistical modeling and prediction of protein families”
-
Scholarship Award
The State Scholarships Foundation-ΙKY, Greece
Received scholarship award from the State Scholarships Foundation of Greece for the
academic year 2005-2006 for being the best student of my year.
Languages
-
Greek
Native or bilingual proficiency
-
English
Professional working proficiency
-
Spanish
Limited working proficiency
More activity by Dr. Maria
-
Quite proud to share that sociaw has just launched the brand new social media channels for Sentinel Online, under the European Space Agency - ESA y…
Quite proud to share that sociaw has just launched the brand new social media channels for Sentinel Online, under the European Space Agency - ESA y…
Liked by Dr. Maria C. Dunford
-
Our first knowledge graph in collaboration with The Francis Crick Institute and GSK using Neo4j is taking shape! Connecting 900.000 nodes with…
Our first knowledge graph in collaboration with The Francis Crick Institute and GSK using Neo4j is taking shape! Connecting 900.000 nodes with…
Liked by Dr. Maria C. Dunford
-
Our team at Google DeepMind and Google Research are taking a major step to accelerate therapeutic innovation with the wider research community. We're…
Our team at Google DeepMind and Google Research are taking a major step to accelerate therapeutic innovation with the wider research community. We're…
Liked by Dr. Maria C. Dunford
-
A week ago, I was awarded the 2025 Excellence Award for Senior Researcher in Informatics and Engineering at AUTH, among many other awardees in other…
A week ago, I was awarded the 2025 Excellence Award for Senior Researcher in Informatics and Engineering at AUTH, among many other awardees in other…
Liked by Dr. Maria C. Dunford
-
More amazing news for Lifebit! Kudos to all of the team and of course to Dr. Maria C. Dunford! #Lifebit #StarttechVentures
More amazing news for Lifebit! Kudos to all of the team and of course to Dr. Maria C. Dunford! #Lifebit #StarttechVentures
Liked by Dr. Maria C. Dunford
-
We're excited to share our latest work, published in Nature Partner Journals Genomic Medicine! 🧬 This study highlights how we used Exomiser for…
We're excited to share our latest work, published in Nature Partner Journals Genomic Medicine! 🧬 This study highlights how we used Exomiser for…
Liked by Dr. Maria C. Dunford
-
AWSome partnership Thorben, looking to support Cancer Research Horizons in this extremely important endeavor.
AWSome partnership Thorben, looking to support Cancer Research Horizons in this extremely important endeavor.
Liked by Dr. Maria C. Dunford
-
💡 Sharing the latest tools in Life Science that are gonna make a wave! As always, excellent work from Frédéric Célerse, PhD and Mark Davies and…
💡 Sharing the latest tools in Life Science that are gonna make a wave! As always, excellent work from Frédéric Célerse, PhD and Mark Davies and…
Liked by Dr. Maria C. Dunford
-
Today I’m very proud to introduce my new consultancy: Amanda White Communications. My aim is to support organisations to communicate with…
Today I’m very proud to introduce my new consultancy: Amanda White Communications. My aim is to support organisations to communicate with…
Liked by Dr. Maria C. Dunford
-
And that’s a wrap! I had a wonderful time at Festival of Genomics 2025! Really enjoyed all the talks and discussions as well as the opportunities…
And that’s a wrap! I had a wonderful time at Festival of Genomics 2025! Really enjoyed all the talks and discussions as well as the opportunities…
Liked by Dr. Maria C. Dunford
-
Welcome onboard to Lifebit to the wonderful team at Psifas Initiative for Precision Medicine
Welcome onboard to Lifebit to the wonderful team at Psifas Initiative for Precision Medicine
Liked by Dr. Maria C. Dunford
Other similar profiles
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore More