论文标题

蛋白质到基因组与微型蛋白质对齐

Protein-to-genome alignment with miniprot

论文作者

Li, Heng

论文摘要

动机:蛋白质到基因组的比对对于在非模型生物中注释基因至关重要。尽管有一些用于此目的的工具,但所有工具都是十年前开发的,并且没有纳入对齐算法的最新进展。它们效率低下,无法跟上新基因组的快速生产和快速增长的蛋白质数据库。 结果:在这里,我们描述了Miniprot,这是将蛋白质序列映射到完整基因组的新对准器。 Miniprot集成了最新技术,例如K-MER草图和基于SIMD的动态编程。它比现有工具快数十倍,同时实现了真实数据的可比精度。 可用性和实施​​:https://github.com/lh3/miniprot

Motivation: Protein-to-genome alignment is critical to annotating genes in non-model organisms. While there are a few tools for this purpose, all of them were developed over ten years ago and did not incorporate the latest advances in alignment algorithms. They are inefficient and could not keep up with the rapid production of new genomes and quickly growing protein databases. Results: Here we describe miniprot, a new aligner for mapping protein sequences to a complete genome. Miniprot integrates recent techniques such as k-mer sketch and SIMD-based dynamic programming. It is tens of times faster than existing tools while achieving comparable accuracy on real data. Availability and implementation: https://github.com/lh3/miniprot

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源