Sunday, June 7, 2020

Week #1: Re-implementing GFAtoVCF in Rust

As part of my first GSoC week, I was asked to re-implement Flavia95's GFAtoVCF in Rust. This has two objectives:
  1. Getting me familiar with Rust (a language I had no previous experience in)
  2. Getting me familiar with Variation Graphs concepts, and with the Handlegraph API (which also has a Rust implementation)
From what I understand, GFA (which stands for Graphical Fragment Assembly) is a format for storing the results of an assembly in the form of a graph; it can also be used to represent a pangenome. VCF (which stands for Variation Calling Format) is instead a format for storing variants of a given reference. 

So, in order convert from GFA to VCF, I have to:
  1. Consider one (or more) paths as a reference
  2. For each reference, explore all other possible paths
  3. When a different node is reached, consider its sequence as a variant (a "bubble")
Looks simple enough! However Rust has a few quirks (such as  Ownership, Closures, etc.) that make it somewhat difficult to work with, at least for a beginner.

With that in mind, I started writing Rust code by trying to stay as close as possible to Flavia's code. Unfortunately, since Python and Rust are very different languages, that didn't go as well as expected. For one, it was very hard to tell what variable types I had to use (since in Python types are implicit); secondly, I experienced many issues when trying to use closures/lambdas (something I had rarely done before).

Another unexpected thing I experienced is that some methods found in Flavia's code were missing from Handlegraph's Rust implementation (specifically for_each_path_handle and for_each_step_in_path). So, I created my own fork of rs-handlegraph but I'm still trying to figure out how to correctly implement them. 

Currently, I still have to fix a few issues from the list above. Once I solve these issues, I'll start running comparisons between my program and Flavia's, to make sure they work in the same way.

No comments:

Post a Comment