Sunday, June 14, 2020

Week #2: Debugging GFAtoVCF

As you can see from the title, this week was mostly spent debugging my Rust version of Flavia95's GFAtoVCF. Recall that GFAtoVCF is a script that is capable of detecting simple bubbles in a Variation Graph; this is in contrast to superbubbles (which can be nested) and ultrabubbles (which can contain loops).

The main topic for the week was the DFS (Depth First Search), or rather the spanning tree obtained by such algorithm. Consider the following image:


See the full size image here


In blue you can see the spanning tree produced by my Rust version of GFAtoVCF, while in red you can see (part of) the one produced by Flavia's. As you can see, nodes 9,10 and 11 are placed differently with respect to each other.  Which one is correct, then?

Well, it turns out that both are correct. The difference is probably due to different underlying implementations of the Handlegraph model (in Python ODGI is used, while I'm using rs-handlegraph), so that when the DFS has to choose a new edge, a different one gets chosen.

I am currently trying to fix the last few issues in GFAtoVCF. By next week, I hope I'll be able to show you the resulting VCF!

No comments:

Post a Comment