Monday, 30 June 2014 at 1:15 pm
by Damian Kao
Trinity is a popular transcriptome assembler developed a the Broad institute. It consist of three main programs (Inchworm, Chrysalis, Butterfly) that processes and assemble raw reads into a transcriptome.
In the Chrysalis step of the program, contigs are bundled together based on k-mer overlap and pair-end read information. These bundled contigs, also called "components" by Trinity, are then represented as a de bruijn graph allowing Butterfly to find various traversal paths which ultimately represents possible transcripts.
Visualizing the de bruijn graph can be very informative. Here is my IPython notebook for rendering the de bruijn graph of Trinity components: notebook
The notebook contains two functions that can:
- Render the graph as a simplified network of essential nodes and highlight all probable paths as described by Butterfly.
- Render all nodes of the graph as green circles with the root node in red.