Friday, 06 April 2012 at 10:11 am

GFF and GTF are data formats heavily used for storing annotation information. It's common to see these two formats used interchangeably. However, GFFs (general feature format) are actually meant to be used for any genomic feature, while GTF (gene transfer format) is strictly used for genes. 

Both of these formats are very similar, making conversion pretty simple. The only problem in conversion is when the lines of the GFF file is not arranged in feature blocks. This entry will show you the differences between these two files and how to interconvert between the two formats.

  Sunday, 01 April 2012 at 11:18 am

In part 1 of this series of post, I showed you how to graph a simple bar graph. In this post, I will show you how to use Chrome's developer tools to better debug your code. The javascript console in developer tools is an extremely powerful resource that allows you access to access and run javascript within the browser. You can even run all your code in the console to render a figure if you choose to. 

To turn on the developer tools in Chrome, click the wrench icon in the upper right hand corner of the browser window -> tools -> developer tools. Your browser will now have a new partition in the bottom showing the developer tools. The two most useful feature in the developer tools are the 'Elements' and 'Console' tabs.

  Thursday, 22 March 2012 at 2:41 pm

View the demo here
HTML source is at the bottom of the post

Computers and the internet have changed academia in dramatic ways from greater sharing of data to a larger sense of community. Science journals are now all digitized and available online either through your web browser or downloadble as a .pdf. 

Even with all the technology available for presenting data, most published papers still only contain static figures. I am not undervaluing the importance of having nicely formatted figures and graphs. But I do want to show how data can be presented with all the tools available now. 

Science papers are generally viewed on a computer through a web browser like Chrome, Firefox, or Safari which use javascript/html/css for displaying information. Therefore, browser languages are ideal for ensuring accessibility of your data. Javascript is often touted as the most prevalent programming language in the world since every computer has a browser and most browsers can interpret javascript.

Here are a bunch of examples of interactive figures made using browser technologies, specifically D3.js.

  Wednesday, 21 March 2012 at 04:08 am

Gene Ontology (GO) is a controlled dictionary of terms used commonly among biologists to describe genes. Using a strict set of classifiers allows computational biologists to analyze qualitative data in a quantitative manner. 

GO is structured as a directed acylic graph. Basically it is a hierarchy of terms where children terms cannot be the parent of ancestors. It is important to note that parents can have multiple children and children can have multiple parents.

GO terms can be accessed via the AmiGO web interface, downloaded and installed as a SQL database, or downloaded as a flat-file (.obo). There are also plenty of software and packages available.

One of the most common operations in GO analysis is finding ancestors or descendents of a specific term. AmiGO is great for looking up details for a single term, but it doesn't display information in an easily parsable way. SQL queries can be made to a database to get this information, but for instances where there is no database available or no pre-installed packages, you can use the .obo flat-file to get this information.

  Tuesday, 20 March 2012 at 11:55 am

Most gene enrichment websites out there only allow you to find enrichments for popular model organisms using pre-established gene ontology annotations. I ran into this problem early on during my phd when confronted with having to generate enrichment data on Schmidtea mediterranea.