Wednesday, 28 March 2012 at 2 pm

View the demo here
Download source code here

There are many ways of displaying gene ontology (GO) information. The most common way is to use software like Cytoscape to generate a graph diagram of GO with terms of interest highlighted. However, it is much more interesting to graph your data in an interactive way.

Here is a demo graphing only a small subset of GO. The demo starts with the root GO term, biological process and allows you to click on children terms (in orange) to navigate down the GO graph. It is a force-directed graph so you can also drag the nodes around. 

The two links on the top (fix and unfix) will turn off/on the force physics. There are also two sample GO term links that'll allow you to jump to the specified GO term.

Since the entire JSON formatted data of GO (biological process only) is around 4MB, this demo does not contain all possible GO terms. You can download a version with all GO terms here

The graph was rendered using D3.js. The data was parsed and formatted with python. I am not going to into great detail about the source code as there are too many things to cover. I'll just generally talk about the steps in creating this visualization.

  Wednesday, 21 March 2012 at 04 am

Gene Ontology (GO) is a controlled dictionary of terms used commonly among biologists to describe genes. Using a strict set of classifiers allows computational biologists to analyze qualitative data in a quantitative manner. 

GO is structured as a directed acylic graph. Basically it is a hierarchy of terms where children terms cannot be the parent of ancestors. It is important to note that parents can have multiple children and children can have multiple parents.

GO terms can be accessed via the AmiGO web interface, downloaded and installed as a SQL database, or downloaded as a flat-file (.obo). There are also plenty of software and packages available.

One of the most common operations in GO analysis is finding ancestors or descendents of a specific term. AmiGO is great for looking up details for a single term, but it doesn't display information in an easily parsable way. SQL queries can be made to a database to get this information, but for instances where there is no database available or no pre-installed packages, you can use the .obo flat-file to get this information.

  Tuesday, 20 March 2012 at 11 am

Most gene enrichment websites out there only allow you to find enrichments for popular model organisms using pre-established gene ontology annotations. I ran into this problem early on during my phd when confronted with having to generate enrichment data on Schmidtea mediterranea.