Wednesday, 17 September 2014 at 1 pm

In my previous entry, I showed how to add a toggle code cell button to your IPython notebook. Someone in the comments had a great solution where a code snippet is added to the custom.js file. His code is located here:

However, it seems like a lot of people wanted a feature where the published notebook (NBViewer) has the ability to hide the code cell. 

It turns out, it is possible to run javascript in the notebook if you import the HTML method from IPython:

from IPython.display import HTML

In a code cell, add this:

function code_toggle() {
if (code_show){
} else {
code_show = !code_show
$( document ).ready(code_toggle);
The raw code for this IPython notebook is by default hidden for easier reading.
To toggle on/off the raw code, click <a href="javascript:code_toggle()">here</a>.''')

When you run this code cell, by default, all the code cells will now be hidden. But you can toggle it on and off by clicking on the link. This toggle link will also be present in the published (NBViewer) version.

Here is an example of an IPython notebook with this toggle link:

*Note that the above link doesn't actually contain the raw toggle script. I put all my IPython specific python scripts in a python library. I then import the script.

  Wednesday, 07 May 2014 at 2 pm

It's over. This was the thought that went through my head as I walked out of hacker school's front door around midnight. Having had a few drinks in the preceeding couple of hours as part of the end-of-term party, my steps were heavier than normal.

What is it?

Hacker school is a three months programmer's retreat where a group of like-minded and motivated people are gathered in a room to learn as much as they can about programming. This is not unique. Organized workshops and retreats exist for many professions and careers. However, what makes hacker school different from others is its singular devotion to learning and community. The more cynical among you might question the organizers' sincerity, as there are obvious secondary motivations in the form of recruitment fees for the organizers and landing a job in the tech industry for the attendees. Take it as you will, I did not find these practical motivations to be obstructive during my stay.

  Tuesday, 17 December 2013 at 12 pm

I got an acceptance e-mail from Hacker School last night after a short written application and 2 interviews. Hacker School is a workshop for programmers. It aims to be a safe environment for people at different skill levels to come together and learn. You spend 3 months in New York working by yourself or collaborating with other like-minded people on whatever projects that interest you. It might sound kind of self indulgent, but so is graduate school in some sense. 

They accept a very diverse group of people from what I've read. Maybe I'll meet another bioinformatician there. I am looking forward to seeing how this goes.

Now I just have to finish writing this thesis...

  Thursday, 05 December 2013 at 10 am

I came across this pop science article yesterday:

The author argues that a gene-centric perspective of evolution, made popular by Dawkins with "The Selfish Gene", is not correct and we should focus our attention on other mechanisms such as gene expression. 

The fallacy with his argument stems from a misunderstanding of what Dawkins was trying to present. The selfish gene can basically be boiled down to: "The most basic unit of heredity is a gene".

This idea is only gene-centric in the sense that we think it is the most fundamental unit of heredity. Biologists understand there are many many layers of complexity (including gene expression) above genes that ultimately contributes to the phenotype. There are plenty of research done at the level of gene expression networks, protein translation, protein folding, cell organization, tissue engineering...etc.

A more valid arguement against "The Selfish Gene" is the use of the term "gene". The definition of a gene is becoming more murky than ever (here is a great paper on this: The most basic unit of heredity perhaps should be any genomic feature that contributes to the phenotype? Whatever that may be.

  Monday, 02 September 2013 at 5 pm

I've been attending the UK NGS/Genomic Sciences meetings since it started 4 years ago. While there are great talks every year, this year, they were able to get Clive Brown to do the keynote talk about Oxford Nanopore. For people in the NGS field, I don't think I need to say much about what Nanopore is (check out Oxford Nanopore's website for more details).

Before the talk, Clive put up a slide telling people he prefers there to be no tweets about the talk since he will be covering a great deal of technical details (which he did). I found that kind of strange. It seems like he doesn't want the content of his talk to be public? Why not just have all of us sign a NDA if that's the case? However, I will comply with his request and will not write much about the technical aspects of his talk. Instead, I will talk about what I think about Oxford Nanopore and its potential impact on the field.

  Saturday, 06 July 2013 at 10 pm

I put some finishing touches on Seeker: Annotation Viewer last week for visualizing sequence features such as protein domains, primers, etc... Now I am working on a genome browser. Here is an extremely early prototype (there is around 1.8mb of files to load):

It should work on latest versions of Chrome/Safari/Firefox. It will most likely NOT work on IE or Opera. Hopefully, this won't crash your browser. This is completely client-side only. You can distribute these files on a USB stick and anyone with a modern browser will be able to open it.

The loaded data is human chromosome 1 parsed from a .gtf file downloaded from UCSC. The parsed data is around 1MB (980KB). These interactions are possible right now:

  • Dragging on the tracks will allow you to scroll through the reference chromosome
  • WASD movement. Press 'A' to scroll left, 'S' to scroll right, 'W' to scroll up, 'S' to scroll down. Anyone who plays computer games should be familiar with this layout.
  • Clicking on the bottom overview bar (blue bar) will let you jump to that position.
  • You can also click and drag on the bottom bar, but depending on how good your computer is, it might be jittery.
  • The line graph on the bottom overview bar represents feature density. The higher the amplitude, the more features there are at that loci.
  • Right now it's displaying 1 million base pair windows. I've tested up to 5 million with little trouble on my early 2012 Macbook Pro. I'll probably set maximum window size to 1 million. 

I'll go in to more detail about how the rendering works in the future. I've implemented a "rubber-banding" scrolling system instead of the normal Google Maps style tiling system.

  Monday, 01 July 2013 at 09 am

I've wanted to learn how to build web apps with webGL ever since I saw the crazy Unreal engine ported to HTML5 and webGL (as a side-note, three.js is a very popular javascript 3d library that leverages webGL). It has a lot of potential for data visualizations. Imagine a genome browser running on a GPU. It will be able to render millions of objects easily. 

I came across this developer preview library today of a framework that allows for data visualizations using webGL and webworkers for multi-threading:

It is only a developer preview. But it looks extremely cool.

Of course the down-side (as with anything running in a browser) is cross-browser compatbility. The framework seems to also use webCL which doesn't seem like it will be widely adopted anytime soon. Perhaps someone can make a modified Node-webkit?

  Thursday, 20 June 2013 at 8 pm

After several refactoring, version 1.0 of the annotation viewer is finished. You can use the app here:

Input to the app right now is either HMMScan domain table result or a tab delimited file. The tab delimited file is formatted with 5 columns: sequence name, feature name, start position, end position, sequence length. There are sample input data in the app for clarity.

I am not sure how cross-browser it is. It was developed mostly with Chrome in mind, however it should work on latest versions of Chrome/Safari/Firefox. 

On the technical side of things, this web app uses D3.js heavily for the SVG rendering and many DOM manipulations. All I can say is that D3.js is almost magical in how fast it re-renders objects. I also rolled my own MVC system instead of going with the popular backbone.js, angular.js,...etc frameworks. It was definintely an eye opening experience to see how much work goes into these MVC systems.

My MVC system is not really a full MVC. A more proper description is a view-centric MVC.

Components like menus, checkboxes, sliders, drop-downs were built with a data binding system that allows them to react to changes in the data. These components are the view of the MVC pattern. However, there are no formal models in this system, hence the "view-centric". Data are just native javascript objects or arrays, allowing JSON-typed input. When the data is bound to a view, methods are added to the data that allows them to update the view on data change. Yes, I am aware that adding methods to the data object is dirty and a hack. 

Here is an example of this view-centric MVC system:

var data = {'name':'next gen sequencing conference','attending':false};
var checkbox = new seeker.checkbox()

The data is an object consisting of two key:value pairs. To bind this piece of data to a checkbox where the label of the textbox correspond to "name" and the checkbox itself correspond to "attending", we use the .bind function. This function takes in two argument objects: data and keys.

There are specific keys that let's the checkbox component understand which data corresponds to the label or the checkbox. Both the 'text' key which corresponds to the label and the 'checkbox' key which correpsonds to the checkbox are bound to the "data" object. The keys in the object that corresponds to 'text' and 'checbox' are 'name' and 'attending. 

This system is a bit unweildy in argument construction. I might have to mess around with that part to make it more elegant. I also still have to optimize data unbinding. I am sure there are tons of memory leaks right now.