Friday, 20 September 2013 at 1:49 pm

Enrichment analysis are applied when you have categorical data associated with your dataset. For example gene ontology, pfam families, molecular pathways, enzymatic activity...etc. The gist of the analysis is to see whether a certain category (GO term, pfam…) are over-represented in a subset of your data.

Let’s take an example. Let’s say I have:

  •  A transcriptome of 20,000 genes.
  • 400 genes out of 20,000 are categorized as “cell cycle”.
  • We found 1,000 genes to be differentially expressed under a certain condition.
  • 300 genes have the “cell cycle” category out of the 1,000 differentially expressed genes.

What is the significance of this? In other words, if we pick 1,000 genes randomly from the total pool of 20,000 genes, what are the chances there will be more than 300 genes with the cell cycle category?

In this post I will go through the basics of how enrichment analysis is performed and some thoughts on how informative this analysis is as applied to biological systems.

  Monday, 02 September 2013 at 5:13 pm

I've been attending the UK NGS/Genomic Sciences meetings since it started 4 years ago. While there are great talks every year, this year, they were able to get Clive Brown to do the keynote talk about Oxford Nanopore. For people in the NGS field, I don't think I need to say much about what Nanopore is (check out Oxford Nanopore's website for more details).

Before the talk, Clive put up a slide telling people he prefers there to be no tweets about the talk since he will be covering a great deal of technical details (which he did). I found that kind of strange. It seems like he doesn't want the content of his talk to be public? Why not just have all of us sign a NDA if that's the case? However, I will comply with his request and will not write much about the technical aspects of his talk. Instead, I will talk about what I think about Oxford Nanopore and its potential impact on the field.