Thursday, 18 April 2013 at 10:03 am

When using HMMScan with various HMM databases (Pfam, TIGRFAM, HMMSmart, ...), you can choose to set a thresholding method for filtering out false positives:

  • Gathering threshold - This was introduced and mainly used by PFAM. This is the threshold PFAM curators manually determined for inclusion into the HMM alignment. The main criteria for inclusion is minimizing domain/sequence overlaps with other protein families. 
  • Trusted cutoff - Score of the the lowest scoring sequence within the HMM alignment. 
  • Noise cutoff - Score of the highest scoring sequence that is NOT in the HMM alignment. Obtained by scanning all other protein family sequences with the HMM in question.

Anything above gathering threshold or trusted cutoff is most likely not a false positive as it is the most strict cutoff. Anything between noise cutoff and gathering/trusted cutoff is a maybe. And anything below noise cutoff can be discarded as false positives.