The main idea of the target-decoy approach to compute false discovery rates is to feed the search engine answers that you know are wrong. These are referred to as ‘decoys’, typically generated by reversing, randomizing, or scrambling real protein sequences. You then estimate error rates in your resulting answer list by seeing how these known wrong answers show up in the list in comparison to the target proteins.
Once you have a ranked list of target and decoy answers you can compute the FDR rates. Typically, in proteomics, global FDR rates are computed (top figure). Here you are asking the question, out of the population of answers how many are wrong. So, in the example below, if you want to know how many proteins are found at a 5% global FDR, you go down the list (across the x-axis) until you have 5% wrong answers in your total number of answers, that is 970 proteins. This is a great metric for characterizing performance of a method, comparing methods, etc.
However, if you are more interested in understanding a key protein biomarker or want to know if that protein really is part of a complex, you want to ask the question, how sure am I of this protein. Here we want to compute a local FDR, something more like confidence. So, we look at the list and ask the question how many are wrong in the area of my protein of interest. In the example below, we can see that at protein 970 in the list, the decoys are accumulating at a pretty fast rate and that the local FDR is more like 50%.
Both global and local FDRs are reported in the ProteinPilotReport for every ProteinPilot software search in tabular form. Graphical representations of the results are also shown as well as the fitting used to compute the values1. The corresponding confidences that can be seen when viewing the group file within the software are also shown.
Tang, W. H.; Shilov, I. V.; Seymour, S. L., (2008) Nonlinear Fitting Method for Determining Local False Discovery Rates from Decoy Database Searches. J. Proteome Res. 7(9), 3661-3667.