- What is a raw FROC curve? Why is it useful?
- I have tried to calculate the trapezoidal area underneath the AFROC curve myself by calculating the trapezoidal area underneath the search model fitted points outputted in the document AFROC_PLOTS.xls when extended to (1.0,1.0), however the values are not the same as those given in the output from the JAFROC software. Is the FOM calculated using the trapezoidal area underneath the operating points or the search modal fitted points or using some other method?
- Reader averaged FROC and AFROC curves are generated with the new version of the software, how are these calculated since each reader will give different LLF and NLF at different confidence intervals?
- Since the FOM is related to the area under the AFROC curve I was going to display AFROC curves in any published data, however I have noticed often FROC curves are displayed, which would you suggest displaying AFROC or FROC curves?
- The old version of the software gave an error if the maximum FPF of an observer/modality pair was less than 0.1. The new version does not give this error; if an observer does have a maximum FPF of less than 0.1 should his/her data be included in the analysis?
- A question regarding the expected value of of the JAFROC figure of merit.
1. Raw FROC: Imagine laying out the marks in two rows, one for NLs - lesion level "false positives" denoted by the red circles below, and one for LLs - lesion level "true positives" denoted by the green circles below. The marks are ordered with the confidence level (z) increasing to the right.
Starting on the extreme right hand side from positive infinity you move an imaginary cutoff slowly to the left. The first mark you hit may be a LL (green circle) and just crossing it will yield an FROC operating point at (0,1/L) where L is the total number of lesion s in your dataset. Keep moving and as you cross each LL the operating point moves up by 1/L and as you cross each NL the operating point moves to the right by 1/I, where I is the total number of images. So you get a saw tooth curve, called the raw FROC curve. Why is it useful? It allows determination of whether the observer is fully using the rating scale, see two examples below. Ideally the raw FROC curve should approach a plateau towards the upper end. [The number of points on the raw FROC curve equals the total number of marks. The binned curve plots a point only when the number in each of the NL and LL bins defined by a pair of neighboring cutoffs reaches a minimum value, typically 5. Then the number of FROC points will be considerably smaller than the number of marks.]
2. JAFROC figure of merit vs. area under fitted AFROC curve. The data points used to calculate the search model fits have been binned to ensure at least 5 events in all LL and FP bins. The JAFROC figure of merit uses the un-binned data and calculates a Wilcoxon-like statistic. See Basics of ROC and FROC document. For this reason the two values are expected to be different. The JAFROC figure of merit is described in Acad Rad 2008,15:1554-66 [Eqn. 2 in my 2004 Medical Physics paper is not used]. It consists of comparing all pairs of lesion - FP (highest rating on normal cases) ratings. Unmarked lesions and unmarked normal images get a rating of -2000. When the lesion rating is higher than the FP one cumulates a one, if equal one cumulates a 0.5 and if lesion rating is lower than the FP we do not cumulate. The total is divided by the number of paired comparisons. This is the JAFROC figure of merit. It is the empirical probability that a lesion is rated higher than a FP. Return to top of page
3. How are average curves generated? The search model parameters are averaged and used to generate the average curves. Return to top of page
4. Should I show FROC or AFROC curves? I recommend showing the AFROC since the area under this curve is the figure of merit that is being used in JAFROC analysis. The prevailing custom of showing the FROC is outdated in my opinion. The FROC is an open ended curve, and one does not know how far to the right and how far up (limit unity) it really extends. Unlike the FROC, the AFROC is fully contained within the unit square. Return to top of page
5. Quality of the data: This is a tricky one and I have gone back and forth on it. The question is whether the reader is not giving larger FPF simply because his threshold is too strict (in which case the data is not the best quality) or whether the reader is so good that even at his lowest threshold he does not generate larger FPF compared to others (in which case the data is good). I think the best way to judge the quality of the data is to look for flattening of the raw froc curve. If the curve flattens, the data is good regardless of FPF. For this reason I have removed the old warning based on the FPF. More work is needed in this area. Return to top of page
6. Expected value of the JAFROC figure of merit: Users of JAFROC are sometimes surprised that a modality (for example, mod-A) on which the observer marks some of the lesions without marking any normal image does not yield a perfect figure of merit ( = 1) and conversely a modality (for example, mod-B) on which the observer marks some of the normal images and does not mark any of the lesions does not yield a zero figure of merit ( = 0). In fact it is observed that 0 <of mod-B <of mod-A < 1. The observer who marks only lesions is obviously better than the observer who marks only normal images. If the observer marked every lesion and did not mark any normal image, the figure of merit would be unity and if the observer marked every normal image and did not mark any lesion, the figure of merit would be zero. Return to top of page
7. TBA. Return to top of page