Reference Implementation
We provide a reference implementation of the metrics described below in the CREMI Python repository.
Neuron Segmentation
Neuron segmentations are compared to the ground truth in terms of variation of information (VOI), adapted Rand error (RAND), and tolerant edit distance (TED).
To compensate for imprecisions and ambiguities in the delineation of object boundaries in the ground truth, we exclude pixels that are close to object boundaries. For that, all pixels that are within a threshold distance to a boundary in the same section are labeled as background. The evaluation code will ignore these pixels.
Synaptic Cleft Detection
We count all voxels in the test set that are labeled as a synaptic cleft and fall beyond a threshold distance from ground truth labels as false positives (FP), and all voxels in the ground truth that are labeled as synaptic cleft and fall beyond a threshold distance from test labels as false negatives (FN). The final score is the F1-score of the FPs and FNs.
For this evaluation, we do not consider the IDs of the found segments, i.e., any detection voxel inside a grown ground truth region is counted as a true positive.
We also report the average Euclidean distance of all FPs and FNs without threshold.
Examples
The following examples are given in 2D with a small matching threshold for illustration purposes. The actual evaluation is performed in 3D with a more generous threshold.
Example: False positives
This example shows a detection that is extending too far to the left. All detection voxels that lie outside of the grown ground truth region are counted as FPs.
Example: False negatives
This example shows a detection that is missing a part of the cleft at the right. All ground truth voxels that lie outside of the grown detection are counted as FNs.
Synaptic Partner Identification
We measure the accuracy of the synaptic partner pair detections by tolerant assignment to ground truth pairs. For each synaptic site in the ground truth, we identify a matching region. The matching region of a synaptic site annotation is the intersection of a sphere centered at the site and the corresponding neuron segment, i.e. all voxels that
- are within a threshold distance to the ground truth annotation, and
- have the same ground truth neuron ID as the voxel under the annotation.
All detected pairs that have both annotations inside the matching areas of a ground truth pair are considered potential matches. Of all potential matches, we find true matches by solving an assignment problem minimizing the Euclidean distance. Unmatched detected pairs are considered FP, unmatched ground truth pairs FN. The final score is the F1-score of the FPs and FNs.
Examples
The following examples are given in 2D with a small matching threshold for illustration purposes. The actual evaluation is performed in 3D with a more generous threshold.
Example: Correct identification
This example shows a correct identification of pair (a',b') with respect to ground truth pair (a,b). Pixels in the matching area of a and b are colored. Both a' and b' lie inside the matching area of a and b, respectively. After solving the assignment problem, (a',b') will be matched with (a,b) and no error will be counted.
Example: False positive
This example shows a false positive (a',b') (assuming there are no more ground truth pairs or detected pairs). Each detected pair that can not be matched to a ground truth pair is counted as one FP. Detected pairs can not be matched if they are outside the matching area of any ground truth pair, or are superseded by another closer pair.
Example: False negative
This example shows a false negative (a,b) (assuming there are no more ground truth pairs or detected pairs). Each ground truth pair that can not be matched to a detected pair is counted as one FN. Ground truth pairs can not be matched if all matching candidates were assigned to a closer pair.
Example: Wrong partner
This example shows an incorrect identification of pair (a',b'). Pixels in the matching area of a and b are colored. The pair (a',b') will not be considered as a match for (a,b), since b' is outside the matching area of b. In this case (assuming there are no more ground truth pairs or detected pairs), this error will be counted as one FP and one FN.