** This is a very incomplete DRAFT **

## The test machine

The computer used for these tests is the following:

` Mac¬Pro`

` mi-2010`

` Processor 2 x 2.66 GHz 6-Core Intel Xeon`

` Memory 24 GB 1333 MHz DDR3 ECC`

` Software Mac OS X Lion 10.7.5 (11G63)`

## Time of execution.

The tests reported here were run on an Apple MacPro, 2 x 2.66 GHz 6-Core Intel Xeon, 24 Go 1333 MHz DDR3, running Mac OS X v10.7.5, with Java 1.6. The detectors instantiated were tamed to use only 1 thread. Unless indicated, the median filter and the sub-pixel localization were not done.

### The DoG & LoG detectors.

#### Processing time for a 2D image as a function of its size.

For a *uint16* image, varying its size, containing 200 gaussian spots of radius 3 (everything is in pixel units).

N (pixels) | Image size | DoG detector time (ms) | LoG detector time (ms) |
---|---|---|---|

256 | 16x16 | 3.0 | 4.8 |

1024 | 32x32 | 2.95 | 4.3 |

4096 | 64x64 | 3.95 | 4.35 |

16384 | 128x128 | 7.9 | 5.85 |

65536 | 256x256 | 23.35 | 17.7 |

262144 | 512x512 | 88.2 | 61.15 |

1048576 | 1024x1024 | 357.3 | 251.65 |

2359296 | 1536x1536 | 789.85 | 605.4 |

4194304 | 2048x2048 | 1463.4 | 1201.1 |

For the DoG detector, unsurprisingly, we find that the execution time is proportional to the number of pixels, following approximately *t (ms) = 3.4e-4 x Npixels*. This is expected as all calculations are done in direct space.

The LoG detector operates in Fourier space, and because of the Fourier transform implementation we use, the images are padded with 0s to reach a size equal to a power of 2. This does not show here as all but one tests are made with such a size. Still, the execution time slightly deviates from the linear case, and shows a slight quadratic shape. The best linear fit yields a low in *t (ms) = 2.8e-4 x Npixels*, showing that the LoG detector is slightly quicker than the DoG detector.

#### Processing time for a 3D image as a function of its size.

N (pixels) | Image size | DoG detector time (ms) | LoG detector time (ms) |
---|---|---|---|

4096 | 16x16x16 | 8.7 | 24.7 |

32768 | 32x32x32 | 23.5 | 38.5 |

262144 | 64x64x64 | 129.3 | 159.2 |

2097152 | 128x128x128 | 875.1 | 936.3 |

16777216 | 256x256x256 | 7054.0 | 7462.4 |

134217728 | 512x512x512 | 61477.2 | 58860.6 |

And again, the processing time is found to be linear with the number of pixels. The linear fit is slightly steeper, however: *t (ms) = 4.6e-4 x Npixels*, which we attribute to the 3D kernel overhead.

Interestingly, the LoG detector seems to become the slowest at intermediate size, which I cannot interpret well.

#### Processing time for a 2D image as a function of the spot radius.

We used a 1024x1024 *uint16* image, with 200 gaussian spots, the size of which we varied. The detector was tuned to this radius.

We find that for the DoG detector, the processing time to increase linearly with the specified radius, following approximately *t (ms) = 20.5 x radius + 260*. As the difference-of-gaussians is calculated in the direct space, a marked increase is expected as there is more pixels to iterate over. Without optimization, we should however have found the time to be increasing with the square of the radius, and find the same dependence that for the image size. Thanks to the clever implementation of gaussian filtering[1], this is avoided.

The LoG detector shows a near-constant processing time, which makes it desirable for spots larger than 2 pixels in radius. This is due to the way we compute the convolution which is explained below.

#### Processing time for a 3D image as a function of the spot radius.

This time we used a 256x256x256 3D image, but with otherwise the same parameters.

The processing time increases, but this time deviates slightly from linearity in the DoG case. We retrieve the 3D kernel overhead we had for the 3D images.

The LoG performance clearly highlights the 0-padding used because of the Fourier transform: Indeed, the processing time increase in a step-wise manner. We use the Fourier transform to compute the convolution by the LoG kernel. But for the implementation we use, the kernel image (and the source image as well) are padded by 0 until their size reaches a power of 2 (128, 256, 512, etc…). Whenever the required kernel size is smaller than this power of 2, its size is increased to this value. Because ultimately the processing time depends on the number of pixels, we see a constant processing time until the kernel size imposes a larger power of 2.

#### Choosing between DoG and LoG based on performance

This stepwise evolution makes it slightly harder to choose between LoG and DoG detectors based on performance. As a crude rule of thumb we will remember that

- The LoG detector outperforms the DoG detector in 2D for radiuses larger than 2 pixels.
- The LoG detector outperforms the DoG detector in 3D for radiuses larger than 4 pixels.

[1] https://github.com/imagej/imglib/blob/master/algorithms/core/src/main/java/net/imglib2/algorithm/gauss3/Gauss3.java