Ticket #126 (closed task: fixed)
Opened 2010-07-13T13:03:11-05:00
Last modified 2010-09-07T10:39:26-05:00
OpenCL Publication
Reported by: | rlentz | Owned by: | rlentz |
---|---|---|---|
Priority: | minor | Milestone: |
|
Component: | OpenCL | Version: | |
Severity: | non-issue | Keywords: | |
Cc: | Blocked By: | ||
Blocking: |
Description
Develop a 18-20 page draft paper for the journal of bioinformatics paper titled(?), "A methodology for adapting software algorithms to leverage GPU based compute resources using OpenCL"
Outline the methodology and it's application for Sobel filter and iterative 3d deconvolution.
Include tables for performance metrics gathered from the last three generations of GPU hardware and current generation CPU hardware.
Include a lessons learned section covering decision criteria on JNA and floating point precision.
Include references to existing publications.
Change History
comment:1 Changed 2010-07-13T13:03:19-05:00 by rlentz
- Owner set to rlentz
- Status changed from new to accepted
comment:2 Changed 2010-07-15T08:43:16-05:00 by rlentz
- Milestone changed from biweekly-2010: Jul-12 to Jul-23 to biweekly-2010: Jul-26 to Aug-06
Pushed back due to:
Awaiting Hardware / Starting new task
comment:3 Changed 2010-08-05T14:49:33-05:00 by rlentz
Restarted this task 5 Aug, 2010. Hardware arrived. Begin testing direct port implementation. Several parallelization errors discovered.
comment:4 Changed 2010-08-05T14:59:26-05:00 by rlentz
On Aug 5, ported command line project into Eclipse (CDT) and created a makefile version. Added the m and OpenCL requirements and updated the resource paths needed to compile the existing code.
Testing has shown errors in the function copyDataMask().
comment:5 Changed 2010-08-16T10:53:38-05:00 by rlentz
Error in Fast Hartley Transform. Reworked former fcla, fclb, and fclc. Testing found run time compile bug in kernel srfourb.
comment:6 Changed 2010-08-16T10:56:31-05:00 by rlentz
Please reference microscopy.wisc.edu/svn/decon for details regarding source level progress.
comment:7 Changed 2010-08-17T15:42:47-05:00 by rlentz
Functions that perform reduction do not port directly from Java and need to be implemented. These parts of OpenCL introduce inefficiencies that exceed the all other performance benefits combined.
comment:8 Changed 2010-08-23T10:43:59-05:00 by rlentz
Reimplementing from Sorensen, et. al. 1231 (Oct 1985) using shared memory
comment:10 Changed 2010-09-07T10:39:26-05:00 by rlentz
- Status changed from accepted to closed
- Resolution set to fixed
Finished re-implementing from Sorensen, et. al. 1231 (Oct 1985) using shared memory. Results data showed 17x acceleration gain in time comparison of OpenCL FHT implementation. This implementation was interfaced with the reference implementation (a plugin provided to the community by Dr. Robert Dougherty) with data showing an integrated 6.36x performance improvement.