com.nativelibs4java.opencl
Class CLProgram

java.lang.Object
  extended by com.nativelibs4java.opencl.CLProgram

public class CLProgram
extends Object

OpenCL program.
An OpenCL program consists of a set of kernels that are identified as functions declared with the __kernel qualifier in the program source. OpenCL programs may also contain auxiliary functions and constant data that can be used by __kernel functions. The program executable can be generated online or offline by the OpenCL compiler for the appropriate target device(s).
A program object encapsulates the following information:

A program can be compiled on the fly (costly) but its binaries can be stored and loaded back in subsequent executions to avoid recompilation.
By default, program binaries are automatically cached on stable platforms (which currently exclude ATI Stream), but the caching can be forced on/off with * see CLContext.setCacheBinaries(boolean).
To create a program from sources, please use see CLContext.createProgram(java.lang.String[])

Author:
Olivier Chafik

Field Summary
protected  CLContext context
           
protected  T entity
           
static boolean passMacrosAsSources
           
 
Method Summary
 void addBuildOption(String option)
          Please see OpenCL's clBuildProgram documentation for details on supported build options.
 void addInclude(String path)
          Add a path (file or URL) to the list of paths searched for included files.
 void addSource(String src)
           
 void allocate()
           
 CLProgram build()
          Returns the context of this program
protected  void clear()
           
protected  String computeCacheSignature()
           
protected  Runnable copyIncludesToTemporaryDirectory()
           
 CLKernel createKernel(String name, Object... args)
          Find a kernel by its functionName, and optionally bind some arguments to it.
 CLKernel[] createKernels()
          Return all the kernels found in the program.
 CLProgram defineMacro(String name, Object value)
           
 void defineMacros(Map<String,Object> macros)
           
 boolean equals(Object obj)
          Underyling implementation pointer-based equality test
protected  void finalize()
           
 Map<CLDevice,byte[]> getBinaries()
          Get the binaries of the program (one for each device, in order)
 CLContext getContext()
          Returns the context of this program
 CLDevice[] getDevices()
           
static
<E extends org.bridj.TypedPointer,A extends com.nativelibs4java.opencl.CLAbstractEntity<E>>
org.bridj.Pointer<E>
getEntities(A[] objects, org.bridj.Pointer<E> out)
           
protected  OpenCLLibrary.cl_program getEntity()
           
 String getIncludedSourceContent(String path)
           
 URL getIncludedSourceURL(String path)
           
protected  String getOptionsString()
           
protected  Set<String> getProgramBuildInfo(OpenCLLibrary.cl_program pgm, org.bridj.Pointer<OpenCLLibrary.cl_device_id> deviceIds)
           
 String getSource()
          Get the source code of this program
 int hashCode()
          Underyling implementation pointer-based hashCode computation
 boolean isCached()
           
static com.nativelibs4java.util.Pair<Map<CLDevice,byte[]>,String> readBinaries(List<CLDevice> allowedDevices, String expectedContentSignatureString, InputStream in)
           
 void release()
          Manual release of the OpenCL resources represented by this object.
 Map<String,URL> resolveInclusions()
           
protected  void setBinaries(Map<CLDevice,byte[]> binaries)
           
 void setCached(boolean cached)
           
 void setFastRelaxedMath()
          Add the -cl-fast-relaxed-math compile option.
 void setFiniteMathOnly()
          Add the -cl-finite-math-only compile option.
 void setMadEnable()
          Add the -cl-mad-enable compile option.
 void setNoSignedZero()
          Add the -cl-no-signed-zero compile option.
 void setNVMaximumRegistryCount(int N)
          Add the -cl-nv-maxrregcount=N compilation option (NVIDIA GPUs only)
Specify the maximum number of registers that GPU functions can use.
 void setNVOptimizationLevel(int N)
          Add the -cl-nv-opt-level compilation option (NVIDIA GPUs only)
Specify optimization level (default value: 3)
 void setNVVerbose()
          Add the -cl-nv-verbose compilation option (NVIDIA GPUs only)
Enable verbose mode.
 void setUnsafeMathOptimizations()
          Add the -cl-unsafe-math-optimizations option.
 void store(OutputStream out)
          Write the compiled binaries of this program (for all devices it was compiled for), so that it can be restored later using CLContext.loadProgram(java.io.InputStream)
 CLProgram undefineMacro(String name)
           
static void writeBinaries(Map<CLDevice,byte[]> binaries, String source, String contentSignatureString, OutputStream out)
           
 
Methods inherited from class java.lang.Object
clone, getClass, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

context

protected final CLContext context

passMacrosAsSources

public static boolean passMacrosAsSources

entity

protected volatile T extends org.bridj.TypedPointer entity
Method Detail

setBinaries

protected void setBinaries(Map<CLDevice,byte[]> binaries)

store

public void store(OutputStream out)
           throws CLBuildException,
                  IOException
Write the compiled binaries of this program (for all devices it was compiled for), so that it can be restored later using CLContext.loadProgram(java.io.InputStream)

Parameters:
out - will be closed
Throws:
CLBuildException
IOException

writeBinaries

public static void writeBinaries(Map<CLDevice,byte[]> binaries,
                                 String source,
                                 String contentSignatureString,
                                 OutputStream out)
                          throws IOException
Throws:
IOException

readBinaries

public static com.nativelibs4java.util.Pair<Map<CLDevice,byte[]>,String> readBinaries(List<CLDevice> allowedDevices,
                                                                                      String expectedContentSignatureString,
                                                                                      InputStream in)
                                                                               throws IOException
Throws:
IOException

getDevices

public CLDevice[] getDevices()

allocate

public void allocate()

getEntity

protected OpenCLLibrary.cl_program getEntity()

addInclude

public void addInclude(String path)
Add a path (file or URL) to the list of paths searched for included files.
OpenCL kernels may contain #include "subpath/file.cl" statements.
This automatically adds a "-Ipath" argument to the compilator's command line options.
Note that it's not necessary to add include paths for files that are in the classpath.

Parameters:
path - A file or URL that points to the root path from which includes can be resolved.

addSource

public void addSource(String src)

copyIncludesToTemporaryDirectory

protected Runnable copyIncludesToTemporaryDirectory()
                                             throws IOException
Throws:
IOException

resolveInclusions

public Map<String,URL> resolveInclusions()
                                  throws IOException
Throws:
IOException

getIncludedSourceContent

public String getIncludedSourceContent(String path)
                                throws IOException
Throws:
IOException

getIncludedSourceURL

public URL getIncludedSourceURL(String path)
                         throws MalformedURLException
Throws:
MalformedURLException

getSource

public String getSource()
Get the source code of this program


getBinaries

public Map<CLDevice,byte[]> getBinaries()
                                 throws CLBuildException
Get the binaries of the program (one for each device, in order)

Returns:
map from each device the program was compiled for to the corresponding binary data
Throws:
CLBuildException

getContext

public CLContext getContext()
Returns the context of this program


defineMacro

public CLProgram defineMacro(String name,
                             Object value)

undefineMacro

public CLProgram undefineMacro(String name)

defineMacros

public void defineMacros(Map<String,Object> macros)

setFastRelaxedMath

public void setFastRelaxedMath()
Add the -cl-fast-relaxed-math compile option.
Sets the optimization options -cl-finite-math-only and -cl-unsafe-math-optimizations. This allows optimizations for floating-point arithmetic that may violate the IEEE 754 standard and the OpenCL numerical compliance requirements defined in the specification in section 7.4 for single-precision floating-point, section 9.3.9 for double-precision floating-point, and edge case behavior in section 7.5. This option causes the preprocessor macro __FAST_RELAXED_MATH__ to be defined in the OpenCL program.
Also see : Khronos' documentation for clBuildProgram.


setNoSignedZero

public void setNoSignedZero()
Add the -cl-no-signed-zero compile option.
Allow optimizations for floating-point arithmetic that ignore the signedness of zero. IEEE 754 arithmetic specifies the behavior of distinct +0.0 and -0.0 values, which then prohibits simplification of expressions such as x+0.0 or 0.0*x (even with -clfinite-math only). This option implies that the sign of a zero result isn't significant.
Also see : Khronos' documentation for clBuildProgram.


setMadEnable

public void setMadEnable()
Add the -cl-mad-enable compile option.
Allow a * b + c to be replaced by a mad. The mad computes a * b + c with reduced accuracy. For example, some OpenCL devices implement mad as truncate the result of a * b before adding it to c.
Also see : Khronos' documentation for clBuildProgram.


setFiniteMathOnly

public void setFiniteMathOnly()
Add the -cl-finite-math-only compile option.
Allow optimizations for floating-point arithmetic that assume that arguments and results are not NaNs or infinites. This option may violate the OpenCL numerical compliance requirements defined in in section 7.4 for single-precision floating-point, section 9.3.9 for double-precision floating-point, and edge case behavior in section 7.5.
Also see : Khronos' documentation for clBuildProgram.


setUnsafeMathOptimizations

public void setUnsafeMathOptimizations()
Add the -cl-unsafe-math-optimizations option.
Allow optimizations for floating-point arithmetic that (a) assume that arguments and results are valid, (b) may violate IEEE 754 standard and (c) may violate the OpenCL numerical compliance requirements as defined in section 7.4 for single-precision floating-point, section 9.3.9 for double-precision floating-point, and edge case behavior in section 7.5. This option includes the -cl-no-signed-zeros and -cl-mad-enable options.
Also see : Khronos' documentation for clBuildProgram.


setNVVerbose

public void setNVVerbose()
Add the -cl-nv-verbose compilation option (NVIDIA GPUs only)
Enable verbose mode. Output will be reported in JavaCL's log at the INFO level


setNVMaximumRegistryCount

public void setNVMaximumRegistryCount(int N)
Add the -cl-nv-maxrregcount=N compilation option (NVIDIA GPUs only)
Specify the maximum number of registers that GPU functions can use. Until a function-specific limit, a higher value will generally increase the performance of individual GPU threads that execute this function. However, because thread registers are allocated from a global register pool on each GPU, a higher value of this option will also reduce the maximum thread block size, thereby reducing the amount of thread parallelism. Hence, a good maxrregcount value is the result of a trade-off. If this option is not specified, then no maximum is assumed. Otherwise the specified value will be rounded to the next multiple of 4 registers until the GPU specific maximum of 128 registers.

Parameters:
N - positive integer

setNVOptimizationLevel

public void setNVOptimizationLevel(int N)
Add the -cl-nv-opt-level compilation option (NVIDIA GPUs only)
Specify optimization level (default value: 3)

Parameters:
N - positive integer, or 0 (no optimization).

addBuildOption

public void addBuildOption(String option)
Please see OpenCL's clBuildProgram documentation for details on supported build options.


getOptionsString

protected String getOptionsString()

setCached

public void setCached(boolean cached)

isCached

public boolean isCached()

computeCacheSignature

protected String computeCacheSignature()
                                throws IOException
Throws:
IOException

getProgramBuildInfo

protected Set<String> getProgramBuildInfo(OpenCLLibrary.cl_program pgm,
                                          org.bridj.Pointer<OpenCLLibrary.cl_device_id> deviceIds)

build

public CLProgram build()
                throws CLBuildException
Returns the context of this program

Throws:
CLBuildException

clear

protected void clear()

createKernels

public CLKernel[] createKernels()
                         throws CLBuildException
Return all the kernels found in the program.

Throws:
CLBuildException

createKernel

public CLKernel createKernel(String name,
                             Object... args)
                      throws CLBuildException
Find a kernel by its functionName, and optionally bind some arguments to it.

Throws:
CLBuildException

release

public void release()
Manual release of the OpenCL resources represented by this object.
Note that resources are automatically released by the garbage collector, so in general there's no need to call this method.
In an environment with fast allocation/deallocation of large objects, it might be safer to call release() manually, though.
Note that release() does not necessarily free the object immediately : OpenCL maintains a reference count for all its objects, and an object released on the Java side might still be pointed to by running kernels or queued operations.


getEntities

public static <E extends org.bridj.TypedPointer,A extends com.nativelibs4java.opencl.CLAbstractEntity<E>> org.bridj.Pointer<E> getEntities(A[] objects,
                                                                                                                                           org.bridj.Pointer<E> out)

finalize

protected void finalize()
                 throws Throwable
Overrides:
finalize in class Object
Throws:
Throwable

hashCode

public int hashCode()
Underyling implementation pointer-based hashCode computation

Overrides:
hashCode in class Object

equals

public boolean equals(Object obj)
Underyling implementation pointer-based equality test

Overrides:
equals in class Object


Copyright © 2009-2012. All Rights Reserved.