• jjfumero

New Java JNI library for GPU Programming via Level Zero API Released

Key TakeAways

  • Intel Level Zero is a low-level library being used by oneAPI, an ecosystem for the development of heterogeneous applications from the C++ programming language.

  • A new library for programming heterogeneous hardware through the Intel Level Zero API from Java is presented.

  • The Java JNI Level Zero library is ideal for developing runtime systems and compilers to handle execution on modern GPUs.

  • The Java JNI Level Zero library allows access to shared memory, energy consumption metrics, device information, and more.

 

Introduction


At the University of Manchester, we have been working in collaboration with Intel to build a new compiler backend for the TornadoVM programming framework. During this development, we have built a couple of standalone libraries. One of them is the JNI library for Intel Level Zero, and we have open-sourced an initial implementation as a stand-alone project under the MIT license. This implementation covers a set of low-level functions that TornadoVM is currently using to manage the execution of its new SPIRV backend (intermediate representation in binary format for defining compute functions on heterogeneous architectures).


Intel Level Zero is a close bare-metal API for programming heterogeneous architectures, such as GPUs, and it is shipped as part of the Intel oneAPI (which includes an implementation of the SYCL standard for parallel programming within C++, and a set of libraries for GPU/FPGA and multi-core programming).


Intel Level Zero can be seen as a very low-level interface for handling the execution and data of parallel programs quite close to the actual hardware. To name a few of the functionalities, Intel Level Zero allows developers to use virtual functions, unified memory, device partitioning, instrumentation and debugging capabilities. Additionally, it provides interfaces to measure performance and energy metrics as well as hardware diagnostics and power management (just to name a few). For more details, you can check our previous posts in which we give a general introduction to Level Zero and talk about how to report profiling metrics.


The Java JNI library implements a subset of the spec 1.1.2 and we expect to continue evolving and adapting this library to the latest versions.


 

How to use it?


The Intel compute-runtime driver for the Intel Integrated GPU needs to be installed first, which includes the drivers for OpenCL and Level Zero implementations. Then, we need to install the Level Zero C++ library to be able to build our JNI library from the source code:


$ git clone https://github.com/oneapi-src/level-zero
$ cd level-zero
$ mkdir build
$ cd build
$ cmake ..
$ cmake --build . --config Release

Then, we can build the Java JNI library for Level Zero:


$ export CPLUS_INCLUDE_PATH=/PATH/to/level-zero/include:$CPLUS_INCLUDE_PATH
$ export LD_LIBRARY_PATH=/PATH/to/level-zero/build/lib:$LD_LIBRARY_PATH 
$ export ZE_SHARED_LOADER="/PATH/to/level-zero/build/lib/libze_loader.so"
$ git clone https://github.com/beehive-lab/levelzero-jni 
$ cd levelzero-jni/levelZeroLib
$ mkdir build
$ cd build
$ cmake .. 
$ make 
$ cd .. 
$ maven clean package

To check if it is running, you can execute the following command:


$ ./scripts/compileAndRun.sh 
 

Example


As a short example, the following code snippet illustrates how to query the device properties of a heterogeneous device using the Java JNI interface:


LevelZeroDevice device = driver.getDevice(driverHandler, 0);
ZeDeviceProperties deviceProperties = new ZeDeviceProperties();
result = device.zeDeviceGetProperties(device.getDeviceHandlerPtr(), deviceProperties);
LevelZeroUtils.errorLog("zeDeviceGetProperties", result);
System.out.println(deviceProperties);

And the output is, as follows:

=========================
Device Properties
=========================
STye                : ZE_STRUCTURE_TYPE_DEVICE_PROPERTIES
pNext               : 0
Type                : ZE_DEVICE_TYPE_GPU
vendorId            : 32902
deviceId            : 39876
flags               : 1
subdeviceId         : 0
coreClockRate       : 1250
maxMemAllocSize     : 4294959104
maxHardwareContext  : 65536
maxCommandQueuePriority: 0
numThreadsPerEU     : 7
physicalEUSimdWidth : 8
numEUsPerSubslice   : 8
numSubslicesPerSlice: 3
numSlices           : 1
timerResolution     : 83
timestampValidBits  : 36
kernelTimestampValidBits: 32
uuid                : [134, 128, 0, 0, 196, 155, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
name                : Intel(R) UHD Graphics [0x9bc4]

The released source code also contains a set of examples:


We hope this library is not only useful for the TornadoVM runtime and compiler, but also for other projects, especially runtimes implemented in Java, for handling heterogeneous execution and memory management from high-level programming languages on top of the JVM.


References


268 views0 comments