top of page
Use Cases

GPU-accelerated Llama3.java inference in pure Java using TornadoVM.

The first Java-native implementation of Llama3 that automatically compiles and executes Java code on GPUs via TornadoVM. 

​

GPULlama3.java is available on GitHub

inter-output.gif

Running in interactive mode on an NVIDIA RTX 5090 GPU with nvtop at the bottom to track GPU utilization

intruct-output.gif

Running in instruct mode on an NVIDIA RTX 5090 GPU

bottom of page