Ahead-of-time (AOT) compilation of kernels.
Vulkan support via SPIR backend.
2xFP16 (Half2) support and further optimizations.
Loop unrolling to aggressively unroll certain loop to improve performance.
Enhanced memory-address space optimizations to improve load/store IO performance.
Support for packed 16-bit floats into a single 32-bit word (Half2) to improve performance.
Enhanced memory buffers, float 16 support and optimizations.
Support for fixed array buffers.
Performance improvements of generated kernels using if conversion.
Support for 64-bit length memory buffers.
Support for 16-bit floats to improve performance.
Performance improvements of generated kernels, new Capability API and Kernel Information objects.
Performance improvements of generated kernels using a new O2 optimization pipeline.
Vectorized instruction generation for PTX kernels.
New O2 optimization pipeline including MemoryAddressSpace specializations.
Dynamic shared memory support for OpenCL.
New Capability API to check kernels for compatibility and to select proper software implementations for features that are not supported on a specific hardware platform.
Kernel Information objects and adapted kernel loaders to get more information about the kernels being generated.
Support for meta constants aka specialization of kernels.
Support for dynamic shared memory in CPU/Cuda kernels.
Enhanced shared-memory support to configure the amount of allocated shared memory dynamically.
Support for dynamic partial evaluation of kernels.
Significant performance improvements of the generated code up to 30%.
Significantly improved error messages.
New optimization and code-generation pipeline.
Improved PTX and OpenCL backends.
Support for local memory via linear arrays in kernels.
Bug fix release which includes additional helper classes to simplify GPU programming and memory transfers.
Bug fixes for critical code-generation issues on multi-threaded platforms.
Helper classes to simplify GPU<->CPU memory transfers.
Enhanced support for Intel and AMD GPUs via OpenCL.
Performance improvements and bug fixes.
Support for OpenCL-compatible devices (beta).
New extension API and improved intrinsics.
Initial support for Intel and AMD GPUs via OpenCL.
Additional low-level intrinsics to use additional GPU hardware functionality.
Additional high-level algorithms to express reduce and scan operations on all ILGPU-compatible devices.
New extension and intrinsic API.
.Net Standard 2.1 (e.g. .Net Core 3.0) support.
Initial support for arrays in GPU kernels.
Enhanced support for multiple GPUs and improved kernel performance.
Improved performance of Cuda kernels.
Initial support for array types in GPU kernels.
Initial native kernel debugging and profiling on GPU hardware.
Test-framework release to verify generated kernel code.
Polished and enhanced version of v0.5.0-beta.
Implementing feedback from the community.
Significantly improved version of the ILGPU compiler.
Basic kernel debugging and profiling on GPU hardware.
Updated IR for enhanced code generation.
Significant compile-time improvements.
Implementing feedback from the community.
Release v0.4.0 Beta
Public Beta Version. New Intermediate Representation (IR), code-transformation phases and backends.
Cross platform support.
Cross platform support without any native dependencies.
Required code-transformation and code-generation phases for NVIDIA GPUs.
Novel Intermediate Representation (IR) for all ILGPU programs.
LLVM dependency for code generation will be removed.
New caching concept and .Net Standard 2.0 support.
Support for selected Linux distributions via build scripts and support for portable PDB debug symbols.
New caching concept to simplify programming.
Support for selected linux distributions via custom build scripts.
Basic support for portable PDB debug symbols. Enhanced compiler error messages based on detailed debug information.
.Net Standard 2.0 support for full flexibility and cross-platform support.
New support for .Net Core 2.0, convenient kernel loading & caching.
Convenient kernel loading and caching.
New support for .Net Core 2.0. This allows users of the ILGPU compiler to compile and run their kernels on a huge variety of different target platforms in the future.
However, the native dependencies have to be adjusted as well.
In order to support new LLVM versions in the future, the LLVMSharp dependency (which is bound to LLVM 3.9.X) will be removed.
It will be replaced by custom LLVM bindings.
Version 0.1.X releases.
Different bug-fix releases based on the main features of the initial ILGPU version.
However, version 0.1.4 (which was released in August) will be the last release of this release series.
The next release series will be 0.2.X that contains new features.
Initial Public Release v0.1
Initial public release on GitHub and Nuget.
This version contains all required JIT compilation features, a full featured CPU runtime and a PTX backend.
The non-public development start was in 2016.
Different developers used and tested ILGPU during the initial development phase.
Their feedback and suggestions were taken into account and considerably influenced the development.
Special thanks to Christian Hauck and Denis Müller.