Awesome
<img src="https://github.com/uxlfoundation/artwork/blob/main/foundation/uxl-foundation-logo-horizontal-color.png" alt="UXL Foundation Logo" width="250"/>oneAPI Math Library (oneMath)
oneMath is an open-source implementation of the oneMath specification. It can work with multiple devices using multiple libraries (backends) underneath. The oneMath project was previously referred to as oneMKL Interfaces.
oneMath is part of the UXL Foundation. <br/><br/>
<table> <thead> <tr align="center" > <th>User Application</th> <th>oneMath Layer</th> <th>Third-Party Library</th> <th>Hardware Backend</th> </tr> </thead> <tbody> <tr> <td rowspan=14 align="center">oneMath</td> <td rowspan=14 align="center">oneMath selector</td> <td align="center"><a href="https://software.intel.com/en-us/oneapi/onemkl">Intel(R) oneAPI Math Kernel Library (oneMKL)</a></td> <td align="center">x86 CPU, Intel GPU</td> </tr> </tr> <tr> <td align="center"><a href="https://developer.nvidia.com/cublas"> NVIDIA cuBLAS</a></td> <td align="center">NVIDIA GPU</td> </tr> <tr> <td align="center"><a href="https://developer.nvidia.com/cusolver"> NVIDIA cuSOLVER</a></td> <td align="center">NVIDIA GPU</td> </tr> <tr> <td align="center"><a href="https://developer.nvidia.com/curand"> NVIDIA cuRAND</a></td> <td align="center">NVIDIA GPU</td> </tr> <tr> <td align="center"><a href="https://developer.nvidia.com/cufft"> NVIDIA cuFFT</a></td> <td align="center">NVIDIA GPU</td> </tr> <tr> <td align="center"><a href="https://developer.nvidia.com/cusparse"> NVIDIA cuSPARSE</a></td> <td align="center">NVIDIA GPU</td> </tr> <tr> <td align="center"><a href="https://ww.netlib.org"> NETLIB LAPACK</a> </td> <td align="center">x86 CPU</td> </tr> <tr> <td align="center"><a href="https://rocblas.readthedocs.io/en/rocm-4.5.2/"> AMD rocBLAS</a></td> <td align="center">AMD GPU</td> </tr> <tr> <td align="center"><a href="https://github.com/ROCmSoftwarePlatform/rocSOLVER"> AMD rocSOLVER</a></td> <td align="center">AMD GPU</td> </tr> <tr> <td align="center"><a href="https://github.com/ROCmSoftwarePlatform/rocRAND"> AMD rocRAND</a></td> <td align="center">AMD GPU</td> </tr> <tr> <td align="center"><a href="https://github.com/ROCmSoftwarePlatform/rocFFT"> AMD rocFFT</a></td> <td align="center">AMD GPU</td> </tr> <tr> <td align="center"><a href="https://github.com/ROCmSoftwarePlatform/rocSPARSE"> AMD rocSPARSE</a></td> <td align="center">AMD GPU</td> </tr> <tr> <td align="center"><a href="https://github.com/uxlfoundation/generic-sycl-components/tree/main/onemath/sycl/blas"> generic SYCL BLAS </a></td> <td align="center">x86 CPU, Intel GPU, NVIDIA GPU, AMD GPU, Other SYCL devices (unsupported)</td> </tr> <tr> <td align="center"><a href="https://github.com/codeplaysoftware/portFFT"> portFFT </a></td> <td align="center">x86 CPU, Intel GPU, NVIDIA GPU, AMD GPU, Other SYCL devices (unsupported)</td> </tr> </tbody> </table>Table of Contents
Support and Requirements
Supported Usage Models:
Host API
There are two oneMath selector layer implementations:
-
Run-time dispatching: The application is linked with the oneMath library and the required backend is loaded at run-time based on device vendor (all libraries should be dynamic).
Example of app.cpp with run-time dispatching:
#include "oneapi/math.hpp" ... cpu_dev = sycl::device(sycl::cpu_selector()); gpu_dev = sycl::device(sycl::gpu_selector()); sycl::queue cpu_queue(cpu_dev); sycl::queue gpu_queue(gpu_dev); oneapi::math::blas::column_major::gemm(cpu_queue, transA, transB, m, ...); oneapi::math::blas::column_major::gemm(gpu_queue, transA, transB, m, ...);
How to build an application with run-time dispatching:
if OS is Linux, use icpx compiler. If OS is Windows, use icx compiler. Linux example:
$> icpx -fsycl –I$ONEMATH/include app.cpp $> icpx -fsycl app.o –L$ONEMATH/lib –lonemath
-
Compile-time dispatching: The application uses a templated backend selector API where the template parameters specify the required backends and third-party libraries and the application is linked with the required oneMath backend wrapper libraries (libraries can be static or dynamic).
Example of app.cpp with compile-time dispatching:
#include "oneapi/math.hpp" ... cpu_dev = sycl::device(sycl::cpu_selector()); gpu_dev = sycl::device(sycl::gpu_selector()); sycl::queue cpu_queue(cpu_dev); sycl::queue gpu_queue(gpu_dev); oneapi::math::backend_selector<oneapi::math::backend::mklcpu> cpu_selector(cpu_queue); oneapi::math::blas::column_major::gemm(cpu_selector, transA, transB, m, ...); oneapi::math::blas::column_major::gemm(oneapi::math::backend_selector<oneapi::math::backend::cublas> {gpu_queue}, transA, transB, m, ...);
How to build an application with compile-time dispatching:
$> clang++ -fsycl –I$ONEMATH/include app.cpp $> clang++ -fsycl app.o –L$ONEMATH/lib –lonemath_blas_mklcpu –lonemath_blas_cublas
Refer to Selecting a Compiler for the choice between icpx/icx
and clang++
compilers.
Device API
Header-based and backend-independent Device API can be called within sycl kernel
or work from Host code (device-rng-usage-model-example). Currently, the following domains support the Device API:
- RNG. To use RNG Device API functionality it's required to include
oneapi/math/rng/device.hpp
header file.
Supported Configurations:
Supported domains include: BLAS, LAPACK, RNG, DFT, SPARSE_BLAS
Supported compilers include:
- Intel(R) oneAPI DPC++ Compiler: Intel proprietary compiler that supports CPUs and Intel GPUs. Intel(R) oneAPI DPC++ Compiler will be referred to as "Intel DPC++" in the "Supported Compiler" column of the tables below.
- oneAPI DPC++ Compiler: Open source compiler that supports CPUs and Intel, NVIDIA, and AMD GPUs. oneAPI DPC++ Compiler will be referred to as "Open DPC++" in the "Supported Compiler" column of the tables below.
- AdaptiveCpp Compiler (formerly known as hipSYCL): Open source compiler that supports CPUs and Intel, NVIDIA, and AMD GPUs.</br>Note: The source code and some documents in this project still use the previous name hipSYCL during this transition period.
Linux*
<table> <thead> <tr align="center" > <th>Domain</th> <th>Backend</th> <th>Library</th> <th>Supported Compiler</th> <th>Supported Link Type</th> </tr> </thead> <tbody> <tr> <td rowspan=10 align="center">BLAS</td> <td rowspan=3 align="center">x86 CPU</td> <td align="center">Intel(R) oneMKL</td> <td align="center">Intel DPC++</br>AdaptiveCpp</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td align="center">NETLIB LAPACK</td> <td align="center">Intel DPC++</br>Open DPC++</br>AdaptiveCpp</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td align="center">generic SYCL BLAS</td> <td align="center">Intel DPC++</br>Open DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td rowspan=2 align="center">Intel GPU</td> <td align="center">Intel(R) oneMKL</td> <td align="center">Intel DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td align="center">generic SYCL BLAS</td> <td align="center">Intel DPC++</br>Open DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td rowspan=2 align="center">NVIDIA GPU</td> <td align="center">NVIDIA cuBLAS</td> <td align="center">Open DPC++</br>AdaptiveCpp</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td align="center">generic SYCL BLAS</td> <td align="center">Open DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td rowspan=2 align="center">AMD GPU</td> <td align="center">AMD rocBLAS</td> <td align="center">Open DPC++</br>AdaptiveCpp</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td align="center">generic SYCL BLAS</td> <td align="center">Open DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td rowspan=1 align="center">Other SYCL devices (unsupported)</td> <td align="center">generic SYCL BLAS</td> <td align="center">Intel DPC++</br>Open DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td rowspan=4 align="center">LAPACK</td> <td align="center">x86 CPU</td> <td align="center">Intel(R) oneMKL</td> <td align="center">Intel DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td align="center">Intel GPU</td> <td align="center">Intel(R) oneMKL</td> <td align="center">Intel DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td align="center">NVIDIA GPU</td> <td align="center">NVIDIA cuSOLVER</td> <td align="center">Open DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td align="center">AMD GPU</td> <td align="center">AMD rocSOLVER</td> <td align="center">Open DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td rowspan=4 align="center">RNG</td> <td align="center">x86 CPU</td> <td align="center">Intel(R) oneMKL</td> <td align="center">Intel DPC++</br>AdaptiveCpp</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td align="center">Intel GPU</td> <td align="center">Intel(R) oneMKL</td> <td align="center">Intel DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td align="center">NVIDIA GPU</td> <td align="center">NVIDIA cuRAND</td> <td align="center">Open DPC++</br>AdaptiveCpp</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td align="center">AMD GPU</td> <td align="center">AMD rocRAND</td> <td align="center">Open DPC++</br>AdaptiveCpp</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td rowspan=9 align="center">DFT</td> <td rowspan=2 align="center">x86 CPU</td> <td align="center">Intel(R) oneMKL</td> <td align="center">Intel DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td align="center">portFFT (<a href="https://github.com/codeplaysoftware/portFFT#supported-configurations">limited API support</a>)</td> <td align="center">Intel DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td rowspan=2 align="center">Intel GPU</td> <td align="center">Intel(R) oneMKL</td> <td align="center">Intel DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td align="center">portFFT (<a href="https://github.com/codeplaysoftware/portFFT#supported-configurations">limited API support</a>)</td> <td align="center">Intel DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td rowspan=2 align="center">NVIDIA GPU</td> <td align="center">NVIDIA cuFFT</td> <td align="center">Open DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td align="center">portFFT (<a href="https://github.com/codeplaysoftware/portFFT#supported-configurations">limited API support</a>)</td> <td align="center">Open DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td rowspan=2 align="center">AMD GPU</td> <td align="center">AMD rocFFT</td> <td align="center">Open DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td align="center">portFFT (<a href="https://github.com/codeplaysoftware/portFFT#supported-configurations">limited API support</a>)</td> <td align="center">Open DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td rowspan=1 align="center">Other SYCL devices (unsupported)</td> <td align="center">portFFT</td> <td align="center">Open DPC++</br>Open DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td rowspan=4 align="center">SPARSE_BLAS</td> <td align="center">x86 CPU</td> <td align="center">Intel(R) oneMKL</td> <td align="center">Intel DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td align="center">Intel GPU</td> <td align="center">Intel(R) oneMKL</td> <td align="center">Intel DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td align="center">NVIDIA GPU</td> <td align="center">NVIDIA cuSPARSE</td> <td align="center">Open DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td align="center">AMD GPU</td> <td align="center">AMD rocSPARSE</td> <td align="center">Open DPC++</td> <td align="center">Dynamic, Static</td> </tr> </tbody> </table>Windows*
<table> <thead> <tr align="center" > <th>Domain</th> <th>Backend</th> <th>Library</th> <th>Supported Compiler</th> <th>Supported Link Type</th> </tr> </thead> <tbody> <tr> <td rowspan=3 align="center">BLAS</td> <td rowspan=2 align="center">x86 CPU</td> <td align="center">Intel(R) oneMKL</td> <td align="center">Intel DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td align="center">NETLIB LAPACK</td> <td align="center">Intel DPC++</br>Open DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td align="center">Intel GPU</td> <td align="center">Intel(R) oneMKL</td> <td align="center">Intel DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td rowspan=2 align="center">LAPACK</td> <td align="center">x86 CPU</td> <td align="center">Intel(R) oneMKL</td> <td align="center">Intel DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td align="center">Intel GPU</td> <td align="center">Intel(R) oneMKL</td> <td align="center">Intel DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td rowspan=2 align="center">RNG</td> <td align="center">x86 CPU</td> <td align="center">Intel(R) oneMKL</td> <td align="center">Intel DPC++</td> <td align="center">Dynamic, Static</td> </tr> <tr> <td align="center">Intel GPU</td> <td align="center">Intel(R) oneMKL</td> <td align="center">Intel DPC++</td> <td align="center">Dynamic, Static</td> </tr> </tbody> </table>Hardware Platform Support
- CPU
- Intel Atom(R) Processors
- Intel(R) Core(TM) Processor Family
- Intel(R) Xeon(R) Processor Family
- Accelerators
- Intel(R) Arc(TM) A-Series Graphics
- Intel(R) Data Center GPU Max Series
- NVIDIA(R) A100 (Linux* only)
- AMD(R) GPUs see here tested on AMD Vega 20 (gfx906)
- Other SYCL devices can be used, but are not supported
Supported Operating Systems
Linux*
Backend | Supported Operating System |
---|---|
x86 CPU | Red Hat Enterprise Linux* 9 (RHEL* 9) |
Intel GPU | Ubuntu 24.04 LTS |
NVIDIA GPU | Ubuntu 22.04 LTS |
Windows*
Backend | Supported Operating System |
---|---|
x86 CPU | Microsoft Windows* Server 2022 |
Intel GPU | Microsoft Windows* 11 |
Software Requirements
What should I download?
General:
<table> <thead> <tr align="center"> <th> Functional Testing </th> <th> Build Only </th> <th>Documentation</th> </tr> </thead> <tbody> <tr> <td colspan=3 align=center> CMake (version 3.13 or newer) </td> </tr> <tr> <td colspan=3 align=center> Linux* : GNU* GCC 5.1 or higher <br> Windows* : MSVS* 2017 or MSVS* 2019 (version 16.5 or newer) </td> </tr> <tr> <tr> <td colspan=3 align=center> Ninja (optional) </td> </tr> <tr> <td> GNU* FORTRAN Compiler </td> <td> - </td> <td> Sphinx </td> </tr> <tr> <td> NETLIB LAPACK </td> <td> - </td> <td> - </td> </tr> </tr> </tbody> </table>Hardware and OS Specific:
<table> <thead> <tr align="center"> <th>Operating System</th> <th>Device</th> <th>Package</th> </tr> </thead> <tbody> <td rowspan=5> Linux*/Windows* </td> <td rowspan=2> x86 CPU </td> <td> Intel(R) oneAPI DPC++ Compiler <br> or <br> oneAPI DPC++ Compiler </td> <tr> <td> Intel(R) oneAPI Math Kernel Library </td> </tr> <td rowspan=3> Intel GPU </td> <td> Intel(R) oneAPI DPC++ Compiler </td> <tr> <td> Intel GPU driver </td> </tr> <tr> <td> Intel(R) oneAPI Math Kernel Library </td> </tr> <td rowspan=2> Linux* only </td> <td> NVIDIA GPU </td> <td> oneAPI DPC++ Compiler <br> or <br> AdaptiveCpp with CUDA backend and dependencies </td> <tr> <td> AMD GPU </td> <td> oneAPI DPC++ Compiler <br> or <br> AdaptiveCpp with ROCm backend and dependencies </td> </tr> </tbody> </table>Product and Version Information:
Documentation
- Contents
- About
- Get Started
- Developer Reference
- Integrating a Third-Party Library
Governance
The oneMath project is governed by the UXL Foundation and you can get involved in this project in multiple ways. It is possible to join the Math Special Interest Group (SIG) meetings where the group discusses and demonstrates work using this project. Members can also join the Open Source and Specification Working Group meetings.
You can also join the mailing lists for the UXL Foundation to be informed of when meetings are happening and receive the latest information and discussions.
Contributing
You can contribute to this project and also contribute to the specification for this project. Please read the CONTRIBUTING page for more information. You can also contact oneMath developers and maintainers via UXL Foundation Slack using #onemath channel.
License
Distributed under the Apache license 2.0. See LICENSE for more information.
FAQs
oneMath
Q: What is the difference between the following items?
- The oneAPI Specification for oneMath
- The oneAPI Math Library (oneMath) project
- The Intel(R) oneAPI Math Kernel Library (oneMKL) Product
A:
-
The oneAPI Specification for oneMath defines the SYCL interfaces for performance math library functions. The oneMath specification can evolve faster and more frequently than implementations of the specification.
-
The oneAPI Math Library (oneMath) project is an open source implementation of the specification. The project goal is to demonstrate how the SYCL interfaces documented in the oneMath specification can be implemented for any math library and work for any target hardware. While the implementation provided here may not yet be the full implementation of the specification, the goal is to build it out over time. We encourage the community to contribute to this project and help to extend support to multiple hardware targets and other math libraries.
-
The Intel(R) oneAPI Math Kernel Library (oneMKL) project is an Intel product provided as part of the Intel(R) oneAPI Base Toolkit. It is used for the Intel backends of oneMath. Its C++ API is very similar to the oneMath specification. It is highly optimized for Intel CPU and Intel GPU hardware.
Q: I'm trying to use oneMath in my project using FetchContent
, but I keep running into ONEMATH::SYCL::SYCL target was not found
problem when I try to build the project. What should I do?
A:
Make sure you set the compiler when you configure your project.
E.g. cmake -Bbuild . -DCMAKE_CXX_COMPILER=icpx
.
Q: I'm trying to use oneMath in my project using find_package(oneMath)
. I set oneMath/oneTBB and Compiler environment first, then I built and installed oneMath, and finally I tried to build my project using installed oneMath (e.g. like this cmake -Bbuild -GNinja -DCMAKE_CXX_COMPILER=icpx -DoneMath_ROOT=<path_to_installed_oneMath> .
) and I noticed that cmake includes installed oneMath headers as a system include which ends up as a lower priority than the installed Intel(R) oneAPI Math Kernel Library package includes which I set before for building oneMath. As a result, I get conflicts between Intel(R) oneAPI Math Kernel Library and installed oneMath headers. What should I do?
A:
Having installed oneMath headers as -I
instead on system includes (as -isystem
) helps to resolve this problem. We use INTERFACE_INCLUDE_DIRECTORIES
to add paths to installed oneMath headers (check oneMathTargets.cmake
in lib/cmake
to find it). It's a known limitation that INTERFACE_INCLUDE_DIRECTORIES
puts headers paths as system headers. To avoid that:
- Option 1: Use CMake >=3.25. In this case oneMath will be built with
EXPORT_NO_SYSTEM
property set totrue
and you won't see the issue. - Option 2: If you use CMake < 3.25, set
PROPERTIES NO_SYSTEM_FROM_IMPORTED true
for your target. E.g:set_target_properties(test PROPERTIES NO_SYSTEM_FROM_IMPORTED true)
.