Austin Schuh | 9a24b37 | 2018-01-28 16:12:29 -0800 | [diff] [blame] | 1 | BLASFEO - BLAS For Embedded Optimization |
| 2 | |
| 3 | BLASFEO provides a set of linear algebra routines optimized for use in embedded optimization. |
| 4 | It is for example employed in the Model Predictive Control software package HPMPC. |
| 5 | |
| 6 | BLASFEO provides three implementations of each linear algebra routine (LA): |
| 7 | - HIGH_PERFORMANCE: a high-performance implementation hand-optimized for different computer architectures. |
| 8 | - REFERENCE: a lightly-optimized version, coded entirely in C withou assumptions about the computer architecture. |
| 9 | - BLAS: a wrapper to BLAS and LAPACK routines. |
| 10 | |
| 11 | The currently supported compter architectures (TARGET) are: |
| 12 | - X64_INTEL_HASWELL: Intel Haswell architecture or newer, AVX2 and FMA ISA, 64-bit OS. |
| 13 | - X64_INTEL_SANDY_BRIDGE: Intel Sandy-Bridge architecture or newer, AVX ISA, 64-bit OS. |
| 14 | - X64_INTEL_CORE: Intel Core architecture or newer, SSE3 ISA, 64-bit OS. |
| 15 | - X64_AMD_BULLDOZER: AMD Bulldozer architecture, AVX and FMA ISAs, 64-bit OS. |
| 16 | - ARMV78_ARM_CORTEX_A57: ARMv78 architecture, VFPv4 and NEONv2 ISAs, 64-bit OS. |
| 17 | - ARMV7A_ARM_CORTEX_A15: ARMv7A architecture, VFPv3 and NEON ISAs, 32-bit OS. |
| 18 | - GENERIC: generic target, coded in C, giving better performance if the architecture provides more than 16 scalar FP registers (e.g. many RISC such as ARM). |
| 19 | |
| 20 | The optimized linear algebra kernels are currently provided for OS_LINUX (x86_64 64-bit, ARMv8A 64-bit, ARMv7A 32-bit), OS_WINDOWS (x86_64 64-bit) and OS_MAC (x86_64 64-bit). |
| 21 | |
| 22 | BLASFEO employes structures to describe matrices (d_strmat) and vectors (d_strvec), defined in include/blasfeo_common.h. |
| 23 | The actual implementation of d_strmat and d_strvec depends on the LA and TARGET choice. |
| 24 | |
| 25 | More information about BLASFEO can be found in the ArXiv paper at https://arxiv.org/abs/1704.02457 |