• MachSuite - MachSuite is a benchmark suite intended for accelerator-centric research.
  • nanoBench - A tool for running small microbenchmarks on recent Intel and AMD x86 CPUs using hardware performance coutners. Github
  • kerncraft - This tool allows automatic analysis of loop kernels using the Execution Cache Memory (ECM) model, the Roofline model and actual benchmarks. kerncraft provides a framework to investigate the data reuse and cache requirements by static code analysis. In combination with the Intel IACA tool kerncraft can give a good overview of both in-core and memory bottlenecks and use that data to apply performance models.
  • TAU Performance System - A portable profiling and tracing toolkit for performance analysis of parallel programs written in Fortran, C, C++, UPC, Java, Python.
  • Scalene - A high-performance CPU, GPU and memory profiler for Python
  • WhyProfiler - WhyProfiler is a CPU profiler for Jupyter notebook that not only identifies hotspots but can suggest faster alternatives.


  • CHAI - Chai is a benchmark suite of Collaborative Heterogeneous Applications for Integrated-architectures. The Chai benchmarks are designed to use the latest features of heterogeneous architectures such as shared virtual memory and system-wide atomics to achieve efficient simultaneous collaboration between host and accelerator devices.