githubEdit

Tools and Utilities

1. Tools

1.1. Performance Analysis

1.2. Profiling

Memory profilers:

  • Scalenearrow-up-right - A high-performance CPU, GPU and memory profiler for Python

  • memrayarrow-up-right – Python memory profiler for tracking allocations, leaks, and peak memory usage with low overhead and detailed call stacks

  • heaptrackarrow-up-right – Heap memory profiler for C/C++ applications that records allocations, identifies leaks, and attributes memory usage to call paths over time

  • guppy3arrow-up-right – Python heap analysis and profiling tool that provides heap snapshots, object graphs, and memory growth analysis for debugging memory usage

1.3. Benchmarking

  • Pointer chasingarrow-up-right - Curious Coding's walkthrough on pointer chasing and other memory optimizations

  • Multichasearrow-up-right - Pointer chase to reveal memory bandwidth and loaded-latency

  • PARAMarrow-up-right - Repository of communication and compute micro-benchmarks as well as full workloads for evaluating training and inference platforms

  • Google Workload Tracesarrow-up-right - Warehouse scale traces captured using DynamoRIO's drmemtrace. The traces are records of instruction and memory accesses as described at Trace Format

  • MLPerfarrow-up-right - Consistent measurements of accuracy, speed, and efficiency on hardware for ML workloads

  • tp-parsecarrow-up-right - Task-Parallel PARSEC

  • CHAIarrow-up-right - Chai is a benchmark suite of Collaborative Heterogeneous Applications for Integrated-architectures. The Chai benchmarks are designed to use the latest features of heterogeneous architectures such as shared virtual memory and system-wide atomics to achieve efficient simultaneous collaboration between host and accelerator devices.

  • MachSuitearrow-up-right - MachSuite is a benchmark suite intended for accelerator-centric research.

1.4. Simulators

1.4.1. System

  • ASTRA-Simarrow-up-right - Distributed Deep Learning Training simulator, developed in collaboration between Georgia Tech, Meta and Intel.

  • SSTarrow-up-right - Structural Simulation Toolkit - Using the supercomputers of today to build the supercomputers of tomorrow

1.4.2. Architectural

  • gem5arrow-up-right - The gem5 simulator is a modular platform for computer-system architecture research, encompassing system-level architecture as well as processor microarchitecture.

  • CPULatorarrow-up-right - CPUlator Computer System Simulator designed as a tool for learning assembly-language programming and computer organization

  • ESESCarrow-up-right - A fast multiprocessor simulator with detailed power, thermal, and performance models for modern out-of-order multicores.

  • Multi2Simarrow-up-right - Multi2Sim is a heterogeneous system simulator of CPUs and GPUs, used to test and validate new hardware designs before they are physically manufactured.

  • SniperSimarrow-up-right - A multi-core, parallel, high-speed and accurate x86 simulator.

  • ZSimarrow-up-right - zsim is a fast x86-64 simulator with a focus on simulating memory hierarchies and large, heterogeneous systems

  • MacSimarrow-up-right - A heterogeneous architecture timing model simulator.

  • CACTIarrow-up-right - An analytical tool that takes a set of cache/memory parameters as input and calculates its access time, power, cycle time, and area.

  • WATTCHarrow-up-right - Architectural simulator that estimates CPU power consumption. Examplearrow-up-right

  • HotSpotarrow-up-right - An accurate and fast thermal model suitable for use in architectural studies.

  • SESCarrow-up-right - SuperScalar simulator is a cycle accurate architectural simulator that models a very wide set of architectures: single processors, CMPs, PIMs, and thread level speculation.

Full-system simulators and their popularity in conferences (table from 2015) simulators

1.4.3. Hardware Accelerator (incl. GPU) simulators

  • Accel-Simarrow-up-right - Simulation framework for simulating and validating programmable accelerators like GPUs

  • GPGPU Simarrow-up-right - GPGPU-Sim provides a detailed simulation model of a contemporary GPUs

  • FireSimarrow-up-right - FireSim is an open-source cycle-accurate FPGA-accelerated full-system hardware simulation platform that runs on cloud FPGAs

  • MPGPUSimarrow-up-right - MGPUSim is a Go based AMD GCN3 GPU simulator based-on the Akita framework.

  • SCALESimarrow-up-right - SCALE sim is a CNN accelerator simulator, that provides cycle-accurate timing, power/energy, memory bandwidth and trace results for a specified accelerator configuration and neural network architecture.

  • STONNEarrow-up-right - Simulation TOol of Neural Network Engines, a cycle-level, highly-modular and highly-extensible simulation framework that can plug into any high-level DNN framework as an accelerator device and perform end-to-end evaluation of flexible accelerator microarchitectures with sparsity support, running complete DNN models.

1.4.4. Micro-architecture/ISA simulators

1.4.5. Memory Simulators

1.4.6. Interconnect Simulators

1.5. Open Source Tools for Digital Design

1.5.1. Digital/Analog Simulators

1.6. Emulators

2. Utilities

2.1. Hardware Design

3. Misc

Last updated