jeremy/HON95-wiki @ 7e5f0a1d5a092168a17dece9c719b6082d07e44b - Gogs

title: CUDA breadcrumbs:

title: Software Engineering
title: HPC --- {% include header.md %}

Related Pages

{:.no_toc}

CUDA (configuration)

Tools

CUDA-GDP

TODO

Nsight Compute

TODO Description
When it reruns kernels for different tests, it restores the GPU state but not the host state. If this causes incorrect behavior, set --replay-mode=application to rerun the entire application instead.

CUDA-MEMCHECK

For checking correctness and discovering memory bugs.
Example usage: cuda-memcheck --leak-check full <application>

nvprof

For profiling CUDA applications.
No longer supported for devices with compute capability 7.5 and higher. Use Nsight Compute instead.
Example usage to show which CUDA calls and kernels tak the longest to run: sudo nvprof <application>
Example usage to show interesting metrics: sudo nvprof --metrics "eligible_warps_per_cycle,achieved_occupancy,sm_efficiency,alu_fu_utilization,dram_utilization,inst_replay_overhead,gst_transactions_per_request,l2_utilization,gst_requested_throughput,flop_count_dp,gld_transactions_per_request,global_cache_replay_overhead,flop_dp_efficiency,gld_efficiency,gld_throughput,l2_write_throughput,l2_read_throughput,branch_efficiency,local_memory_overhead" <application>

{% include footer.md %}