The NVIDIA® CUDA® Toolkit provides a comprehensive development environment for C and C++ developers building GPU-accelerated applications. The CUDA Toolkit includes a compiler for NVIDIA GPUs, math libraries, and tools for debugging and optimizing the performance of your applications. You’ll also find programming guides, user manuals, API reference, and other documentation to help you get started quickly accelerating your application with GPUs.
|
|
|
|
|
NEW in CUDA 4
Easier Application Porting
- Share GPUs across multiple threads
- Use all GPUs from a single host thread
- No-copy pinning of system memory, a faster alternative to cudaMallocHost()
- C++ new and delete for virtual functions
- Support for inline PTX assembly
- Thrust library of templated primitives
- NVIDIA Performance Primitives (NPP) library
Faster Multi-GPU Programming
- Unified Virtual Addressing
- Peer-to-Peer Communication
- New & Improved Developer Tools
- New LLVM-based compiler
- Automated performance analysis in Visual Profiler
- C++ debugging in CUDA-GDB for Linux & MacOS
- GPU binary disassembler (cuobjdump)
|
CUDA 5 Release Candidate
(Now available to all developers)
Nsight, Eclipse Edition, learn more
- Develop, Debug and Optimize… All in one IDE
RDMA for GPUDirect,learn more
- Direct communication between GPUs and other PCIe devices
GPU Library Object Linking, learn more
- Libraries and plug-ins for GPU code
Dynamic Parallelism, learn more
- Easily accelerate parallel nested loops starting with Tesla K20 Kepler GPUs
- Watch the 5 min CUDA 5 Overview by Ian Buck
Register for CUDA 5 Webinars for more details of the new features of this new release
Try CUDA 5 and share your feedback with us!
|
Download CUDA 4 Today
|
Download CUDA 5 RC Today
|
Members of the CUDA Registered Developer Program can report issues and file bugs
Login or Join Today
|
Learn more about the GPU-accelerated libraries and development tools included in the CUDA Toolkit
GPU-Accelerated Libraries
- cuFFT, – Fast Fourier Transforms Library
- cuBLAS – Complete BLAS library
- cuSPARSE – Sparse Matrix library
- cuRAND – Random Number Generator
- NPP – Thousands of Performance Primitives for Image & Video Processing
- Thrust – Templated Parallel Algorithms & Data Structures
- CUDA Math Library of high performance math routines
|
Development Tools
- Nsight integrated development environment
- Visual Profiler
- CUDA-GDB command line debugger
- CUDA-MEMCHECK memory analyzer
|
In addition to all the tools, libraries and documentation in the CUDA Toolkit, you’ll find hundreds of source code samples in the NVIDIA GPU Computing SDK.
If you develop applications in languages other than C or C++, please review the Getting Started page for a language solution that meets your needs. The CUDA Toolkit complements and fully supports programming with OpenACC directives.
Availability
The latest version of the CUDA Toolkit is always available at www.nvidia.com/getcuda
NVIDIA GPU Computing Registered Developers get early access to the next CUDA Toolkit release, invitations to special registered developer-only webinars, and access to NVIDIA’s online bug reporting and feature request system. Login or Join Today
References
|
- GPU-Accelerated Applications
- CUDA Downloads
- CUDA LLVM Compiler SDK
- CUDA Toolkit 4.0 Features Overview
- CUDA Toolkit 4.1 Feature Overview
- Online Documentation for CUDA C/C++
- GPU Computing SDK code samples
- OpenACC, powerful directives for parallelizing your code
- GPU Computing Webinars
- More GPU Libraries and Tools
|