Overview
Teaching: 0 min
Exercises: 0 minQuestions
What type of operations can be accelerated using libraries
How can libraries be used to accelerate calculations
How can CUDA python be used to write my own kernels
Worked examples moving from division between vectors to sum reduction
Objectives
Learn to use CUDA libraries
Learn to accelerate Python code using CUDA
Show examples for each of the CUDA use scenarios mentioned:
After visiting a great number of web pages this week, this NVidia page is the main source I have settled on.
There are two examples here using Anaconda NumbaPro.
There is lots of documentation to read on the Continuum Analytics website - linked to at the above site
Anaconda accelerate provides access to numerical libraries optimised for performance on Intel CPUs and NVidia GPUs. Using accelerate, you can access
I read about @vectorize for automatically accelerating functions, but everything pointed to NumbaPro which has been depreciated. This blog post indicates what has gone where (NumbaPro was paid-for software: now split into Numba (open-source) and Accelerate (free for academic use).
CUDA functionality can accessed directly from Python code. Information on this page is a bit sparse.
Thankfully the Numba documentation looks fairly comprehensive and includes some examples.
Looks to be just a wrapper to enable calling kernels written in CUDA C. This would seem to be out of the scope of this course?
FIXME: Find some examples for some of the above (more on GPU obviously). Some material here, the most useful being examples on github:
Continuum Analytics NumbaPro repo
I have tried the Mandlebrot example on Zrek, and only the first part works.
I have emailed Nvidia and the GitHub repo owner asking for help updating
this code which uses Numbapro (deprecated).
No reponse received. Help required!
I just got a response, suggesting this has now been fixed.
Key Points