Tell HN: GpuOwl/PRPLL, GPU software used to find the largest prime number

Posted by mpreda 10/26/2024

Hi, I'm Mihai Preda the author of GpuOwl/PRPLL [1], an OpenCL software used by Luke Durant for his recent discovery of the largest prime number know, the 52nd Mersenne prime 2^136279841 - 1 [2].

Feel free to ask questions about technical aspects of the GpuOwl implementation, about optimizations, tricks, efficient FFT implementation on GPUs etc. Or anything else.

[1] GpuOwl: https://github.com/preda/gpuowl [2] GIMPS: https://www.mersenne.org/

72 points | 43 commentspage 2

DeathArrow 10/27/2024|

Why do you use OpenCL instead of CUDA?

mpreda 10/27/2024|

Indeed CUDA is nice due to the way it uses C++, integrates host and GPU code in a single file, and in the convenience of compilation. Basically I think CUDA is a bit easier to start with than OpenCL.

OTOH CUDA only works on Nvidia, and that's a major limitation.

GpuOwl uses heavily FP64 ("double" floating point), and FP64 is more readily available at consumer prices on AMD GPUs. We (the GIMPS project) use a lot of Radeon VII and Radeon Pro VII GPUs, which have great FP64 at a cheap price (I am personally running 8x Radeon Pro VII that I bought new for about $300 a piece).

So you see, for us AMD GPUs are the first citizen. Of course I want to support Nvidia GPUs as well, and OpenCL allows that. Luke Durant did run GpuOwl on a lot of Nvidia GPUs in the cloud, and I'm happy GpuOwl did work well for him on Nvidia.

Resolver 10/27/2024|||

Are there any potential benefits of using CUDA instead of OpenCL on Nvidia GPUs? Like, better driver support, ability to utilize Nvidia-specific features?

Nvidia A100 GPU which was used to find a new Mersenne prime has specialized dedicated hardware like tensor cores, which on A100 can work not only for FP16 and FP32 but also for FP64. Are there any benefits of utilizing this capabilities?

mpreda 10/28/2024||

Yes I expect there may be some micro-optimizations that are available on CUDA, such as using bits of PTX in places.

And if the GPU provides some sort of matrix-multiplication on FP64, that we're not currently making use of -- clearly that would be a big opportunity.

But somebody needs to implement it, profile, test.. on some HW.

DeathArrow 10/27/2024|||

Thank you. It makes sense to use OpenCL if you have AMD GPUs in mind.

I thought though that prospective HPC users have more Nvidia A100 and H100 in mind when buying hardware.

evanb 10/27/2024||

GIMPS is not typically targeting HPC, it is typically targeting hobbyists who have spare cycles to burn.

primecurious 10/27/2024|

I'd also like to draw attention that a lot of this work was sponsored by IMC the market maker, Mihai's employer.

mpreda 10/27/2024|

What! This is absolutely not true. My open source work was not sponsored by anyone. And IMC is not my employer. But really, how did you get this idea?

mpreda 10/27/2024||

"primecurious", who you are and what is the purpose of such statements? how would you know who is or isn't sponsoring my work?

But just to set it straight, GpuOwl received exactly $0 contributions or sponsoring from exactly nobody. It's a pleasure work from my side, and it's open sourced for the easy access of curious minds to the algorithms and techniques implemented. I did receive great help, in the form of source-code contributions, most importantly from George Woltman.