Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4060 ti vs 1650 #142

Open
David67821 opened this issue Sep 6, 2024 · 1 comment
Open

4060 ti vs 1650 #142

David67821 opened this issue Sep 6, 2024 · 1 comment

Comments

@David67821
Copy link

I'm seeing something strange...
on the 1650 video card it finds 10 matches per template per day...
on the 4060 ti it finds 18-20 matches per template per day...

but the 4060 ti is 5 times faster! and shows 5 times more Mkey/s

I don't understand why it doesn't find 5 times more matches?

@joaoescribano
Copy link

As far i can tell, the code is bad optimized for newer GPUs (i'm usign a RTX 3080ti) and at the standard code compilation, i'm getting 1.4 GKey/s, after few updates at the cuda engine, i'm getting now 3 GKey/s.

in GPUEngie.cu#456 at function bool GPUEngine::callKernel() It uses nbThread / nbThreadPerGroup, nbThreadPerGroup as cuda parameters, i've been playing with it to find the best tune.

I've also changed the 8.6 Cores per SM at CPUEngie.cu#131 (_ConvertSMVer2Cores) to 1024, as my GPU can handle it.

Dunno if it's the case or not, but it helped me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants