Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpectedly Poor Performance on AMD Ryzen 9 7950X? #61

Open
JohnTravolski opened this issue Jan 8, 2023 · 4 comments
Open

Unexpectedly Poor Performance on AMD Ryzen 9 7950X? #61

JohnTravolski opened this issue Jan 8, 2023 · 4 comments

Comments

@JohnTravolski
Copy link

JohnTravolski commented Jan 8, 2023

I benchmarked a variety of Python operations between an i7 5960x (released late 2014) and an AMD Ryzen 9 7950X (released late 2022). In general, the single threaded performance of the 7950X is typically much higher than the 5960X given the improvements that have been made during that timeframe. However, I noticed that xxhash doesn't see the same speedup. Here's a comparison of a variety of Python functions and the time they took to run on both machines (Python 3.8, xxhash==1.4.3):

image

The 3rd and 4th column are the time in seconds for each machine to run each operation a given number of times, either measured with time.time() or time.perf_counter().

The code for measuring the performance of xxhash was as follows:

import xxhash
import time

hard = 10
datadict = {}

def timer(num_attempts = 10):
  def decor(func):
    def wrapper(*args):
      for timefunc, string in [(time.time, "time()"), (time.perf_counter, "perf_counter()")]:
        times = []
        for attempt_num in range(num_attempts):
          start = timefunc() # start counting time using the given timing function
          result = func(*args) # the code we're benchmarking
          end = timefunc()
          duration = end - start
          times.append(duration)
        average_time = sum(times)/len(times) # take the average of the times over num_attempts
        datadict[func.__name__ + "~" + string] = average_time
      return result
    return wrapper
  return decor

@timer(hard)
def xxhash_4kimg_5ktimes(filebits):
  j = 0
  for i in range(5000):
    j = xxhash.xxh64(filebits).hexdigest()
  return j
  
print("Timing hashing")
with open('2.jpg', 'rb') as afile:
  filebits = afile.read()
print("\txxhash:")
j = xxhash_4kimg_5ktimes(filebits)

for key, val in datadict:
  print(key, val)

('2.jpg' is a 2.94 MB jpeg file, 3840x2160 resolution). I'm on Windows 10.

Any idea why the speedup isn't higher, as many of the other functions I tested were? On average I was getting at least a 2x speedup for most Python functions, but only got 1.08x for xxhash. It was the only one that performed so poorly.

@ifduyue
Copy link
Owner

ifduyue commented Jan 10, 2023

Can you try latest version of python-xxhash and attach the resulsts?

@JohnTravolski
Copy link
Author

Can you try latest version of python-xxhash and attach the resulsts?

Good idea. Just retried it with xxhash version 3.2.0 for Python 3.8. Definitely an improvement, but I suppose still not as fast as most other Python functions:

image

A 1.47x speedup is definitely better than the 1.08x I had before, but still lagging behind the average of the others I tested.

Given the single-threaded Cinebench score of the 7950X is over double that of the 5960x, my expectation was closer to a 2x, which is what I got for a lot of other Python functions (as can be seen in my first post).

I'm guessing there's something about the algorithm that is causing it to not scale as well on this processor, but I don't know what it would be.

@ifduyue
Copy link
Owner

ifduyue commented Jan 10, 2023

Thanks for the quick feedback, I'll try to find out why

@ifduyue
Copy link
Owner

ifduyue commented Jan 10, 2023

Haven't done any tests yet, a wild guess that it is hexdigest code

python-xxhash/src/_xxhash.c

Lines 268 to 276 in f8086d4

for (i = j = 0; i < XXH64_DIGESTSIZE; i++) {
unsigned char c;
c = (digest[i] >> 4) & 0xf;
c = (c > 9) ? c + 'a' - 10 : c + '0';
retbuf[j++] = c;
c = (digest[i] & 0xf);
c = (c > 9) ? c + 'a' - 10 : c + '0';
retbuf[j++] = c;
}

@ifduyue ifduyue assigned ifduyue and unassigned ifduyue Aug 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants