Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local disk cache #529

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

ola-rozenfeld
Copy link
Contributor

@ola-rozenfeld ola-rozenfeld commented Jan 24, 2024

Implements an optional local disk cache (LRU, CAS + Action Cache) used for remote action outputs.
Goma has this feature, and many of our Reclient customers requested it since (usually because of slow internet connections).

In my benchmarks on Linux with a relatively good internet connection this was not able to reduce the runtime of a fully cached build by a statistically significant amount. But it didn't hurt either.

The current implementation does GC asynchronously, to save time. The priority queue implementation using the container/heap library is taken verbatim from the Golang documentation, plus I improved it a bit to double the array size when needed instead of extending it by one element at a time.

@ola-rozenfeld
Copy link
Contributor Author

ola-rozenfeld commented Jan 24, 2024

This is ready for review now. I'm not sure about the lint errors -- it complains about unwrapped errors from other packages, but from the code I can see, we do this literally everywhere, so I'm choosing to ignore unless asked otherwise.

@ola-rozenfeld ola-rozenfeld force-pushed the ola-disk-cache branch 15 times, most recently from abb4792 to 8979a8d Compare January 24, 2024 22:22
newSize := uint64(atomic.AddInt64(&d.sizeBytes, dg.Size))
if newSize > d.maxCapacityBytes {
select {
case d.gcReq <- atomic.AddUint64(&d.gcTick, 1):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason we count the GC cycles here is for testing. I couldn't find another way to wait for GC to be over in tests.

@ola-rozenfeld
Copy link
Contributor Author

Update: Brave have tried this change and see no difference, because we still need to do an Action Cache RPC. I will now add optional Action Cache to this change to see how much improvement they get with it.

@ola-rozenfeld ola-rozenfeld marked this pull request as draft January 26, 2024 14:11
@ola-rozenfeld ola-rozenfeld force-pushed the ola-disk-cache branch 2 times, most recently from b42ecc2 to 50846d6 Compare January 27, 2024 18:41
@ola-rozenfeld ola-rozenfeld changed the title Local disk CAS Local disk cache Jan 27, 2024
@gkousik
Copy link
Collaborator

gkousik commented Jan 29, 2024

Thanks for the PR Ola! I generally like and agree with having a local disk cache feature, but I want to ask some due diligence questions.

  1. In slow internet connection scenarios, has racing exec strategy proven to be helpful? Is local disk cache faster than racing in those scenarios?

  2. In my benchmarks on Linux with a relatively good internet connection this was not able to reduce the runtime of a fully cached build by a statistically significant amount. But it didn't hurt either.

    This is very interesting - what was the size of your disk cache and what was the total output downloaded by the build? I am assuming this is a Chrome build? Its surprising that network transfer + write ~= local_disk_copy for a full build.

  3. I think StoreCas also could suffer from a race condition where the local build modifies the file after writing it out. Perhaps we need to validate digest after cp is finished?

  4. Finally a request, assuming we go ahead (I just want to know answers to (1) and (2)), could this be split up into smaller CLs? I have some readability comments that maybe easier to do with smaller changes :)

@gkousik gkousik requested a review from mrahs January 29, 2024 14:21
@ola-rozenfeld ola-rozenfeld marked this pull request as ready for review January 29, 2024 16:42
@ola-rozenfeld
Copy link
Contributor Author

Hey Kousik and Anas! Good questions! 1 - @goodov from Brave has tried this out on a slow connection -- maybe Aleksey can comment on how it compares to racing, because I know he has tried that as well (although it's not yet the default setting we recommend to customers). I'm wondering if there's a way for me to artificially create a VM with a slow connection to try it out -- maybe get a VM in the farthest possible region from our backend?...

2 - I tried a Chromium build on Linux with 50gb cache size, and the actual cache size was 26GB. Which, btw, implies that RBE download metrics that it displays at the end of the build are waaaay off (it usually says RBE Stats: down 5.78 GB).

3 - Good point! I didn't think the output files would be locally modified (do they? Do you know which actions do that?). I'll add handling of this case and a unit test. But I'm struggling to imagine this being a race condition unless it's literally two concurrent actions that output the same file -- and if that's the case, isn't it already a race condition in the build itself?...

4 - Yes, agreed, I did it in one go because I wanted @goodov to be able to give it a try easily. I think this can naturally break into 3 PRs -- first, introduce the diskcache with CAS-only plus unit tests, then second PR adds Action Cache functionality plus unit tests, then 3rd PR introduces the client flags plus instrumentation to actually use the diskcache in the client. Does that sound good?

Thank you!

@ola-rozenfeld
Copy link
Contributor Author

ola-rozenfeld commented Jan 31, 2024

Updates:

  • Changed the directory structure following Marc-Antoine's (@maruel) suggestion (based on Git and Isolate implementations) -- am now splitting the cache directory into 256 subdirectories based on 2-byte prefix of the digest, and using the other bytes for the name. Also moved the "_ac" to suffix. Both of these should improve performance, especially on Windows.
  • Split this PR out into Local disk cache: part 1 #530 (will send out part 2 after part 1 is in).

I'm leaving this PR here for reference to the whole thing.

@ola-rozenfeld ola-rozenfeld force-pushed the ola-disk-cache branch 2 times, most recently from b01e309 to c800a0e Compare February 11, 2024 17:50
@ola-rozenfeld
Copy link
Contributor Author

@mrahs Friendly ping! :-) Let me know if you prefer to review the whole thing or in chunks (e.g. #530).

Thank you!

@mrahs
Copy link
Collaborator

mrahs commented Feb 13, 2024

@mrahs Friendly ping! :-) Let me know if you prefer to review the whole thing or in chunks (e.g. #530).

Thank you!

Sorry I'm late. I'm trying to understand the context of such cache within past discussions and future requirements.

I tried a Chromium build on Linux with 50gb cache size, and the actual cache size was 26GB. Which, btw, implies that RBE download metrics that it displays at the end of the build are waaaay off (it usually says RBE Stats: down 5.78 GB).

Ah, interesting. Any chance you've been able to spot the cause of this mismatch?

@ola-rozenfeld ola-rozenfeld force-pushed the ola-disk-cache branch 8 times, most recently from 3cc60db to 7a31b23 Compare May 19, 2024 20:36
@ola-rozenfeld ola-rozenfeld force-pushed the ola-disk-cache branch 2 times, most recently from 2b812c8 to 9eb7483 Compare May 20, 2024 23:10
}
it := &qitem{
key: k,
lat: fileInfoToAccessTime(info),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use the last access time instead of last modified time?
I don't believe that using the file system to track this information is a good idea. My implementations saved this as a dumb json file and that was very simple, I don't care about the actual time stamp, just the ordering; thus the file is a flat list of files in LRU for eviction order. This also removes the need for the priorityQueue type.

d.mu.Lock()
it := heap.Pop(d.queue).(*qitem)
d.mu.Unlock()
size, err := d.getItemSize(it.key)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why this does a stat on the file when the size is already in the digest.

diskOpsStart := time.Now()
// We only delete the files, and not the prefix directories, because the prefixes are not worth worrying about.
if err := os.Remove(d.getPath(it.key)); err != nil {
log.Errorf("Error removing file: %w", err)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be good if the error would be accumulated, so that Shutdown() could be renamed to Close() and return the accumulated errors.

return key{digest: dg, isCas: !isAc}, nil
}

func (d *DiskCache) getPath(k key) string {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional nit: I would have preferred files to have a uniform file name "%s-%d" % (hash, size), instead of having the size as an extension. Then you could use ".cas" as the marker for cas files. IMHO it would be a bit cleaner.

if q.n == cap(q.items) {
// Resize the queue
old := q.items
q.items = make([]*qitem, 2*cap(old)) // Initial capacity needs to be > 0.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if old == 0 {
  old = 256
}

then you can remove the comment


type key struct {
digest digest.Digest
isCas bool
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is isCas meant to do a performance optimization later? I don't see any behavior change based on this member.

@ola-rozenfeld
Copy link
Contributor Author

Thank you, Mark-Antoine -- I'm still tinkering with the whole thing, I just added stats and benchmark tests (not pushed yet), and I'm trying out different variations to see what's fastest. I hope I can get it all done by Monday.

@ola-rozenfeld ola-rozenfeld force-pushed the ola-disk-cache branch 2 times, most recently from 9888072 to 8094227 Compare May 27, 2024 16:33
@@ -129,6 +140,11 @@ func (c *Client) DownloadOutputs(ctx context.Context, outs map[string]*TreeOutpu
if err := cache.Update(absPath, md); err != nil {
return fullStats, err
}
if c.diskCache != nil {
if err := c.diskCache.StoreCas(output.Digest, absPath); err != nil {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it could be done as part of DownloadFiles() ? Otherwise this adds a serial latency.

@ola-rozenfeld
Copy link
Contributor Author

I'm really sorry, but I have to admit to myself and others that I apparently have zero spare cycles to put into this in the foreseeable future :-(
Interested parties, such as @maruel or @goodov are welcome to pick up this work and go from here, merging it in any form/parts. Again, I'm sorry! In retrospect, I should have tried to merge and iterate rather than address all comments at once -- but you all had extremely helpful comments, which were important to address.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants