Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The description in demo caching.hs vs actual implementation seems wrong. #16

Open
oderwat opened this issue Apr 17, 2019 · 0 comments
Open

Comments

@oderwat
Copy link
Contributor

oderwat commented Apr 17, 2019

Somehow TCache got my favourite project to learn Haskell and I already did a second round of cleanups to the code with my new friend hlint.

Currently I am checking out the examples and found the following information which irritates me a lot:

In the demo caching.hs the description says something like:

default policy (defaultCheck) for clearing the cache is to reduce the cache to half of max sixe when size exceeds the max

and

because 200 exceeds the maximum cache size (100) defaultCheck will discard the 150 older elems to reduce the cache to a half

While the documentation and implementation does something very different:

This is a default cache clearance check. It forces to drop from the cache all the elems not accesed since half the time between now and the last sync.

So from my understanding it works like this:

If there are more elements in the (t)cache than given to the cleanup proc all elements which match the "clearance check" are getting cleaned from memory. There are no checks on how many elements are getting flushed.

The related code looks like this:

when (size > sizeObjects) $  forkIO (filtercache t cache lastSync elems)
  >> performGC

Where filtercache works on all elements and and uses the check to find elements which are getting removed from memory.

Where the related check code is:

defaultCheck  now lastAccess lastSync
        | lastAccess > halftime = False
        | otherwise  = True

I can't find anything which discards "150" of the "200" elements and keeps half of the allowed cache size in memory.

My understanding of the example with the given timings are:

  • It sync/cleans every 10 seconds and after generating 200 elements it would have to sync all 200 to disk (and it does).
  • The example then waits for 20 seconds doing nothing. This should write all elements to the disk. But it should not clean elements from the memory as at this time they where just getting synced.
  • Then, 1 second afterwards (with the second sync cycle) they get all cleaned from memory because they where not accesses in the last 5 seconds (halve the time between checks).
  • After this the example updates all of the 200 elements which have to be loaded from disk as the cache will be "clean".
  • Then it waits again for 20 seconds while the 10 sec interval syncs the data to the disk and at the same time cleans all from memory again, because it seems that updating does not count as "access" which then will trigger immediately removal from memory again.

Was the behaviour different in the past or did I miss something which changes what I think happens with this example?

To check it myself I wrote a function which (as I believe) counts how many of the entries are being hold in memory (besides GC) and have rewritten the example with other timings (5 secs after 6 secs initial, so it triggers 6, 11, 16, 21 ... which makes more sense in my eyes. The resulting output from this seems to prove my interpretation of the code.

P.S.: I am going to create another PR with some of work but wanted to ask about my findings related to this topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant