Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I/O stack performance #238

Open
gijsde1ste opened this issue Jun 11, 2020 · 2 comments
Open

I/O stack performance #238

gijsde1ste opened this issue Jun 11, 2020 · 2 comments

Comments

@gijsde1ste
Copy link

Hi all,

As some of you might know by now I'm attempting to setup an experiment to measure performance between internal and IO efficient algorithms. I'm trying to get some early benchmarks how tpies IO stack performance compares to the internal stack. See this very bare bone code example:

void testInternalStack(){
    std::cout << "Testing internal stack" << std::endl;
    tpie::internal_stack<double> s = tpie::internal_stack<double>(500242880);

    for (int i = 0; i < 500242880; i++){
        s.push(i);
    }

    for (int i = 0; i < 500242880; i++){
        s.pop();
    }
}

void testIOStack(){
    std::cout << "Testing IO stack" << std::endl;
    tpie::stack<double> s = tpie::stack<double>();

    for (int i = 0; i < 500242880; i++){
        s.push(i);
    }

    for (int i = 0; i < 500242880; i++){
        s.pop();
    }
}

int main() {
    tpie::tpie_init();

    size_t available_memory_mb = 128;
    tpie::get_memory_manager().set_limit(available_memory_mb*1024*1024);

    auto start = std::chrono::high_resolution_clock::now();
    testIOStack();
    //testInternalStack();
    auto stop = std::chrono::high_resolution_clock::now();

    auto duration = std::chrono::duration_cast<std::chrono::microseconds>(stop-start);
    std::cout << duration.count() << std::endl;

    tpie::tpie_finish();

	return 0;
}

In order to limit the available ram I've set up two cgroups. 128Group with the intended test size of 128mb ram and unlimited (10gb) swap. And a control group unlimitedGroup with 8gb ram and 10gb swap. As the images below show there is no real difference between running the program normally or with the unlimited cgroup limitation, therefore its not cgroup that has some performance overhead.

The internal stack behaves as expected, it's very quick when enough ram is available ~1.2 seconds to add and pop 3gb of elements. When the ram becomes a problem it slows down to ~25 seconds because of swap memory.

The IO stack however behaves somewhat unexpected, when ram is not an issue it takes ~14-15 seconds, which is logical since it does some IO. As you can see there is no TPIE warning that the 128mb limit set in tpie is exceeded. The 14-15 seconds is faster than the 25 seconds of the internal stack when that was limited to 128mb, which is good. But the weird thing is that when I use the 128mb cgroup limitation on the IO stack it becomes a lot slower, ~64 seconds which indicates it is doing much more IO's than when there is no limit. This should not be the case?

What I've already tried is playing with the swappiness factor, the OS starts swapping out ram before it is full (otherwise it would be stalling too much) but even when I set swappiness to 0 (only swap when absolutely necessary) I get the same results. While monitoring the cgroup stats I see the swap memory is actively used while running this benchmark.

Can anyone shed some light on why the IO efficient stack performance changes by how much ram there is available?

Images/results of benchmark:

internal_stack
IO_stack

@adament
Copy link
Collaborator

adament commented Jun 20, 2020

As a quick test have you tried setting memory available to something less than 128, e.g. 64? Just to test whether the problem might be that you overestimate the amount of memory available to tpie?

@SSoelvsten
Copy link
Contributor

@gijsde1ste has been inactive on GitHub for more than a year and left this issue without progress for two years. Maybe it should just be closed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants