Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyTable.C dereferencing elements of empty vector, crashing CLI #17381

Closed
biagas opened this issue Mar 1, 2022 · 9 comments · Fixed by #17396
Closed

PyTable.C dereferencing elements of empty vector, crashing CLI #17381

biagas opened this issue Mar 1, 2022 · 9 comments · Fixed by #17396
Assignees
Labels
bug Something isn't working crash bug caused visit to crash impact medium Productivity partially degraded (not easily mitigated bug) or improved (enhancement) likelihood medium Neither low nor high likelihood reviewed Issue has been reviewed and labeled by a developer
Milestone

Comments

@biagas
Copy link
Contributor

biagas commented Mar 1, 2022

Describe the bug

The flatten.py test is crashing the CLI on Windows.
It is happening during the box_selection test, specifically the call to test_box(True):

    TestSection('BoxSelectionNoSharedMemory')
    test_box(True)

Running through VS debugger, it segv's on this line in PyTable.C because 'results' is empty:

std::cout << results[0] << ", " << results[1] << ", " << results[2] << std::endl;

I'm sure 'results' being empty indicates a bigger problem.

Adding if-test for results.size before printing out its contents fixes the segv, but these BoxSelectionNoSharedMemory tests fail: multi_rect2d_box_nodes and mult_rect2d_box_zones

Helpful additional information

  • Did VisIt crash: CLI crashes
  • Did you get wrong results:

Desktop

  • OS and version: Windows 10
  • VisIt Version: develop
  • Server info (if applicable):
@biagas biagas added bug Something isn't working likelihood medium Neither low nor high likelihood impact medium Productivity partially degraded (not easily mitigated bug) or improved (enhancement) labels Mar 1, 2022
@brugger1 brugger1 added crash bug caused visit to crash reviewed Issue has been reviewed and labeled by a developer labels Mar 1, 2022
@brugger1 brugger1 added this to the 3.2.3 milestone Mar 1, 2022
@biagas
Copy link
Contributor Author

biagas commented Mar 3, 2022

@cyrush, can you direct me on how best to discover how/why the flatten query fails on Windows?
I have fixed the segv on my local copy and will create a PR soon for that, but the failure is what tickled the bug.

@biagas biagas mentioned this issue Mar 3, 2022
7 tasks
@cyrush
Copy link
Member

cyrush commented Mar 3, 2022

@Laganellac, could this case be an issue b/c the results are empty and their aren't guards later in the process? (We move the vector, but it is just fundamentally empty?)

@Laganellac
Copy link
Contributor

Hi @cyrush and @biagas, I can remove the std::cout prints; at this point they probably wouldn't even be useful in the debug log. I must've missed them somehow when making my cleanup pass over the code.

As for the results vector being empty - this indicates to me that the avtFlattenQuery has filled out the XMLResult with valid metadata about the size / shape of the results vector yet when it got to the CLI the results vector was empty. My gut tells me this is coming from this block of code in avtFlattenQuery.C:

if(PAR_Rank() == 0)
{
const double sizeInGigaBytes = (pimpl->outData.size() * sizeof(FloatType)) / (double)1e9;
debug5 << "avtFlattenQuery XML output:\n" << pimpl->outInfo.ToXML() << std::endl;
qA->SetXmlResult(pimpl->outInfo.ToXML());
qA->SetResultsMessage("Success!\nNOTE: This query should only "
"be used in the CLI via the Flatten() function.");
if(pimpl->useSharedMemory)
{
WriteSharedMemory();
}
else if(sizeInGigaBytes > pimpl->maxDataSize)
{
std::stringstream s;
s << "ERROR: Data too large to transport via query attributes ("
<< sizeInGigaBytes << "GB). You can override the limit by"
<< " overriding the parameter 'maxDataSize' with a (double)"
<< " size repesented in Gigabytes. (Current value = "
<< pimpl->maxDataSize << ").\nNOTE: This query should only "
<< "be used in the CLI via the Flatten() function.";
qA->SetResultsMessage(s.str());
debug1 << s.str() << std::endl;
}
else
{
SwapIntoQueryAttributes(*qA, pimpl->outData);
}
}

I can change this logic to only set the XMLResult after we've verified that the results vector will be populated. However, I'm not convinced this is the whole issue because I would expect the Flatten unit test to also fail on Linux if we were hitting the 'else if' branch in the above code.

Are there debug logs from running the test on Windows that I could take a look at?

@biagas
Copy link
Contributor Author

biagas commented Mar 10, 2022

@Laganellac here are the engine and viewer log files from a run of the test on Windows where I commented out all but test_box_selection, which is the one that fails. Interestingly enough, if I comment out the shared memory portion of test_box_selection, the non-shared portion succeeds.
engine_viewer_logs.zip

@Laganellac
Copy link
Contributor

Hi @biagas, I've made a small change related to this in PR #17423; however, I don't think this will solve the problem with the test on Windows. Unfortunately I cannot reproduce this bug on my Ubuntu machine and will have to setup a Windows build to try and debug. Based on the log output, everything coming from the engine looks correct - specifically line 479 has the correct input parameters, and lines 864 - 889 have the expected XMLInfo. Same goes later in the file when the test is run again with no shared memory.

When the test fails on Windows does it still crash (now that there are no printouts)? Or does it just fail the text diff for the file contents? On Windows, both the shared memory and non shared memory approaches should result in the exact same behavior (shared memory support is turned off for Windows in Flatten), so maybe there is a bug in my logic that always disables shared memory on windows but I'm not seeing that in the log output.

@biagas
Copy link
Contributor Author

biagas commented Mar 15, 2022

The test fails on Windows (no longer crashes).
It is the second run of the test_box that fails. If I leave the flatten.py file untouched, then the non-shared-memory test_box call fails.
Here are the results from that run:
flatten_box_no-shmem_results.zip

If I swap the order, is still the second call to test_box that fails, so this time the shared memory version.
Here are the results from that run:
flatten_box_shmem_results.zip

@cyrush
Copy link
Member

cyrush commented Jun 15, 2022

@biagas -- Revisiting -- I can't quite understand the path forward.

Do we need to get #17423 into 3.3? Or did another PR heal this?

@biagas
Copy link
Contributor Author

biagas commented Jun 16, 2022

@cyrush, the offending code was removed, and the cli no longer crashes, so I don't think anything needs to be done for 3.3.

However, the tests still fail on Windows, and the reason for the failure has yet to be discovered. #17423 may help, I can give it a try.

Here the first line of values in the expected results:

# vec_nid/c0,vec_nid/c1,vec_nid/c2,nodeIds,nodeDomains
0.000000000000000000e+00,0.000000000000000000e+00,0.000000000000000000e+00,0.000000000000000000e+00,0.000000000000000000e+00

and the same results on windows:

# vec_nid/c0,vec_nid/c1,vec_nid/c2,nodeIds,nodeDomains
3.454112000000000000e+06,7.426881860921530476e-43,3.348628000000000000e+06,7.426881860921530476e-43,1.740819699404572229e+25

Looks like standard windows garbage for uninitialized vars.

@biagas
Copy link
Contributor Author

biagas commented Jun 16, 2022

@cyrus, I added a new ticket describing the test suite problem on Windows (#17767)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working crash bug caused visit to crash impact medium Productivity partially degraded (not easily mitigated bug) or improved (enhancement) likelihood medium Neither low nor high likelihood reviewed Issue has been reviewed and labeled by a developer
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants