Skip to content

Using TDD to track down an out of bounds memory access.

Trent Nelson edited this page Jan 17, 2023 · 6 revisions

Here are the steps required to build the borked version with an out-of-bounds memory access. Prompted by Twitter discussion with Tim Misiak. The commit prior to the one with the fix is 8a71b5860d30eb97daa2bd9db62a73578c7b6b79. The fix is in this commit: spoiler alert.

In a Visual Studio developer console:

cd /d c:\src
git clone https://github.com/tpn/perfecthash ph-tdd-before
cd ph-tdd-before
git checkout 8a71b5860d30eb97daa2bd9db62a73578c7b6b79

cd ..
git clone ph-tdd-before ph-tdd-after
cd ph-tdd-after
git checkout b6ab48138ce3e8a5f56072b5134de1f067053201

cd ..
cd ph-tdd-before\src

If you have an AVX2 CPU, you'll need to apply the following patch to ensure the non-AVX2 code (which is where the bug is) gets executed:

diff --git a/src/PerfectHash/Graph.c b/src/PerfectHash/Graph.c
index 94b79fc..762bb58 100644
--- a/src/PerfectHash/Graph.c
+++ b/src/PerfectHash/Graph.c
@@ -91,10 +91,12 @@ Return Value:
     //

     Rtl = Graph->Rtl;
+#if 0
     if (Rtl->CpuFeatures.AVX2 != FALSE) {
         Graph->Vtbl->CalculateAssignedMemoryCoverage =
             GraphCalculateAssignedMemoryCoverage_AVX2;
     }
+#endif

     //
     // We're done!  Indicate success and finish up.

Build the debug version:

msbuild /nologo /m /t:Rebuild /p:Configuration=Debug;Platform=x64

The bug actually manifests as silent memory corruption with regards to the statistics used to collect cache line occupancy information for a perfect hash table's "assigned" array. We can leverage gflags.exe to force the bug to manifest as an access violation, which is required to generate the crash. Via Administrator:

gflags.exe -i PerfectHashBulkCreate.exe +hpa

Open Windbg Preview as Administrator, go to Launch Executable (Advanced), and configure things as follows: image

The executable: c:\src\ph-tdd-before\src\x64\Debug\PerfectHashBulkCreate.exe

The command line: c:\src\perfecthash-keys\sys32 c:\temp\ph\tdd-before Chm01 RotateMultiplyXorRotate And 1 --FindBestGraph --BestCoverageType=HighestNumberOfEmptyCacheLines --BestCoverageAttempts=10 --AttemptsBeforeTableResize=100000000 --NoFileIo --MaxNumberOfTableResizes=0 --IgnorePreviousTableSize

Start directory isn't important, set it to anything you like.

Select [x] Record with Time Travel Debugging, then press Configure and Record, and enter a save location.

Then, press Record. You should see something like this:

image

Press Stop and Debug. Here's a rough screen capture of what you can expect: image

You can then step backwards via t- to inspect loop variables and figure out how *Assigned is resulting in a memory access error. Or just take a look at the fix here: spoiler alert.

Don't forget to disable gflags.exe when you're done. Via Administrator:

gflags.exe -i PerfectHashBulkCreate.exe -hpa
Clone this wiki locally