Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Mmap allocator information to error message when failing with MMap failed #9455

Closed
wants to merge 1 commit into from

Conversation

mohsaka
Copy link
Contributor

@mohsaka mohsaka commented Apr 11, 2024

We currently have a lack of information on the current status of the MMap allocator when we fail in the logging. Here is an example of what we see.

E20240410 00:12:53.207183 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.218799 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.228353 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.239408 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.250753 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.259156 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.266788 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.274092 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.287583 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.296489 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
I20240410 00:12:53.296515 21387 AsyncDataCache.cpp:799] [CACHE] Backoff in allocation contention for 23.67ms
E20240410 00:12:53.327771 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.342656 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.387915 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.413861 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.463887 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.501030 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.521816 21387 Exceptions.h:69] Line: /app/presto-native-execution/velox/velox/common/memory/MemoryPool.cpp:1198, Function:handleAllocationFailure, Expression:  allocateContiguous failed with 4194304 pages from Memory Pool[op.2149.4.63.HashBuild LEAF root[20240410_001004_00013_5sjzc_123] parent[node.2149] MMAP track-usage thread-safe]<max capacity 437.00GB unlimited capacity used 0B available 0B reservation [used 0B, reserved 0B, min 0B] counters [allocs 1, frees 0, reserves 0, releases 1, collisions 0])> Mmap failed with 4194304 pages use MmapArena false Failed to evict from cache state: AsyncDataCache:
Cache size: 57.10GB tinySize: 228.15KB large size: 57.09GB
Cache entries: 15536 read pins: 128 write pins: 0 pinned shared: 126.07MB pinned exclusive: 0B
 num write wait: 19610 empty entries: 112530
Cache access miss: 235115 hit: 974160 hit bytes: 2.51TB eviction: 219579 eviction checks: 296298 aged out: 0
Prefetch entries: 2 bytes: 890.92KB
Alloc Megaclocks 610291
Allocated pages: 32586619 cached pages: 14967066
, Source: RUNTIME, ErrorCode: MEM_ALLOC_ERROR

By adding toString(), we will be able to see output like

Memory Allocator[MMAP total capacity 461.00GB free capacity 66.99GB allocated pages 103287293 mapped pages 104333972 external mapped pages 7670401
[size 1: 33382(130MB) allocated 60831 mapped]
[size 2: 43805(342MB) allocated 343492 mapped]
[size 4: 34963(546MB) allocated 47499 mapped]
[size 8: 28204(881MB) allocated 38870 mapped]
[size 16: 35949(2246MB) allocated 47261 mapped]
[size 32: 47909(5988MB) allocated 51124 mapped]
[size 64: 36067(9016MB) allocated 36067 mapped]
[size 128: 45642(22821MB) allocated 45642 mapped]
[size 256: 331530(331530MB) allocated 331532 mapped]
]

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 11, 2024
Copy link

netlify bot commented Apr 11, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit d9be7d2
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/661f0941ea13ba0008622dda

Copy link
Collaborator

@majetideepak majetideepak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @mohsaka

Copy link
Contributor

@xiaoxmeng xiaoxmeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mohsaka thanks for the improvement % nit.

velox/common/memory/MmapAllocator.cpp Outdated Show resolved Hide resolved
@mohsaka
Copy link
Contributor Author

mohsaka commented Apr 16, 2024

@xiaoxmeng Thanks for the review. All fixed!

@mohsaka mohsaka requested a review from xiaoxmeng April 16, 2024 23:28
@facebook-github-bot
Copy link
Contributor

@xiaoxmeng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@xiaoxmeng merged this pull request in ced2db6.

Copy link

Conbench analyzed the 1 benchmark run on commit ced2db6c.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

Joe-Abraham pushed a commit to Joe-Abraham/velox that referenced this pull request Jun 7, 2024
…p failed (facebookincubator#9455)

Summary:
We currently have a lack of information on the current status of the MMap allocator when we fail in the logging. Here is an example of what we see.

```
E20240410 00:12:53.207183 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.218799 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.228353 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.239408 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.250753 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.259156 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.266788 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.274092 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.287583 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.296489 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
I20240410 00:12:53.296515 21387 AsyncDataCache.cpp:799] [CACHE] Backoff in allocation contention for 23.67ms
E20240410 00:12:53.327771 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.342656 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.387915 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.413861 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.463887 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.501030 21387 MmapAllocator.cpp:344] [MEM] Mmap failed with 4194304 pages use MmapArena false
E20240410 00:12:53.521816 21387 Exceptions.h:69] Line: /app/presto-native-execution/velox/velox/common/memory/MemoryPool.cpp:1198, Function:handleAllocationFailure, Expression:  allocateContiguous failed with 4194304 pages from Memory Pool[op.2149.4.63.HashBuild LEAF root[20240410_001004_00013_5sjzc_123] parent[node.2149] MMAP track-usage thread-safe]<max capacity 437.00GB unlimited capacity used 0B available 0B reservation [used 0B, reserved 0B, min 0B] counters [allocs 1, frees 0, reserves 0, releases 1, collisions 0])> Mmap failed with 4194304 pages use MmapArena false Failed to evict from cache state: AsyncDataCache:
Cache size: 57.10GB tinySize: 228.15KB large size: 57.09GB
Cache entries: 15536 read pins: 128 write pins: 0 pinned shared: 126.07MB pinned exclusive: 0B
 num write wait: 19610 empty entries: 112530
Cache access miss: 235115 hit: 974160 hit bytes: 2.51TB eviction: 219579 eviction checks: 296298 aged out: 0
Prefetch entries: 2 bytes: 890.92KB
Alloc Megaclocks 610291
Allocated pages: 32586619 cached pages: 14967066
, Source: RUNTIME, ErrorCode: MEM_ALLOC_ERROR
```

By adding toString(), we will be able to see output like
```
Memory Allocator[MMAP total capacity 461.00GB free capacity 66.99GB allocated pages 103287293 mapped pages 104333972 external mapped pages 7670401
[size 1: 33382(130MB) allocated 60831 mapped]
[size 2: 43805(342MB) allocated 343492 mapped]
[size 4: 34963(546MB) allocated 47499 mapped]
[size 8: 28204(881MB) allocated 38870 mapped]
[size 16: 35949(2246MB) allocated 47261 mapped]
[size 32: 47909(5988MB) allocated 51124 mapped]
[size 64: 36067(9016MB) allocated 36067 mapped]
[size 128: 45642(22821MB) allocated 45642 mapped]
[size 256: 331530(331530MB) allocated 331532 mapped]
]
```

Pull Request resolved: facebookincubator#9455

Reviewed By: amitkdutta

Differential Revision: D56230522

Pulled By: xiaoxmeng

fbshipit-source-id: f56832115468c0354680381c27f5f175580184ac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants