Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSM Agent cannot manage t3.nano instances. Not enough RAM. OOM during package baseline scan. #615

Open
gwharton opened this issue Feb 2, 2025 · 3 comments

Comments

@gwharton
Copy link

gwharton commented Feb 2, 2025

Fire up a t3.nano instance running Ubuntu and allow it to be managed by AWS Systems Manager.

Ensure Systems Manager initiates a Software Baseline scan on the node.

Watch the SSM Agent crash and burn

Feb 02 16:44:51 ip-10-0-0-116 python3[1498]: /usr/lib/python3/dist-packages/uaclient/apt_news.py:207: Warning: W:Download is performed unsandboxed as root as file '/run/ubuntu-advantage/aptnews.json' couldn't be accessed by user '_apt'. - pkgAcquire::Run (13: Permission denied)
Feb 02 16:44:51 ip-10-0-0-116 python3[1498]:   acq.run()
Feb 02 16:45:39 ip-10-0-0-116 kernel: irqbalance invoked oom-killer: gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0, oom_score_adj=0
Feb 02 16:45:40 ip-10-0-0-116 kernel: CPU: 1 PID: 608 Comm: irqbalance Not tainted 6.8.0-1021-aws #23-Ubuntu
Feb 02 16:45:40 ip-10-0-0-116 kernel: Hardware name: Amazon EC2 t3.nano/, BIOS 1.0 10/16/2017
Feb 02 16:45:40 ip-10-0-0-116 kernel: Call Trace:
Feb 02 16:45:40 ip-10-0-0-116 kernel:  <TASK>
Feb 02 16:45:40 ip-10-0-0-116 kernel:  dump_stack_lvl+0x76/0xa0
Feb 02 16:45:40 ip-10-0-0-116 kernel:  dump_stack+0x10/0x20
Feb 02 16:45:40 ip-10-0-0-116 kernel:  dump_header+0x47/0x200
Feb 02 16:45:40 ip-10-0-0-116 kernel:  oom_kill_process+0x116/0x270
Feb 02 16:45:40 ip-10-0-0-116 kernel:  ? oom_evaluate_task+0x143/0x1e0
Feb 02 16:45:40 ip-10-0-0-116 kernel:  out_of_memory+0x103/0x350
Feb 02 16:45:40 ip-10-0-0-116 kernel:  __alloc_pages_may_oom+0x10c/0x1d0
Feb 02 16:45:40 ip-10-0-0-116 kernel:  __alloc_pages_slowpath.constprop.0+0x42a/0xa40
Feb 02 16:45:40 ip-10-0-0-116 kernel:  __alloc_pages+0x306/0x330
Feb 02 16:45:40 ip-10-0-0-116 kernel:  alloc_pages_mpol+0x91/0x200
Feb 02 16:45:40 ip-10-0-0-116 kernel:  folio_alloc+0x64/0x120
Feb 02 16:45:40 ip-10-0-0-116 kernel:  ? filemap_get_entry+0x66/0x150
Feb 02 16:45:40 ip-10-0-0-116 kernel:  filemap_alloc_folio+0xf4/0x100
Feb 02 16:45:40 ip-10-0-0-116 kernel:  __filemap_get_folio+0x14b/0x2f0
Feb 02 16:45:40 ip-10-0-0-116 kernel:  filemap_fault+0x15c/0x880
Feb 02 16:45:40 ip-10-0-0-116 kernel:  __do_fault+0x3b/0x140
Feb 02 16:45:40 ip-10-0-0-116 kernel:  do_read_fault+0x114/0x1b0
Feb 02 16:45:40 ip-10-0-0-116 kernel:  do_fault+0x10c/0x310
Feb 02 16:45:40 ip-10-0-0-116 kernel:  handle_pte_fault+0x74/0x1c0
Feb 02 16:45:40 ip-10-0-0-116 kernel:  __handle_mm_fault+0x65e/0x7a0
Feb 02 16:45:40 ip-10-0-0-116 kernel:  handle_mm_fault+0x17c/0x380
Feb 02 16:45:40 ip-10-0-0-116 kernel:  do_user_addr_fault+0x157/0x660
Feb 02 16:45:40 ip-10-0-0-116 kernel:  exc_page_fault+0x83/0x190
Feb 02 16:45:40 ip-10-0-0-116 kernel:  asm_exc_page_fault+0x27/0x30
Feb 02 16:45:40 ip-10-0-0-116 kernel: RIP: 0033:0x737a97eb4237
Feb 02 16:45:40 ip-10-0-0-116 kernel: Code: Unable to access opcode bytes at 0x737a97eb420d.
Feb 02 16:45:40 ip-10-0-0-116 kernel: RSP: 002b:00007ffe871a1d10 EFLAGS: 00010246
Feb 02 16:45:40 ip-10-0-0-116 kernel: RAX: 0000000000000000 RBX: 000063478eb5cd40 RCX: 0000000000000001
Feb 02 16:45:40 ip-10-0-0-116 kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: 000063478eb5d1b0
Feb 02 16:45:40 ip-10-0-0-116 kernel: RBP: 00007ffe871a1d90 R08: 0000000000000000 R09: 000000007fffffff
Feb 02 16:45:40 ip-10-0-0-116 kernel: R10: 000063478eb5cd40 R11: 0000000000000293 R12: 000063478eb5d1b0
Feb 02 16:45:40 ip-10-0-0-116 kernel: R13: 00007ffe871a1d30 R14: 000000007fffffff R15: 00007ffe871a1d28
Feb 02 16:45:40 ip-10-0-0-116 kernel:  </TASK>
Feb 02 16:45:40 ip-10-0-0-116 kernel: Mem-Info:
Feb 02 16:45:40 ip-10-0-0-116 kernel: active_anon:30821 inactive_anon:40579 isolated_anon:0
                                       active_file:311 inactive_file:499 isolated_file:0
                                       unevictable:9257 dirty:40 writeback:0
                                       slab_reclaimable:4424 slab_unreclaimable:10630
                                       mapped:2201 shmem:684 pagetables:1383
                                       sec_pagetables:0 bounce:0
                                       kernel_misc_reclaimable:0
                                       free:1067 free_pcp:13 free_cma:0
Feb 02 16:45:40 ip-10-0-0-116 kernel: Node 0 active_anon:123284kB inactive_anon:162316kB active_file:1452kB inactive_file:1788kB unevictable:37028kB isolated(anon):0kB isolated(file):0kB mapped:8804kB dirty:160kB writeback:0kB shmem:2736kB shmem_thp:0kB shmem_pmdmapped:0kB anon_thp:0kB writeback_tmp:0kB kernel_stack:3424kB pagetables:5532kB sec_pagetables:0kB all_unreclaimable? no
Feb 02 16:45:40 ip-10-0-0-116 kernel: Node 0 DMA free:1592kB boost:0kB min:92kB low:112kB high:132kB reserved_highatomic:0KB active_anon:4772kB inactive_anon:7760kB active_file:12kB inactive_file:4kB unevictable:0kB writepending:0kB present:15996kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:48kB local_pcp:0kB free_cma:0kB
Feb 02 16:45:40 ip-10-0-0-116 kernel: lowmem_reserve[]: 0 386 386 386 386
Feb 02 16:45:40 ip-10-0-0-116 kernel: Node 0 DMA32 free:2676kB boost:5548kB min:8016kB low:8632kB high:9248kB reserved_highatomic:0KB active_anon:118512kB inactive_anon:154556kB active_file:1716kB inactive_file:1500kB unevictable:37028kB writepending:160kB present:465328kB managed:411548kB mlocked:27284kB bounce:0kB free_pcp:4kB local_pcp:0kB free_cma:0kB
Feb 02 16:45:40 ip-10-0-0-116 kernel: lowmem_reserve[]: 0 0 0 0 0
Feb 02 16:45:40 ip-10-0-0-116 kernel: Node 0 DMA: 1*4kB (M) 3*8kB (UM) 4*16kB (U) 3*32kB (UM) 8*64kB (UM) 1*128kB (M) 1*256kB (M) 1*512kB (M) 0*1024kB 0*2048kB 0*4096kB = 1596kB
Feb 02 16:45:40 ip-10-0-0-116 kernel: Node 0 DMA32: 155*4kB (UME) 86*8kB (UME) 60*16kB (UE) 23*32kB (UME) 1*64kB (M) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3068kB
Feb 02 16:45:40 ip-10-0-0-116 kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Feb 02 16:45:40 ip-10-0-0-116 kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Feb 02 16:45:40 ip-10-0-0-116 kernel: 5675 total pagecache pages
Feb 02 16:45:40 ip-10-0-0-116 kernel: 0 pages in swap cache
Feb 02 16:45:40 ip-10-0-0-116 kernel: Free swap  = 0kB
Feb 02 16:45:40 ip-10-0-0-116 kernel: Total swap = 0kB
Feb 02 16:45:40 ip-10-0-0-116 kernel: 120331 pages RAM
Feb 02 16:45:40 ip-10-0-0-116 kernel: 0 pages HighMem/MovableOnly
Feb 02 16:45:40 ip-10-0-0-116 kernel: 13604 pages reserved
Feb 02 16:45:40 ip-10-0-0-116 kernel: 0 pages hwpoisoned
Feb 02 16:45:40 ip-10-0-0-116 kernel: Tasks state (memory values in pages):
Feb 02 16:45:40 ip-10-0-0-116 kernel: [  pid  ]   uid  tgid total_vm      rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
Feb 02 16:45:40 ip-10-0-0-116 kernel: [    125]     0   125     8508     1056      256      800         0    94208        0          -250 systemd-journal
Feb 02 16:45:40 ip-10-0-0-116 kernel: [    187]     0   187    72238     6752     4608     2144         0   114688        0         -1000 multipathd
Feb 02 16:45:40 ip-10-0-0-116 kernel: [    198]     0   198     6618     1372      768      604         0    77824        0         -1000 systemd-udevd
Feb 02 16:45:40 ip-10-0-0-116 kernel: [    346]   991   346     5397     1312      544      768         0    86016        0             0 systemd-resolve
Feb 02 16:45:40 ip-10-0-0-116 kernel: [    535]   998   535     5599     1024      256      768         0    81920        0             0 systemd-network
Feb 02 16:45:40 ip-10-0-0-116 kernel: [    598]     0   598      680      480       32      448         0    45056        0             0 acpid
Feb 02 16:45:40 ip-10-0-0-116 kernel: [    602]     0   602     1806      608       64      544         0    53248        0             0 cron
Feb 02 16:45:40 ip-10-0-0-116 kernel: [    603]   101   603     2449     1056      192      864         0    69632        0          -900 dbus-daemon
Feb 02 16:45:40 ip-10-0-0-116 kernel: [    608]     0   608    20730      832       64      768         0    69632        0             0 irqbalance
Feb 02 16:45:40 ip-10-0-0-116 kernel: [    610]     0   610     8115     3360     2528      832         0   102400        0             0 networkd-dispat
Feb 02 16:45:40 ip-10-0-0-116 kernel: [    613]   989   613    95769     1440      512      928         0   135168        0             0 polkitd
Feb 02 16:45:40 ip-10-0-0-116 kernel: [    624]     0   624   367226     5258     5258        0         0   307200        0          -900 snapd
Feb 02 16:45:40 ip-10-0-0-116 kernel: [    634]     0   634     4403      960      224      736         0    81920        0             0 systemd-logind
Feb 02 16:45:40 ip-10-0-0-116 kernel: [    645]     0   645   117247     1440      512      928         0   151552        0             0 udisksd
Feb 02 16:45:40 ip-10-0-0-116 kernel: [    729]     0   729     1537      480       32      448         0    53248        0             0 agetty
Feb 02 16:45:40 ip-10-0-0-116 kernel: [    750]   110   750     4847      800      160      640         0    65536        0             0 chronyd
Feb 02 16:45:40 ip-10-0-0-116 kernel: [    755]   110   755     2765      516      132      384         0    57344        0             0 chronyd
Feb 02 16:45:40 ip-10-0-0-116 kernel: [    773]     0   773    27499     3232     2336      896         0   122880        0             0 unattended-upgr
Feb 02 16:45:40 ip-10-0-0-116 kernel: [    790]   102   790    55627     1248      384      864         0    98304        0             0 rsyslogd
Feb 02 16:45:40 ip-10-0-0-116 kernel: [    813]     0   813    97969     1280      416      864         0   143360        0             0 ModemManager
Feb 02 16:45:40 ip-10-0-0-116 kernel: [    816]     0   816     1526      448       32      416         0    53248        0             0 agetty
Feb 02 16:45:40 ip-10-0-0-116 kernel: [    984]     0   984   458276     2944     2944        0         0   229376        0             0 amazon-ssm-agen
Feb 02 16:45:40 ip-10-0-0-116 kernel: [   1039]     0  1039   595910     4337     4337        0         0   372736        0             0 ssm-agent-worke
Feb 02 16:45:40 ip-10-0-0-116 kernel: [   1318]     0  1318   423459     3664     3664        0         0   237568        0             0 ssm-document-wo
Feb 02 16:45:40 ip-10-0-0-116 kernel: [   1343]     0  1343   404962     2944     2912       32         0   204800        0             0 ssm-document-wo
Feb 02 16:45:40 ip-10-0-0-116 kernel: [   1350]     0  1350   423331     3008     3008        0         0   221184        0             0 ssm-document-wo
Feb 02 16:45:40 ip-10-0-0-116 kernel: [   1360]     0  1360      700      448        0      448         0    49152        0             0 sh
Feb 02 16:45:40 ip-10-0-0-116 kernel: [   1361]     0  1361     1968      576       64      512         0    57344        0             0 _script.sh
Feb 02 16:45:40 ip-10-0-0-116 kernel: [   1367]     0  1367   441828     2944     2944        0         0   221184        0             0 ssm-document-wo
Feb 02 16:45:40 ip-10-0-0-116 kernel: [   1386]     0  1386      700      448        0      448         0    49152        0             0 sh
Feb 02 16:45:40 ip-10-0-0-116 kernel: [   1387]     0  1387     1968      608       96      512         0    57344        0             0 _script.sh
Feb 02 16:45:40 ip-10-0-0-116 kernel: [   1423]     0  1423    53409    11317    10581      736         0   196608        0             0 python3
Feb 02 16:45:40 ip-10-0-0-116 kernel: [   1438]     0  1438   404266     3296     2848      448         0   217088        0             0 updater
Feb 02 16:45:40 ip-10-0-0-116 kernel: [   1488]     0  1488    38604    11398    10374     1024         0   212992        0             0 python3
Feb 02 16:45:40 ip-10-0-0-116 kernel: [   1498]     0  1498    13479     4819     3955      864         0   143360        0             0 python3
Feb 02 16:45:40 ip-10-0-0-116 kernel: [   1499]     0  1499    13769     4992     4128      864         0   147456        0             0 python3
Feb 02 16:45:40 ip-10-0-0-116 kernel: [   1500]    42  1500     5815     1344      256     1088         0    90112        0             0 http
Feb 02 16:45:40 ip-10-0-0-116 kernel: [   1501]    42  1501     5815     1248      224     1024         0    86016        0             0 http
Feb 02 16:45:40 ip-10-0-0-116 kernel: [   1504]    42  1504     4019      928      128      800         0    81920        0             0 gpgv
Feb 02 16:45:40 ip-10-0-0-116 kernel: [   1550]    42  1550     4019      566      150      416         0    77824        0             0 gpgv
Feb 02 16:45:40 ip-10-0-0-116 kernel: [   1551]    42  1551      700      448        0      448         0    49152        0             0 apt-key
Feb 02 16:45:40 ip-10-0-0-116 kernel: [   1556]    42  1556     6058     2912     1920      992         0    94208        0             0 store
Feb 02 16:45:40 ip-10-0-0-116 kernel: [   1557]    42  1557     3187      768       96      672         0    69632        0             0 apt-config
Feb 02 16:45:40 ip-10-0-0-116 kernel: [   1558]     0  1558     1945      515       67      448         0    53248        0             0 cron
Feb 02 16:45:40 ip-10-0-0-116 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/snap.amazon-ssm-agent.amazon-ssm-agent.service,task=python3,pid=1488,uid=0
Feb 02 16:45:40 ip-10-0-0-116 kernel: Out of memory: Killed process 1488 (python3) total-vm:154416kB, anon-rss:41496kB, file-rss:4096kB, shmem-rss:0kB, UID:0 pgtables:208kB oom_score_adj:0
@VishnuKarthikRavindran
Copy link
Contributor

Hi @gwharton, Thanks for reaching us. Could you please share from which agent version you started seeing this issue?

@gwharton
Copy link
Author

gwharton commented Feb 3, 2025

Its been like this for years. The t3.nano, or any of the 512MB low memory instances just don't have enough RAM to run the SSM AWS-RunPatchBaseline scan from Systems Manager. The python process to collect the installed packages consumes somewhere between 150 and 200MB of RAM all on its own. A 512MB machine, running its own stuff, doesn't have enough headroom to have a 200MB transient process firing up every day. Python seems quite heavy weight for this.

The only way I've found I can work with nano instances is to exclude them from Systems Manager.

@equake
Copy link

equake commented Feb 8, 2025

@VishnuKarthikRavindran have just created a fresh instance and got OOM'd.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants