Overloads with idling threads #76
Replies: 9 comments 5 replies
-
Hi, I don't see anything wrong with your htop. The threads are using anywhere around 0.7% to 2.0% CPU which is good and the way it should be. The 106% CPU usage you see is actually a combined value of what Tor uses on one CPU plus what workers use on multiple threads and CPUs. TOR process itself is not using 106% of one CPU. As for compare.sh please check /var/tmp for a file named file2. See if it has entries and how old the file is. You may have a problem downloading the most recent file. Please post the result of your compare.sh here. What's your advertised bandwidth? Do you mind telling me your server's nick so I can take a look at your bandwidth history graph? |
Beta Was this translation helpful? Give feedback.
-
The threads are doing alright, that's the usual load per worker. Tor looks at the number of workers and their current workload and how long it will take for them to finish the job. If the time for a worker to become available for the new job is longer than a certain threshold then Tor starts dropping NTor. If you see one or two worker threads show 0% from time to time, then you know your NumCPUs is just at that right spot. 24 for your kind of Bandwidth should be enough but I think Tor gets into trouble when you reach your burst and with the DDoS attacks, I'm assuming you reach the burst often. Try to figure out what kind of bandwidth you're comfortable with and keep your burst close to your max Advertised bandwidth and then figure out the sweet spot for the NumCPUs. Also it seems that you restart too often, When you get an overload status, keep Tor running. The overload message will go away after a couple of heartbeats or 3. As the time goes by, the ban list gets populated and you establish enough connections with other relays, your system runs smoother and smoother. Each time you change the NumCPUs though, you'll have to restart Tor. A -HUP will not do. As for conntrack.sh, you didn't post the result. Does it show 0 relays or does it show low numbers. Please post the results if you can. You also didn't tell me how old file2 was. |
Beta Was this translation helpful? Give feedback.
-
So conntrack.sh is working. However your iptables rules are not. None of those IP addresses with above 2 connections would have been there (except for your IP and snowflake servers) had the rules been applied . And the number of IPs with two connections that are also dual-or relays should have been almost equal (may be 5 more than dual-or relays ). Are you running the newest version of the rules.? Did you run conntrack.sh right after you ran multi.sh script? You should download the newest version and make sure it runs with no error. Make sure you have populated ipv4.txt and ipv6.txt with correct IP and correct format. After you run it, confirm that the rules have been applied by typing
and make sure ipsets are there too by typing:
If you like, you can copy and paste the result of iptables -S -t mangle here so I can take a look at it. But from what I see, the problem may not be the NumCPUs. The problem is that the rules are not doing what they should do. |
Beta Was this translation helpful? Give feedback.
-
No, that's not normal, If you run multi.sh after having Tor run for a while, you'd see something like that but after about 10 minutes all should disappear. Just noticed. I can't see your public IP in the result of your conntrack.sh, only a local IP. 10.0.0.225. Are you behind a router and on a local LAN? If that's the case the IP addresses in your ipv4.txt and ipv6.txt are wrong. The destination for the incoming packets should be your local IP. Chances are your ban list ipset is empty too. |
Beta Was this translation helpful? Give feedback.
-
Looking at the results again, I see that your public IPV6 address seems to be the destination of the the packets, so I'd keep that in ipv6.txt for now but definitely change the IPV4. |
Beta Was this translation helpful? Give feedback.
-
That IP seems to have very few connections and they're probably local. The public IPV6 has 411 connections which tells me that IP is the main destination, so keep that. Just change the IPV4 for now and we'll monitor your system to see if any other changes will be necessary. |
Beta Was this translation helpful? Give feedback.
-
Well, again to clarify, the IPV6 with 411 connections in that image goes in ipv6.txt and 10.0.0.255 goes in ipv4.txt and if you have done port forwarding and your local ORPort is different from your public ORPort, the local one goes there too. You don't have to restart either. If you run multi.sh and if we're right, those connections start to disappear one by one. If you see that happen, then it's working. |
Beta Was this translation helpful? Give feedback.
-
It works perfectly! Only items in list are loopbacks and one other that is
another machine I'm running - thanks so much for your help.
…On Mon, Feb 6, 2023, 09:03 Enkidu ***@***.***> wrote:
Well, again to clarify, the IPV6 with 411 connections in that image goes
in ipv6.txt and 10.0.0.255 goes in ipv4.txt and if you have done port
forwarding and your local ORPort is different from your public ORPort, the
local one goes there too.
You don't have to restart either. If you run multi.sh and if we're right,
those connections start to disappear one by one. If you see that happen,
then it's working.
—
Reply to this email directly, view it on GitHub
<#76 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AKEN7JCUW224FTSJVAPGH23WWEACRANCNFSM6AAAAAAUN6ZBGA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
My pleasure. Glad it worked out. Eventually once you run conntrack.sh, In the top section where it shows IPs with more than 2 connections, you should see your own IPs and two or 3 IPs from snowflake. |
Beta Was this translation helpful? Give feedback.
-
Hey! I'm having a strange issue where I have many worker threads (24 numcpus) that simply idle and do mostly nothing, and yet the main process eats an entire core, and still throws many overload messages, dropping ntor onion skins while the worker threads use basically no CPU. Also possibly of mention is that after updating to v5, I let it run for a day and checked compare.sh, yet no relays were in the list. htop screenshot below:
Beta Was this translation helpful? Give feedback.
All reactions