Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very slow linux download speed, excessive disk and cpu usage #83

Open
ghost opened this issue Sep 4, 2015 · 8 comments
Open

Very slow linux download speed, excessive disk and cpu usage #83

ghost opened this issue Sep 4, 2015 · 8 comments
Labels

Comments

@ghost
Copy link

ghost commented Sep 4, 2015

https://github.com/spring/pr-downloader/blob/master/src/Downloader/Rapid/Sdp.cpp#L197-L205

This closes each downloaded file, then reopens/rereads and checks md5:

bool CFileSystem::fileIsValid(const FileData* mod, const std::string& filename) const
{
HashMD5 md5hash;
int bytes;
unsigned char data[IO_BUF_SIZE];
FILE* f= propen(filename.c_str(), "rb");
gzFile inFile = gzdopen (fileno(f), "rb");

Since the curl write calls are sequential:

curl_easy_setopt(curlw->GetHandle(), CURLOPT_WRITEFUNCTION, write_streamed_data);

... this blocks the download loop until each file is opened, written to, closed, re-opened, loaded in a gzip file object, md5 checked and closed again.

There are much better/faster ways to check the files in-memory, during streamer download, by chunks, using gzip, md5 and crc objects - crc might be the fastest and the sdp provides such data, python example (though thoroughly untested early code that's really unstable):

https://github.com/serg9/rapid_nonet/blob/master/receive_loop.py#L118-L146

There's only one file close, no-rereading the file from disk, using zlib.decompressobj and binascii.crc32, they are meant for on-the-fly stream decompression, there have to be c++ ways to do it too.

Memory usage is not a problem since each received chunk fits comfortably in RAM. I don't know if I have some disk issues or it's my current fstab setup, but the current version of pr-downloader is using my disk to 100% and the download speed is ~100-500kbps max (since it waits for disk). In-ram decompression+crc check is substantially faster on my machine.

For me, this makes it unusable under Linux, might also be related to how disk caching is handled, but I didn't look into that because python kind of handles that on its own.

I can't replicate the crash I told you about now but I'll keep trying.

If not enough linux users get this, it should be marked low priority (I have special mounts).

strace http://pastebin.com/f0HEJ1X6
with time of the day http://pastebin.com/dV6070GX , this makes the sluggishness obvious.

@abma abma added the feature label Sep 4, 2015
@abma
Copy link
Contributor

abma commented Sep 4, 2015

a "cold start" here:

$ time pr-downloader ba:stable
...
real 0m27.141s
user 0m2.196s
sys 0m1.140s

with a debug build of pr-downloader (which outputs to console A LOT) to a normal hard-drive (no ssd).

thats 1,37MB/s (i've a 16MBit connection, ~1.6MB is the max speed).

i agree that i/o can be heavily optimized, but i don't have this issue. Is your harddisk maybe broken or starting to die?

@abma
Copy link
Contributor

abma commented Sep 4, 2015

If not enough linux users get this, it should be marked low priority (I have special mounts).

what params do you use to mount the disk where pr-downloader stores its files? (usally ~/.spring/)

also side note: safety was chosen over speed: currently when pr-downloader crashes / is killed it shouldn't leave broken files on disk which are read/used by spring. (with some enhancements, the same can be accomplished with insane speed :) )

very likely the cause of the speed drop is fsync()...

@ghost
Copy link
Author

ghost commented Sep 4, 2015

Hi,

I'll test this on multiple drives, it's the only program that's giving me headaches on this disk though.
Each fsync call takes ~0.10 seconds judging from that strace, that's a minute and a half for each thousand files extra, but there's something else at play, since:

Latest Ubuntu:

$ time ./pr-downloader ba:stable
pr-downloader 0.7-208-g690d0e2
[Info] Using filesystem-writepath: /home/user/.spring
[Info] Found 19 repos in /home/user/.spring/rapid/repos.springrts.com/repos.gz
[Info] opened /home/user/.spring/rapid/packages.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/s44.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/swiw.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/evo.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/zk.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/techa.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/jauria.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/chiliui.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/lups.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/modelshaders.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/jw.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/tard.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/i18n.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/jrtsc.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/feature-placer.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/spring-features.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/mcl.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/area17.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/devgame.repo.springrts.com/versions.gz
[Progress] 0% [ ] 0/0
[Download] Balanced Annihilation V9.06
[Info] opened /home/user/.spring/packages/76a5494d0fcac5cc512c0f98a30cc0a5.sdp.tmp
[Progress] 0% [ ] 0/0
[Info] Using rapid
[Info] http://packages.springrts.com/streamer.cgi?76a5494d0fcac5cc512c0f98a30cc0a5
[Progress] 100% [==============================] 25637824/25637824
[Info] Download complete!

real 6m36.453s
user 0m1.532s
sys 0m0.976s

That's clearly waiting on IO (and the network is just fine)

Gentoo in chroot:

time ./pr-downloader ba:stable
pr-downloader 0.7-208-g690d0e2
[Info] Using filesystem-writepath: /home/user/.spring
[Info] opened /home/user/.spring/rapid/repos.springrts.com/repos.gz
[Progress] 100% [==============================] 229/229
[Info] Found 19 repos in /home/user/.spring/rapid/repos.springrts.com/repos.gz
[Info] opened /home/user/.spring/rapid/packages.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/s44.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/swiw.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/evo.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/zk.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/techa.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/jauria.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/chiliui.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/lups.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/modelshaders.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/jw.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/tard.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/i18n.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/jrtsc.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/feature-placer.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/spring-features.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/mcl.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/area17.repo.springrts.com/versions.gz
[Info] opened /home/user/.spring/rapid/devgame.repo.springrts.com/versions.gz
[Progress] 100% [==============================] 48952/48952 1
[Download] Balanced Annihilation V9.06
[Info] opened /home/user/.spring/packages/76a5494d0fcac5cc512c0f98a30cc0a5.sdp.tmp
[Progress] 100% [==============================] 122787/122787
[Info] Using rapid
[Info] http://packages.springrts.com/streamer.cgi?76a5494d0fcac5cc512c0f98a30cc0a5
[Progress] 100% [==============================] 25637824/25637824
[Info] Download complete!

real 1m50.288s
user 0m1.716s
sys 0m0.788s

It still crunches the disk/cpu, but it's considerably better.

The setup I have could force most syncs to be "real" disk syncs so that's one thing, I need to change kernels to figure out what's going on because I didn't expect it to be that much lower on gentoo.

With the in-memory check and no syncs on python, download time is pretty much limited to bandwidth, so I'm getting < 10 seconds there, but this is too weird so I'll check kernel/SMART/sata settings.

@cleanrock
Copy link
Contributor

Fyi, its fine on my archlinux system:
$ time ./src/pr-downloader --filesystem-writepath dl_dir ba:stable
pr-downloader 0.7-224-g313b6b9
[Download] Balanced Annihilation V9.07
real 0m16.855s
user 0m0.507s
sys 0m0.323s

@ghost
Copy link
Author

ghost commented Sep 6, 2015

Thanks for that, my best time so far was 1:50 on a different setup with another hdd (but this one isn't new either), even with noatime, either way it's much better than 6 minutes. I don't have new hdds to test this with and no ssd either.

strace -C -tt pr-downloader ba:stable 

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 86.37    0.088000          20      4317           fsync
  3.38    0.003447           0      9279           write
  2.78    0.002830           1      4452           rename
  2.70    0.002750           1      4369           munmap
  1.46    0.001483           0     13105      4338 close
  1.06    0.001077           0     10961           read
  1.04    0.001061           0      8722           open
  0.60    0.000614           0      4469           lseek
  0.25    0.000259           0      4407           fstat
  0.25    0.000253           0     15020     13438 stat
  0.05    0.000050           0      7671           rt_sigaction
  0.02    0.000025           0      4142           poll
  0.02    0.000024           0      4413           mmap
  0.01    0.000012           0      2006           recvfrom
  0.00    0.000000           0        32           mprotect
  0.00    0.000000           0        50           brk
  0.00    0.000000           0        22           rt_sigprocmask
  0.00    0.000000           0        44           ioctl
  0.00    0.000000           0         1         1 access
  0.00    0.000000           0        62           select
  0.00    0.000000           0        44           alarm
  0.00    0.000000           0         1           getpid
  0.00    0.000000           0        45           socket
  0.00    0.000000           0        44        22 connect
  0.00    0.000000           0        22           sendto
  0.00    0.000000           0        22           getsockname
  0.00    0.000000           0        22           getpeername
  0.00    0.000000           0        22           getsockopt
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         2           uname
  0.00    0.000000           0        44           fcntl
  0.00    0.000000           0       280           mkdir
  0.00    0.000000           0         1           arch_prctl
  0.00    0.000000           0        22           sendmmsg
------ ----------- ----------- --------- --------- ----------------
100.00    0.101885                 98116     17799 total

@abma
Copy link
Contributor

abma commented Sep 6, 2015

@serg9:

can you test this please?

mkdir -p /tmp/test && time pr-downloader --filesystem-writepath /tmp/test ba:stable && time rsync -a /tmp/test /tmp/test2

the fsync is/was there because when developing for several times i got 0 byte files.

@abma
Copy link
Contributor

abma commented Sep 6, 2015

also, can you please provide your mount flags?

@ghost
Copy link
Author

ghost commented Sep 6, 2015

/dev/sda1 on / type ext4 (rw,relatime,errors=remount-ro,data=ordered)

Linux computer 3.19.0-26-generic #28-Ubuntu SMP Tue Aug 11 14:16:32 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
mkdir -p /tmp/test && time ./pr-downloader --filesystem-writepath /tmp/test ba:stable && time rsync -a /tmp/test /tmp/test2
pr-downloader 0.7-208-g690d0e2
...
[Download] Balanced Annihilation V9.07
[Info] http://packages.springrts.com/streamer.cgi?32f43f1a18d3efe18070b7b174429244
...

real 1m54.791s
user 0m1.624s
sys 0m0.908s

real 0m0.315s
user 0m0.212s
sys 0m0.212s

Interesting rsync data:

rsync -a rsync://ftp... (local fast gentoo portage mirror) .
-- get one from https://www.gentoo.org/support/rsync-mirrors/

real 0m53.743s
user 0m5.136s
sys 0m16.024s

rsync full portage dir from one drive to another:
(I think there's some caching involved here because it's too low)
real 0m26.400s
user 0m3.592s
sys 0m7.472s

and back again ...
real 0m50.820s
user 0m4.532s
sys 0m10.684s

total portage files
find | wc-l
183007

both drivers are:
relatime,errors=remount-ro,data=ordered
(I think it's default in Ubuntu)

Similar rsync numbers from:
Linux localhost 4.1.6-gentoo #3 SMP Wed Aug 26 18:01:32 EEST 2015 x86_64

back to pr-downloader/ubuntu, 4gb test partition, ba:stable:

  • mkfs.vfat...

real 1m10.029s
user 0m1.860s
sys 0m1.172s

  • mkfs.ext2...

default mount options

real 0m55.237s
user 0m1.872s
sys 0m0.692s

+noatime

real 1m2.008s
user 0m1.688s
sys 0m1.816s

  • mkfs.ext3

real 2m11.110s
user 0m1.700s
sys 0m0.932s

+commit=120,noatime data=ordered

real 2m14.939s
user 0m1.704s
sys 0m1.036s

+noatime,data=writeback

real 1m38.727s
user 0m1.620s
sys 0m0.960s

  • mkfs.ext4, default mount is relatime,data=ordered

real 1m58.384s
user 0m1.752s
sys 0m0.968s

noatime,data=writeback,commit=120

real 1m52.160s
user 0m1.748s
sys 0m0.816s

Git clones (but it looks like the network is the bottleneck here)

time git clone https://github.com/springlobby/springlobby
Cloning into 'springlobby'...
remote: Counting objects: 66539, done.
remote: Compressing objects: 100% (54/54), done.
remote: Total 66539 (delta 32), reused 0 (delta 0), pack-reused 66485
Receiving objects: 100% (66539/66539), 23.50 MiB | 128.00 KiB/s, done.
Resolving deltas: 100% (52032/52032), done.
Checking connectivity... done.

real 2m49.751s
user 0m5.768s
sys 0m1.832s

other drive
real 2m15.132s
user 0m5.460s
sys 0m1.432s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants