Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance for large files #355

Open
mtlynch opened this issue Dec 10, 2022 · 3 comments
Open

Improve performance for large files #355

mtlynch opened this issue Dec 10, 2022 · 3 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@mtlynch
Copy link
Owner

mtlynch commented Dec 10, 2022

Reddit user /u/bdpna reports that PicoShare is taking 10 hours to upload a 7 GB file to a well-resourced server (Supermicro 4u, dual XEON CPU, 64 GB RAM). PicoShare doesn't seem to be pegging the CPU or RAM.

They tried increasing the buffer size of SQLite data entries, but that had negligible or negative impact on performance.

Other ideas:


From another year of using PicoShare on a 1x shared CPU, 256 MB RAM Fly.io instance, PicoShare seems to perform well on files below 1 GB. Above that, it sometimes crashes when receiving or serving those files.


Update (2024-03-16)

As a test to see whether there was an inherent size limit in PicoShare, I spun up a Scaleway PRO2-L (32 CPU / 128 GB RAM), and I was able to upload an 11 GB file fine:

image

I'm sure there are ways to make PicoShare more efficient so that larger files work on smaller servers, but PicoShare demonstrably supports files up to 11 GB.

@mtlynch mtlynch added enhancement New feature or request help wanted Extra attention is needed labels Dec 10, 2022
@joost00719
Copy link

joost00719 commented Aug 17, 2023

I am also having performance issues. It's stuck on "File uploaded! Processing..." for a very long time when mounted on an NFS share.
The initial upload goes really quick (No clue where it temporary stores it), and then it is processing which then sends it to my NFS share which is on the same physical machine but in a different VM. Speed is about 12-20MByte/s, while I can actually get speeds of 300MByte/s or more without picoshare.

I assume the file is written to a temp directory first, and when it's fully uploaded it'll start to process which is just a copy to the sqlite database. The /data directory is on an NFS share, but this causes the slow write speed.

I'd rather see a direct upload to the /data folder (maybe have a /data/tmp folder) and not have it stored in the sqlite database (just a path reference would be plenty I assume). For this a cleanup routine should be considered (maybe if it's not in the db, delete it from disk, and if it's not on disk, handle it on the front end (e.g. file not found on disk)

@mtlynch
Copy link
Owner Author

mtlynch commented Aug 17, 2023

Thanks for reporting this, @joost00719!

Unfortunately, SQLite isn't going to work well if it's over an NFS mount, so I think even if performance were better, you risk data corruption.

At this point, I'm unlikely to move the data out of SQLite. Keeping all data + metadata in SQLite means that Litestream handles all replication for us. If we stored the data as files, we'd have to manage our own replication logic.

I understand that this is an unusual design choice, but that's the design that works best for my use-case, and I optimize PicoShare for scenarios I use.

@bumgarb
Copy link

bumgarb commented Jan 10, 2025

I'm not sure if this will be helpful to this issue.
I'm running PicoShare on a Synology DS1520+, 8GB RAM,
I have 1G symmetric internet that was testing over 900MB every hour during this upload.
The uploader has a 300MB symmetric but I do not know the conditions on their end at the time of upload.

I had someone upload a 194.87GB file to a Guest Link. They said they did not get any "valid" confirmation but instead got what looked like an error - a bunch of red text down the page in what looked like HTML to them. They did not send a screenshot.

I could tell something was processing on my end as I could see the database slowly increasing in size. I stopped watching it around 5pm. I could see the file listed in PicoShare > Files this morning.

However, I cannot download it from PicoShare directly or through the generated links that I can see in my PicoShare admin console. I also got a red text error when attempting to delete the file which was basically that deletion failed. However, after about an hour, I noticed the file no longer appeared in PicoShare > Files.

I'm not sure how to better handle such a large file successfully.
Also, the PicoShare database is still over 194GB in size.
Will a prune or purge happen at some point to decrease the size of my database now that the file no longer appears in the list.?Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants