You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to do some training of dictionaries with datasets that are many gigabytes and consist of sometimes millions of files. I keep them in 7z files and linearly decompress them using Sharpcompress on the fly. It would be awesome to be able to feed that data straight into the dictionary creation using a custom stream or something.
Even so, the function that does the training copies an array into a stream, which for large datasets is a waste of memory when you could just directly supply the needed stream. The function that accepts the stream is in a class marked as internal so I can't access it directly.
Ideally I'd love to be able to load up a 7z file with hundreds of gigabytes of data and stream that into the dictionary creation without running out of RAM (because I don't like to keep that many files on my hard drive bare to waste space and system resources).
The text was updated successfully, but these errors were encountered:
I would like to do some training of dictionaries with datasets that are many gigabytes and consist of sometimes millions of files. I keep them in 7z files and linearly decompress them using Sharpcompress on the fly. It would be awesome to be able to feed that data straight into the dictionary creation using a custom stream or something.
Even so, the function that does the training copies an array into a stream, which for large datasets is a waste of memory when you could just directly supply the needed stream. The function that accepts the stream is in a class marked as internal so I can't access it directly.
Ideally I'd love to be able to load up a 7z file with hundreds of gigabytes of data and stream that into the dictionary creation without running out of RAM (because I don't like to keep that many files on my hard drive bare to waste space and system resources).
The text was updated successfully, but these errors were encountered: