You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We want GetRanges to return each byte range as soon as that byte range is ready (instead of waiting until all byte ranges are ready). A problem with the simple "tokio + rayon" code in Try interleaving compute with IO #37 is that store.get_ranges(filename, ranges).await will block until all ranges are available.
Tokio is a relatively heavy-weight dependency. Recent flamegraphs show tokio taking up a lot of the runtime.
The object_store_adaptor.rs is actually quite heavyweight; and introduces two different type of Operation.
In the future, I'd still love to extend LSIO to support Windows I/O Rings and MacOS kqueue. But those platforms probably have similar abstractions to io_uring. And there's little point bogging the code down right now with unneccessary abstractions.
it'd be fairly easy to create a new adaptor for async
How?
Basically, what's already proposed in #61. LSIO would have a threadpool. The user would send operations to LSIO's threadpool via a channel.
Why completely drop async/await? Why not stayasync and emit a Stream of completed operations?
Reasons for dropping async/await completely:
TL;DR: AFAICT, async is not necessary for this use-case, and async only adds computational overhead and code complexity 🙂
The io_uring loading code isn't async: it's a single thread right now, and it'll soon be a threadpool. The compute code won't be async (because we don't want to block an async thread). Yes, we could have a Stream connecting the IO and compute. But why add async glue between the two halves if it's not necessary?
Reasons for using Stream:
The StreamExt crate has some handy methods which, at first glance, might make it easy to compose steps of computation. But we can't run blocking computation in the async threadpool!
JackKelly
changed the title
Drop ObjectStore & async/await. Use Channels / async iterators instead.
Drop ObjectStore & async/await. Use Channels / async iterators instead. Focus entirely (for now) on io_uring for local file storage.
Feb 28, 2024
JackKelly
changed the title
Drop ObjectStore & async/await. Use Channels / async iterators instead. Focus entirely (for now) on io_uring for local file storage.
Drop ObjectStore & async/await. Use Channels instead. Focus entirely (for now) on io_uring for local file storage.
Feb 28, 2024
Why drop
async
/await
and dropObjectStore
?ObjectStore
. For example:Vec
(UseO_DIRECT
#51)GetRanges
to return each byterange
as soon as that byte range is ready (instead of waiting until all byte ranges are ready). A problem with the simple "tokio + rayon" code in Try interleaving compute with IO #37 is thatstore.get_ranges(filename, ranges).await
will block until all ranges are available.tokio
taking up a lot of the runtime.object_store_adaptor.rs
is actually quite heavyweight; and introduces two different type ofOperation
.How?
Basically, what's already proposed in #61. LSIO would have a threadpool. The user would send operations to LSIO's threadpool via a channel.
Why completely drop
async
/await
? Why not stayasync
and emit aStream
of completed operations?Reasons for dropping
async
/await
completely:async
is not necessary for this use-case, andasync
only adds computational overhead and code complexity 🙂async
: it's a single thread right now, and it'll soon be a threadpool. The compute code won't beasync
(because we don't want to block an async thread). Yes, we could have aStream
connecting the IO and compute. But why addasync
glue between the two halves if it's not necessary?Reasons for using
Stream
:StreamExt
crate has some handy methods which, at first glance, might make it easy to compose steps of computation. But we can't run blocking computation in the async threadpool!async
code??Use-cases to consider
UPDATE: Use-cases are moved to issue #104.
TODO
IoUring
instance with each thread? Can we publish tasks to other threads? #103async
, and instead only usingchannels
andRayon
)? #104Item
? #105async
codeItem
returned from the threadpool needs a user-defined identifier.The text was updated successfully, but these errors were encountered: