Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace 'tar' crate with 'tokio-tar' #3202

Merged
merged 5 commits into from
Jan 3, 2023
Merged

Replace 'tar' crate with 'tokio-tar' #3202

merged 5 commits into from
Jan 3, 2023

Conversation

hlinnaka
Copy link
Contributor

The synchronous 'tar' crate has required us to use block_in_place and SyncIoBridge to work together with the async I/O in the client connection. Switch to 'async-tar' crate that uses async I/O natively.

As part of this, move the CopyDataWriter implementation to postgres_backend_async.rs. Even though it's only used in one place currently, it's in principle generally applicable whenever you want to use COPY out.

@hlinnaka hlinnaka requested review from a team as code owners December 23, 2022 14:12
@hlinnaka hlinnaka requested review from knizhnik, SomeoneToIgnore, koivunej and a team and removed request for a team and knizhnik December 23, 2022 14:12
@hlinnaka
Copy link
Contributor Author

'async-tar' is a fork of the 'tar' crate, just replacing sync functions with corresponding async ones. I'm not sure how well maintained it is, but a crate like this doesn't really need much change or maintenance. If we go with this, it might be good to take a close look at 'async-tar' and compare if it's missing any fixes that have been made on 'tar' crate.

I'm surprised by how many dependencies the 'async-tar' crate pulls in. It depends on 'async-std'; is that really necessary, or could we update 'async-tar' to rely on the corresponding std functionality instead?

@SomeoneToIgnore
Copy link
Contributor

It depends on 'async-std'

Let's not pull that in then, it's yet another async runtime and we have tokio for that.
Could something like https://crates.io/crates/tokio-tar be a better choice?

@hlinnaka
Copy link
Contributor Author

It depends on 'async-std'

Let's not pull that in then, it's yet another async runtime and we have tokio for that.

I see.

Could something like https://crates.io/crates/tokio-tar be a better choice?

Tried that now. There's one problem with it:

error[E0477]: the type `AbortableWrite<'a, W>` does not fulfill the required lifetime
  --> pageserver/src/basebackup.rs:44:9
   |
44 |     ar: Builder<AbortableWrite<'a, W>>,
   |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
note: type must satisfy the static lifetime as required by this binding
  --> /home/heikki/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-tar-0.3.0/src/builder.rs:15:46
   |
15 | pub struct Builder<W: Write + Unpin + Send + 'static> {
   |                                              ^^^^^^^

tokio_tar::Builder requires the writer to be 'static. The reason for that is that it has a mechanism to write the tar EOF block (1024 bytes of zeros), if you just drop the Builder without calling finish. The plain tar crate has that too, and it's straightforward there: the Drop implementation writes the EOF marker. In the async_tar, the Drop implementation uses block_on to do the same. But in tokio_tar, it uses a channel and a separately spawned task to do it.

That's hacky IMHO. The EOF-block-writing-task executes at some not-well defined time after the drop has happened. And it's super annoying for us, because we actually don't want the EOF marker to be written at all. We actually work hard to skip it, that's exactly why we have the AbortableWrite hack.

So ideally tokio-tar just didn't have that Drop implementation, and then it wouldn't require 'static, and we could also remove AbortableWrite. I'm tempted to fork it and do that. Perhaps the upstream project would be interested in a PR for that too; IMHO it's a bad idea to write the EOF block on drop anyway.

@hlinnaka
Copy link
Contributor Author

Switched to a modified version of 'tokio-tar' without the Drop implementation and the 'static requirement.

@hlinnaka hlinnaka changed the title Replace 'tar' crate with 'async-tar' Replace 'tar' crate with 'tokio-tar' Dec 23, 2022
Copy link
Contributor

@SomeoneToIgnore SomeoneToIgnore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vorot93/tokio-tar@master...hlinnaka:tokio-tar:no-termination-on-drop
As a tokio-tar maintainer, I would not want to accept a PR that could break its users' workflow just by the version bump, silently.

I would imagine, more PR-friendly way would be to define some extra field in the Builder, that allows disabling archive termination and use that in Drop.


Otherwise, I like how straightforward most of this PR is, including the old sync code transition.

Copy link
Member

@koivunej koivunej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't find anything wrong in this aside from @SomeoneToIgnore's comments. Happy to have clearer stack traces with one more block_in_place gone :)

@koivunej
Copy link
Member

koivunej commented Dec 27, 2022

Looking more, vorot93/tokio-tar#9 does look quite important in case the work not completing on one poll, however that's only on the read path which we don't use.

Looking at tokio-tar the AsyncWrite usage of slices is not cancellation safe. I doubt know if it was cancellation safe before, but now it definitely is not. I think however the only place it's cancelled is with the shutdown waiter, and then the writer is not restarted back again. Maybe a comment or two could be added about this near... I'll try to find that callsite. It doesn't seem that there is any shutdown cancellation at the point of importing basebackup.

The synchronous 'tar' crate has required us to use block_in_place and
SyncIoBridge to work together with the async I/O in the client
connection. Switch to 'tokio-tar' crate that uses async I/O natively.

As part of this, move the CopyDataWriter implementation to
postgres_backend_async.rs. Even though it's only used in one place
currently, it's in principle generally applicable whenever you want
to use COPY out.

Unfortunately we cannot use the 'tokio-tar' as it is: the Builder
implementation requires the writer to have 'static lifetime. So we
have to use a modified version without that requirement. It was
required just for the Drop implementation that writes the
end-of-archive sections if the Builder is dropped without calling
`finish`. But we don't actually want that behavior anyway; in fact
we had to jump through some hoops with the AbortableWrite hack to skip
those. With the modified version of 'tokio-tar' without that Drop
implementation, we don't need AbortableWrite either, but I kept that
for now.
@SomeoneToIgnore
Copy link
Contributor

We don't use that Drop path indeed, and I've created a PR that should explicitly help us with that: neondatabase/tokio-tar#1 , there are more details about the upstreaming activities, let's see if they ever get released.
For now, out fork can do that: #3239

Copy link
Contributor

@problame problame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I trust the transformations from sync to async were mostly mechanical / done carefully. Didn't review those.

The tokio-tar PR has been merged into our fork. Hence it's OK to use it in this PR now.

I scanned for FIXMEs and started conversations. Please address them, then let's get this merged.

Copy link
Contributor

@problame problame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please squash before merge

@hlinnaka hlinnaka merged commit 0a0e55c into main Jan 3, 2023
@hlinnaka hlinnaka deleted the async-tar branch January 3, 2023 10:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants