Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cats-effect 3.x implementation #1196

Merged
merged 9 commits into from
Sep 13, 2022
Merged

Cats-effect 3.x implementation #1196

merged 9 commits into from
Sep 13, 2022

Conversation

m50d
Copy link
Contributor

@m50d m50d commented Aug 10, 2022

See discussion in #1127

I'm not sure if it's complete, but it covers the main/obvious cases. Propagate the context on launching a new fiber (in the fiber itself), make it current for the duration of the run method, clear it from the current thread on suspend and restore (from what was originally saved) on resume.

@m50d m50d changed the title Rough cats-effect 3.x implementation Cats-effect 3.x implementation Aug 16, 2022
@m50d
Copy link
Contributor Author

m50d commented Aug 16, 2022

Is restoring this fiber's original context on resume good enough, or will that lead to lost updates? (i.e. do we need to store the updated current context on suspend?)

@m50d
Copy link
Contributor Author

m50d commented Aug 16, 2022

Looks like one of the 3 test runs failed - maybe the scala 3 one? How do I tell?

@ivantopo
Copy link
Contributor

Hey @m50d, thanks for starting this PR! I tried to run the tests locally and noticed that they randomly pass/fail:
image

I'm taking a look at the implementation and IOFiber 🤓 and will update you soon

@ivantopo
Copy link
Contributor

Hey @m50d, I opened a PR to Goodcover's branch with a few changes I had to do to make things work properly. Sorry for pushing changes instead of a proper review but after I finally got it to work I didn't want to touch it at all 😅

Regarding this comment:

Is restoring this fiber's original context on resume good enough, or will that lead to lost updates? (i.e. do we need to store the updated current context on suspend?)

It seems like we only need to restore the context when the Fiber was actually suspended before calling resume, as resume might be called several times without it actually being suspended before 🤷. Good that you caught the calls to schedule* and reschedule*, I never would have thought that a Fiber would have a couple of threads fighting to see who executes it!

It would be awesome if you could merge those changes to your branch and test your applications with it, ensuring the behavior is as expected. I ran many iterations of the tests locally and they passed consistently (via testUntilFailed on SBT), and even managed to keep the scheduler's threads clean with a bit of extra instrumentation so hopefully the instrumentation is good enough 🤞

There are two tiny things I would like to do before merging this to master, though:

  • Move all the instrumentation into a single module as we did with the tapir instrumentation (see Upgrade to stable tapir version #1188) so people don't need to download different artifacts. This should be simple because there is no code for CE2, just configuration. I'll look into it by the end of the week
  • Provide some helper functions for folks to create spans and do context propagation. I liked what you did with the meteredWithSpanCapture function on the tests and I think we should provide a couple of helpers for this

Really looking forward to your feedback on this 🙏 /cc @jatcwang @cmcmteixeira @hughsimpson

@cmcmteixeira
Copy link

Hello!
Code makes sense and the test cover all the scenarios I can think of 👍
Great work everyone 🎉

update the cats effect 3 instrumentation
@ivantopo
Copy link
Contributor

ivantopo commented Sep 1, 2022

Hey @dispalt, thanks for merging the PR! Did you guys have the chance to test these changes and see if it works fine with your apps? /cc @m50d

@hughsimpson
Copy link
Contributor

I've been testing this in a dev server for the last 24 hours or so and it's looking good! Not spotted any issues with it so far. Code seems reasonable. I think the todos suggested by @ivantopo make sense. Awesome work guys 🙏

@m50d
Copy link
Contributor Author

m50d commented Sep 6, 2022

Hi @ivantopo , thank you ever so much for stepping in, and no worries about pushing the changes directly - I think we all share the sentiment of just wanting to get something working to start with :)

I've just tested our application with this (updated) branch and our traces look better and more joined up than ever before, so that all seems great.

I'll get right on those todos today, at least as far as I can; I'm certainly happy to do the module organisation once I understand what has to happen. As for helpers, that sounds like a good idea in principle, but I think if there are any sticking points then we shouldn't hold up the merge, as getting basic context propagation working is the main thing; we can always add more helper functions in a follow-up PR.

Thanks again, really looking forward to having this all running.

@m50d
Copy link
Contributor Author

m50d commented Sep 6, 2022

Having looked at your other PR if you know what you want to do about the module side of things then I'm happy to leave that to you - I don't have strong feelings one way or another beyond making sure the upgrade path is easy.

@ivantopo
Copy link
Contributor

ivantopo commented Sep 6, 2022

Hey @m50d, @hughsimpson, thank you so much for trying this out! I'm really happy to see that it is working 🎉

I'll the project thing cleaned up and merged asap, hoping to release something by the end of this week! I'm not sure if I'll include the function to create Spans or not, but at least we will get the context propagation going 💪

@ivantopo
Copy link
Contributor

ivantopo commented Sep 7, 2022

I was updating the build so it compiles on Scala 3 and got it to compile, but Kanela explodes when running the tests:
image

Weirdly enough, it happens with Scala 3, but works fine with 2.x... I already pinged @dpsoft for some bytecode magic rescue. Hopefully some news will come soon 🙏

@dpsoft
Copy link
Contributor

dpsoft commented Sep 8, 2022

@ivantopo I've detected some differences in the bytecode generated by Scala 3, and in particular this:

Scala 3
image

Scala 2.x
image

I'm not sure why we lost the method parameter annotations, maybe there is a Scalac 3 parameter to preserve the metadata. In the meantime I've rewritten the Advices in plain Java and that does the trick ;)

image

@m50d
Copy link
Contributor Author

m50d commented Sep 9, 2022

I think this is actually a bit too aggressive about storing an empty context for the scheduler. Will see if I can minimize my test case.

@m50d
Copy link
Contributor Author

m50d commented Sep 9, 2022

The config change is definitely necessary (otherwise we can lose the context over a sleep).

I'm not sure whether the change to ContextCleaningWrapper is strictly required, but it seems more correct.

@ivantopo
Copy link
Contributor

Hey folks 👋

It hit me a few days ago that the problem with Scala 3 was already solved by @vaslabs on #1191! We need to add the @static annotation to our advice methods as shown here:
image

I did it locally and worked as expected, you can check the last two commits on this branch. It is a bit annoying that we also need to create a companion class to the advice objects, but at least we don't have to write code in Java 😂

Regarding this addition:

  executor-service {
    within += "cats\\.effect\\.unsafe\\.SchedulerCompanionPlatform.*"
  }

I'm not a fan of it because it means we are instrumenting the same runnable twice: once on the cats-io instrumentation to wrap the context and another from the kamon-executors instrumentation that does exactly the same thing. I would say that whatever we do, we do it in a single instrumentation place. I commented out that inclusion locally and ran the tests several times. All of them succeeded and didn't leave any dirty threads.

It seems correct to remove it but I don't know why it was there in the first place 🤔

@ivantopo
Copy link
Contributor

Hey @dispalt @m50d I opened this to your branch: goodcover#2

Had to merge master in there to get support for Scala 3 (would have preferred rebasing but there were too many commits with conflicts and got all messed up in there 😭)

Let's get those changes in and do a last round of review to get this done 💪

@ivantopo
Copy link
Contributor

BTW, I'm going to leave out the idea of having the instrumentation for CE 2.x and 3.x on the same submodule because there are no CE3 artifacts for Scala 2.11.

Will probably revisit that when we drop support for Scala 2.11

@m50d
Copy link
Contributor Author

m50d commented Sep 12, 2022

Hey @dispalt @m50d I opened this to your branch: goodcover#2

Cool, merged that!

It seems correct to remove it but I don't know why it was there in the first place

I had a test case where I was initialising Kamon partway through a test where it made the difference between propagating the span across a (CE3) sleep and not. But probably initialising Kamon at the start of one's application is more correct :). Reverted that change.

Will probably revisit that when we drop support for Scala 2.11

SGTM!

@ivantopo ivantopo merged commit 57128f7 into kamon-io:master Sep 13, 2022
@ivantopo
Copy link
Contributor

For the record: I added one extra commit 6cff554 to move the cats-io-3 module to the bundle that gets published for Scala 2.12+ only.

Thank you so much @m50d @cmcmteixeira @hughsimpson @dispalt @dpsoft for all the effort poured into making this work 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants