Requests should be cached on a per-datasource level #491

kyri-petrou · 2024-06-28T11:52:27Z

There is a rather annoying bug/issue currently with how we dedup/cache queries. The current Cache takes into consideration only the Request, but not the DataSource. This means that if a Request is reused across different DataSources.

For most usecases this shouldn't be a problem, because having multiple DataSources returning the same type is unlikely. However, this is something that has happened to me before and failed to realise until it hit prod:

  case class EntityId(id: Int) extends Request[Nothing, Boolean]

  val userExistsDs: DataSource[Any, EntityId] =
    DataSource.fromFunctionBatchedZIO[Any, Nothing, EntityId, Boolean]("UserExistsDS") { requests =>
      ???
    }

  val organisationExistsDs: DataSource[Any, EntityId] =
    DataSource.fromFunctionBatchedZIO[Any, Nothing, EntityId, Boolean]("OrganizationExistsDs") { requests =>
      ???
    }

In this case, if we execute a query against both datasources where the id exists in both of them, we would be retrieving the result from the cache even if the request was meant for a different datasource.

While the workaround is simple (create another case class for the 2nd case class), I personally believe that the existing behaviour is not correct. If I'm executing a request against a specific DataSource, I expect it to be cached on the datasource that I am making a request against.

Unfortunately, the only way to fix this is by breaking the Cache API by adding DataSource as a parameter to its methods. @paulpdaniels AFAICT you were the one that asked for the Cache to be made abstract, do you see any issues with this? (assuming you ended up using a custom Cache, otherwise ignore me 😅)

kyri-petrou · 2024-06-28T11:53:53Z

zio-query/shared/src/test/scala/zio/query/ZQuerySpec.scala

+      test("N + 1 on composed queries") {
+        for {
+          ref   <- Ref.make(0)
+          ds     = countingDs(ref)
+          query1 = identityDs.query(IdReq(1)).flatMap(i => ds.query(CountingReq(i)))
+          query2 = identityDs.query(IdReq(2)).flatMap(i => ds.query(CountingReq(i)))
+          res   <- (query1 <~> query2).run
+          i     <- ref.get
+        } yield assertTrue(i == 1, res == ("1", "2"))
+      },
+      test("requests are cached per datasource") {
+        for {
+          res <- identityDs.query(IdReq(1)).flatMap(i => plus1Ds.query(IdReq(i))).run
+        } yield assertTrue(res == 2)
+      },


This test is irrelevant with the PR. It's something I've been meaning to add for some time now just as a safeguard when making changes to batching logic

Requests should be cached on a per-datasource level

bddcb85

kyri-petrou requested review from ghostdogpr and paulpdaniels June 28, 2024 11:52

kyri-petrou commented Jun 28, 2024

View reviewed changes

kyri-petrou added 2 commits July 1, 2024 20:06

fmt and fix Scala 2.12

02827c9

Disable Mima

b2a2149

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Requests should be cached on a per-datasource level #491

Requests should be cached on a per-datasource level #491

kyri-petrou commented Jun 28, 2024

kyri-petrou Jun 28, 2024

Requests should be cached on a per-datasource level #491

Are you sure you want to change the base?

Requests should be cached on a per-datasource level #491

Conversation

kyri-petrou commented Jun 28, 2024

kyri-petrou Jun 28, 2024

Choose a reason for hiding this comment