-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pkg/pg with dialects.go & txdb.go #910
base: main
Are you sure you want to change the base?
Conversation
86478ad
to
8e479fe
Compare
213ae02
to
9035c2a
Compare
pkg/pg/txdb_test.go
Outdated
dbURL, ok := os.LookupEnv("CL_DATABASE_URL") | ||
if !ok { | ||
fmt.Sprintf("CL_DATABASE_URL not set--falling back to testing txdb backed by an in-memory db") | ||
dbURL = string(InMemoryPostgres) | ||
} | ||
db := NewSqlxDB(t, dbURL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to package this up as a test helper?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could, although I can't think of a good use case for this other than testing txdb itself (ie, this test). So making it a test helper, especially if it's in a place where it can be imported, might confuse someone.
Tests that depend on the db migration scripts being run (ie, require the standard tables and indexes set up) should call NewSqlxDB()
to connect to an actual db and just fail if they can't. Tests that don't depend on those scripts should call NewInMemoryDataSource()
. Usually if testing with NewInMemoryDataSource
is something that can work, then you wouldn't have any reason to attempt testing it with an actual db. In this case, it makes sense because there is some code in the Open()
method of the txdb driver which parses the db url and validates that it's a validly formatted database url. This gets skipped if you're testing with an in-memory data source but everything else is tested. So using a real db tests everything, but using the in-memory db in its place tests only like 95%
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests that depend on the db migration scripts being run
This is part of the pattern that we are trying to get away from though. It should not be the default.
Usually if testing with NewInMemoryDataSource is something that can work, then you wouldn't have any reason to attempt testing it with an actual db
Why do you say this? If we have the ability to trivially integration test against real postgres, then we should use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you say this? If we have the ability to trivially integration test against real postgres, then we should use it.
I was mostly saying that because it seems unlikely anything could pass in the in-memory db but fail in a real db. (Unless we're testing the connection to the db itself, or that the db is set up properly... in which case I wouldn't expect an in-memory db to be useful.) I suppose there are probably some ways, but in general I would assume that the set of sql commands supported by an in-memory db would be a small subset of those supported by a fully fledged database.
This is part of the pattern that we are trying to get away from though. It should not be the default.
I'm not sure what the motivation for wanting to change existing patterns is, but if the reason is because we want to minimize use of the external db to conserve resources then making this a helper function and encouraging its use would be diametrically opposed to that. If we always try to hit the external db first, and only use the in-memory one as a backup if that fails then we'd waste a whole lot of resources in CI since 100% of them will go to the db. I'm not sure what the purpose of an in-memory db would even be then. In the short term, it could help us with local testing before we get the db isolation work Keith planned out done. But once each relay has its own migration scripts, it won't even be useful for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify:
In my last paragraph I was assuming this meant that you think we should always call this helper function instead of using an in-memory db by itself:
If we have the ability to trivially integration test against real postgres, then we should use it.
So that's why I say "100% of them will go to the db". If instead by that, you just mean that we ought to leave the option open to do that for some tests, rather than an all-or-nothing approach then my objection above wouldn't apply.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: in light of having read your other thread, I understand now that the motivation is not conserving CI resources but making it easier for devs to run tests locally. At least for some tests, it may still not be worth using the extra CI resources just to get that slight bit of extra certainty that the function you're testing behaves as expected. But given your motivations, it does seem like you're recommending an all-or-nothing approach. Since unless all of the tests were able to run in-memory you wouldn't get rid of the dependency... right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is partially about local vs. CI, but it is also about whether a test can bootstrap itself, or depends on particularly configured dependencies, like fixture data in an external DB. I'm not sure what you mean by all-or-nothing. If something doesn't work with in memory, then it would be skipped 🤷 This is just like how we hide tests behind the integration
build tag. Most devs don't run those locally, but they always run in CI on every PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If something doesn't work with in memory, then it would be skipped 🤷
Doesn't that imply that we do need to have a common SkipShortDB()
method for tests requiring a db dependency to import... so that they can get skipped? Or is there a different way to skip them you're suggesting we replace that with?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it does not imply that we need to use short at all:
#910 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I guess this is what you had in mind...
If there are dependencies required, then typically you opt-in to those tests by e.g. including a build tag, setting a flag/env-var, etc.
So you're just recommending we move to using one of these methods instead of --short... that makes sense
"github.com/smartcontractkit/chainlink-common/pkg/utils" | ||
) | ||
|
||
// txdb is a simplified version of https://github.com/DATA-DOG/go-txdb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you are introducing our own custom txdb here (which is a copy from chainlink), have you considered just using the DATA-DOG one (which is at least regularly updated)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @samsondav
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We originally did use the DATA-DOG version, it was terrible. Full of bugs and overly complicated. After several PRs trying to fix the upstream, I just wrote our own streamlined version and we've pretty much never had any problems with it, which is why it has never had to be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, this is outside of the scope of this CL - but we do have the problem with it: we essentially maintain a fork of the txdb inside of the chainlink (and its forked repos) and we will now do the same in the chainlink-common. My hope that in the last 5 years DATA-DOG has made enough improvements to their library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its not a fork its a rewrite and its relatively small. Just put it here and import wherever you need it then its only needed once.
If it ain't broken don't fix it, and this ain't broke.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and we will now do the same in the chainlink-common
Just to be clear, this is a move not a copy... the PR this is linked to in chainlink repo changes all of the imports of it to point here and removes it from that repo
Neither of these were in the actual pg package in chainlink repo. dialects.go came from core/store/dialects and txdb.go from core/internal/testutils/pgtest, but neither of these seem like they deserve their own package in chainlink-common--we can lump all the postgres specific common utilities under pkg/pg
Also: convert rest of panic isn't ordinary errors
All txdb connections share the same underlying connection to the postgres db. Calling NewSqlxDB() or NewConnection() with dialect=txdb doesn't create a new pg connection, it just creates a new tx with BEGIN. Closing the connection with db.Close() issues ROLLBACK. Both NewSqlxDB() and NewConneciton() choose random UUID's for their dsn string, so we shouldn't have a case where the same dsn is opened more than once. If that did happen, then these two different txdb "connections" would be sharing the same transaction which would mean closing the abort channel due to a query sent over one of them would affect the other. Hopefully that's not a problem? If it is I think our only option will be to go back to using context.Background for all queries. Before this commit, there was only one abort channel for the entire txdb driver meaning that even two entirely different connections opened with different dsn's could interfere with each other's queries. This should fix that case, which is presumably the only case we care about. Since each dsn corresponds to a different call to NewSqlxDB() and the UUID's are generated randomly, there should no longer be a conflict. Each txdb connection will have its own abort channel.
… is not set This allows us to test most of it in CI, and all locally
This showed up in some of the unit tests in the linked PR in chainlink repo
c409461
to
6c1d019
Compare
NONEVM-739
Supports
smartcontractkit/chainlink#15064
smartcontractkit/chainlink-solana#921
Description
This mostly just moves
chainlink/core/internal/testutils/pgtest/txdb.go
andchainlink/core/store/dialects/dialects.go
into a new common packagechainlink-common/pkg/pg
, updating imports accordingly.The purpose of this is so that it can be imported and used by
chainlink-solana
for unit testing Solana ORM's (for example of actual use, see the linkedchainlink-solana
PR this supports)Since
txdb.go
was a bit of a mess, it has been cleaned up a bit.init()
function invoked on package load has been replaced withRegisterTxDb()
which accepts a dbUrl as a param instead of automatically reading it from theCL_DATABASE
env var. Since there are only two places in the codebase which depend on our customtxdb
dialect of sql being registered, this can simply be called in both of those places (insideNewSqlxDb()
andNewConnection()
) instead of relying on the globalinit()
.sql.Register()
more than once ifRegisterTxDb()
is called more than once (callingsql.Register()
more than once with the same driver name will result in an error).context.Background
to queries with passing of a new context which gets cancelled when the db is closed. (Note: db here refers to the txdb, which is just a single connection in the lower level pg db, corresponding to the dsn passed in which is a randomly generated UUID.)This also moves some unit tests of txdb itself from
chainlink
intochainlink-common
. Since they need an actual database, but don't need any fixtures this is an ideal use case for an in-memory test db, something in between the empty interface added earlier and spinning up a full docker postgresql container with the schemas and tables all set up.go-duckdb
has been used for this, which is now easy to add to our dependencies since we're already at go 1.23. If the CL_CHAINLINK env var is set locally, txdb can be tested with a full ordinary postgresql database backing it... otherwise it falls back to testing it with the in-memory db backing it.core ref: c7728a4eced5de67bad1cafdd21b4fab276e0e96
solana ref: 88cca3779525873e45f6d9f54bc8e0d0f2c9af26