Make TryInsert functions within the packages module use INSERT ... ON CONFLICT #21063

zeripath · 2022-09-04T20:46:12Z

The TryInsert* functions within the packages models make incorrect assumptions
about transactional isolation within most databases. It is perfectly possible for
a SELECT to return nothing but an INSERT fail with a duplicate in most DBs as it
is only INSERT that the locking occurs.

This PR introduces a common InsertOnConflictDoNothing function which will attempt
to INSERT provided bean but will return 0 as the number of rows affected if there is a
conflict.

Fix #19586

Signed-off-by: Andrew Thornton [email protected]

The TryInsert* functions within the packages models make incorrect assumptions about transactional isolation within most databases. It is perfectly possible for a SELECT to return nothing but an INSERT fail with a duplicate in most DBs as it is only INSERT that the locking occurs. This PR changes the code to simply try to insert first and if there is an error then attempt to SELECT from the table. If the SELECT works then the INSERT error is assumed to have been related to the unique constraint failure. This technique avoids us having to parse the error returned from the DBs as these are varied and different. If the SELECT fails then the INSERT error is returned to the user. Fix go-gitea#19586 Signed-off-by: Andrew Thornton <[email protected]>

zeripath · 2022-09-05T00:30:11Z

ugh looks like my worries about transactions were right - we're gonna need to use an upsert style...

models/packages/package_file.go

zeripath · 2022-09-05T08:11:59Z

I'm going to have to mark this WIP as we actually have to use the UPSERT forms otherwise the transactions will fail.

Signed-off-by: Andrew Thornton <[email protected]>

…t-then-try-to-get

Signed-off-by: Andrew Thornton <[email protected]>

KN4CK3R · 2022-09-15T14:54:38Z

models/packages/package.go

+		return nil, err
+	}
+	if n != 0 {
+		return p, nil


The insert methods need to return the filled bean. For example it's expected that ID has a value in the return value which is not the case here.

The code auto fills the ID of the bean in the same way that xorm's Insert() does.

…t-then-try-to-get

Signed-off-by: Andrew Thornton <[email protected]>

models/db/common.go

Signed-off-by: Andrew Thornton <[email protected]>

wxiaoguang · 2023-02-08T01:47:50Z

If I understand correctly, the test cases haven't cover the behavior of comment:

// This function will update the ID of the provided bean if there is an insertion

That's the only test code for asserting m.ID, but it only tests a succeeded insertion

Update: out-dated

		_, _ = e.Exec("DELETE FROM one_unique")

		has, err := db.InsertOnConflictDoNothing(ctx, &OneUnique{})
		assert.Error(t, err)
		assert.False(t, has)

		toInsert := &OneUnique{Data: "test"}

		has, err = db.InsertOnConflictDoNothing(ctx, toInsert)
		assert.NoError(t, err)
		assert.True(t, has)
		assert.NotEqual(t, 0, toInsert.ID)

wxiaoguang · 2023-02-08T02:30:22Z

If I understand correctly, there are still some problems in code:

the has is misleading.
- If insertion succeeds, it returns true, nil, if insertion fails, it returns false, nil
- So the meaning of bool in (bool, error) is: whether the insertion succeeds, not about whether the record existed before.
assert.NotEqual(t, int(0), int64(0)) always succeeds, But the ID is a int64, so you must use NotEqualValues or NotEmpty, otherwise the test code like assert.NotEqual(t, 0, toInsert.ID) always succeeds.
There is only one ON CONFLICT DO NOTHING for SQLite3, I do not understand why it would return the ID for the duplicated insertion.
The MySQL is highly likely needed to use update id=last_insert_id(id), but not update id=id. The second one doesn't seem the proper way to update the LAST_INSERT_ID for MySQL.

Code changed so that I need to review again

lunny · 2023-02-08T06:56:21Z

Looks like things become too complex to difficult to review.

zeripath · 2023-02-08T14:17:20Z

If I understand correctly, the test cases haven't cover the behavior of comment:

It feels a bit like you're being deliberately obstructive and I don't quite understand why you're being this way.

This might be a language issue - but there are better ways to say I think you should add a line that checks if the ID remains at 0.

To which I might reply why would we care and why are you being so bloody difficult about it? That the ID remains unchanged is irrelevant. Fine I'll add it but honestly it's just making the test less clear and less easy to read for zero real world benefit.

It feels like you're targeting me for some reason and I don't understand why. You left maintainers - was it because of me?

If I understand correctly, there are still some problems in code:

the has is misleading.

If insertion succeeds, it returns true, nil, if insertion fails, it returns false, nil

So the meaning of bool in (bool, error) is: whether the insertion succeeds, not about whether the record existed before.

I was using has as in «has been inserted» but I agree inserted would be better. Frankly its minor and you should just say use inserted instead of has instead of making some major issue about it.

assert.NotEqual(t, int(0), int64(0)) always succeeds, But the ID is a int64, so you must use NotEqualValues or NotEmpty, otherwise the test code like assert.NotEqual(t, 0, toInsert.ID) always succeeds.

Again just make a suggestion. It's a good catch that I forgot about it but your tone is frankly insulting.

There is only one ON CONFLICT DO NOTHING for SQLite3, I do not understand why it would return the ID for the duplicated insertion.

I don't understand your question?

There are two cases:

INSERT proceeds because there is no conflict. The SQLite driver will return the inserted id because SQLite always does not.
INSERT fails because there is a conflict.

In the second case we do not get an ID. Nor do we attempt to use it because 0 rows are affected.

The MySQL is highly likely needed to use update id=last_insert_id(id), but not update id=id. The second one doesn't seem the proper way to update the LAST_INSERT_ID for MySQL.

This code is not attempting to update the id in fact it is deliberately not attempting to cause an update to the id - without the ON DUPLICATE KEY UPDATE id = id there is no ON CONFLICT DO NOTHING.

Signed-off-by: Andrew Thornton <[email protected]>

wxiaoguang · 2023-02-08T14:28:09Z

It feels a bit like you're being deliberately obstructive and I don't quite understand why you're being this way.

I am not obstructive, but really want to help, I have tested a lot on my side.

It depends on the ID-behavior mentioned in #21063 (comment)

I have tested SQLite3 and MySQL. I do not think your code could work for some cases, and the tests were incorrect.

For SQLite, insert ... on conflict update set id=id returning id (do nothing doesn't work and won't return the id)

For MySQL, insert ... on duplicate key update id=last_insert_id(id)

And the test has to cover: insert(keyA), insert(keyB), insert(keyA) and assert getting A-id

That's all.

It feels like you're targeting me for some reason and I don't understand why. You left maintainers - was it because of me?

No, when reviewing, I only care about the code and solution are right or not. Maybe you are also too strict for some problems, so there are more likely divergences. There are also some strong objections when my PRs get reviewed.

but your tone is frankly insulting.

Sorry I didn't mean that. But just want make the problem clear. Could you suggest about how to express the problem more gently. ps: I have made some suggestions, but I didn't find that you had got the point, and I also got "Why don't you try it", maybe I also misunderstood something, but the problems are indeed there.

We can have a private chat with the SQL problems later.

This code is not attempting to update the id in fact it is deliberately not attempting to cause an update to the id - without the ON DUPLICATE KEY UPDATE id = id there is no ON CONFLICT DO NOTHING.

There is still no comment about why INSERT IGNORE doesn't work for MySQL in such case. (there was just a joking for the first question #21063 (comment) , I am not good enough to guess).

And, maybe I really misunderstood something -- about the design or about the logic (no comment nor test at the beginning), I do not know. Maybe the divergence starts here #21063 (comment): "The code auto fills the ID of the bean in the same way that xorm's Insert() does.": The bean should has the ID filled when there is no error.

So feel free to dismiss all my comments. Thank you for your work!

Signed-off-by: Andrew Thornton <[email protected]>

…t-then-try-to-get

zeripath · 2023-02-09T00:08:31Z

We don't use or get the ID when the INSERT fails because there is a conflict because OnConflictDoNothing.

I have tested SQLite3 and MySQL. I do not think your code could work for some cases, and the tests were incorrect.

For SQLite, insert ... on conflict update set id=id returning id (do nothing doesn't work and won't return the id)

Clearly I need to separate out the code paths here.

There is no RETURNING in the SQLite code. The RETURNING is only used in Postgres.

SQLite will always return the last inserted ID as part of its driver so we don't need to do that.

If there is a conflict we don't use it - we do nothing.

For MySQL, insert ... on duplicate key update id=last_insert_id(id)

No. As far as I can see that would cause every conflict to cause an update of the primary key (or fail.)

The code as written: ON DUPLICATE KEY UPDATE id = id makes the UPDATE a no-op so that no row is changed i.e. NOTHING happens when there is a conflict.

Again the code is not attempting to return the ID when there is a conflict.

And the test has to cover: insert(keyA), insert(keyB), insert(keyA) and assert getting A-id

See above, the code does not promise to get the ID when there is a conflict. It promises to do nothing.

There is still no comment about why INSERT IGNORE doesn't work for MySQL in such case. (there was just a joking for the first question #21063 (comment) , I am not good enough to guess).

Using INSERT IGNORE didn't work in my testing when there was a primary key with autoincrement - and if I didn't instead use the ON DUPLICATE KEY UPDATE no-op it would error out. This may have been warnings being returned rather than errors.

It seems however now that INSERT IGNORE will just work by itself but I do wonder if the warnings that IGNORE could cause failures or complaints by users.

And, maybe I really misunderstood something -- about the design or about the logic (no comment nor test at the beginning), I do not know. Maybe the divergence starts here #21063 (comment): "The code auto fills the ID of the bean in the same way that xorm's Insert() does.": The bean should has the ID filled when there is no error.

Ah. Perhaps that is the problem.

I'm not promising to get the ID when there is no insert.

The function has comment on it that says it will update the ID of the provided bean if it is inserted. It doesn't say it will do that when it's not inserted - and in fact it will NOT do that - instead on conflict it will do nothing as the function name suggests.

// InsertOnConflictDoNothing will attempt to insert the provided bean but if there is a conflict it will not error out
// This function will update the ID of the provided bean if there is an insertion
// This does not do all of the conversions that xorm would do automatically but it does quite a number of them
// once xorm has a working InsertOnConflictDoNothing this function could be removed.

There were always tests - just not specific ones. The package repositories code has good coverage of this function. As requested though I've added some specific tests and covered and fixed some more edge cases which should make the function usable in other contexts too.

wxiaoguang · 2023-02-09T03:02:20Z

Using INSERT IGNORE didn't work in my testing when there was a primary key with autoincrement - and if I didn't instead use the ON DUPLICATE KEY UPDATE no-op it would error out. This may have been warnings being returned rather than errors.

It seems however now that INSERT IGNORE will just work by itself but I do wonder if the warnings that IGNORE could cause failures or complaints by users.

From my knowledge, I always use INSERT IGNORE for all cases and haven't got any problem. That's why the question first came .... because in my mind:

INSERT IGNORE should be fine for all cases, I never thought any user would be surprised or affected by the warning. If XORM is affected by the warning, that's another story. (if there was a comment for it, I could also get the point as early as possible).
Then after being told that the IGNORE doesn't work for autoincrement, I thought that since the insert on dup update id=id trick is used, it must be done by purpose to get the conflicted ID according to another comment, then the discussion happens.

Thank you for letting me know more details.

Anyway, it's clear for me now. I think this PR is good to have, maybe it could help to remove the workaround mutex for the package code after it gets merged:

Signed-off-by: Andrew Thornton <[email protected]>

…t-then-try-to-get

zeripath · 2023-02-09T19:29:08Z

Ah yes that mutex can be removed too.

Apologies for the confusion earlier - it just felt we were going round and round in circles for no good reason and I couldn't understand why.

…t-then-try-to-get

Signed-off-by: Andrew Thornton <[email protected]>

lunny · 2023-03-12T12:00:32Z

Thanks for your hard-working @zeripath . It really should be a part of XORM. :)

wxiaoguang · 2023-03-12T12:06:50Z

How about taking this PR now, and move to XORM in the future?

The "workaround mutex" really seems fragile, if there is a chance, we could use a DB insert-ignore instead of that in-process mutex for a stable release.

Signed-off-by: Andrew Thornton <[email protected]>

lunny · 2023-03-12T13:02:47Z

tests/integration/db_common_test.go

+func TestInsertOnConflictDoNothing(t *testing.T) {
+	defer tests.PrepareTestEnv(t)()
+
+	ctx := db.DefaultContext


Should we have a parallel test?

lunny · 2024-05-07T06:03:46Z

I think the key problem is the package property table have no unique keys. per #30335 (comment)

zeripath added type/bug backport/v1.17 labels Sep 4, 2022

zeripath added this to the 1.18.0 milestone Sep 4, 2022

GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Sep 5, 2022

zeripath commented Sep 5, 2022

View reviewed changes

models/packages/package_file.go Outdated Show resolved Hide resolved

lunny previously approved these changes Sep 5, 2022

View reviewed changes

GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Sep 5, 2022

zeripath added the pr/wip This PR is not ready for review label Sep 5, 2022

zeripath changed the title ~~Make TryInsert functions within the packages module try to insert first~~ WIP: Make TryInsert functions within the packages module try to insert first Sep 5, 2022

partial still broken

b29ab42

Signed-off-by: Andrew Thornton <[email protected]>

zeripath force-pushed the fix-19586-insert-first-then-try-to-get branch from 0663049 to b29ab42 Compare September 6, 2022 20:25

zeripath added 4 commits September 6, 2022 21:26

more

470064c

Signed-off-by: Andrew Thornton <[email protected]>

Merge remote-tracking branch 'origin/main' into fix-19586-insert-firs…

9c83ab8

…t-then-try-to-get

Merge remote-tracking branch 'origin/main' into fix-19586-insert-firs…

bf94b55

…t-then-try-to-get

ensure that timestamps are also set

8d3864a

Signed-off-by: Andrew Thornton <[email protected]>

zeripath removed the pr/wip This PR is not ready for review label Sep 7, 2022

zeripath changed the title ~~WIP: Make TryInsert functions within the packages module try to insert first~~ Make TryInsert functions within the packages module try to insert first Sep 7, 2022

oops lets try to get the mysql syntax right

71522ea

Signed-off-by: Andrew Thornton <[email protected]>

zeripath changed the title ~~Make TryInsert functions within the packages module try to insert first~~ WIP: Make TryInsert functions within the packages module try to insert first Sep 7, 2022

zeripath added the pr/wip This PR is not ready for review label Sep 7, 2022

KN4CK3R reviewed Sep 15, 2022

View reviewed changes

lunny modified the milestones: 1.18.0, 1.19.0 Oct 17, 2022

lunny modified the milestones: 1.19.0, 1.20.0 Jan 31, 2023

Merge remote-tracking branch 'origin/main' into fix-19586-insert-firs…

7843fe9

…t-then-try-to-get

zeripath added 2 commits February 7, 2023 16:48

more comments

04efbf9

Signed-off-by: Andrew Thornton <[email protected]>

missed setting sql

2283b23

Signed-off-by: Andrew Thornton <[email protected]>

wxiaoguang reviewed Feb 7, 2023

View reviewed changes

models/db/common.go Outdated Show resolved Hide resolved

zeripath added 3 commits February 7, 2023 19:18

add testcases

a282e66

Signed-off-by: Andrew Thornton <[email protected]>

fix fmt

62b1e20

Signed-off-by: Andrew Thornton <[email protected]>

slight bug in mysql

f1222e8

Signed-off-by: Andrew Thornton <[email protected]>

GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Feb 8, 2023

as per wxiaoguang

25abc72

Signed-off-by: Andrew Thornton <[email protected]>

zeripath added 2 commits February 8, 2023 23:53

split functions out and just use ignore

028b5a6

Signed-off-by: Andrew Thornton <[email protected]>

Merge remote-tracking branch 'origin/main' into fix-19586-insert-firs…

1c17006

…t-then-try-to-get

zeripath added 2 commits February 9, 2023 19:25

remove unnecessary mutex

ac6862a

Signed-off-by: Andrew Thornton <[email protected]>

Merge remote-tracking branch 'origin/main' into fix-19586-insert-firs…

dc4638c

…t-then-try-to-get

lunny removed the lgtm/need 1 This PR needs approval from one additional maintainer to be merged. label Feb 12, 2023

zeripath added 2 commits March 12, 2023 11:32

Merge remote-tracking branch 'origin/main' into fix-19586-insert-firs…

ecd3eea

…t-then-try-to-get

add a few comments to test;

bf18cf8

Signed-off-by: Andrew Thornton <[email protected]>

fix broken merge

1a6f5df

Signed-off-by: Andrew Thornton <[email protected]>

lunny reviewed Mar 12, 2023

View reviewed changes

delvh removed this from the 1.20.0 milestone Jun 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make TryInsert functions within the packages module use INSERT ... ON CONFLICT #21063

Make TryInsert functions within the packages module use INSERT ... ON CONFLICT #21063

zeripath commented Sep 4, 2022 •

edited

Loading

zeripath commented Sep 5, 2022

zeripath commented Sep 5, 2022

KN4CK3R Sep 15, 2022

zeripath Feb 7, 2023 •

edited

Loading

wxiaoguang commented Feb 8, 2023 •

edited

Loading

wxiaoguang commented Feb 8, 2023 •

edited

Loading

lunny commented Feb 8, 2023

zeripath commented Feb 8, 2023

wxiaoguang commented Feb 8, 2023 •

edited

Loading

zeripath commented Feb 9, 2023

wxiaoguang commented Feb 9, 2023

zeripath commented Feb 9, 2023

lunny commented Mar 12, 2023

wxiaoguang commented Mar 12, 2023 •

edited

Loading

lunny Mar 12, 2023

lunny commented May 7, 2024

Make TryInsert functions within the packages module use INSERT ... ON CONFLICT #21063

Are you sure you want to change the base?

Make TryInsert functions within the packages module use INSERT ... ON CONFLICT #21063

Conversation

zeripath commented Sep 4, 2022 • edited Loading

zeripath commented Sep 5, 2022

zeripath commented Sep 5, 2022

KN4CK3R Sep 15, 2022

Choose a reason for hiding this comment

zeripath Feb 7, 2023 • edited Loading

Choose a reason for hiding this comment

wxiaoguang commented Feb 8, 2023 • edited Loading

wxiaoguang commented Feb 8, 2023 • edited Loading

lunny commented Feb 8, 2023

zeripath commented Feb 8, 2023

wxiaoguang commented Feb 8, 2023 • edited Loading

zeripath commented Feb 9, 2023

wxiaoguang commented Feb 9, 2023

zeripath commented Feb 9, 2023

lunny commented Mar 12, 2023

wxiaoguang commented Mar 12, 2023 • edited Loading

lunny Mar 12, 2023

Choose a reason for hiding this comment

lunny commented May 7, 2024

zeripath commented Sep 4, 2022 •

edited

Loading

zeripath Feb 7, 2023 •

edited

Loading

wxiaoguang commented Feb 8, 2023 •

edited

Loading

wxiaoguang commented Feb 8, 2023 •

edited

Loading

wxiaoguang commented Feb 8, 2023 •

edited

Loading

wxiaoguang commented Mar 12, 2023 •

edited

Loading