Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

insert operation succeeds without actually writing any data to the table #82

Open
ta3pks opened this issue Aug 19, 2023 · 17 comments
Open

Comments

@ta3pks
Copy link

ta3pks commented Aug 19, 2023

image
In this example insert_twitter_user_counts method returns Ok however theres no row inserted when i check on the db
image

Here is my test case everything returns success until the fetch_one it simply fails to fetch an unexisting record

When I use .query however the same data works without problems
image

@loyd
Copy link
Collaborator

loyd commented Oct 5, 2023

I'm sorry for not getting back to you sooner. Can you provide this code instead of images so I can check it?

@ta3pks
Copy link
Author

ta3pks commented Oct 9, 2023

sorry I dont have this particular code any more i figured it was an issue with the types but no errors reported i think this still is a bug tho

@alexanderspevak
Copy link

alexanderspevak commented Nov 14, 2023

Hello, I think I am having same problem. I am using batch update and there are no errors but Quantities returned are always zero, I can not find any issue with my types.
Here is link to my repo:
link

Messages are not empty, I am sure of that as I am logging what I am writing to batch:
Screenshot 2023-11-14 at 10 02 31
Last line in the image is log of written messages and there is 0 for both entries and transactions
I have checked my types, and checked what is stored in clickhouse, but nothing is there.

@alexanderspevak
Copy link

alexanderspevak commented Nov 17, 2023

I found out that entries are inserted into database if I call inserter.end(), and inserter.commit() is not enough. Is this intended behaviour?
I have case of infinte inserting, so calling inserter.end() is not something what I want

@ta3pks
Copy link
Author

ta3pks commented Nov 18, 2023

I think i figured this out and i can produce a minimal reproducable example

@alexanderspevak
Copy link

I think i figured this out and i can produce a minimal reproducable example

that would be great. Thank you!

@ta3pks
Copy link
Author

ta3pks commented Nov 18, 2023

As seen here both inserts succeed however second insert simply was not inserted despite being successful i thought this is a clickhouse server specific issue however when i run the same commands on clickhouse cli itself (inserting deleting and inserting back) it works without a problem

@ta3pks
Copy link
Author

ta3pks commented Nov 18, 2023

to run the example i wrote simply provide the db address in full URI form
eg. cargo run -- https://me:[email protected]/mydb

@karaolidis
Copy link

Hello! Is this still an ongoing issue? I haven't used the crate yet, but this feels like a relatively important problem to address...

@ta3pks
Copy link
Author

ta3pks commented Jan 20, 2024

well i im using this in production so far as long as two consequtive ids are not the same it works fine but if your case cannot handle "but sometimes" sort of issues i dont think this is suitable for you @karaolidis

@karaolidis
Copy link

I just run a couple of tests, the minimal example provided fails with:

failed to create table: BadResponse("Code: 119. DB::Exception: Table engine is not specified in CREATE query. (ENGINE_REQUIRED) (version 23.5.3.24 (official build))")

After changing it to:

    db.query("create table if not exists testing_table (x UInt64, y UInt64) engine=MergeTree() primary key x")
        .execute()
        .await
        .expect("failed to create table");

    #[derive(clickhouse::Row, serde::Serialize, serde::Deserialize, Debug, Clone)]
    struct XY {
        x: u64,
        y: u64,
    }
    let xy = [XY { x: 1, y: 2 }, XY { x: 2, y: 4 }, XY { x: 3, y: 6 }];

    let mut insert = db.insert("testing_table").unwrap();
    for xy in &xy {
        insert.write(xy).await.unwrap();
    }
    insert.end().await.unwrap();

    let count = db
        .query("select count() from testing_table")
        .fetch_one::<usize>()
        .await
        .unwrap();

    println!("count: {}", count);
    assert_eq!(count, xy.len());

    db.query("delete from testing_table where x in ?")
        .bind([1, 2, 3])
        .execute()
        .await
        .expect("failed to drop table");

    let count = db
        .query("select count() from testing_table")
        .fetch_one::<usize>()
        .await
        .unwrap();

    println!("count: {}", count);
    assert_eq!(count, 0);

    let mut insert = db.insert("testing_table").unwrap();
    for xy in &xy {
        insert.write(xy).await.unwrap();
    }
    insert.end().await.unwrap();

    let count = db
        .query("select count() from testing_table")
        .fetch_one::<usize>()
        .await
        .unwrap();

    println!("count: {}", count);
    assert_eq!(count, xy.len());

    db.query("drop table if exists testing_table")
        .execute()
        .await
        .unwrap();

... it seems to be working fine:

count: 3
count: 0
count: 3

@loyd
Copy link
Collaborator

loyd commented Jan 20, 2024

@karaolidis

Hello! Is this still an ongoing issue? I haven't used the crate yet, but this feels like a relatively important problem to address...

We have been using this crate actively for years on multiple big production clusters with different CH versions and have never seen lost data because of this crate, so I think it's "production-ready" despite being 0.X version.

@karaolidis
Copy link

That's good to hear, thanks for the help! I couldn't reproduce the issue anyway so I'll give the crate a go 🙏

@loyd
Copy link
Collaborator

loyd commented Jan 20, 2024

@alexanderspevak

I have case of infinte inserting, so calling inserter.end() is not something what I want

See the documentation how to properly use it. You need to set some threshold (by time or by row count and soon by size too). In case of using by-time threshold, you should call commit() by your timer if you have a stream with rare updates. I think the documentation can be better, but an example depends on using an async approach (runtime or even an actor system).

@loyd
Copy link
Collaborator

loyd commented Jan 20, 2024

@ta3pks, what do you mean by "consequtive ids are not the same"?

@loyd
Copy link
Collaborator

loyd commented Jan 20, 2024

@ta3pks

second insert simply was not inserted despite being successful i thought

I run the example 100 times both in debug and release modes and it's fine. What's the version of CH do you use?

The (possibly) failed result can be explained if ALTER .. DELETE is used without mutations_sync, in this case the code contains a race. However, the example uses lightweight deletes that should be visible instantly (although finally based on ALTER .. DELETE, so it's still "bad" for using on a regular basis in CH).

however when i run the same commands on clickhouse cli itself

CH CLI uses tcp+native which has a longer handshake (yep, HTTP can be handled faster for short queries).

So, again: what's the version of CH server?

@ta3pks
Copy link
Author

ta3pks commented Jan 22, 2024

@loyd have you had a look at my minimal example ?
my clickhouse version is 23.9.2 revision 54466 i am using hosted version of clickhouse

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants