-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GCS #18
Comments
Hi,
Actually we have not tried GCS with this FDW. |
Is it possible to use the |
Because we did not find the exact information that AWS SDK C++ can work with GCS, by current implementation using AWS SDK, I cannot tell it may work or not. |
FWIW I'm getting the same error with a parquet file on AWS S3 so this is not just about GCS. |
FWIW I'm getting the same error with a parquet file on Ali OSS. I roughly looked at the source code and suspect that this piece of code has an issue. parquet_s3_fdw_connection.cpp if (use_minio)
{
const Aws::String defaultEndpoint = "127.0.0.1:9000";
clientConfig.scheme = Aws::Http::Scheme::HTTP;
clientConfig.endpointOverride = endpoint ? (Aws::String) endpoint : defaultEndpoint;
s3_client = new Aws::S3::S3Client(cred, clientConfig,
Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy::Never, false);
}
else
{
const Aws::String defaultRegion = "ap-northeast-1";
clientConfig.scheme = Aws::Http::Scheme::HTTPS;
clientConfig.region = awsRegion ? (Aws::String) awsRegion : defaultRegion;
s3_client = new Aws::S3::S3Client(cred, clientConfig);
} I think else block of code absent else
{
const Aws::String defaultRegion = "ap-northeast-1";
clientConfig.scheme = Aws::Http::Scheme::HTTPS;
clientConfig.region = awsRegion ? (Aws::String) awsRegion : defaultRegion;
// May be there are default value(on AWS S3) for endpoint of clientConfig, but if you use GCS or Ali OSS, you should specify the endpoint from external configuration.
clientConfig.endpointOverride = (Aws::String) endpoint;
s3_client = new Aws::S3::S3Client(cred, clientConfig);
} |
+1 to support GCS. |
AWS S3 (installed according to the instruction)
access to this bucket (using the same Key and Secret that were specified in USER MAPPING) is available, which is confirmed by the aws-cli:
Does it look like there is a problem with the extension or its dependencies? |
Hello, thanks for your reporting. According to the behavior, I think the problem belongs to proxy. Current implementation of According to your comments and above situation, I attached a patch file to temporally change the implementation. You can use region, endpoint or both to connect. If this patch can solve your problems, we will apply it in next release. |
I have the same question, has this issue been resolved? I did not use any proxies. |
@CHEN-Jing194 Thanks for your report. The root cause is not clear, so the issue has not been resolved. |
I found that there is no corresponding Parquet file on my S3. When I create an external table, no new file will be created on S3. So, what should I do to use the INSERT statement after creating the table? |
@CHEN-Jing194
And please let me know if you face the original problem of this issue. |
|
@CHEN-Jing194, Sorry for unclear explanation. In case of no parquet file exists in S3, you need to use
You can see the generated file on S3:
You can see the generated file on S3:
Please let me know if there is any problem. |
But how do I set the Access Key Id? I only see options to set a username and password. |
You can set
|
Could you share me the SQLs that you used to create SERVER and FOREIGN TABLE example_insert? |
The SERVER must be created with option
|
@son-phamngoc i'm so sorry, I eventually found out that I had installed a lower version of the extension.😭 |
@CHEN-Jing194 No problem. |
Can foreign tables be partitioned? This should help reduce the amount of data scanned and lower costs. |
@CHEN-Jing194 Sorry for late response. Could you confirm which way of the following understanding matches your expectation?
|
@son-phamngoc Additionally, |
@CHEN-Jing194
Update and Delete can work only when key columns are specified.
Thank you for reporting. I can reproduce this problem. |
@CHEN-Jing194 |
@son-phamngoc |
No problem. You are welcome.
parquet_s3_fdw uses key columns values to find the correct target record to be updated/deleted.
I want to update all records which has c1 = 100, so SQL query is: |
@son-phamngoc |
@CHEN-Jing194 I'm glad to hear that. |
@son-phamngoc |
@CHEN-Jing194 Thank you for your answer. @CMBCKSRL @mausch @ZhiXingHeYiApple @vitabaks |
@son-phamngoc Hello, I have another question. Is fdw requesting the corresponding parquet file to the local and then performing SQL operations? I tested it and found that the traffic is quite high. I originally thought it was using the functionality of s3 select, but it seems like it's not. |
I am trying to use parquet_s3_fdw to connect to my GCS bucket and extract data from parquet files but it seems to be impossible (or I've made a mistake in my code).
here is what I do
Firstly, I create EXTENSION
CREATE EXTENSION parquet_s3_fdw;
Than I create server
CREATE SERVER parquet_s3_srv FOREIGN DATA WRAPPER parquet_s3_fdw OPTIONS (region 'us-west1');
My GCS bucket region is us-west1 (Oregon) but I also tried us-west2.
Afterwards, I create user mapping
CREATE USER MAPPING FOR CURRENT_USER SERVER parquet_s3_srv OPTIONS (user '<access_key>', password '<secret_key>');
I don't think that there is a problem with these keys because I was able to access to my bucket from ClickHouse.
In the end I create foreign table
But when I query this foreign table I get this error
select * from natality_parquet limit 5;
SQL Error [XX000]: ERROR: parquet_s3_fdw: failed to exctract row groups from Parquet file: failed to open Parquet file HeadObject failed
Is it actually possible to access to GCS via parquet_s3_fdw? If it is true, than could you please point me where am I mistaken in my code
The text was updated successfully, but these errors were encountered: