-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable legacy filesystem implementation by default #23343
Conversation
lib/trino-filesystem-manager/src/main/java/io/trino/filesystem/manager/FileSystemConfig.java
Outdated
Show resolved
Hide resolved
Looks like they're a whole bunch of failures that I'm not false alarms @anusudarsan |
b650ce3
to
56a5e63
Compare
I assume you will have to update a whole bunch of catalog properties files used for testing and stuff like that @anusudarsan |
@@ -222,6 +222,9 @@ public DistributedQueryRunner build() | |||
hiveProperties.put("hive.metastore", "file"); | |||
hiveProperties.put("hive.metastore.catalog.dir", queryRunner.getCoordinator().getBaseDataDir().resolve("hive_data").toString()); | |||
} | |||
if (!hiveProperties.buildOrThrow().containsKey("fs.hadoop.enabled")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we not use the new navy file systems for most of the tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All those tests were already migrated to native-fs (including product-tests), by explicitly setting fs.hadoop.enabled to false, and enabling one of s3/azure/gcs implementations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So why do we need to add this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@electrum there are still tests with no fs.hadoop.enabled
values set. When default was changed to false, tests fail with errors like No factory set for location /tmp..
. Hence this change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we change those tests to make sure they do have the fs.hadoop.enabled set to on instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we change those tests to make sure they do have the fs.hadoop.enabled set to on instead
We could, but there will be a lot. I followed the pattern in the runner where we default to file
metastore when no metastore is set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When default was changed to false, tests fail with errors like
No factory set for location /tmp..
Here is a stacktrace
stacktrace
io.trino.spi.TrinoException: Could not read database schemaat io.trino.plugin.hive.metastore.file.FileHiveMetastore.readFile(FileHiveMetastore.java:1406)
at io.trino.plugin.hive.metastore.file.FileHiveMetastore.readSchemaFile(FileHiveMetastore.java:1391)
at io.trino.plugin.hive.metastore.file.FileHiveMetastore.getDatabase(FileHiveMetastore.java:285)
at io.trino.plugin.hive.metastore.file.FileHiveMetastore.createDatabase(FileHiveMetastore.java:198)
at io.trino.metastore.tracing.TracingHiveMetastore.lambda$createDatabase$9(TracingHiveMetastore.java:173)
at io.trino.metastore.tracing.Tracing.lambda$withTracing$0(Tracing.java:31)
at io.trino.metastore.tracing.Tracing.withTracing(Tracing.java:39)
at io.trino.metastore.tracing.Tracing.withTracing(Tracing.java:30)
at io.trino.metastore.tracing.TracingHiveMetastore.createDatabase(TracingHiveMetastore.java:173)
at io.trino.plugin.hive.metastore.cache.CachingHiveMetastore.createDatabase(CachingHiveMetastore.java:572)
at io.trino.plugin.deltalake.TestDeltaLakeProjectionPushdownPlans.createPlanTester(TestDeltaLakeProjectionPushdownPlans.java:119)
It looks to me that we need to have a mapping for io.trino.filesystem.local.LocalFileSystemFactory
in FileSystemModule
or an extension of FileSystemModule
that we are using for testing purposes in order to cope with local fs locations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Supporting local fs would be cool anyway
56a5e63
to
f5eed0b
Compare
@@ -222,6 +222,9 @@ public DistributedQueryRunner build() | |||
hiveProperties.put("hive.metastore", "file"); | |||
hiveProperties.put("hive.metastore.catalog.dir", queryRunner.getCoordinator().getBaseDataDir().resolve("hive_data").toString()); | |||
} | |||
if (!hiveProperties.buildOrThrow().containsKey("fs.hadoop.enabled")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So why do we need to add this?
f5eed0b
to
0aae449
Compare
Release note edit, suggest Reason being, we should make it clear that there is a choice between multiple different options. As written now, it sounds like we expect users to simply re-enable the hadoop FS they were previously using. |
1bf8df1
to
8f8ab99
Compare
@anusudarsan @electrum I'm wondering if for a lot of the tests instead of defaulting to the Hadoop filesystem we should default to another one to make the removal and migration later easier. But maybe that can also be a follow-up pull request |
a8b6215
to
7c60795
Compare
/test-with-secrets sha=7c60795d814d6b7d1ec2e1d3f0a4118e491d2c98 |
The CI workflow run with tests that require additional secrets finished as failure: https://github.com/trinodb/trino/actions/runs/10838968944 |
7c60795
to
13ff61e
Compare
/test-with-secrets sha=13ff61e6f20bb1c208d9dae0d72a55a677335da2 |
The CI workflow run with tests that require additional secrets finished as failure: https://github.com/trinodb/trino/actions/runs/10840306232 |
13ff61e
to
58f28d1
Compare
/test-with-secrets sha=58f28d1e02179c4b0c4e3b4ae37192cd94b27e7f |
The CI workflow run with tests that require additional secrets finished as failure: https://github.com/trinodb/trino/actions/runs/10842020393 |
58f28d1
to
494c0ae
Compare
/test-with-secrets sha=494c0ae41b2cca6ea43d59c445027f56d323b900 |
The CI workflow run with tests that require additional secrets has been started: https://github.com/trinodb/trino/actions/runs/10851945727 |
This means we disable all filesystem implementations by default, enforcing users to explicitly configure either legacy or native filesystem implementation
494c0ae
to
22e84c4
Compare
rebased the PR |
CI is good. Failure is a flaky false alarm. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work down the rabbit hole.
@anusudarsan @mosabua Could you confirm CI failure? https://github.com/trinodb/trino/actions/runs/10855433827/job/30128033072 |
Dang .. that passed on the PR on the runs with secrets |
Description
Additional context and related issues
Docs must be ready and merged at the same time so change lands in a release with the required docs. This is a breaking change for most users.
#23366
Release notes
(x) Release notes are required, with the following suggested text:
and link to migration guides