Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rebase twitter's commit onto prestodb master #248

Open
wants to merge 129 commits into
base: prestodb-twitter-master
Choose a base branch
from

Conversation

beinan
Copy link

@beinan beinan commented Apr 21, 2020

No description provided.

shixuan-fan and others added 30 commits April 21, 2020 13:37
When SORTED_WRITE_TO_TEMP_PATH_ENABLED is true, we would require
a temporary path for sorted writes.
Soft memory limits are default memory limits given to each query that can be overridden using session properties up to the hard limit set by the existing configuration properties.

Having soft limits makes it easier to migrate a workload to lower memory limits by allowing only the queries that require higher limits to specify them while defaulting other queries to lower limits.

Available soft memory limit configuration properties:

query.soft-max-memory-per-node
query.soft-max-total-memory-per-node
query.soft-max-total-memory
query.soft-max-memory
Adding a configuration to handle compression codec for handling orc
and dwrf storage format. Use hive.orc_compression_codec to override
the generic compression codec for orc and dwrf storage format. The
reason to add an extra configuration was the unavailability of uniform
support of all compression codec across all storage formats. The ZSTD
compression codec is only available for orc and dwrf storage format.
We have need for this function in several places, and it is purely geometric.
Adds a parent abstract class to PrestoS3FileSystemMetricsCollector
so that other SDK clients can share the metrics collector support.

Adds reporting for client retry pause time indicating how long the
thread was asleep between request retries in the client itself.

Fixes the reporting client timings. Previously, when the client
retried a request only the first request timings would be recorded
in the stats. Now, all request timings are reported individually.
Previously, an instance of PrestoS3FileSystemStats instance was
created in PrestoS3ClientFactory which means it would not report
S3 client stats to the instance registered with JMX. This would
only have affected PrestoS3Select clients. Now the same metric
instance is shared with PrestoS3FileSystem
In SHOW FUNCTIONS results, list the built-in functions first, and then
the SQL functions, in alphabetical order of the qualified function
names.
Minor variable renames
Page sink commit mechanism is a general connector capability and is not
restricted only for partition commit.
It can be used not only to commit lifespans or physical partitions.
In fact it can be used to commit any page sink write.
Tasks in spark are often retried and run speculatively, thus the
commit protocol required for table writes to avoid data corruption

Co-authored-by: Andrii Rosa <[email protected]>
A footer consists of two parts.
 - offset of each stripe's start location.
 - footer's total size in bytes.
TestRowBasedSerialization sometimes fails calling
createRandomLongDecimalsBlock with less than 10 positions. We should
allow blocks with less than 10 positions to be created if there are
such needs. This commit removes the check to enforce the positionCount
check, and comments were added to suggest the user use a larger
positinCount when desired nullRate > 0.
beinan and others added 26 commits May 8, 2020 21:15
… Parquet schema mismatch checking (twitter-forks#245)

* Compare type by (name,type) pair rather than (index,type) pair during Parquet schema mismatch checking

* add unit test for parquet schema mismatch checker
@beinan beinan force-pushed the prestodb-twitter-234-druid branch from cf7de87 to 3ff4f01 Compare May 9, 2020 04:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.