Athena Tools

A set of tools to accomplish the following:

Creationg of Apache ORC files
Inferring a schema from JSON
A Lambda function that will encode S3 objects as ORC via S3 notification.
Simple Athena SQL execution from the command line

ORC Encoding via AWS Lambda

The AWS prescribed method for encoding S3 data into an efficient data format for Athena is awful. Rarely does one want the first instruction of anything to involve creating a Hadoop cluster.

ORC S3 Notification Encoder

Environment Variable	Description
DESTINATION_S3_BUCKET	Bucket here the ORC files will be stored.
DESTINATION_S3_PREFIX	Prefix to add to the S3 key
PARTITION_BY	Optional fn to partition rows by. Evaled Clojure code.
PARTITION_KEY	Name of the variable used as part of the parition

Usage

FIXME

License

Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
doc		doc
resources		resources
src/athena		src/athena
test-resources		test-resources
test/athena		test/athena
uberjar-resources		uberjar-resources
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
project.clj		project.clj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Athena Tools

ORC Encoding via AWS Lambda

ORC S3 Notification Encoder

Usage

License

About

Releases

Packages

Languages

License

bpoweski/athena-tools

Folders and files

Latest commit

History

Repository files navigation

Athena Tools

ORC Encoding via AWS Lambda

ORC S3 Notification Encoder

Usage

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages