Name		Name	Last commit message	Last commit date
parent directory ..
src/main		src/main
README.md		README.md
pom.xml		pom.xml

README.md

Transform example

Transform example demonstrates how to create Apache Beam pipeline, create the new transformation and use it together with GBIF transforms and core classes

Avro schema - example-record.avsc is used to generate target data class.
Interpretation ExampleInterpreter.java class uses source data object to apply some logic and sets data to the target object.
ExampleTransform.java is Apache Beam ParDo transformation, uses ExampleInterpreter.java and Interpretation.java.
ExamplePipeline.java is Apache Beam pipeline uses ExampleTransform.java as a ParDo transformation, also you can find example of a Darwin Core Archive - example.zip and example of pipeline options - example.properties to run the pipeline.blob/master/examples/src/main/java/or

How to run:

Please change BUILD_VERSION to the current project version

java -jar target/examples-BUILD_VERSION-shaded.jar src/main/resources/example.properties

You can find output files in the output directory

Spark standalone:

The example uses DirectRunner, in case when your dataset contains more than 1000 records, please use Spark standalone instance

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

transform

transform

README.md

Transform example

How to run:

Spark standalone:

Files

transform

Directory actions

More options

Directory actions

More options

Latest commit

History

transform

Folders and files

parent directory

README.md

Transform example

How to run:

Spark standalone: