Skip to content

Latest commit

 

History

History
 
 

transform

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Transform example

Transform example demonstrates how to create Apache Beam pipeline, create the new transformation and use it together with GBIF transforms and core classes

  1. Avro schema - example-record.avsc is used to generate target data class.
  2. Interpretation ExampleInterpreter.java class uses source data object to apply some logic and sets data to the target object.
  3. ExampleTransform.java is Apache Beam ParDo transformation, uses ExampleInterpreter.java and Interpretation.java.
  4. ExamplePipeline.java is Apache Beam pipeline uses ExampleTransform.java as a ParDo transformation, also you can find example of a Darwin Core Archive - example.zip and example of pipeline options - example.properties to run the pipeline.blob/master/examples/src/main/java/or

How to run:

Please change BUILD_VERSION to the current project version

java -jar target/examples-BUILD_VERSION-shaded.jar src/main/resources/example.properties

You can find output files in the output directory

Spark standalone:

The example uses DirectRunner, in case when your dataset contains more than 1000 records, please use Spark standalone instance