Skip to content

mario-renau-a/Avro-test

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

To run the tests, you should run sbt test. In case you are doing improvements that target speed, you can generate a sample Avro file and check how long it takes to read that Avro file using the following commands:

sbt "test:runMain com.mastria.spark.avro.AvroFileGenerator 10000 2"

This will create sample avro files in target/avroForBenchmark/. You can specify the number of records for each file, as well as the overall number of files.

sbt "test:runMain com.mastria.spark.avro.AvroReadBenchmark"

runs count() on the data inside target/avroForBenchmark/ and tells you how the operation took.

Similarly, you can do benchmarks on how long it takes to write DataFrame as Avro file with

sbt "test:runMain com.mastria.spark.avro.AvroWriteBenchmark NUMBER_OF_ROWS"

where NUMBER_OF_ROWS is an optional parameter that allows you to specify the number of rows in DataFrame that we will be writing.

Schema Registry

Download: https://github.com/renukaradhya/confluentplatform

Start Zookeeper

.\zookeeper-server-start.bat ..\..\etc\kafka\zookeeper.properties

Start Kafka Server

.\kafka-server-start.bat ..\..\etc\kafka\server.properties

Start Schema Registry

.\schema-registry-start.bat ..\..\etc\schema-registry\schema-registry.properties

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages