Skip to content

Commit

Permalink
Merge pull request #18 from jrnd-io/0.4.0-dev
Browse files Browse the repository at this point in the history
0.4.0 dev
  • Loading branch information
hifly81 authored Sep 30, 2024
2 parents 14f3cc9 + d874fd8 commit 07e6ffe
Show file tree
Hide file tree
Showing 12 changed files with 274 additions and 42 deletions.
139 changes: 124 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,22 +63,22 @@ tear-down.sh

JR Source Connector can be configured with:

Parameter | Description | Default
-|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-
`template` | A valid JR existing template name. Skipped when __embedded_template_ is set. For a list of available templates see: https://jrnd.io/docs/#listing-existing-templates | net_device
`embedded_template` | Location of a file containing a valid custom JR template. This property will take precedence over _template_. File must exist on Kafka Connect Worker nodes. |
`topic` | destination topic on Kafka |
`frequency` | Repeat the creation of a random object every 'frequency' milliseconds. | 5000
Parameter | Description | Default
-|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-
`template` | A valid JR existing template name. Skipped when __embedded_template_ is set. For a list of available templates see: https://jrnd.io/docs/#listing-existing-templates | net_device
`embedded_template` | Location of a file or URL, containing a valid custom JR template. This property will take precedence over _template_. File must exist on Kafka Connect Worker nodes. |
`topic` | destination topic on Kafka |
`frequency` | Repeat the creation of a random object every 'frequency' milliseconds. | 5000
`duration` | Set a time bound to the entire object creation. The duration is calculated starting from the first run and is expressed in milliseconds. At least one run will always been scheduled, regardless of the value for 'duration'. If not set creation will run forever. | -1
`objects` | Number of objects to create at every run. | 1
`key_field_name` | Name for key field, for example 'ID'. This is an _OPTIONAL_ config, if not set, objects will be created without a key. Skipped when _key_embedded_template_ is set. Value for key will be calculated using JR function _key_, https://jrnd.io/docs/functions/#key |
`key_value_interval_max` | Maximum interval value for key value, for example 150 (0 to key_value_interval_max). Skipped when _key_embedded_template_ is set. | 100
`key_embedded_template` | Location of a file containing a valid custom JR template for keys. This property will take precedence over _key_field_name_ and _key_value_interval_max_. File must exist on Kafka Connect Worker nodes. |
`jr_executable_path` | Location for JR executable on workers. If not set, jr executable will be searched using $PATH variable. |
`value.converter` | one between _org.apache.kafka.connect.storage.StringConverter_, _io.confluent.connect.avro.AvroConverter_, _io.confluent.connect.json.JsonSchemaConverter_ or _io.confluent.connect.protobuf.ProtobufConverter_ |org.apache.kafka.connect.storage.StringConverter
`value.converter.schema.registry.url` | Only if _value.converter_ is set to _io.confluent.connect.avro.AvroConverter_, _io.confluent.connect.json.JsonSchemaConverter_ or _io.confluent.connect.protobuf.ProtobufConverter_. URL for _Schema Registry._ |
`key.converter` | one between _org.apache.kafka.connect.storage.StringConverter_, _io.confluent.connect.avro.AvroConverter_, _io.confluent.connect.json.JsonSchemaConverter_ or _io.confluent.connect.protobuf.ProtobufConverter_ |org.apache.kafka.connect.storage.StringConverter
`key.converter.schema.registry.url` | Only if _key.converter_ is set to _io.confluent.connect.avro.AvroConverter_, _io.confluent.connect.json.JsonSchemaConverter_ or _io.confluent.connect.protobuf.ProtobufConverter_. URL for _Schema Registry._ |
`objects` | Number of objects to create at every run. | 1
`key_field_name` | Name for key field, for example 'ID'. This is an _OPTIONAL_ config, if not set, objects will be created without a key. Skipped when _key_embedded_template_ is set. Value for key will be calculated using JR function _key_, https://jrnd.io/docs/functions/#key |
`key_value_interval_max` | Maximum interval value for key value, for example 150 (0 to key_value_interval_max). Skipped when _key_embedded_template_ is set. | 100
`key_embedded_template` | Location of a file or URL, containing a valid custom JR template for keys. This property will take precedence over _key_field_name_ and _key_value_interval_max_. File must exist on Kafka Connect Worker nodes. |
`jr_executable_path` | Location for JR executable on workers. If not set, jr executable will be searched using $PATH variable. |
`value.converter` | one between _org.apache.kafka.connect.storage.StringConverter_, _io.confluent.connect.avro.AvroConverter_, _io.confluent.connect.json.JsonSchemaConverter_ or _io.confluent.connect.protobuf.ProtobufConverter_ |org.apache.kafka.connect.storage.StringConverter
`value.converter.schema.registry.url` | Only if _value.converter_ is set to _io.confluent.connect.avro.AvroConverter_, _io.confluent.connect.json.JsonSchemaConverter_ or _io.confluent.connect.protobuf.ProtobufConverter_. URL for _Schema Registry._ |
`key.converter` | one between _org.apache.kafka.connect.storage.StringConverter_, _io.confluent.connect.avro.AvroConverter_, _io.confluent.connect.json.JsonSchemaConverter_ or _io.confluent.connect.protobuf.ProtobufConverter_ |org.apache.kafka.connect.storage.StringConverter
`key.converter.schema.registry.url` | Only if _key.converter_ is set to _io.confluent.connect.avro.AvroConverter_, _io.confluent.connect.json.JsonSchemaConverter_ or _io.confluent.connect.protobuf.ProtobufConverter_. URL for _Schema Registry._ |


## Format
Expand Down Expand Up @@ -327,6 +327,25 @@ In this example a JR connector job with a custom template for values will be ins

Template definition is loaded from file _/tmp/customer-template.json_ existing on Kafka Connect Worker nodes.

Definition for _customer-template.json_:

```
{
"customer_id": "{{uuid}}",
"first_name": "{{name}}",
"last_name": "{{surname}}",
"email": "{{email}}",
"phone_number": "{{phone}}",
"street_address": "{{city}}, {{street}} {{building 2}}, {{zip}}",
"state": "{{state}}",
"zip_code": "{{zip}}",
"country": "United States",
"country_code": "US"
}
```

Connector job:

```
{
"name" : "jr-avro-custom-quickstart",
Expand Down Expand Up @@ -366,6 +385,68 @@ kafka-avro-console-consumer --bootstrap-server localhost:9092 --topic customer -
{"customer_id":"a57911e5-dc9e-4da4-b280-1c0b0143538e","first_name":"Charles","last_name":"Thompson","email":"[email protected]","phone_number":"726 39040449","street_address":"Richmond, Hillcrest Road 6, 43215","state":"Indiana","zip_code":"43215","country":"United States","country_code":"US"}
```

In this second example a JR connector job with a custom template for values will be instantiated and produce 5 new random messages to _customer_ topic every 5 seconds, using the _Confluent Schema Registry_ to register the _Avro_ schema.

Template definition is loaded from URL _http://web/job-template.json_ .

Definition for _job-template.json_:

```
{
"job_title": "{{randoms "Software Engineer|Data Scientist|DevOps Engineer|Product Manager|UI/UX Designer"}}",
"job_department": "{{randoms "Engineering|Data Science|Operations|Product|Design"}}",
"job_location": "{{randoms "New York|San Francisco|Remote|London|Berlin"}}",
"job_description": "{{randoms "Join our innovative team to build cutting-edge software solutions|Lead projects in developing next-gen data products|Help design and implement cloud infrastructure for our services|Collaborate with cross-functional teams to enhance our product line|Design user-centered interfaces for our web and mobile applications"}}",
"required_skills": [
"{{randoms "Python|Java|JavaScript|AWS|Docker"}}",
"{{randoms "Agile methodologies|SQL|React|Node.js|Machine Learning"}}",
"{{randoms "Kubernetes|Figma|Project Management|Data Visualization|Microservices Architecture"}}"],
"salary": "{{randoms "$80,000 - $100,000|$100,000 - $120,000|$120,000 - $150,000"}}",
"experience_level": "{{randoms "Entry Level|Mid Level|Senior Level"}}"
}
```

Connector job:

```
{
"name" : "jr-avro-job-template-quickstart",
"config": {
"connector.class" : "io.jrnd.kafka.connect.connector.JRSourceConnector",
"embedded_template" : "http://web/job-template.json",
"topic": "jobs",
"frequency" : 5000,
"objects": 5,
"value.converter": "io.confluent.connect.avro.AvroConverter",
"value.converter.schema.registry.url": "http://schema-registry:8081",
"tasks.max": 1
}
}
```

Show the _Avro_ schema registered for value:

```
curl -v http://localhost:8081/subjects/jobs-value/versions/1/schema
< HTTP/1.1 200 OK
< Content-Type: application/vnd.schemaregistry.v1+json
{"type":"record","name":"recordvalueRecord","fields":[{"name":"job_title","type":"string"},{"name":"job_department","type":"string"},{"name":"job_location","type":"string"},{"name":"job_description","type":"string"},{"name":"required_skills","type":{"type":"array","items":"string"}},{"name":"salary","type":"string"},{"name":"experience_level","type":"string"}],"connect.name":"recordvalueRecord"}
```

Consume from _jobs_ topic:

```
kafka-avro-console-consumer --bootstrap-server localhost:9092 --topic jobs --from-beginning --property schema.registry.url=http://localhost:8081
{"job_title":"UI/UX Designer","job_department":"Data Science","job_location":"Remote","job_description":"Collaborate with cross-functional teams to enhance our product line","required_skills":["JavaScript","Node.js","Figma"],"salary":"$120,000 - $150,000","experience_level":"Senior Level"}
{"job_title":"DevOps Engineer","job_department":"Design","job_location":"New York","job_description":"Help design and implement cloud infrastructure for our services","required_skills":["JavaScript","Machine Learning","Figma"],"salary":"$120,000 - $150,000","experience_level":"Senior Level"}
{"job_title":"DevOps Engineer","job_department":"Product","job_location":"London","job_description":"Lead projects in developing next-gen data products","required_skills":["JavaScript","Node.js","Microservices Architecture"],"salary":"$100,000 - $120,000","experience_level":"Mid Level"}
{"job_title":"Product Manager","job_department":"Engineering","job_location":"Remote","job_description":"Collaborate with cross-functional teams to enhance our product line","required_skills":["Java","Machine Learning","Project Management"],"salary":"$80,000 - $100,000","experience_level":"Mid Level"}
{"job_title":"UI/UX Designer","job_department":"Product","job_location":"London","job_description":"Design user-centered interfaces for our web and mobile applications","required_skills":["JavaScript","Machine Learning","Data Visualization"],"salary":"$80,000 - $100,000","experience_level":"Entry Level"}
```

#### Custom templates for keys

In this example a JR connector job using a custom template for values will be instantiated and produce 5 new random messages to _customer_full_ topic every 5 seconds, using the _Confluent Schema Registry_ to register the _Avro_ schema. Message keys are also created using a custom template, using the _Confluent Schema Registry_ to register the _Avro_ schema.
Expand All @@ -374,6 +455,34 @@ Template definition is loaded from file _/tmp/customer-template.json_ existing o

Key Template definition is loaded from file _/tmp/key-customer-template.json_ existing on Kafka Connect Worker nodes.

Definition for _customer-template.json_:

```
{
"customer_id": "{{uuid}}",
"first_name": "{{name}}",
"last_name": "{{surname}}",
"email": "{{email}}",
"phone_number": "{{phone}}",
"street_address": "{{city}}, {{street}} {{building 2}}, {{zip}}",
"state": "{{state}}",
"zip_code": "{{zip}}",
"country": "United States",
"country_code": "US"
}
```

Definition for _key-customer-template.json_:

```
{
"customer_id": "{{uuid}}",
"last_name": "{{surname}}"
}
```

Connector job:

```
{
"name" : "jr-avro-custom-full-quickstart",
Expand Down
6 changes: 3 additions & 3 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<modelVersion>4.0.0</modelVersion>

<groupId>io.jrnd</groupId>
<version>0.3.0</version>
<version>0.4.0</version>
<artifactId>jr-kafka-connect-source</artifactId>
<packaging>jar</packaging>

Expand All @@ -15,9 +15,9 @@
<kafka.version>3.8.0</kafka.version>
<avro.version>1.11.3</avro.version>
<jackson.version>2.13.4.2</jackson.version>
<protobuf.version>3.25.4</protobuf.version>
<protobuf.version>3.25.5</protobuf.version>
<slf4j.version>1.7.15</slf4j.version>
<junit.version>5.8.2</junit.version>
<junit.version>5.10.0</junit.version>
<mockito.version>5.0.0</mockito.version>
<curator.version>5.0.0</curator.version>
</properties>
Expand Down
2 changes: 1 addition & 1 deletion quickstart/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ RUN CGO_ENABLED=1 GOOS=linux go build -tags static_all -v -ldflags="-X 'github.c

FROM confluentinc/cp-server-connect:7.7.1

ARG JR_SOURCE_CONNECTOR_VERSION=0.3.0
ARG JR_SOURCE_CONNECTOR_VERSION=0.4.0
ARG JR_PACKAGE_NAME=jrndio-jr-kafka-connect-source

COPY --from=builder /tmp/jr-main/templates/ /home/appuser/.jr/templates/
Expand Down
2 changes: 1 addition & 1 deletion quickstart/Dockerfile-arm64
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ RUN CGO_ENABLED=1 GOOS=linux go build -tags static_all -v -ldflags="-X 'github.c

FROM confluentinc/cp-server-connect:7.7.1

ARG JR_SOURCE_CONNECTOR_VERSION=0.3.0
ARG JR_SOURCE_CONNECTOR_VERSION=0.4.0
ARG JR_PACKAGE_NAME=jrndio-jr-kafka-connect-source

COPY --from=builder /tmp/jr-main/templates/ /home/appuser/.jr/templates/
Expand Down
2 changes: 1 addition & 1 deletion quickstart/build-image.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

DOCKERFILE=quickstart/Dockerfile
IMAGE_NAME=jrndio/kafka-connect-demo-image
IMAGE_VERSION=0.3.0
IMAGE_VERSION=0.4.0

if [[ $(uname -m) == 'arm64' ]]; then
DOCKERFILE=quickstart/Dockerfile-arm64
Expand Down
12 changes: 12 additions & 0 deletions quickstart/config/html/job-template.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"job_title": "{{randoms "Software Engineer|Data Scientist|DevOps Engineer|Product Manager|UI/UX Designer"}}",
"job_department": "{{randoms "Engineering|Data Science|Operations|Product|Design"}}",
"job_location": "{{randoms "New York|San Francisco|Remote|London|Berlin"}}",
"job_description": "{{randoms "Join our innovative team to build cutting-edge software solutions|Lead projects in developing next-gen data products|Help design and implement cloud infrastructure for our services|Collaborate with cross-functional teams to enhance our product line|Design user-centered interfaces for our web and mobile applications"}}",
"required_skills": [
"{{randoms "Python|Java|JavaScript|AWS|Docker"}}",
"{{randoms "Agile methodologies|SQL|React|Node.js|Machine Learning"}}",
"{{randoms "Kubernetes|Figma|Project Management|Data Visualization|Microservices Architecture"}}"],
"salary": "{{randoms "$80,000 - $100,000|$100,000 - $120,000|$120,000 - $150,000"}}",
"experience_level": "{{randoms "Entry Level|Mid Level|Senior Level"}}"
}
14 changes: 14 additions & 0 deletions quickstart/config/jr-source.avro.job-template.quickstart.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"name" : "jr-avro-job-template-quickstart",
"config": {
"connector.class" : "io.jrnd.kafka.connect.connector.JRSourceConnector",
"embedded_template" : "http://web/job-template.json",
"topic": "jobs",
"frequency" : 5000,
"objects": 5,
"duration": 150000,
"value.converter": "io.confluent.connect.avro.AvroConverter",
"value.converter.schema.registry.url": "http://schema-registry:8081",
"tasks.max": 1
}
}
13 changes: 11 additions & 2 deletions quickstart/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ services:
SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: 'broker:9092'

connect:
image: jrndio/kafka-connect-demo-image:0.3.0
image: jrndio/kafka-connect-demo-image:0.4.0
hostname: connect
container_name: connect
depends_on:
Expand Down Expand Up @@ -83,4 +83,13 @@ services:
CONTROL_CENTER_MONITORING_INTERCEPTOR_TOPIC_PARTITIONS: 1
CONTROL_CENTER_STREAMS_NUM_STREAM_THREADS: 1
CONTROL_CENTER_CONSUMER_FETCH_MAX_BYTES: 52428800
CONTROL_CENTER_CONSUMER_MAX_POLL_RECORDS: 500
CONTROL_CENTER_CONSUMER_MAX_POLL_RECORDS: 500

web:
image: nginx:alpine
hostname: web
container_name: web
ports:
- "8080:80"
volumes:
- ./config/html:/usr/share/nginx/html:ro
2 changes: 1 addition & 1 deletion src/assembly/manifest.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name" : "jr-source-connector",
"version" : "0.3.0",
"version" : "0.4.0",
"title" : "JR Source Connector",
"description" : "A Kafka Connector for JR, the leading streaming quality data generator.",
"logo" : "assets/jr.png",
Expand Down
Loading

0 comments on commit 07e6ffe

Please sign in to comment.