Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Schema less option #25

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
24 changes: 22 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
Not sure, but this connector might not work with Avro

### Introduction

The Kafka-Kinesis-Connector is a connector to be used with [Kafka Connect](https://kafka.apache.org/documentation/#connect) to publish messages from Kafka to [Amazon Kinesis Streams](https://aws.amazon.com/kinesis/streams/) or [Amazon Kinesis Firehose](https://aws.amazon.com/kinesis/firehose/).
Expand All @@ -16,7 +18,9 @@ You can build the project by running "maven package" and it will build amazon-ki

1. Make sure you create a delivery stream in AWS Console/CLI/SDK – See more details [here](http://docs.aws.amazon.com/firehose/latest/dev/basic-create.html) and configure destination.

2. Connector uses [DefaultAWSCredentialsProviderChain](http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/auth/DefaultAWSCredentialsProviderChain.html) for authenitication. It looks for credentials in following order - environment variable, java system properties, credentials profile file at default location ( (~/.aws/credentials), credentials delievered through Amazon EC2 container service, and instance profile credentails delivered through Amazon EC2 metadata service. Make sure user has at least permission to list streams/delivery stream, describe streams/delivery stream and put records for stream/delivery stream.
2. If you don't specify aws user key nor aws secret key then connector will use [DefaultAWSCredentialsProviderChain](http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/auth/DefaultAWSCredentialsProviderChain.html) for authenitication. It looks for credentials in following order - environment variable, java system properties, credentials profile file at default location ( (~/.aws/credentials), credentials delievered through Amazon EC2 container service, and instance profile credentails delivered through Amazon EC2 metadata service. Make sure user has at least permission to list streams/delivery stream, describe streams/delivery stream and put records for stream/delivery stream.

3. If you don't specify Kinesis host or CloudWatch host, the original one will be used.

### Running a Connector

Expand All @@ -39,6 +43,13 @@ You can build the project by running "maven package" and it will build amazon-ki
| connector.class | Class for Amazon Kinesis Firehose Connector | com.amazon.kinesis.kafka.FirehoseSinkConnector |
| topics | Kafka topics from where you want to consume messages. It can be single topic or comma separated list of topics | - |
| region| Specify region of your Kinesis Firehose | - |
| awsKey | AWS user key.| - |
| awsSecret | AWS user secret key.| - |
| awsKinesisHost | AWS Kinesis host. (optional, must go with awsKinesisPort)| localhost |
| awsKinesisPort | AWS Kinesis port. (optional, must go with awsKinesisHost)| 7700 |
| awsCloudWatchHost | AWS CloudWatch host. (optional, must go with awsCloudWatchPort)| localhost |
| awsCloudWatchPort | AWS CloudWatch port. (optional, must go with awsCloudWatchHost)| 7701 |
| awsValidateCertificate | Should certificate be validated| true |
| batch | Connector batches messages before sending to Kinesis Firehose (true/false) | true |
| batchSize | Number of messages to be batched together. Firehose accepts at max 500 messages in one batch. | 500 |
| batchSizeInBytes | Message size in bytes when batched together. Firehose accepts at max 4MB in one batch. | 3670016 |
Expand All @@ -49,7 +60,9 @@ You can build the project by running "maven package" and it will build amazon-ki

1. Make sure you create Kinesis stream in AWS Console/CLI/SDK – See more details [here](http://docs.aws.amazon.com/streams/latest/dev/learning-kinesis-module-one-create-stream.html).

2. Connector uses [DefaultAWSCredentialsProviderChain](http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/auth/DefaultAWSCredentialsProviderChain.html) for authenitication. It looks for credentials in following order - environment variable, java system properties, credentials profile file at default location ( (~/.aws/credentials), credentials delievered through Amazon EC2 container service, and instance profile credentails delivered through Amazon EC2 metadata service. Make sure user has at least permission to list streams/delivery stream, describe streams/delivery stream and put records for stream/delivery stream.
2. If you don't specify aws user key nor aws secret key then connector will use [DefaultAWSCredentialsProviderChain](http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/auth/DefaultAWSCredentialsProviderChain.html) for authenitication. It looks for credentials in following order - environment variable, java system properties, credentials profile file at default location ( (~/.aws/credentials), credentials delievered through Amazon EC2 container service, and instance profile credentails delivered through Amazon EC2 metadata service. Make sure user has at least permission to list streams/delivery stream, describe streams/delivery stream and put records for stream/delivery stream.

3. If you don't specify Kinesis host or CloudWatch host, the original one will be used.

### Running a Connector

Expand All @@ -73,6 +86,13 @@ You can build the project by running "maven package" and it will build amazon-ki
| topics | Kafka topics from where you want to consume messages. It can be single topic or comma separated list of topics | - |
| region| Specify region of your Kinesis Firehose | - |
| streamName | Kinesis Stream Name.| - |
| awsKey | AWS user key.| - |
| awsSecret | AWS user secret key.| - |
| awsKinesisHost | AWS Kinesis host. (optional, must go with awsKinesisPort)| localhost |
| awsKinesisPort | AWS Kinesis port. (optional, must go with awsKinesisHost)| 7700 |
| awsCloudWatchHost | AWS CloudWatch host. (optional, must go with awsCloudWatchPort)| localhost |
| awsCloudWatchPort | AWS CloudWatch port. (optional, must go with awsCloudWatchHost)| 7701 |
| awsValidateCertificate | Should certificate be validated| true |
| usePartitionAsHashKey | Using Kafka partition key as hash key for Kinesis streams. | false |
| maxBufferedTime | Maximum amount of time (milliseconds) a record may spend being buffered before it gets sent. Records may be sent sooner than this depending on the other buffering limits. Range: [100..... 9223372036854775807] | 15000 |
| maxConnections | Maximum number of connections to open to the backend. HTTP requests are sent in parallel over multiple connections. Range: [1...256]. | 24 |
Expand Down
76 changes: 76 additions & 0 deletions dependency-reduced-pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.amazonaws</groupId>
<artifactId>amazon-kinesis-kafka-connecter</artifactId>
<name>amazon-kinesis-kafka-connecter</name>
<version>0.0.9-SNAPSHOT</version>
<url>http://maven.apache.org</url>
<build>
<plugins>
<plugin>
<artifactId>maven-shade-plugin</artifactId>
<version>2.3</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<artifactSet>
<excludes>
<exclude>org.apache.kafka:*</exclude>
<exclude>com.fasterxml.jackson.annotation:*</exclude>
<exclude>com.fasterxml.jackson.core:*</exclude>
<exclude>com.fasterxml.jackson.dataformat:*</exclude>
<exclude>com.fasterxml.jackson.dataformat:*</exclude>
<exclude>jackson*:jackson-databind:jar:</exclude>
</excludes>
</artifactSet>
</configuration>
</execution>
</executions>
<configuration>
<createDependencyReducedPom>true</createDependencyReducedPom>
</configuration>
</plugin>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.6.1</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
</plugins>
</build>
<dependencies>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>connect-api</artifactId>
<version>0.11.0.2</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.testng</groupId>
<artifactId>testng</artifactId>
<version>6.9.10</version>
<scope>test</scope>
<exclusions>
<exclusion>
<artifactId>jcommander</artifactId>
<groupId>com.beust</groupId>
</exclusion>
<exclusion>
<artifactId>bsh</artifactId>
<groupId>org.beanshell</groupId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
</project>

17 changes: 10 additions & 7 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

<groupId>com.amazonaws</groupId>
<artifactId>amazon-kinesis-kafka-connector</artifactId>
<version>0.0.9-SNAPSHOT</version>
<version>0.1.0-SNAPSHOT</version>
<packaging>jar</packaging>

<name>amazon-kinesis-kafka-connector</name>
Expand All @@ -15,30 +15,33 @@
</properties>

<dependencies>

<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>connect-api</artifactId>
<version>0.11.0.2</version>
<version>2.3.1</version>
</dependency>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>amazon-kinesis-client</artifactId>
<version>1.7.3</version>
<version>1.13.0</version>
</dependency>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>amazon-kinesis-producer</artifactId>
<version>0.12.8</version>
<version>0.14.0</version>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>26.0-jre</version>
</dependency>
<dependency>
<groupId>org.testng</groupId>
<artifactId>testng</artifactId>
<version>6.9.10</version>
<scope>test</scope>
</dependency>

</dependencies>
</dependencies>
<build>
<plugins>
<plugin>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,22 @@ public class AmazonKinesisSinkConnector extends SinkConnector {

public static final String SLEEP_CYCLES = "sleepCycles";

public static final String SCHEMA_ENABLE = "schemaEnable";

public static final String AWS_KEY = "awsKey";

public static final String AWS_SECRET = "awsSecret";

public static final String AWS_KINESIS_HOST = "awsKinesisHost";

public static final String AWS_KINESIS_PORT = "awsKinesisPort";

public static final String AWS_CLOUDWATCH_HOST = "awsCloudWatchHost";

public static final String AWS_CLOUDWATCH_PORT = "awsCloudWatchPort";

public static final String AWS_VALIDATE_CERTIFICATE = "awsValidateCertificate";

private String region;

private String streamName;
Expand Down Expand Up @@ -80,6 +96,16 @@ public class AmazonKinesisSinkConnector extends SinkConnector {

private String sleepCycles;

private String schemaEnable;
private String awsKey;
private String awsSecret;

private String awsKinesisHost;
private String awsKinesisPort;
private String awsCloudWatchHost;
private String awsCloudWatchPort;
private String awsValidateCertificate;

@Override
public void start(Map<String, String> props) {
region = props.get(REGION);
Expand All @@ -99,6 +125,14 @@ public void start(Map<String, String> props) {
outstandingRecordsThreshold = props.get(OUTSTANDING_RECORDS_THRESHOLD);
sleepPeriod = props.get(SLEEP_PERIOD);
sleepCycles = props.get(SLEEP_CYCLES);
schemaEnable = props.get(SCHEMA_ENABLE);
awsKey = props.get(AWS_KEY);
awsSecret = props.get(AWS_SECRET);
awsKinesisHost = props.get(AWS_KINESIS_HOST);
awsKinesisPort = props.get(AWS_KINESIS_PORT);
awsCloudWatchHost = props.get(AWS_CLOUDWATCH_HOST);
awsCloudWatchPort = props.get(AWS_CLOUDWATCH_PORT);
awsValidateCertificate = props.get(AWS_VALIDATE_CERTIFICATE);
}

@Override
Expand Down Expand Up @@ -198,6 +232,29 @@ public List<Map<String, String>> taskConfigs(int maxTasks) {
config.put(SLEEP_CYCLES, sleepCycles);
else
config.put(SLEEP_CYCLES, "10");

if(schemaEnable != null)
config.put(SCHEMA_ENABLE, schemaEnable);
else
config.put(SCHEMA_ENABLE, "true");

if (awsKinesisHost != null)
config.put(AWS_KINESIS_HOST, awsKinesisHost);

if (awsKinesisPort != null)
config.put(AWS_KINESIS_PORT, awsKinesisPort);

if (awsCloudWatchHost != null)
config.put(AWS_CLOUDWATCH_HOST, awsCloudWatchHost);

if (awsCloudWatchPort != null)
config.put(AWS_CLOUDWATCH_PORT, awsCloudWatchPort);

if (awsValidateCertificate != null)
config.put(AWS_VALIDATE_CERTIFICATE, awsValidateCertificate);

config.put(AWS_KEY, awsKey);
config.put(AWS_SECRET, awsSecret);

configs.add(config);

Expand Down
Loading