Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

METRON-2265: Update Kerberos settings #1519

Conversation

merrimanr
Copy link
Contributor

Contributor Comments

This PR includes code and documentation changes for enabling Kerberos in HDP 3.1. The primary changes needed were minor Maven changes and making the Metron JAAS setting available to Kafka through KAFKA_OPTS.

Changes Included

  • Changes to KDC setup documentation for Centos7
  • Changes to Kerberos setup documentation for HDP 3.1
  • Minor Mpack change for looking up a host name
  • Addition of a CLIENT_JAAS_ARG property to /etc/default/metron. This is needed separately from METRON_JVMFLAGS
  • Updated the Kafka security.protocol setting to SASL_PLAINTEXT
  • Maven changes to handle classpath errors that result from the extra jars added by Storm when Kerberos is enabled. The extra jars are located in /usr/hdp/current/storm-supervisor/external/storm-autocreds/*.
  • Replaced a call to org.apache.commons.math.util.MathUtils with a standard Java implementation. This dependency was being pulled in transitively.
  • Fixed a bug in metron-data-management that resulted from com.google.thirdparty not being shaded.

Testing

I attempted to test the various components (Storm, Kafka, HBase, Hadoop, etc) to verify they still work after Kerberos is enabled. The instructions in Kerberos-manual-setup.md should now be correct. Also there should not be any additional steps needed to enabled the sensor stubs other than a simple restart. I tested the following after spinning up full dev:

  • Verified data is flowing through to ES and HDFS and the Alerts UI works.
  • Verified the various Profiler README examples work
  • Verified the flatfile_loader.sh script works as expected
  • Verified Stellar works with Kafka (added docs for this)

I believe this should cover almost all of the components. There were not any specific Solr changes required (was done in the Solr upgrade PR) but we might want to test this too.

Pull Request Checklist

Thank you for submitting a contribution to Apache Metron.
Please refer to our Development Guidelines for the complete guide to follow for contributions.
Please refer also to our Build Verification Guidelines for complete smoke testing guides.

In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following:

For all changes:

  • Is there a JIRA ticket associated with this PR? If not one needs to be created at Metron Jira.
  • Does your PR title start with METRON-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
  • Has your PR been rebased against the latest commit within the target branch (typically master)?

For code changes:

  • Have you included steps to reproduce the behavior or problem that is being changed or addressed?

  • Have you included steps or a guide to how the change may be verified and tested manually?

  • Have you ensured that the full suite of tests and checks have been executed in the root metron folder via:

    mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh 
    
  • Have you written or updated unit tests and or integration tests to verify your changes?

  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?

  • Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent?

For documentation related changes:

  • Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via site-book/target/site/index.html:

    cd site-book
    mvn site
    
  • Have you ensured that any documentation diagrams have been updated, along with their source files, using draw.io? See Metron Development Guidelines for instructions.

Note:

Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.
It is also recommended that travis-ci is set up for your personal repository such that your branches are built there before submitting a pull request.

Copy link
Contributor

@nickwallen nickwallen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of really good stuff here @merrimanr . I want to try and run this up myself. I have a few feedback items.

@tigerquoll
Copy link
Contributor

tigerquoll commented Sep 25, 2019

I do a full dev install and followed
https://github.com/apache/metron/blob/429c5eac55554496c967ca9f6e935f6e0b2d4781/metron-deployment/Kerberos-manual-setup.md

Used the following environment:

export BROKERLIST=node1:6667
export KAFKA_HOME=/usr/hdp/3.1.4.0-315/kafka
export [email protected]
export METRON_SERVICE_KEYTAB=/etc/security/keytabs/metron.headless.keytab 
export CLIENT_JAAS_ARG=/etc/kafka/conf/kafka_client_jaas.conf
export KAFKA_SECURITY_PROTOCOL=SASL_PLAINTEXT
export ELASTICSEARCH=node1:9200
export KAFKA_OPTS="-Djava.security.auth.login.config=$CLIENT_JAAS_ARG"

Please note the KAFKA_OPTS variable suggested in the guide is wrong, it is a copy from the HDP page which is most likely to be a typo.

The check of the count in elasticsearch at the "Push Data" stage is misleading, as the guide does not delete the previous collection before enabling kerberos. Repeating the "add sample-bro.txt" stage does not increase the document count.

Was utilising ${KAFKA_HOME}/bin/kafka-consumer-groups.sh to monitor Kafka activity by utilising the following recipe:

  1. Create file /home/metron/kafka.command.config with content of
security.protocol=SASL_PLAINTEXT
  1. Run command
${KAFKA_HOME}/bin/kafka-consumer-groups.sh --command-config=/home/metron/kafka.command.config  --bootstrap-server ${BROKERLIST} --describe --group bro_parser

This showed nothing consuming from the bro topic.

Consumer group 'bro_parser' has no active members.

TOPIC           PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG             CONSUMER-ID     HOST            CLIENT-ID
bro             0          13536           13898           362             -               -               -
[2019-09-25 04:45:03,310] WARN [Principal=null]: TGT renewal thread has been interrupted and will exit. (org.apache.kafka.common.security.kerberos.KerberosLogin)

Further investigations reveals that storm workers are unable to communicate with ZK?

2019-2019-09-25 00:36:21.805 o.a.k.c.NetworkClient Thread-18-kafkaSpout-bro-executor[4 4] [WARN] [Consumer clientId=consumer-2, groupId=bro_parser] Connection to node 1001 could not be established. Broker may not be available.-
25 00:36:21.906 o.a.k.c.NetworkClient Thread-12-kafkaSpout-yaf-executor[6 6] [WARN] [Consumer clientId=consumer-1, groupId=yaf_parser] Connection to node 1001 could not be established. Broker may not be available.
2019-09-25 00:36:22.012 o.a.k.c.NetworkClient Thread-16-kafkaSpout-snort-executor[5 5] [WARN] [Consumer clientId=consumer-3, groupId=snort_parser] Connection to node 1001 could not be established. Broker may not be available.
2019-09-25 00:36:22.053 o.a.s.s.o.a.z.ClientCnxn main-SendThread(node1:2181) [INFO] Opening socket connection to server node1/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2019-09-25 00:36:22.053 o.a.s.s.o.a.z.ClientCnxn main-SendThread(node1:2181) [WARN] Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
...java.lang.RuntimeException: ("Error when processing an event")
        at org.apache.storm.util$exit_process_BANG_.doInvoke(util.clj:341) [storm-core-1.2.1.3.1.4.0-315.jar:1.2.1.3.1.4.0-315]
        at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.7.0.jar:?]
        at org.apache.storm.daemon.worker$mk_halting_timer$fn__10399.invoke(worker.clj:259) [storm-core-1.2.1.3.1.4.0-315.jar:1.2.1.3.1.4.0-315]
        at org.apache.storm.timer$mk_timer$fn__1639$fn__1640.invoke(timer.clj:71) [storm-core-1.2.1.3.1.4.0-315.jar:1.2.1.3.1.4.0-315]
        at org.apache.storm.timer$mk_timer$fn__1639.invoke(timer.clj:42) [storm-core-1.2.1.3.1.4.0-315.jar:1.2.1.3.1.4.0-315]
        at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
2019-09-25 00:36:22.191 o.a.s.d.worker Thread-21 [INFO] Shutting down worker bro__snort__yaf-4-1569366291 a0046483-4363-4209-b4e0-d1ad9f6deea2 6701
2019-09-25 00:36:22.191 o.a.s.d.worker Thread-21 [INFO] Terminating messaging context
2019-09-25 00:36:22.191 o.a.s.d.worker Thread-21 [INFO] Shutting down executors

Tried back tracking through HDP/Storm/Kerberos docs to verify correct setup, but ran out of time.

@merrimanr merrimanr closed this Sep 25, 2019
@merrimanr merrimanr reopened this Sep 25, 2019
@merrimanr
Copy link
Contributor Author

@tigerquoll it's not clear to me which instructions you are following or where you are running into issues. Do you mind providing a step-by-step description of what you've done and where things didn't work?

@nickwallen
Copy link
Contributor

nickwallen commented Sep 27, 2019

+1 I was able to kerberize successfully. I am finding that running kerberized in the development environment with HDP 3.1 is a bit unstable due to limited resources in the VM. This has always been a problem and requires a bit of massaging.

Here are some brief notes on things I did to get this to work in our resource constrained dev environment.

  • Before you begin the kerberization wizard in Ambari stop all sensor stubs.

    systemctl stop sensor-stubs-bro
    systemctl stop sensor-stubs-yaf
    systemctl stop sensor-stubs-snort
    
  • Elasticsearch will fail the service check after kerberization. Its status will remain "yellow" due to resource constraints. This will prevent the other services from being started automatically.

  • When this service check fails, just close the kerberos wizard and continue on. Then manually restart the HDP and Metron services that you want to test. Avoid starting anything that you don't need like YARN, etc.

  • Restart a single telemetry source to see telemetry flow end-to-end. No need to run all 3 of the demo telemetry sources and this lightens the load a bit.

    systemctl start sensor-stubs-bro
    
  • Shut-down as many services as you can that are not directly in-use like YARN, MR, Profiler, Pcap, etc when trying to validate.

@tigerquoll Might any of this help explain what might have happened in your environment?

@tigerquoll
Copy link
Contributor

tigerquoll commented Oct 3, 2019

Ok,
I can get kerberos Metron processing data with this PR. The jars and everything appear OK, it was the contained deployment instructions that were causing me grief.

I had to deviate from the provided "metron-deployment/Kerberos-manual-setup.md" file in the following ways (Can somebody update the markup file in the PR or will there need to be a new PR for the markup changes?)

Initial environment:

source /etc/default/metron
export KAFKA_HOME="/usr/hdp/current/kafka-broker"
export BROKERLIST=node1:6667
export HDP_HOME="/usr/hdp/current"
export KAFKA_HOME="${HDP_HOME}/kafka-broker"
export CLIENT_JAAS_ARG=/etc/kafka/conf/kafka_client_jaas.conf
export KAFKA_SECURITY_PROTOCOL=SASL_PLAINTEXT
export ELASTICSEARCH=node1:9200
export KAFKA_OPTS="-Djava.security.auth.login.config=$CLIENT_JAAS_ARG"

Verify KDC

Step 2:
add
kinit metron
before
klist -f

Enable kerberos

Step 3:
metron.headless.keytab appears to already be generated
so add to start of step 3:
rm metron.headless.keytab
change:
cp metron.headless.keytab /etc/security/keytabs
to:
cp -n metron.headless.keytab /etc/security/keytabs

Kafka Authorization:

The Metron user does not have permissions to edit ACLs, The Kafka Service account does have permissions, so we can temporarily use that keytab to add ACLS.

Add to start of Step 3:
export CLIENT_JAAS_ARG=/etc/kafka/conf/kafka_jaas.conf
export KAFKA_OPTS="-Djava.security.auth.login.config=$CLIENT_JAAS_ARG"

Storm Authoriszation

An additional step should be done before the others
su metron -

Step 7 requires root access
so add
exit
id

source /etc/default/metron
before proceeding with the rest of the script

Start metron

Step 1:
add source /etc/default/metron

Push Data

remove
export KAFKA_OPTS=$CLIENT_JAAS_ARG
Add
source /etc/default/metron
export ELASTICSEARCH=node1:9200
export KAFKA_SECURITY_PROTOCOL=SASL_PLAINTEXT
export KAFKA_HOME="/usr/hdp/current/kafka-broker"
export CLIENT_JAAS_ARG=/etc/kafka/conf/kafka_client_jaas.conf
export KAFKA_OPTS="-Djava.security.auth.login.config=$CLIENT_JAAS_ARG"

Add
curl -XGET "${ELASTICSEARCH}/bro*/_count"
before the dumping new sample data to kafka to get the count before hand for comparison purposes.

@nickwallen
Copy link
Contributor

Good stuff @tigerquoll . I'm glad we have your fresh eyes on this.

I would prefer if you submit your changes above as a separate PR, if you don't mind. I think this PR is a net-positive and gets Kerberos working on the upgrade branch. Your additions would go beyond what we have here and address many of the finer details of making Kerberos work in the development environment.

@mmiklavc
Copy link
Contributor

mmiklavc commented Oct 3, 2019

@tigerquoll Would you mind creating a Jira ticket task in the feature branch epic along the lines of "Enhance Kerberos instructions?" I'm not sure some of these steps should be verbatim, e.g. the env vars. @nickwallen and @merrimanr seem to be making it through so it might make sense for us to verify these additional steps separately.

@nickwallen
Copy link
Contributor

This has been merged into the feature branch.

@nickwallen nickwallen closed this Oct 3, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants