Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

METRON-2246 rpm-docker - minimise use of bind mounts due to performance #1501

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

tigerquoll
Copy link
Contributor

@tigerquoll tigerquoll commented Sep 4, 2019

Contributor Comments

Docker bind shared folders have a reputation as being fairly slow and inefficient on Docker for Mac. This project restructures the rpm-docker project to stop docker bind folders from being used by the RPM process for input and scratch pad use. Docker bind folders are still used for output of the constructed RPMs.

This PR restructures the rpm-docker project to use a three-stage docker build process. The first docker file is a template with the RPM building tools installed, that will be cached by docker once it is build.

The second phase of the docker build process involves constructing a new docker image based on the 1st phase docker image plus the Metron jars.

The third phase involves running the second phase docker image which will process the Metron Jars and place the packaged rpms into the appropriate folders in the project directory.

Performance improvement.

On my 2019 MBP, the following performance improvement were obtained:
Unmodified rpm-docker project:

mvn package -DskipTests -T 2C -P HDP-2.5.0.0,mpack
cd metron-deployment/
time mvn clean package -Pbuild-rpms -T 2C

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] metron-deployment 0.7.2 ............................ SUCCESS [  0.763 s]
[INFO] metron-rpm 0.7.2 ................................... SUCCESS [12:21 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 12:22 min (Wall Clock)
[INFO] Finished at: 2019-09-04T18:35:12+10:00
[INFO] ------------------------------------------------------------------------

real 12m23.361s
user 0m11.263s
sys 0m8.577s

This PR:

mvn package -DskipTests -T 2C -P HDP-2.5.0.0,mpack
cd metron-deployment/
time mvn clean package -Pbuild-rpms -T 2C

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] metron-deployment 0.7.2 ............................ SUCCESS [  0.848 s]
[INFO] metron-rpm 0.7.2 ................................... SUCCESS [09:28 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 09:29 min (Wall Clock)
[INFO] Finished at: 2019-09-04T19:16:38+10:00
[INFO] ------------------------------------------------------------------------

real	9m30.377s
user	0m10.831s
sys	0m7.999s

Test plan:

  1. run mvn package -DskipTests -T 2C -P HDP-2.5.0.0,mpack
  2. cd metron-deployment
  3. mvn clean package -Pbuild-rpms -T 2C
  4. Check metron-deployment/packaging/docker/rpm-docker/target/RPMS for presence of RPMS.

Thank you for submitting a contribution to Apache Metron.
Please refer to our Development Guidelines for the complete guide to follow for contributions.
Please refer also to our Build Verification Guidelines for complete smoke testing guides.

In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following:

For all changes:

  • Is there a JIRA ticket associated with this PR? If not one needs to be created at Metron Jira.
  • Does your PR title start with METRON-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
  • Has your PR been rebased against the latest commit within the target branch (typically master)?

For code changes:

  • Have you included steps to reproduce the behavior or problem that is being changed or addressed?

  • Have you included steps or a guide to how the change may be verified and tested manually?

  • n/a Have you ensured that the full suite of tests and checks have been executed in the root metron folder via:

    mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh 
    
  • n/a Have you written or updated unit tests and or integration tests to verify your changes?

  • n/a If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?

  • Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent?

For documentation related changes:

  • n.a Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via site-book/target/site/index.html:

    cd site-book
    mvn site
    
  • n/a Have you ensured that any documentation diagrams have been updated, along with their source files, using draw.io? See Metron Development Guidelines for instructions.

Note:

Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.
It is also recommended that travis-ci is set up for your personal repository such that your branches are built there before submitting a pull request.

@ottobackwards
Copy link
Contributor

I was considering this type of thing as my next step in my pr. I'll review

@ottobackwards
Copy link
Contributor

Can you do a scan of the documentation and make sure that the RPM location is correctly referenced?

@mmiklavc
Copy link
Contributor

mmiklavc commented Sep 4, 2019

This PR restructures the rpm-docker project to use a three-stage docker build process. The first docker file is a template with the RPM building tools installed, that will be cached by docker once it is build.

That's excellent

The second phase of the docker build process involves constructing a new docker image based on the 1st phase docker image plus the Metron jars.

This makes me nervous. Can you clarify this a bit more? How is this going to impact the build lifecycle for fulldev? If I do a vagrant up can I expect this step is still going to completely rebuild and package Metron with the latest code and deploy fresh, not an old cached version of the jars?

@tigerquoll
Copy link
Contributor Author

tigerquoll commented Sep 4, 2019

@ottobackwards
First some clarification: The old docker build process mounted the entire project directory inside the docker file as a bind share, and utilised the project subdirectories for various purposes.

The new process moves away from this by:

  1. Building in the rpm build source data directories (SPEC and SOURCES) directly into the image rather than using BIND share with the associated IO penalty.
  2. Having the rpm build working directories (BUILD and BUILDROOT), as simply runtime ephemeral directories inside the docker image, again skipping the IO penalty of bind shares.
  3. Having the rpm build output directories (RPMS and SRPMS) as simply runtime ephemeral directories inside the docker image as well. RPMLINT is run on the rpms in these output directories as one of the last steps in the docker rpmbuild process. Running RPMlint on a bind share was a big IO penalty.
  4. A new last step which actually used to be run by maven. Copying files from RPMS/SRPMS directories to the project "target" directory. The "target" directory is now the only shared bind directory used to move data from the running docker image to the project build directory.

Pushing this last step inside the running docker image allows rpmlint to process the rpms while they are still sitting on the higher-performing in-docker filesystems, rather then while the RPMS are sitting on a shared bind directory.

The readme was updated to show the new location that rpms are located once the docker image has finished running as it does this "move to target directory" step rather then the maven process;

@tigerquoll
Copy link
Contributor Author

tigerquoll commented Sep 4, 2019

@mmiklavc Fair question.
During the maven prepare phase for the packaging project, the jars from all of the metron sub-project target directories get copied into the rpm-docker/SOURCES directory. This then becomes the source of the new docker image.

Docker keeps track (i.e. checksums) all of the source data used to build each step of a docker image (This is called the docker context). When you rebuild a docker image, docker compares the checksums to the source data, and skips the build process if nothing in the docker context of that image has changed.

The rpm-docker build process builds both docker images every times it runs. Most times the first image has no changes in its source files, so the build is skipped. The second build process (the one that embeds Metron jars) will actually skip as well if none of the Metron source jars have changed. (bonus optimisation!). But in all cases, the jars embedded in the docker image will be a 100% match with the jars that are in rpm-docker/SOURCES .

This "keeping track of docker contexts" is fundamental to how the docker build process runs, and has been validated and tested by everyone who utilises docker over many years.

@tigerquoll
Copy link
Contributor Author

@tigerquoll tigerquoll changed the title Metron-2246 rpm-docker - minimise use of bind mounts due to performance METRON-2246 rpm-docker - minimise use of bind mounts due to performance Sep 5, 2019
@tigerquoll tigerquoll closed this Sep 7, 2019
@tigerquoll tigerquoll reopened this Sep 7, 2019
@ottobackwards
Copy link
Contributor

@tigerquoll, for my issue, i was going to take it a step further and just do a ROOT level docker file, and copy/add . /code.

Mainly because i want to do full END to END in the container:

  • run ansible -> builds code -> builds rpm -> deploys to vagrant or "other" host.

Having gotten that far in my thinking I stopped however, because I haven't had time to think through all the restructuring it would take, given the way we have the docker stuff buried and purposed right now.

Does that make sense?

@tigerquoll
Copy link
Contributor Author

Heh @ottobackwards,
I don't think that you can get virtual box running inside a docker container, so you would have to pass the completed RPM's out of the container back to the host machine so that vagrant could stand up a VM and pass the RPMs to it to install.

@ottobackwards
Copy link
Contributor

hahah

@ottobackwards
Copy link
Contributor

I have a PR up already. The idea is to have the virtual box created and up, and have the docker container run ansible with it as the target.

@ottobackwards
Copy link
Contributor

ottobackwards commented Sep 13, 2019

#1261 (comment)

It is currently conflicted, I am going to get back to it until you are done here, really. Feel free to review and comment.

@mmiklavc
Copy link
Contributor

Hm, this doesn't appear to have made a difference on my machine. Am I missing something?

2nd run with the new approach
mvn clean package -Pbuild-rpms -T 2C 15.87s user 11.44s system 3% cpu 14:53.15 total

Master branch for Metron
mvn clean package -Pbuild-rpms -T 2C 15.55s user 12.85s system 3% cpu 14:40.47 total

@tigerquoll
Copy link
Contributor Author

That is a bit weird, I'll pull the branch and re-run my tests.

@tigerquoll
Copy link
Contributor Author

Current master:
------------------------------------------------------------------------
Reactor Summary for metron-deployment 0.7.2:

metron-deployment .................................. SUCCESS [  0.772 s]
metron-rpm ......................................... SUCCESS [14:15 min]
------------------------------------------------------------------------
BUILD SUCCESS
------------------------------------------------------------------------
Total time:  14:16 min (Wall Clock)
Finished at: 2019-09-27T12:34:58+10:00
------------------------------------------------------------------------

real 14m22.823s
user 0m13.724s
sys 0m9.627s



PR - Second build:

------------------------------------------------------------------------
Reactor Summary for metron-deployment 0.7.2:

metron-deployment .................................. SUCCESS [  0.895 s]
metron-rpm ......................................... SUCCESS [10:32 min]
------------------------------------------------------------------------
BUILD SUCCESS
------------------------------------------------------------------------
Total time:  10:33 min (Wall Clock)
Finished at: 2019-09-27T12:48:57+10:00
------------------------------------------------------------------------

real 10m39.657s
user 0m13.474s
sys 0m9.916s

Are you using a Mac for your testing?

@nickwallen
Copy link
Contributor

Here is what I am noticing with this change.

If I both build the code and create the RPMs, I see times of roughly 11 - 12 minutes for RPM creation consistently. Said a different way, the first time I create the RPMs, it takes 11 -12 minutes.

mvn package -DskipTests -T 2C -P HDP-2.5.0.0,mpack
cd metron-deployment/
# this step takes 11 - 12 minutes
time mvn clean package -Pbuild-rpms -T 2C

Now, if I then attempt to create the RPMs again (without rebuilding the code), I do see improved times of roughly 8 - 9 minutes.

mvn package -DskipTests -T 2C -P HDP-2.5.0.0,mpack
cd metron-deployment/

# right after I built the code, this takes 11 - 12 minutes
time mvn clean package -Pbuild-rpms -T 2C

# if I create the RPMs again, this takes 8 - 9 minutes
time mvn clean package -Pbuild-rpms -T 2C

I believe that once I do a mvn clean or rebuild the code, the next time I go about creating RPMs, it takes 11-12 minutes. In master, no matter how I go about it, RPM creation takes 11 - 12 minutes.

@tigerquoll Why do you think this is? Could the time savings here actually be from re-use of the phase 1 container? Does the phase 1 container get removed after a clean?

I think your use of multi-stage Docker builds makes sense here, but I have not had a chance to dig-in to this enough to explain what I am seeing.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants