Skip to content

The parcel defines script

jayesh edited this page Dec 14, 2015 · 8 revisions

The parcel defines script is the key component that allows a parcel to affect processes managed by Cloudera Manager. As parcels live in their own separate directories, nothing will automatically be aware of them, or look inside them for files. The defines script is what bridges this gap.

Basic Principles

The essential principle is that the defines script should define and export environment variables which are then available to the process launch scripts when they source the defines script.

Important

Since the defines script gets sourced into the launch environment for the processes offered by the parcel, do not terminate the current process via bash exit commands.

An Example: HBase

In this example, we'll consider how CM runs HBase out of the CDH parcel. The following lines are present in the scripts, but most of the scripts have been elided for simplicity.

In cdh_env.sh:

CDH_DIRNAME=${PARCEL_DIRNAME:-"CDH-5.0.0-0.cdh5b2.p0.19"}
export CDH_HADOOP_HOME=$PARCELS_ROOT/$CDH_DIRNAME/lib/hadoop
export CDH_HBASE_HOME=$PARCELS_ROOT/$CDH_DIRNAME/lib/hbase

In the hbase.sh launch script:

export HADOOP_HOME=$CDH_HADOOP_HOME
export HBASE_HOME=$CDH_HBASE_HOME
...
HBASE_BIN="$HBASE_HOME/../../bin/hbase"

CM happens to use this indirection with $CDH_HBASE_HOME to ensure there aren't any namespace collisions, and there's no requirement for third party parcels to do this. But it's clear to see how the system works: The parcel exports some environment variable expected by the launch script, which can use it to discover program binaries and/or set other variables that are relevant to the program being run.

An Example: LZO Plugin

The other major form of parcel is one that provides a plugin for an existing program. The LZO plugin is a good example.

The full gplextras_env.sh:

#!/bin/bash
GPLEXTRAS_DIRNAME=${PARCEL_DIRNAME:-"GPLEXTRAS-5.0.0-0.gplextras5b2.p0.20"}

if [ -n "${HADOOP_CLASSPATH}" ]; then
  export HADOOP_CLASSPATH="${HADOOP_CLASSPATH}:$PARCELS_ROOT/$GPLEXTRAS_DIRNAME/lib/hadoop/lib/*"
else
  export HADOOP_CLASSPATH="$PARCELS_ROOT/$GPLEXTRAS_DIRNAME/lib/hadoop/lib/*"
fi

if [ -n "${HBASE_CLASSPATH}" ]; then
  export HBASE_CLASSPATH="${HBASE_CLASSPATH}:$PARCELS_ROOT/$GPLEXTRAS_DIRNAME/lib/hadoop/lib/*"
else
  export HBASE_CLASSPATH="$PARCELS_ROOT/$GPLEXTRAS_DIRNAME/lib/hadoop/lib/*"
fi

if [ -n "${FLUME_CLASSPATH}" ]; then
  export FLUME_CLASSPATH="${FLUME_CLASSPATH}:$PARCELS_ROOT/$GPLEXTRAS_DIRNAME/lib/hadoop/lib/*"
else
  export FLUME_CLASSPATH="$PARCELS_ROOT/$GPLEXTRAS_DIRNAME/lib/hadoop/lib/*"
fi

if [ -n "${JAVA_LIBRARY_PATH}" ]; then
  export JAVA_LIBRARY_PATH="${JAVA_LIBRARY_PATH}:$PARCELS_ROOT/$GPLEXTRAS_DIRNAME/lib/hadoop/lib/native"
else
  export JAVA_LIBRARY_PATH="$PARCELS_ROOT/$GPLEXTRAS_DIRNAME/lib/hadoop/lib/native"
fi

if [ -n "${LD_LIBRARY_PATH}" ]; then
  export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:$PARCELS_ROOT/$GPLEXTRAS_DIRNAME/lib/impala/lib"
else
  export LD_LIBRARY_PATH="$PARCELS_ROOT/$GPLEXTRAS_DIRNAME/lib/impala/lib"
fi

In this script, we are explicitly appending the LZO plugin to various environment variables that are picked up by different services in CDH. In these cases, we directly set the relevant variables, and the launch scripts generally don't have to do any marshalling to make them available to the launched programs.

It is very important, in these cases, to respect any pre-existing values for these common variables - so that your parcel can co-operate with any other plugin parcels that are active in the system.

Special parcel variables

When a parcel's environment script is sourced, two special variables are set by CM to allow the parcel to establish where it is located.

  • $PARCELS_ROOT: This is the directory on the filesystem where all parcels are located. (Default: /opt/cloudera/parcels)
  • $PARCEL_DIRNAME: This is the name of the parcel's directory under $PARCELS_ROOT. Consequently, the absolute path to the parcel's directory is: $PARCELS_ROOT/$PARCEL_DIRNAME.