Skip to content

CM Diagnostic bundle supported Role log Naming formats

dseth edited this page Jun 17, 2021 · 1 revision

Role Log Naming conventions

The diagnostic bundle collection command in Cloudera Manager (CM) has a step that involves gathering the role logs on selected or all hosts in the cluster. This step needs to iterate through all the role log types (eg. Hive On Tez, ImpalaD or HDFS Datanode role logs), sort them and then find the relevant log files to pickup for bundling based on the start/end time or max. file size limit set by the diagnostic bundle collection command.

Currently, the sorting of role logs depend on the nomenclature of the name given to the log files.

For instance, a role log named as:

  • hadoop-cmf-hive_on_tez-HIVESERVER2-lannister-003.edh.cloudera.com.log.out.2021-05-15-2
  • hadoop-cmf-hive_on_tez-HIVESERVER2-lannister-003.edh.cloudera.com.log.out.2021-05-16-5
  • hadoop-cmf-hive_on_tez-HIVESERVER2-lannister-003.edh.cloudera.com.log.out.2021-05-17-8

will need to be sorted differently as compared to a log file that is named as:

  • impalad.nightly-5.ent.cloudera.com.impala.log.INFO.20121130-061408.14555
  • impalad.nightly-5.ent.cloudera.com.impala.log.INFO.20121130-061409.12345

Few of the naming conventions that CM currently supports are:

  • x.y.z.log.out.2021-03-14-9
  • x.y.z.log.out-2021-06-02-4
  • x.y.z.log.INFO.20121130-061408.14555
  • x.y.z.log.20121130-061408.14555
  • x.y.z.log.1, xyz.log.2, xyz.log.3

Hence, in-order to perform a search and bundling of the respective role logs it's imperative to implement a parsing logic in CM iff your role log type is not in the above supported list.

The code for it is located at: https://github.infra.cloudera.com/Starship/cmf/blob/cdpd-master/agents/cmf/src/cmf/log_retrieve.py#L382