Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logging enhancements #19583

Draft
wants to merge 7 commits into
base: dev
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 81 additions & 0 deletions Logging.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
This is a planning and discussion issue for enhancements I would like to make to Galaxy's logging system.

# TL;DR

I want to add `logger.trace()` calls to the Galaxy codebase, but before doing that I would like to get some infrastructure in place.

# What is proposed

1. Enhance support for the `trace()` method that is already present in `galaxy.util.custom_logging`
1. Add a simplified logging configuration section to either `galaxy.yml` or a separate logging config file
1. Add API endpoints to query and set logging levels at runtime
1. Add an admin panel that gives administrators a GUI for setting log levels on a running instance.

# What is not proposed

1. Providing an API that allows arbitrary logger reconfiguration. We only need (want) to change the logging level at runtime. No other configuration changes will be allowed.
1. Changing the way loggers are currently configured. The logging configuration changes proposed here work transparently with the current logging configuration methods.
2. Doing everything now. This is a first pass and we can add more features and enhancements later.

# Proof of concept

I have a proof of concept running that implements the first three items (everything but the admin panel). The code is available in the [logging-enhancements](https://github.com/ksuderman/galaxy/tree/logging-enhancements) branch of my Galaxy fork.

However, before I start submitting PRs there are a number of design decisions to resolve.

## The trace() method

My proof of concept is based on the collective wisdom of [StackOverflow](https://stackoverflow.com/a/35804945/1691778).

If we monkeypatch Python's `logging` library early enough in the code loading process then the `trace()` method is automagically available to all loggers. There are ~750 configured loggers in a running Galaxy instance.

This works well in a running Galaxy instance, but may present problems for Galaxy packages meant to be used as libraries. Calling `galaxy.util.logging.addTraceLoggingLevel()` in the [package.__init__.py](https://github.com/galaxyproject/galaxy/blob/dev/packages/package.__init__.py) may be sufficient. The goal is to make it as easy as possible for developers to add `trace()` calls to their code and not require the use of a special logger class.

## Logging configuration

The proof of concept implementation adds a `logging_levels` section to the `galaxy.yml` file. This could also be done in a separate configuration file. The purpose of the `logging_levels` section is to provide an easier way to set the logging levels for groups of loggers in a single statement.

```
logging_levels:
galaxy.*: ERROR
galaxy.jobs.runners.*: DEBUG
galaxy.jobs.runners.kubernetes: TRACE
```

## Changes at runtime

Adding `trace()` level logging statements has the potential to blow up log files and Galaxy should not be configured to use `TRACE` as the default logging level. Therefore we will need some way to enable `TRACE` level logging at runtime. Options include, but are not limited to:

1. set up a file watcher on the config file and reconfigure loggers when changes are detected
2. provide an API that allows log levels to be changed.

These are not mutually exclusive options, but in some situations, e.g. AnVIL, users may not have access to the configuration files.

The proof of concept provides a simple API that allows the level to be changed at runtime. Adding a config watcher can be added at a later date if needed.

<img width="835" alt="Screenshot 2024-11-20 at 3 59 52 PM" src="https://github.com/user-attachments/assets/df0b6689-5832-4d7b-98b1-ada2fc5a5b97">

```
curl http://localhost:8080/api/logging
curl http://localhost:8080/api/logging/galaxy.jobs.runners.*
curl -X POST http://localhost:8080/api/logging/galaxy.jobs.runners.*?level=DEBUG
curl -X POST http://localhost:8080/api/logging/galaxy.jobs.runners.kubernetes?level=TRACE
```
For brevity I have omitted setting the API key. Only administrators can list or change the logging levels.


The first `GET` method returns JSON containing information for all currently configured loggers. The second `GET` method returns information for a single logger or all loggers in a particular package. The `POST` method can be used to set the logging level for a single logger or all the loggers in a package.

# TODO

## Authorization

All API endpoints specify `require_admin=True` in the `@router` decorator. Is this sufficient? I was thinking of adding checks in relevant `galaxy.util.logging` methods, but is this necessary?

## Admin GUI

I have not started on an admin panel yet, but I envision it would be a new option in the *Server* section of the admin panel. The logging panel would contain a table and/or a tree view of all the loggers and their current levels and the ability to set the levels for loggers. I will need help and guidance with this task.

## Naming

I've used that I thought were reasonable names for the names of packages, API endpoints, function names, and configuration options, but I'm open to suggestions. This also applies to general code layout and structure.
8 changes: 8 additions & 0 deletions lib/galaxy/app.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,14 @@
import collections
import errno
import logging

# This needs to be done before any other galaxy imports to ensure all logging is
# captured. There is no harm in calling addTraceLoggingLevel multiple times, but
# it is only needed once provided we do it early enough in the code loading
# process.
from galaxy.util.logging import addTraceLoggingLevel
addTraceLoggingLevel()

import os
import signal
import sys
Expand Down
11 changes: 9 additions & 2 deletions lib/galaxy/config/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@
resource_path,
)
from galaxy.util.themes import flatten_theme
from ..util.logging import set_logging_levels_from_config
from ..version import (
VERSION_MAJOR,
VERSION_MINOR,
Expand Down Expand Up @@ -168,8 +169,7 @@ def configure_logging(config, facts=None):
or a simple dictionary of configuration variables.
"""
facts = facts or get_facts(config=config)
# Get root logger
logging.addLevelName(LOGLV_TRACE, "TRACE")

# PasteScript will have already configured the logger if the
# 'loggers' section was found in the config file, otherwise we do
# some simple setup using the 'log_*' values from the config.
Expand Down Expand Up @@ -198,6 +198,10 @@ def configure_logging(config, facts=None):
conf["filename"] = conf.pop("filename_template").format(**facts)
logging_conf["handlers"][name] = conf
logging.config.dictConfig(logging_conf)
logging_levels = config.get("logging_levels", None)
if logging_levels:
set_logging_levels_from_config(logging_levels)



def find_root(kwargs) -> str:
Expand Down Expand Up @@ -541,6 +545,9 @@ def resolve(key):

def _check_against_root(self, key: str):
def get_path(current_path, initial_path):
# TODO: Not sure why this is needed for the logging API tests...
if initial_path is None:
return current_path
# if path does not exist and was set as relative:
if not self._path_exists(current_path) and not os.path.isabs(initial_path):
new_path = self._in_root_dir(initial_path)
Expand Down
12 changes: 12 additions & 0 deletions lib/galaxy/queue_worker.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
from galaxy.tools import ToolBox
from galaxy.tools.data_manager.manager import DataManagers
from galaxy.tools.special_tools import load_lib_tools
from galaxy.util.logging import set_log_levels

logging.getLogger("kombu").setLevel(logging.WARNING)
log = logging.getLogger(__name__)
Expand Down Expand Up @@ -303,6 +304,16 @@ def admin_job_lock(app, **kwargs):
log.info(f"Administrative Job Lock is now set to {job_lock}. Jobs will {'not' if job_lock else 'now'} dispatch.")


def set_logging_levels(app, **kwargs):
"""
Set levels for loggers.
"""
name = kwargs.get("name")
level = kwargs.get("level")
log.debug("Setting logging levels for %s to %s.", name, level)
set_log_levels(name, level)


control_message_to_task = {
"create_panel_section": create_panel_section,
"reload_tool": reload_tool,
Expand All @@ -318,6 +329,7 @@ def admin_job_lock(app, **kwargs):
"reconfigure_watcher": reconfigure_watcher,
"reload_tour": reload_tour,
"reload_core_config": reload_core_config,
"set_logging_levels": set_logging_levels
}


Expand Down
219 changes: 219 additions & 0 deletions lib/galaxy/util/logging/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,219 @@
import logging
from typing import List, Dict


TRACE = logging.DEBUG - 5

log = logging.getLogger(__name__)


def addTraceLoggingLevel():
addLoggingLevel('TRACE', TRACE)


def set_logging_levels_from_config(configuration: dict):
all_logger_names = logging.Logger.manager.loggerDict.keys()
settings = dict()
for name,level in configuration.items():
if type(level) == int:
level = logging.getLevelName(level)
else:
level = level.upper()
if name.endswith(".*"):
pattern = name[:-2]
for name in all_logger_names:
if name.startswith(pattern):
settings[name] = level
else:
settings[name] = level
for name,level in settings.items():
logging.getLogger(name).setLevel(level)


def _get_level_info(logger) -> Dict[str, str]:
'''
Get the level and effective level of a logger

:param logger: The logger to get the info for
:type logger: logging.Logger

:return: The level and effective level of the logger
:rtype: Dict[str, str]
'''
if logger is None:
return {
"name": "None",
"level": "NOTSET",
"effective": "NOTSET"
}
return {
"name": logger.name,
"level": logging.getLevelName(logger.level),
"effective": logging.getLevelName(logger.getEffectiveLevel())
}


def get_logger_names() -> List[str]:
'''
Gets the names of all the currently configured loggers.

:return: The names of all the currently configured loggers
:rtype: List[str]
'''
log.info("Getting a list of all configured loggers")
logger_dict = logging.Logger.manager.loggerDict
loggers = [name for name in logger_dict if isinstance(logger_dict[name], logging.Logger)]
return loggers



def get_log_levels(name) -> Dict[str, Dict[str, str]]:
'''
Get the log level for a one or more loggers. If no name is provided then
the levels for all loggers is returned.

:param name: The name of the logger to get the level for
:type name: str

:return: The log level for the logger
:rtype: Dict[str, Dict[str, str]]
'''
# if not trans.user_is_admin:
# log.warning("Only admins can get log level")
# raise AdminRequiredException()
log.info("Getting level for logger %s", name)
loggers = get_logger_names()
if name is None:
result = {}
for logger_name in loggers:
logger = logging.getLogger(logger_name)
result[logger_name] = _get_level_info(logger)
return result
elif name.endswith(".*"):
result = {}
pattern = name[:-2]
for logger_name in [logger for logger in loggers if logger.startswith(pattern)]:
logger = logging.getLogger(logger_name)
result[logger_name] = _get_level_info(logger)
return result
elif name in loggers:
logger = logging.getLogger(name)
return { name: _get_level_info(logger) }
log.warning("Logger %s not found", name)
return { "UNKNOWN": _get_level_info(None) }


def set_log_levels(name, level) -> List[Dict[str,str]]:
'''
Set the log level for a one or more loggers.

To set the level for a single logger, pass the name of the logger. To set
the level for all loggers that start with a certain prefix, e.g. all the logger
in a particular package, pass the prefix followed by ".*".


:param name: The name of the logger(s) to set the level for
:type name: str
:param level: The level to set the logger to
:type level: str

:return: The log level for the logger
:rtype: LoggerLevelInfo
'''
# if not trans.user_is_admin:
# log.warning("Only admins can set log level")
# raise AdminRequiredException()
log.info("Setting level for logger %s to %s", name, level)
result = []
loggers = get_logger_names()
if name.endswith(".*"):
pattern = name[:-2]
for logger_name in [logger for logger in loggers if logger.startswith(pattern)]:
logger = logging.getLogger(logger_name)
logger.setLevel(level)
result.append(_get_level_info(logger))
return result
elif name in loggers:
logger = logging.getLogger(name)
logger.setLevel(level)
result.append(_get_level_info(logger))
else:
log.warning("Logger %s not found", name)
return result


def addLoggingLevel(levelName, levelNum, methodName=None):
"""
A modified version of the method found at
https://stackoverflow.com/a/35804945/1691778

Rather than raising an AttributeError we simply return if the levelName or
methodName already exist.

--- Original Docstring ---

Comprehensively adds a new logging level to the `logging` module and the
currently configured logging class.

`levelName` becomes an attribute of the `logging` module with the value
`levelNum`. `methodName` becomes a convenience method for both `logging`
itself and the class returned by `logging.getLoggerClass()` (usually just
`logging.Logger`). If `methodName` is not specified, `levelName.lower()` is
used.

This method was inspired by the answers to Stack Overflow post
http://stackoverflow.com/q/2183233/2988730, especially
http://stackoverflow.com/a/13638084/2988730

Example
-------
>>> addLoggingLevel('TRACE', logging.DEBUG - 5)
>>> logging.getLogger(__name__).setLevel("TRACE")
>>> logging.getLogger(__name__).trace('that worked')
>>> logging.trace('so did this')
>>> logging.TRACE
5

"""
if not methodName:
methodName = levelName.lower()

if hasattr(logging, levelName) or hasattr(logging, methodName) or hasattr(logging.getLoggerClass(), methodName):
logging.warning("Attempted to add logging level %s with level number %d and method name %s, but one or more already exist", levelName, levelNum, methodName)
# traceback.print_stack()
return

# TDOD: Do we really want to do this here?
logging.basicConfig(level=logging.DEBUG,
format='%(asctime)s [%(levelname)s] %(name)s %(filename)s:%(lineno)d - %(message)s')

def logForLevel(self, message, *args, **kwargs):
if self.isEnabledFor(levelNum):
self._log(levelNum, message, args, **kwargs)
def logToRoot(message, *args, **kwargs):
logging.log(levelNum, message, *args, **kwargs)

logging.addLevelName(levelNum, levelName)
setattr(logging, levelName, levelNum)
setattr(logging.getLoggerClass(), methodName, logForLevel)
setattr(logging, methodName, logToRoot)
logging.info("Trace level logging has been enabled")


class DebuggingLogHander(logging.Handler):
"""
A log handler used during testing to capture log records in memory so we
can validate what has been logged.
"""
def __init__(self):
logging.Handler.__init__(self)
self.records = []

def emit(self, record):
self.records.append(record)

def reset(self):
self.records = []

def get_records(self):
return self.records
Loading
Loading