Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(unit_test,userspace): allow env var to get expanded in yaml even when part of a string #2918

Merged
merged 4 commits into from
Dec 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
124 changes: 102 additions & 22 deletions unit_tests/falco/test_configuration.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -114,11 +114,26 @@ TEST(Configuration, configuration_environment_variables)
// Set an environment variable for testing purposes
std::string env_var_value = "envVarValue";
std::string env_var_name = "ENV_VAR";
std::string default_value = "default";
SET_ENV_VAR(env_var_name.c_str(), env_var_value.c_str());
yaml_helper conf;

std::string sample_yaml =
std::string embedded_env_var_value = "${ENV_VAR}";
std::string embedded_env_var_name = "ENV_VAR_EMBEDDED";
SET_ENV_VAR(embedded_env_var_name.c_str(), embedded_env_var_value.c_str());

std::string bool_env_var_value = "true";
std::string bool_env_var_name = "ENV_VAR_BOOL";
SET_ENV_VAR(bool_env_var_name.c_str(), bool_env_var_value.c_str());

std::string int_env_var_value = "12";
std::string int_env_var_name = "ENV_VAR_INT";
SET_ENV_VAR(int_env_var_name.c_str(), int_env_var_value.c_str());

std::string empty_env_var_value = "";
std::string empty_env_var_name = "ENV_VAR_EMPTY";
SET_ENV_VAR(empty_env_var_name.c_str(), empty_env_var_value.c_str());

std::string default_value = "default";
std::string env_var_sample_yaml =
"base_value:\n"
" id: $ENV_VAR\n"
" name: '${ENV_VAR}'\n"
Expand All @@ -133,52 +148,117 @@ TEST(Configuration, configuration_environment_variables)
" sample_list:\n"
" - ${ENV_VAR}\n"
" - ' ${ENV_VAR}'\n"
" - $UNSED_XX_X_X_VAR\n";
conf.load_from_string(sample_yaml);
" - '${ENV_VAR} '\n"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests were expanded:

  • new case with whitespace AFTER the expanded string. Moreover, since we now support env variables embedded inside bigger strings, both ${ENV_VAR} and ${ENV_VAR} are now expanded
  • we do now tests that env variables found in the middle of strings are properly expanded (paths[0,1,2,3] cases)
  • paths[4] checks an even weirder case, ie: when the env variable gets expanded to another env variable. It must get resolved twice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We check that plugins config (that is getted as a sequence) is still correctly expanded.

" - $UNSED_XX_X_X_VAR\n"
"paths:\n"
" - ${ENV_VAR}/foo\n"
" - $ENV_VAR/foo\n"
" - /foo/${ENV_VAR}/\n"
" - /${ENV_VAR}/${ENV_VAR}${ENV_VAR}/foo\n"
" - ${ENV_VAR_EMBEDDED}/foo\n"
"is_test: ${ENV_VAR_BOOL}\n"
"num_test: ${ENV_VAR_INT}\n"
"empty_test: ${ENV_VAR_EMPTY}\n"
"plugins:\n"
" - name: k8saudit\n"
" library_path: /foo/${ENV_VAR}/libk8saudit.so\n"
" open_params: ${ENV_VAR_INT}\n";

yaml_helper conf;
conf.load_from_string(env_var_sample_yaml);

/* Check if the base values are defined */
ASSERT_TRUE(conf.is_defined("base_value"));
ASSERT_TRUE(conf.is_defined("base_value_2"));
ASSERT_TRUE(conf.is_defined("paths"));
ASSERT_FALSE(conf.is_defined("unknown_base_value"));

/* Test fetching of a regular string without any environment variable */
std::string base_value_string = conf.get_scalar<std::string>("base_value.string", default_value);
auto base_value_string = conf.get_scalar<std::string>("base_value.string", default_value);
ASSERT_EQ(base_value_string, "my_string");

/* Test fetching of escaped environment variable format. Should return the string as-is after stripping the leading `$` */
std::string base_value_invalid = conf.get_scalar<std::string>("base_value.invalid", default_value);
auto base_value_invalid = conf.get_scalar<std::string>("base_value.invalid", default_value);
ASSERT_EQ(base_value_invalid, "${ENV_VAR}");

/* Test fetching of invalid escaped environment variable format. Should return the string as-is */
std::string base_value_invalid_env = conf.get_scalar<std::string>("base_value.invalid_env", default_value);
/* Test fetching of invalid escaped environment variable format. Should return the string as-is */
auto base_value_invalid_env = conf.get_scalar<std::string>("base_value.invalid_env", default_value);
ASSERT_EQ(base_value_invalid_env, "$$ENV_VAR");

/* Test fetching of strings that contain environment variables */
std::string base_value_id = conf.get_scalar<std::string>("base_value.id", default_value);
auto base_value_id = conf.get_scalar<std::string>("base_value.id", default_value);
ASSERT_EQ(base_value_id, "$ENV_VAR"); // Does not follow the `${VAR}` format, so it should be treated as a regular string

std::string base_value_name = conf.get_scalar<std::string>("base_value.name", default_value);
auto base_value_name = conf.get_scalar<std::string>("base_value.name", default_value);
ASSERT_EQ(base_value_name, env_var_value); // Proper environment variable format

std::string base_value_escaped = conf.get_scalar<std::string>("base_value.escaped", default_value);
auto base_value_escaped = conf.get_scalar<std::string>("base_value.escaped", default_value);
ASSERT_EQ(base_value_escaped, env_var_value); // Environment variable within quotes

/* Test fetching of an undefined environment variable. Expected to return the default value.*/
std::string unknown_boolean = conf.get_scalar<std::string>("base_value.subvalue.subvalue2.boolean", default_value);
ASSERT_EQ(unknown_boolean, default_value);
/* Test fetching of an undefined environment variable. Resolves to empty string. */
auto unknown_boolean = conf.get_scalar<std::string>("base_value.subvalue.subvalue2.boolean", default_value);
ASSERT_EQ(unknown_boolean, "");

/* Test fetching of environment variables from a list */
std::string base_value_2_list_0 = conf.get_scalar<std::string>("base_value_2.sample_list[0]", default_value);
auto base_value_2_list_0 = conf.get_scalar<std::string>("base_value_2.sample_list[0]", default_value);
ASSERT_EQ(base_value_2_list_0, env_var_value); // Proper environment variable format

std::string base_value_2_list_1 = conf.get_scalar<std::string>("base_value_2.sample_list[1]", default_value);
ASSERT_EQ(base_value_2_list_1, " ${ENV_VAR}"); // Environment variable preceded by a space, hence treated as a regular string
auto base_value_2_list_1 = conf.get_scalar<std::string>("base_value_2.sample_list[1]", default_value);
ASSERT_EQ(base_value_2_list_1, " " + env_var_value); // Environment variable preceded by a space, still extracted env var with leading space

std::string base_value_2_list_2 = conf.get_scalar<std::string>("base_value_2.sample_list[2]", default_value);
ASSERT_EQ(base_value_2_list_2, "$UNSED_XX_X_X_VAR"); // Does not follow the `${VAR}` format, so should be treated as a regular string
auto base_value_2_list_2 = conf.get_scalar<std::string>("base_value_2.sample_list[2]", default_value);
ASSERT_EQ(base_value_2_list_2, env_var_value + " "); // Environment variable followed by a space, still extracted env var with trailing space

/* Clear the set environment variable after testing */
SET_ENV_VAR(env_var_name.c_str(), env_var_value.c_str());
auto base_value_2_list_3 = conf.get_scalar<std::string>("base_value_2.sample_list[3]", default_value);
ASSERT_EQ(base_value_2_list_3, "$UNSED_XX_X_X_VAR"); // Does not follow the `${VAR}` format, so should be treated as a regular string

/* Test expansion of environment variables within strings */
auto path_list_0 = conf.get_scalar<std::string>("paths[0]", default_value);
ASSERT_EQ(path_list_0, env_var_value + "/foo"); // Even if env var is part of bigger string, it gets expanded

auto path_list_1 = conf.get_scalar<std::string>("paths[1]", default_value);
ASSERT_EQ(path_list_1, "$ENV_VAR/foo"); // Does not follow the `${VAR}` format, so should be treated as a regular string

auto path_list_2 = conf.get_scalar<std::string>("paths[2]", default_value);
ASSERT_EQ(path_list_2, "/foo/" + env_var_value + "/"); // Even when env var is in the middle of a string. it gets expanded

auto path_list_3 = conf.get_scalar<std::string>("paths[3]", default_value);
ASSERT_EQ(path_list_3, "/" + env_var_value + "/" + env_var_value + env_var_value + "/foo"); // Even when the string contains multiple env vars they are correctly expanded

auto path_list_4 = conf.get_scalar<std::string>("paths[4]", default_value);
ASSERT_EQ(path_list_4, env_var_value + "/foo"); // Even when the env var contains another env var, it gets correctly double-expanded

/* Check that variable expansion is type-aware */
auto boolean = conf.get_scalar<bool>("is_test", false);
ASSERT_EQ(boolean, true); // `true` can be parsed to bool.

auto boolean_as_str = conf.get_scalar<std::string>("is_test", "false");
ASSERT_EQ(boolean_as_str, "true"); // `true` can be parsed to string.

auto boolean_as_int = conf.get_scalar<int32_t>("is_test", 0);
ASSERT_EQ(boolean_as_int, 0); // `true` cannot be parsed to integer.

auto integer = conf.get_scalar<int32_t>("num_test", -1);
ASSERT_EQ(integer, 12);

// An env var that resolves to an empty string returns ""
auto empty_default_str = conf.get_scalar<std::string>("empty_test", default_value);
ASSERT_EQ(empty_default_str, "");

std::list<falco_configuration::plugin_config> plugins;
conf.get_sequence<std::list<falco_configuration::plugin_config>>(plugins, std::string("plugins"));
std::vector<falco_configuration::plugin_config> m_plugins{ std::make_move_iterator(std::begin(plugins)),
std::make_move_iterator(std::end(plugins)) };
ASSERT_EQ(m_plugins[0].m_name, "k8saudit");
ASSERT_EQ(m_plugins[0].m_library_path, "/foo/" + env_var_value + "/libk8saudit.so");
ASSERT_EQ(m_plugins[0].m_open_params, "12");

/* Clear the set environment variables after testing */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just unsetenv?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On windows we should map unsetenv to _putenv(string,"") anyway, so i decided to avoid adding a new macro and just use what i got :)

SET_ENV_VAR(env_var_name.c_str(), "");
SET_ENV_VAR(embedded_env_var_name.c_str(), "");
SET_ENV_VAR(bool_env_var_name.c_str(), "");
SET_ENV_VAR(int_env_var_name.c_str(), "");
SET_ENV_VAR(empty_env_var_name.c_str(), "");
}

TEST(Configuration, configuration_webserver_ip)
Expand Down
139 changes: 102 additions & 37 deletions userspace/falco/yaml_helper.h
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,40 @@ limitations under the License.
#include "event_drops.h"
#include "falco_outputs.h"

class yaml_helper;

class yaml_visitor {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yaml_visitor is a small visitor API for yaml files, to call a callback function on each scalar value.
It is private to yaml_helper since it receives non-constant YAML::Nodes (thus it can modify the parsed yaml structure).

private:
using Callback = std::function<void(YAML::Node&)>;
yaml_visitor(Callback cb): seen(), cb(std::move(cb)) {}

void operator()(YAML::Node &cur) {
seen.push_back(cur);
if (cur.IsMap()) {
for (YAML::detail::iterator_value pair : cur) {
descend(pair.second);
}
} else if (cur.IsSequence()) {
for (YAML::detail::iterator_value child : cur) {
descend(child);
}
} else if (cur.IsScalar()) {
cb(cur);
}
}

void descend(YAML::Node &target) {
if (std::find(seen.begin(), seen.end(), target) == seen.end()) {
(*this)(target);
}
}

std::vector<YAML::Node> seen;
Callback cb;

friend class yaml_helper;
};

/**
* @brief An helper class for reading and editing YAML documents
*/
Expand All @@ -49,6 +83,7 @@ class yaml_helper
void load_from_string(const std::string& input)
{
m_root = YAML::Load(input);
pre_process_env_vars();
}

/**
Expand All @@ -57,6 +92,7 @@ class yaml_helper
void load_from_file(const std::string& path)
{
m_root = YAML::LoadFile(path);
pre_process_env_vars();
}

/**
Expand All @@ -77,44 +113,8 @@ class yaml_helper
get_node(node, key);
if(node.IsDefined())
{
std::string value = node.as<std::string>();

// Helper function to convert string to the desired type T
auto convert_str_to_t = [&default_value](const std::string& str) -> T {
std::stringstream ss(str);
T result;
if (ss >> result) return result;
return default_value;
};

// If the value starts with `$$`, check for a subsequent `{...}`
if (value.size() >= 3 && value[0] == '$' && value[1] == '$')
{
// If after stripping the first `$`, the string format is like `${VAR}`, treat it as a plain string and don't resolve.
if (value[2] == '{' && value[value.size() - 1] == '}')
{
value = value.substr(1);
return convert_str_to_t(value);
}
else return convert_str_to_t(value);
}

// Check if the value is an environment variable reference
if(value.size() >= 2 && value[0] == '$' && value[1] == '{' && value[value.size() - 1] == '}')
{
// Format: ${ENV_VAR_NAME}
std::string env_var = value.substr(2, value.size() - 3);

const char* env_value = std::getenv(env_var.c_str()); // Get the environment variable value
if(env_value) return convert_str_to_t(env_value);

return default_value;
}

// If it's not an environment variable reference, return the value as is
return node.as<T>();
return node.as<T>(default_value);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No more custom logic needed.

Copy link
Contributor Author

@FedeDP FedeDP Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BREAKING CHANGE: in #2562 env var that resolved to empty string (both non-set env vars and env vars set to "") returned default value.
This was in constrast with everything else in our yaml impl (ie: we only returned default value if node is not defined).
I reset to the same behavior of all other variables.

Copy link
Contributor Author

@FedeDP FedeDP Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BREAKING CHANGE: node.as<T>(default_value); will return default_value when it is not able to parse the node to T.
It makes sense IMHO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

return default_value;
}

Expand Down Expand Up @@ -153,6 +153,71 @@ class yaml_helper
private:
YAML::Node m_root;

/*
* When loading a yaml file,
* we immediately pre process all scalar values through a visitor private API,
* and resolve any "${env_var}" to its value;
* moreover, any "$${str}" is resolved to simply "${str}".
*/
void pre_process_env_vars()
{
yaml_visitor([](YAML::Node &scalar) {
auto value = scalar.as<std::string>();
auto start_pos = value.find('$');
while (start_pos != std::string::npos)
{
auto substr = value.substr(start_pos);
// Case 1 -> ${}
if (substr.rfind("${", 0) == 0)
{
auto end_pos = substr.find('}');
if (end_pos != std::string::npos)
{
// Eat "${" and "}" when getting the env var name
auto env_str = substr.substr(2, end_pos - 2);
const char* env_value = std::getenv(env_str.c_str()); // Get the environment variable value
if(env_value)
{
// env variable name + "${}"
value.replace(start_pos, env_str.length() + 3, env_value);
}
else
{
value.erase(start_pos, env_str.length() + 3);
}
}
else
{
// There are no "}" chars anymore; just break leaving rest of value untouched.
break;
}
}
// Case 2 -> $${}
else if (substr.rfind("$${", 0) == 0)
{
auto end_pos = substr.find('}');
if (end_pos != std::string::npos)
{
// Consume first "$" token
value.erase(start_pos, 1);
}
else
{
// There are no "}" chars anymore; just break leaving rest of value untouched.
break;
}
start_pos++; // consume the second '$' token
}
else
{
start_pos += substr.length();
}
start_pos = value.find("$", start_pos);
}
scalar = value;
})(m_root);
}

/**
* Key is a string representing a node in the YAML document.
* The provided key string can navigate the document in its
Expand Down
Loading