Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Andy bennett/684 continue adding datapipeline #91

Open
wants to merge 20 commits into
base: dev-pipeline
Choose a base branch
from

Conversation

andyb96247
Copy link

@andyb96247 andyb96247 commented Aug 27, 2020

Data pipeline read functionality should be complete in its current form. The github workflow still doesn't work.

Resolves ScottishCovidResponse/SCRCIssueTracking#684

@andyb96247 andyb96247 requested a review from a team August 27, 2020 16:53
- name: Compile
env:
CC: ${{ matrix.config.compiler }}
CXX: ${{ matrix.config.compilerpp }}
run: |
source .venv/bin/activate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reason why the CI is currently failing for macOS is that you do this source .venv/bin/activate for both the macOS and Ubuntu builds, but the virtual environment has only been set up for Ubuntu in the "Data pipeline Dependencies" step.

I guess you would need to split the compile into two: one for MacOS, and one for Ubuntu, to fix it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair point. I'm not familiar enough with MacOS to know quite what to do to fix without needing to go starting looking things up. Hopefully it isn't too difficult.

@@ -6,7 +6,7 @@ find_package(GSL REQUIRED)
set(PRE_CONFIGURE_FILE "Git.cpp.in")
set(POST_CONFIGURE_FILE "${CMAKE_CURRENT_BINARY_DIR}/Git.cpp")
include(${CMAKE_SOURCE_DIR}/cmake/git_watcher.cmake)
add_library(git SHARED ${POST_CONFIGURE_FILE})
add_library(git STATIC ${POST_CONFIGURE_FILE})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the change to STATIC?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cmake for libdatapipeline.a is not building with '-fPIC' so needs STATIC here. A better fix might be to add '-fPIC'/SHARED there and keep SHARED here.

// be used in this sum. The value of obs.cases[0][0] will be -1, so unlikely to show
// much?

// for (unsigned int region = 0; region < obs.cases.size() - 1; ++region) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot. Looking at the code, I think you are right: it is incorrect. I've added an issue to track this here: ScottishCovidResponse/SCRCIssueTracking#751

@@ -240,22 +208,64 @@ ObservationsForInference ReadInferenceObservations(const std::string& configDir,
return observations;
}

PredictionConfig ReadPredictionConfig(const std::string& configDir, int index, Utilities::logging_stream::Sptr log, const CommonModelInputParameters& commonParameters)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the split between ReadPredictionConfig and ReadLocalPredictionConfig. It seems as though all of the data in both functions comes from local sources, rather than the pipeline, so why the split?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference is that the "Local" function always occurs with or without the data pipeline (there is a call to it from both) since these values are always read from the local files. ReadPredictionConfig only gets called in the non data pipeline, with the equivalent IOdatapipeline::ReadPredictionConfig called for the data pipeline cases.

I just wanted to avoid some duplication. This is similar for the LocalInference function.

EXPECT_EQ(parse.getArgs().datapipeline_path, "my_config.yaml");
}

TEST(AnArgumentParser, TerminatesIfNoDataPipelinePath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn't do what is intended. You can confirm this by altering the body of the test to:

    const char* _test_args_default[] = {"exe", "-c", "foo"};

    EXPECT_EXIT( ArgumentParser(3, _test_args_default), ExitedWithCode(1), "");

The test still passes, when in fact it should fail, given that a path "foo" has been supplied .

The reason that the original test passes is because the model structure and run mode are missing. These are required arguments, and their absence causes the observed termination, not the absence of the data pipeline path.

The correct implementation would be:

    const char* _test_args_default[] = {"exe", "-m", "inference", "-s", "original", "-c"};

    EXPECT_EXIT( ArgumentParser(6, _test_args_default), ExitedWithCode(1), "");

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

const std::string out_dir = std::string(ROOT_DIR)+"/outputs";
Utilities::logging_stream::Sptr logger = std::make_shared<Utilities::logging_stream>(out_dir);

EXPECT_ANY_THROW(IO::IOdatapipeline idp2("../test/datapipeline/parameters.ini", "", logger, "NoValidPath.yaml"););
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest using the ROOT_DIR macro to set up this path, rather than a relative path. With the relative path, the test will not work if it is run from somewhere other than the build directory.

Same comment applies elsewhere in this file.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this

}

namespace {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this code has been duplicated in a few places. If you need it multiple times, I'd suggest moving it into the Utilities.h file and including it from there.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the "operator<<" functions were there just for debugging so I've removed them in both places.

@@ -4,7 +4,7 @@ set(PROJECT_NAME Covid19EERAModel)

project(${PROJECT_NAME} VERSION 0.10.0 LANGUAGES CXX)

set(CMAKE_CXX_STANDARD 11)
set(CMAKE_CXX_STANDARD 17)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for requiring C++17 support?

My concern would be that only relatively up-to-date compilers support it: people working on older systems may struggle to build this code if we require 17. But if we need a feature from it, then it's okay.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The datapipeline.hh file is currently using std::is_same_v which is a C++17 feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants