Skip to content

Commit

Permalink
Postpone removal of old common data to #1676
Browse files Browse the repository at this point in the history
  • Loading branch information
alecandido committed Feb 22, 2023
1 parent 37b4aab commit d74b3ce
Show file tree
Hide file tree
Showing 1,927 changed files with 174,968 additions and 0 deletions.
77 changes: 77 additions & 0 deletions buildmaster/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
cmake_minimum_required (VERSION 3.0.2)

# Set a default build type if none was specified

if(NOT CMAKE_BUILD_TYPE AND NOT CMAKE_CONFIGURATION_TYPES)
message(STATUS "Setting build type to 'Release' as none was specified.")
set(CMAKE_BUILD_TYPE Release CACHE STRING "Choose the type of build." FORCE)
# Set the possible values of build type for cmake-gui
set_property(CACHE CMAKE_BUILD_TYPE PROPERTY STRINGS "Debug" "Release"
"MinSizeRel" "RelWithDebInfo")
endif()

project(buildmaster)

IF(CMAKE_INSTALL_PREFIX_INITIALIZED_TO_DEFAULT)
SET(CMAKE_INSTALL_PREFIX
"${CMAKE_SOURCE_DIR}" CACHE PATH "buildmaster install location" FORCE)
ENDIF(CMAKE_INSTALL_PREFIX_INITIALIZED_TO_DEFAULT)

SET(CMAKE_BUILD_WITH_INSTALL_RPATH FALSE)

# Allow people to set their own rpath on command line
SET(CONDA_PREFIX "$ENV{CONDA_PREFIX}")

if((NOT CMAKE_INSTALL_RPATH) AND (CONDA_PREFIX))
message(STATUS "Setting rpath to $CONDA_PREFIX/lib.")
SET(CMAKE_INSTALL_RPATH "$ENV{CONDA_PREFIX}/lib")
endif()

set(CMAKE_CXX_STANDARD 14)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
set(CMAKE_CXX_EXTENSIONS OFF)

# check for dependencies

find_package(PkgConfig REQUIRED)
pkg_search_module(GSL REQUIRED gsl)
pkg_search_module(YAML REQUIRED yaml-cpp)
pkg_search_module(NNPDF REQUIRED nnpdf)

# linux complains if we don't explicitly link these

pkg_search_module(SQLITE REQUIRED sqlite3)
pkg_search_module(LHAPDF REQUIRED lhapdf)
pkg_search_module(LIBARCHIVE REQUIRED libarchive)

set(DEFAULT_CXX_OPTIONS "-Wall -Wextra -march=nocona -mtune=haswell \
-fvisibility-inlines-hidden -fmessage-length=0 \
-ftree-vectorize -fPIC -fstack-protector-strong \
-fsanitize=address \
-O2 -pipe")

# Get rid of annoying semi colon in pkg_config output - slightly hacky way
# https://stackoverflow.com/a/35458217

string(REPLACE ";" " " NNPDF_CFLAGS "${NNPDF_CFLAGS}")

set(CMAKE_CXX_FLAGS
"${DEFAULT_CXX_OPTIONS} ${NNPDF_CFLAGS} ${GSL_CFLAGS} ${YAML_CFLAGS}"
CACHE STRING "compile flags" FORCE)
set(CMAKE_EXE_LINKER_FLAGS
"${CMAKE_EXE_LINKER_FLAGS}" CACHE STRING "linker flags" FORCE)

AUX_SOURCE_DIRECTORY(filters FILTER_SRC)

add_executable(buildmaster src/buildmaster.cc
${FILTER_SRC}
src/common.cc
src/buildmaster_utils.cc)
include_directories(src inc)
target_link_libraries(buildmaster
${NNPDF_LDFLAGS} ${YAML_LDFLAGS}
${GSL_LDFLAGS} ${SQLITE_LDFLAGS}
${LHAPDF_LDFLAGS} ${LIBARCHIVE_LDFLAGS})
install(TARGETS buildmaster DESTINATION .
PERMISSIONS OWNER_READ OWNER_WRITE OWNER_EXECUTE GROUP_READ
GROUP_EXECUTE WORLD_READ WORLD_EXECUTE)
133 changes: 133 additions & 0 deletions buildmaster/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
# buildmaster

The `buildmaster` project provides a systematic mechanism to convert
experimental data provided in various formats into a standard (or *master*)
version in the NNPDF CommonData format.

## Dependencies and compilation

`buildmaster` depends on:

Run:

- NNPDF/nnpdf
- yaml-cpp
- gsl

Build:

- cmake > 3.0.2
- sqlite3
- libarchive
- LHAPDF

(Note that yaml-cpp, gsl, cmake, sqlite3, libarchive and LHAPDF are also
dependencies of NNPDF/nnpdf).

In order to compile, it is recommended to create a fresh build directory

```
$ mkdir bld
$ cd bld
$ cmake ..
$ make -j && make install
```

## Running the code

In order to generate a master copy of all experimental data run the
`buildmaster` program which by default will be installed in the root of this repository.
This program will create for each dataset:
- DATA_[setname].dat are generated and placed in the results folder
- SYSTYPE_[setname]_DEFAULT.dat are generated and placed in results/systypes
After generating these files the user can copy them to the `nnpdfcpp/data/commondata` folder.


## Implementing new datasets

In order to implement a new dataset the developer has to:

1. Create input files in `rawdata/<exp>`, where `<exp>` is the name of
the new dataset and must coincide with the `apfelcomb` and `applgrid`
definitions. The input files are raw data files obtained from papers
(copy/paste) or from the HEPDATA website. The user has the freedom to
select his preferred format, but csv or plain text are the recommended
formats.

2. Create a metadata file in `meta` in yaml with the format
```yaml
# DATASET.yaml
ndata: <number of datapoints>
nsys: <number of systematic errors>
setname: <setname in double quotes, i.e "ATLAS1JET11">
proctype: <process type (see nnpdf/nnpdfcpp/doc) in double quotes i.e "JET")
```
3. Create a new class with the dataset name in `inc` (*.h) and
`filters` (*.cc) following the patter of other datasets, i.e. in the
header create a new class which inherits from `CommonData`:
```c++
class MY_NEW_DATASET_CLASSFilter: public CommonData {
public: MY_NEW_DATASET_CLASSFilter("MY_NEW_DATASET_NAME") { ReadData(); }
private:
void ReadData();
}
```
in the C++ file you implement the `void ReadData()` method.

4. The previous class must read from rawdata all the required
information about the data, and fill the attributes of the CommonData
class in `libnnpdf`. The required entries are:
- the kinematic information: fKin1, fKin2, fKin3
- the data: fData
- the statistical uncer.: fStat
- the systematic uncer.: fSys

5. Edit `src/buildmaster.cc` by including the header file of the new
dataset and by pushing back to the `target` object a new instance of
the class created in step 2. Namely:
```c++
target.push_back(new MY_NEW_DATASET_CLASSFilter());
```

## Important notes

### Symmetrising uncertainties

Occasionally experiments present uncertainties that are asymmetric, i.e

```
\sigma + \Delta_+ - \Delta_-
```
These must be symmetrised in `buildmaster` as the CommonData format accepts only
symmetric uncertainties. When provided, the symmetrisation procedure suggested
by the experimental paper should be used. If no such procedure is suggested, we
follow the symmetrisation procedure of D'Agostini [physics/0403086]. This is
implemented in the function `symmetriseErrors` provided in buildmaster_utils.
**Be careful** with signs when using this function. The function expects all
signs to be present in its arguments. So when symmetrising an error that
appears as above you should call
```
symmetriseErrors(\Delta_+, - \Delta_-, ... )
```
where it is important to note that the sign is intact in the downwards
uncertainty. This method returns a symmetrised error along with a *shift* to be
applied to the data central value.
## Code development policy/rules
Developers must never commit code structure modifications to master. The development pattern should follow these rules:
- Open an issue explaining your bug or feature request. If you report a bug, post information to reproduce it.
- The resolution of issues must be performed in a new branch through a pull request.
- If you have a local version of the code that you would like to merge in the master, open a pull request.
- The pull request must be reviewed by at least 2 core developers.
## File format documentation
For specifications about data file formats please check the `nnpdf` repository at `nnpdf/doc/data/`.
Loading

0 comments on commit d74b3ce

Please sign in to comment.