Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

package formulas #6

Open
zbeekman opened this issue Jul 4, 2015 · 17 comments
Open

package formulas #6

zbeekman opened this issue Jul 4, 2015 · 17 comments

Comments

@zbeekman
Copy link
Member

zbeekman commented Jul 4, 2015

Hi all,
Just a suggestion about how to manage different packages, and a place to discuss.

I really like the way Homebrew keeps track of packages. It stores instructions for downloading, building and installing packages as relatively simple ruby scripts. I think this idea is much more powerful than trying to do something like duplicating package sources via cloning existing git repos. If FLATPack used a similar formula structure, the formula can indicate which compilers a certain package works with (and what versions and/or which language features or standards are required) and contain specialized build instructions for each compiler.

Here is a link to the Homebrew documentation for writing formulas and for submitting pull requests to add them to Homebrew.

Here is the json-fortran Homebrew formula for reference. It’s pretty concise and simple, and Homebrew has a create command which you pass a URL to and will extract version info, the sha256 hash and stub to expand. The bottle do stuff is just precompiled binaries built during the formula testing process and automatically added by the testbot:

class JsonFortran < Formula
  desc "A Fortran 2008 JSON API"
  homepage "https://github.com/jacobwilliams/json-fortran"
  url "https://github.com/jacobwilliams/json-fortran/archive/4.1.1.tar.gz"
  sha256 "97f258d28536035ef70e9ead5c7053e654106760a12db2cc652587ed61b76124"

  head "https://github.com/jacobwilliams/json-fortran.git"

  bottle do
    cellar :any
    sha256 "3b6410ef26c24d63f90e420aae0157f7d97b4d154b398305863e2be6c24eed8d" => :yosemite
    sha256 "5608f04857515ce6b38d6a7ade2cf50a15541cb307ff97cbde1d367af3b19801" => :mavericks
    sha256 "443ce5965a801c7e3dda0dfc5762b9f84ec97bf450d98eedfe0385d3681a725e" => :mountain_lion
  end

  option "with-unicode-support", "Build json-fortran to support unicode text in json objects and files"
  option "without-test", "Skip running build-time tests (not recommended)"
  option "without-robodoc", "Do not build and install ROBODoc generated documentation for json-fortran"

  depends_on "robodoc" => [:recommended, :build]
  depends_on "cmake" => :build
  depends_on :fortran

  def install
    mkdir "build" do
      args = std_cmake_args
      args << "-DUSE_GNU_INSTALL_CONVENTION:BOOL=TRUE" # Use more GNU/Homebrew-like install layout
      args << "-DENABLE_UNICODE:BOOL=TRUE" if build.with? "unicode-support"
      args << "-DSKIP_DOC_GEN:BOOL=TRUE" if build.without? "robodoc"
      system "cmake", "..", *args
      system "make", "check" if build.with? "test"
      system "make", "install"
    end
  end

  test do
    ENV.fortran
    (testpath/"json_test.f90").write <<-EOS.undent
      program json_test
        use json_module
        use ,intrinsic :: iso_fortran_env ,only: error_unit
        implicit none
        call json_initialize()
        if ( json_failed() ) then
          call json_print_error_message(error_unit)
          stop 2
        endif
      end program
    EOS
    system ENV.fc, "-ojson_test", "-ljsonfortran", "-I#{HOMEBREW_PREFIX}/include", testpath/"json_test.f90"
    system "./json_test"
  end
end

This encodes everything home-brew needs to know about json-fortran and this is what build options/info it provides to the user:

$ brew info json-fortran
json-fortran: stable 4.1.1 (bottled), HEAD
A Fortran 2008 JSON API
https://github.com/jacobwilliams/json-fortran
/usr/local/homebrew/Cellar/json-fortran/4.1.1 (34 files, 1.3M) *
  Built from source with: --with-unicode-support
From: https://github.com/Homebrew/homebrew/blob/master/Library/Formula/json-fortran.rb
==> Dependencies
Build: robodoc ✔, cmake ✘
Recommended: robodoc ✔
==> Options
--with-unicode-support
    Build json-fortran to support unicode text in json objects and files
--without-robodoc
    Do not build and install ROBODoc generated documentation for json-fortran
--without-test
    Skip running build-time tests (not recommended)
--HEAD
    Install HEAD version
@cmacmackin
Copy link
Contributor

We were already thinking of doing something along these lines. Originally we were wondering if perhaps this information could just be encoded in JSON, though. My thinking was that within the JSON would be the URL for a git repository. Then there would be a dictionary mapping version numbers to commit SHAs. As you say, there would also need to be instructions on how to go about compiling including, perhaps, certain compile options. We'd also need a list of compilers (and versions) which will work, perhaps changing depending on the version of the package. Much of this logic would work regardless of whether this was in JSON or Python. The advantage of JSON would be that the package maintainer wouldn't need to know Python. Of course, doing it in Python makes it easier for the maintainer to add more complicated options and build logic. It would also mean that we (the authors of FLATPack) would not need to worry about messing around converting JSON to a Package object.

Final thought: how do we go about choosing the compiler at build time? This would, presumably, depend on the build system. Perhaps have the package maintainer include a different set of build instructions for each compiler?

@zbeekman
Copy link
Member Author

zbeekman commented Jul 4, 2015

See #7... Basically I think you just build a version of the package for all possible compiler's when you install the package. This will be the intersection of available compilers on the system and compilers supported by the package. Maybe the user can also limit what is built.

@cmacmackin
Copy link
Contributor

I realize that. What I mean is, how do we tell the build system which compiler to use? This may be a trivial question, but I'm not that familiar with anything more complicated than make.

@milancurcic
Copy link

Some ideas and suggestions:

  1. At initial setup, flatpack could scan the system and/or prompt the user for paths to available compilers. The compilers accepted would have to be on flatpack's list of supported compilers, since each comes with its own peculiarities and flags that flatpack would have to know about.
  2. flatpack should ask the user what will be the default build environment (compiler, memory model etc.), and use the default whenever the build options are not specified. The default (say, gnu) could be overridden by the user with a flag, for example:
    flatpack build package-name [--build-env=intel]
  3. Building with all (available & supported) compilers by default may not be ideal in all cases - some codes may take over an hour to compile, especially if optimized. I think having a single default build setting that can be overridden is the way to go (my $.02)
  4. Another difficulty that you guys touched on is that each package comes with it's own build system. Some use make, cmake, scons; some may support multiple, some may support none (i.e. build yourself). flatpack will need to have defined rules (say, in a json file) how to build a specific package.

@cmacmackin
Copy link
Contributor

I've pushed some sample JSON files to the repo, to provoke conversation. compilers.json would be a system file describing the available compilers. json-fortran.json is an example package file for json-fortran. The build commands which I provide for the Intel and NAG compilers are meaningless--I'm not sure what the proper way to choose the compiler is, so I just added something random to indicate that the commands would be different compared to GNU.

@cmacmackin
Copy link
Contributor

Looking at YAML, I've concluded that it looks like a much nicer (and certainly more Pythonic) serialization format to use than JSON. There is the additional advantage that PyYAML makes it trivially easy to convert the YAML data into a Python object. Take a look at the compilers.yml and json-fortran.yml, as well as compiler.py to see.

@szaghi
Copy link
Member

szaghi commented Jul 6, 2015

Hi @cmacmackin
I am using YAML and PyYaml in a testing version of MaTiSSe. It seems very handy, but in my case (yaml mixed into dirty markdown) has some limitations, thus I am testing a fork that does a better job returning more data from the parsed stream (in particular line and column of parsed yaml token). ASAP Iwill study your new pushed example.

@zbeekman
Copy link
Member Author

zbeekman commented Jul 6, 2015

In response to @milancurcic’s comment:

At initial setup, flatpack could scan the system and/or prompt the user for paths to available compilers. The compilers accepted would have to be on flatpack’s list of supported compilers, since each comes with its own peculiarities and flags that flatpack would have to know about.

👍

flatpack should ask the user what will be the default build environment (compiler, memory model etc.), and use the default whenever the build options are not specified. The default (say, gnu) could be overridden by the user with a flag, for example:
flatpack build package-name [—build-env=intel]

See #7. I think in general FLATPack should build a version for each supported compiler on the system, but with the option that the user can restrict the build to a single compiler. Using @cmacmackin’s example files, by default when installing json-fortran FLATPack would cross reference the compilers listed in the json-fortran.yml formula and this available on the given system from the compilers.yml and build a version of json-fortran for each compiler that FLATPack can. Then the user can switch which compiler environment is available as described in #7. Building a version for each compiler will be time consuming but will make installation of a given package a one time even for sysadmins and will make each package highly available to users. Thoughts?

Building with all (available & supported) compilers by default may not be ideal in all cases - some codes may take over an hour to compile, especially if optimized. I think having a single default build setting that can be overridden is the way to go (my $.02)

Fair point… perhaps this is best to have a setting for. Let the user/sysadmin choose the desired behavior? Or maybe large/slow projects can be flagged in their formulas to be one environment installs only?

Another difficulty that you guys touched on is that each package comes with it's own build system. Some use make, cmake, scons; some may support multiple, some may support none (i.e. build yourself). flatpack will need to have defined rules (say, in a json file) how to build a specific package.

Yes, it certainly will. I think that this is unavoidable and that we should follow the Homebrew philosophy of defining package formulas, kept under version control, that anybody can contribute via a pull request. I don’t know of any other way to get around this issue.

@zbeekman
Copy link
Member Author

zbeekman commented Jul 6, 2015

@cmacmackin : I love the ease of use of YAML with Python and the simple syntax. The proposed formula looks great in JSON or YAML. I think that we want to keep as little duplicate information in the package formulae as possible though—I’d prefer to leave out author, maintainer and email info. That information should be available from the project site.

My one hesitation about YAML is that it can be a bit awkward to embed shell commands in it sometimes due to its strict indentation rules, but that certainly is not a deal breaker. I think the benefits probably outweigh this shortcoming.

I think most packages have builds that can be categorized into two to 5 phases, and it might be worthwhile for FLATPack to know about them:

  1. Configure: Let FLATPack formulas have an opportunity to communicate to the package what the install structure will look like, and let the user pass in special options. In GNU AutoTools this is the ./configure … step and for CMake this is the cmake … or ccmake … or cmake-gui … step. Some packages may not have anything that needs to be done here. Homebrew has a set of std_cmake_args that can be specified in formulae to help to set the install prefix etc. Not all packages need to have a configure stage
  2. Build: Most of the time make will likely be executed here, but it’s possible that this step could be a shell script or a build with SCons, FoBiS.py, foray or some other tool. All packages will need to be compiled.
  3. Build verification: This will be something like make check to verify that the package that was just built passes unit tests and regression tests by the author. Again this step might be optional but encouraged.
  4. Installation: All packages will need to be installed after they are built. Binary objects are put in their proper locations… according to the convention outlined in Proposed installation management #7 or a different agreed upon convention
  5. Installation verification: Similar to build verification, but in this step access to the installed package from client code is tested. Can we use the module defined in the package and link against the libraries it provides? This step can likely be done completely in lieu of the Build Verification because it can test both functionality and availability. This is similar to Homebrew’s test do formula block.

@cmacmackin
Copy link
Contributor

@zbeekman @milancurcic
I think perhaps the way to go is to have a flag for large packages, indicating that they will take a long time to build. If this is flagged then FLATPack can query the user as to whether they want to build for all compilers or just a subset of them.

@zbeekman
Copy link
Member Author

zbeekman commented Jul 7, 2015

Sounds good

@cmacmackin
Copy link
Contributor

As I've been thinking more about this project, I've realized how little I know about build processes. As such, I'm probably not the best person to be designing the structure of FLATPack formulas. Some initial ideas can be found here. In particular, I would be interested on people's thoughts about how to specify different compile options. For example, how to compile with documentation, how to compile with multi-threading, etc. Or perhaps it would be better just to have separate packages for docs (as done in Debian/Ubuntu) and for multi-threaded versions.

@zbeekman
Copy link
Member Author

I think that the formula just needs to be able to be told about build details that are provided by the actual software packages. The initial formulas look like they are definitely on the right track. FLATPack need not force conventions on packages, rather it should just require them to adhere to sane Unix-like install conventions. I can work on this some more when I get some free time, but right now finishing my thesis is the main priority.

@cmacmackin
Copy link
Contributor

Okay, take your time. Good luck on your thesis.

@szaghi
Copy link
Member

szaghi commented Jul 24, 2015

Go Zaak, this is your time to fly! We are all with you!

@cmacmackin I am not sure how can I help you. I like zaak idea that FLATPack should be as most agnostic as possible. Why you are complaining with compilations flags or such low-level "garbage"?

@cmacmackin
Copy link
Contributor

Well, I'd be fine with avoiding that sort of stuff. I just noticed that it was present in the Homebrew formula. For example, there are the lines specifying different install options:

  option "with-unicode-support", "Build json-fortran to support unicode text in json objects and files"
  option "without-test", "Skip running build-time tests (not recommended)"
  option "without-robodoc", "Do not build and install ROBODoc generated documentation for json-fortran"

There are also some lines specifying arguments:

      args = std_cmake_args
      args << "-DUSE_GNU_INSTALL_CONVENTION:BOOL=TRUE" # Use more GNU/Homebrew-like install layout
      args << "-DENABLE_UNICODE:BOOL=TRUE" if build.with? "unicode-support"
      args << "-DSKIP_DOC_GEN:BOOL=TRUE" if build.without? "robodoc"

That may not be necessary for FLATPack, but as I'm not experienced with build systems I am not sure.

@zbeekman
Copy link
Member Author

Yes, we will need some generic install related options for CMake, Makefiles etc. The formula just provides specification of those things if necessary to conform with the FLATPack install structure, etc… There are standard conventions for relocating install directories that we can take advantage of. I’m happy to help out with this logic, but probably not for a month or two.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants