Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ENV variables for configuration, more upgradable (and Heroku support) #36

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

davidjrice
Copy link

So, I'm aware there's another issue open #33 however, I think the point has been lost amongst using jruby etc. None of that is required to get it running on heroku.

The few changes made have allowed me to

Using the following process to create the app.

heroku create
heroku labs:enable user-env-compile

heroku config:add SQUASH_SECRET_TOKEN="xxx"
heroku config:add SQUASH_AUTHENTICATION_PASSWORD_SALT="xxx"
git push heroku master

I'm not happy with where the hacks are to use ENV variables at present. There are more configurations set during setup.rb to be extracted and perhaps modifying the setup itself. However, this approach:

  • encourages passwords / secret tokens etc. not to be included in the source repository
  • allows easy upgradability (as no files are modified this would be as simple as a git pull && git push heroku)
  • allows configuration changes without requiring deployments

I would use this in conjunction with a .env file (added to .gitignore) and instantiating those ENV variables before running the server with

export $(cat .env)

or using the dot_env gem.

I would continue further here but the heroku File System concerns I've outlaid in #35 have pretty much stopped my progress. I'm not sure running on heroku will be possible without drastic changes to git repository integration.

Therefore, this is more a proof of concept but I thought I would document here as

  • a) using ENV variables would still be preferable, regardless of Heroku
  • b) if this information helps or inspires anyone else to get this running on Heroku 😃

@RISCfuture
Copy link
Contributor

So what options do we have for Git repository access? About the only one I can think of is to somehow serialize the repository to the database, and create an in-memory representation of it at runtime. Which sounds very slow.

@nzifnab
Copy link

nzifnab commented Feb 4, 2013

Can someone fill me in on which parts of the workflow require special git integration? I have a working Squash server running on heroku that my development app can successfully access with errors, and bugs appear to be properly created with stacktraces and a git blame and everything. Attempting to do the same from a production app has given me the error about the current_revision being unavailable but looking at the source of the error it appears as though I can just set a configuration value with the revision hash value (I have yet to try this, I will soon) - is there something else in the production app that is going to require further git repo access?

Somehow I was under the impression that the squash web app is the one that decides what the git blame looks like for any given bug (I see in the model code that it calls git.clone and clones into the tmp/ dir). Is the concern that it has to do this too frequently and severely affects performance? (It does appear to work - although I'm not sure at what point in the lifecycle the repo is cloned).

@RISCfuture
Copy link
Contributor

is there something else in the production app that is going to require further git repo access?

No, but performance will be very slow because each time your dynos restart all local copies of your projects' repositories are discarded. So you will be constantly re-cloning your repos. I'm struggling to find an optimal solution to this.

@RISCfuture
Copy link
Contributor

OK, before this pull request is merged, we'll need

  • to figure out the filesystem issue
  • to get these ENV calls into Heroku-based projects only

@nzifnab
Copy link

nzifnab commented Feb 4, 2013

each time your dynos restart

Ya... And I believe they wind down anytime they haven't been accessed recently (which will happen quite often on an application that only 4 developers are ever likely to look at)...

I see the problem here, I can think of two possible solutions:

  1. Use the github API to get the data you need access to (Not sure if it gives access to everything you'd need, I couldn't find an API endpoint for git blame data, for instance).

-- The app would need to store the credentials of a user that had access to the repos in question (Perhaps via environment variables) for private repos

  1. [Update] I'm not sure this would work (at all), skip to the "To recap" for why [/Update] As I understand it, it's the squash app that needs to clone the client app's repo and record changes when the client app gets deployed (correct me if I'm wrong). It may be possible to use a buildpack on the client app that would force-trigger a deploy of the associated squash app whose own buildpack would clone or update a checked out git repo of the client. Files made during the build in the build_dir will remain there even when the dynos are restarted.

-- This would require having a custom buildpack on all apps you intend on using with Squash
-- I'm not sure what it would take to trigger the deploy of a separate app, it may have to make an innocuous change to a txt file in squash/web and then git push heroku...

To recap:

Squash Web ➡️ Buildpack clones client apps into... 💩 Shit. Just realized if you have any reasonable number of apps this would take forever, and it's not very feasible to clone everything into the app because heroku will end up limiting your slug size. It'd also be annoying to tell squash which apps it should pre-clone anytime you needed to add another app.

Heroku integration is more complicated than it seems at first! It may be best to try to get the data you need via github API requests and cache or save reusable responses in the database... That of course limits people on heroku to using github and not other VCS's like bitbucket etc.

@bjeanes
Copy link
Contributor

bjeanes commented Feb 20, 2013

I think Heroku support with a FS-based git repo is completely unfeasible — by design, on Heroku's part. The Git Data API could be used to support GitHub (Enterprise) repositories, but that's about it.

However, we shouldn't conflate the heroku issue with the configuration changes, they have independent value. I am not deploying to Heroku but really don't want to have to use yml files for configuration for a few reasons. Upgradeability is a huge concern and, as much as possible, the app should be runnable and (reasonably) configurable without having to maintain your own fork of the project (or even make any commits at all).

@RISCfuture
Copy link
Contributor

That's a valid point. We should split this into two issues.

@bjeanes
Copy link
Contributor

bjeanes commented Feb 21, 2013

@RISCfuture agreed. So far, this PR is completely about configuration being decoupled from yml files that are committed. I think that's great for a number of reasons...

I'm trying to get Squash deployed and running at Groupon at the moment, and am interested in making a few changes that would allow people in my situation to run squash as unmodified as possible, to maintain the ability to keep upstream changes mergable, Ideally, I shouldn't have to make a single deployment-specific commit to the project to get it running. This allows homogeneity which is awesome for things like community, upstream contributions, and cross-community support.

I'm planning to make a few PRs this week to this end, if you'd be interested:

  1. JRuby -> WAR creation pipeline that can be dropped straight into something like a Tomcat or self-executed. It should not interfere with people not using JRuby or people wanting to run the app on JRuby in a traditional Ruby deployment (capistrano, exploded project directory, etc)
  2. Make configuration easily configurable from ENV vars and/or Java properties.
  3. Standardize the worker enqueuing interface to allow things like Resque to be plugged in at a single point without having to run a non-standard variant of the main codebase
  4. Other things that I find that need to be different to support my deployment constraints but can be built in a generic way and leverages common code paths.

This PR starts on point 2 which is why I'm interested in it...

Crap that was a long comment and a total tangent.

@RISCfuture
Copy link
Contributor

Throw in Sidekiq for 3) and I'd appreciate it :) I'm considering the switch here at Square. Good luck and let me know if you need anything explained.

@davidjrice
Copy link
Author

@bjeanes @RISCfuture cool.

I think continuing on with improving configuration / upgradeability from this start would be a good thing.

The other points @bjeanes brought up, pretty related but could be separated out into further issues. All useful stuff 👍

On the git file system access, I have some ideas which I will try and document in a new issue when I get a chance. In short though, I think all of the git interaction could be separated to an interface. This could be the github API (if they released a blame API feature), github enterprise, a git repo fronted by a simple sinatra app on a separate server with a writable file system. You name it.

@RISCfuture
Copy link
Contributor

OK (3) is done. 600adf0

@@ -1,4 +1,5 @@
source :rubygems
ruby '1.9.3'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest dropping your eb50ef0 commit, because:

  • Ruby 1.9.3 is specified in Gemfile as of ab1b97d.
  • This isn't relevant to using ENV variables for configuration.
  • Merging your branch will conflict because the Gemfile source has also been changed.

@pda
Copy link

pda commented Mar 16, 2013

I've submitted a pull request at RISCfuture/Configoro#2 to enable ERB preprocessing of YAML configuration files.

This will allow item: <%= ENV["SOMETHING_CONFIGURABLE"] || "some_default" %> throughout Squash configuration files.

@jeroenvandijk
Copy link

@RISCfuture I think you can prevent restarts by keeping the application alive through ping the server every once and so often. (this is what I've done with all my apps on Heroku for better response times). This can be done for example with the NewRelic addon through monitoring. When you do this a restart only happens once a day. One git clone per day is still ok right?

@RISCfuture
Copy link
Contributor

One git-clone per day is fine. I've accepted the Configoro PR but it's ugly to have those ENV[] || foos all over the place.

@bjeanes
Copy link
Contributor

bjeanes commented Apr 19, 2013

@RISCfuture it doesn't have to be the default, the Configoro change just enables people to do that if they choose. They could also draw it from other arbitrary sources (Java System.property comes to mind, for JRuby)...

@pda
Copy link

pda commented Apr 19, 2013

Ugly maybe, but I'll take it over tools to rewrite and then commit config files :)
Separation of config from the app as per The Twelve-Factor App makes a lot of sense to me.
I'd also use ENV.fetch("SOMETHING_CONFIGURABLE") where there's no sensible default.

@RISCfuture
Copy link
Contributor

Yeah, I agree with the idea; I'm just debating where best to abstract the logic. Open to ideas.

@pda
Copy link

pda commented Apr 19, 2013

Yeah I've wondered that on various projects; ENV["…"] lookups scattered throughout the codebase, or wrapping them up in a single Configuration object which manages sourcing and defaulting those settings. I'm not sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants