Skip to content

Commit

Permalink
Add workflows for "fork and clone" and "pull upstream changes"
Browse files Browse the repository at this point in the history
Closes jennybc#40 by admitting there is no viable browser-only workflow
  • Loading branch information
jennybc committed Jan 10, 2019
1 parent 783b440 commit 2260fc3
Show file tree
Hide file tree
Showing 12 changed files with 220 additions and 0 deletions.
2 changes: 2 additions & 0 deletions _bookdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,8 @@ rmd_files: [
"workflows-intro.Rmd",
"workflows-repeated-amend.Rmd",
"workflows-pull.Rmd",
"workflows-fork-and-clone.Rmd",
"workflows-upstream-changes-into-fork.Rmd",
"workflows-explore-extend-pull-request.Rmd",
"workflows-make-github-repo-browsable.Rmd",

Expand Down
Binary file added img/clone-theirs-sad.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/clone-theirs.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/clone-yours.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/fork-and-clone.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/fork-no-upstream-sad.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/fork-triangle-happy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/fork.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/pull-push-yours.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/rstudio-new-branch.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
77 changes: 77 additions & 0 deletions workflows-fork-and-clone.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# Fork and clone {#fork-and-clone}

Use "fork and clone" to get a copy of someone else's repo if there's any chance you will want to propose a change to the owner, i.e. send a "pull request". If you are waffling between "clone" and "fork and clone", go with "fork and clone".

## Initial workflow

On [GitHub](https://github.com), make sure you are signed in and navigate to the repo of interest. Think of this as `OWNER/REPO`, where `OWNER` is the user or organization who owns the repository named `REPO`.

In the upper right hand corner, click **Fork**.

This creates a copy of `REPO` in your GitHub account and takes you there in the browser. Now we are looking at `YOU/REPO`.

**Clone** `YOU/REPO`, which is your copy of the repo, a.k.a. your fork, to your local machine. You have two options:

* [Existing project, GitHub first](#existing-github-first), an RStudio workflow we've used before.
- Your fork `YOU/REPO` plays the role of the existing GitHub repo, in this case -- not the original repo!
- Make a conscious decision about the local destination directory and HTTPS vs SSH URL.
* Execute `git clone https://github.com/YOU/REPO.git` (or `git clone [email protected]:YOU/REPO.git`) in the shell.
- Clone your fork `YOU/REPO`-- not the original repo!
- `cd` to the desired parent directory first. Make a conscious decision about HTTPS vs SSH URL.

We're doing this:

![](img/fork-and-clone.png)

## `usethis::create_from_github("OWNER/REPO")`

The [usethis package](https://usethis.r-lib.org) has a convenience function, [`create_from_github()`](https://usethis.r-lib.org/reference/create_from_github.html), that can do "fork and clone". In fact, it can go even further and [set the `upstream` remote](#upstream-changes). However, `create_from_github()` requires that you have [configured a GitHub personal access token](#github-pat). It hides lots of detail and can feel quite magical.

Due to these difference, we won't dwell on `create_from_github()` here. But once you get tired of doing all of this "by hand", check it out!

## Engage with the new repo

If you did "fork and clone" via [Existing project, GitHub first](#existing-github-first), you are probably in an RStudio Project for this new repo.

Regardless, get yourself into this project, whatever that means for you, using your usual method.

Explore the new repo in some suitable way. If it is a package, you could run the tests or check it. If it is a data analysis project, run a script or render an Rmd. Convince yourself that you have gotten the code.

## Don't mess with `master` {#dont-touch-master}

If you make any commits in your local repository, I **strongly recommend** that you work in [a new branch](#git-branches), not `master`.

I **strongly recommend** that you do not make commits to `master` of a fork.

This will make your life much easier if you want to [pull upstream work](#upstream-changes) into your copy. The `OWNER` of `REPO` will also be happier to receive your pull request from a non-`master` branch.

## The original repo as a remote

Remember we are here:

![](img/fork-and-clone.png)

Here is the current situation in words:

* You have a fork `YOU/REPO`, which is a repo on GitHub.
* You have a local clone of your fork.
* Your fork `YOU/REPO` is the remote known as `origin` for your local repo.
* You are well positioned to make a pull request to `OWNER/REPO`.

But notice the lack of a direct connection between your local copy of this repo and the original `OWNER/REPO`. This is a problem.

![](img/fork-no-upstream-sad.png)

As time goes on, the original repository `OWNER/REPO` will continue to evolve. You probably want the ability to keep your copy up-to-date. In Git lingo, you will need to get the "upstream changes".

![](img/fork-triangle-happy.png)

See the workflow [Get upstream changes for a fork](#upstream-changes) for how to inspect your remotes, add `OWNER/REPO` as `upstream` if necessary, and pull changes, i.e. how to complete the "triangle" in the figure above.

### No, you can't do this via GitHub

You might hope that GitHub could automatically keep your fork `YOU/REPO` synced up with the original `OWNER/REPO`. Or that you could do this in the browser interface. Then you could pull those upstream changes into your local repo.

But you can't.

There are some tantalizing, janky ways to sort of do parts of this. But they have fatal flaws that make them unsustainable. I believe you really do need to [add `upstream` as a second remote on your repo and pull from there](#upstream-changes).
141 changes: 141 additions & 0 deletions workflows-upstream-changes-into-fork.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
# Get upstream changes for a fork {#upstream-changes}

This workflow is relevant if you have done [fork and clone](#fork-and-clone) and now you need to pull subsequent changes from the original repo into your copy.

Sometimes you set this up right away, when you fork and clone, even though you don't need it yet. Congratulations, you are planning for the future!

It's also very typical to do this step a few days or months later. Maybe you're taking an interest in someone else's work for the second time and you want to make another pull request. Or you just want your copy to reflect their recent work. It is also totally normal to set this up upon first need.

Vocabulary: `OWNER/REPO` refers to the original GitHub repo, owned by `OWNER`, who is not you. `YOU/REPO` refers to your copy on GitHub or "fork".

## No, you can't do this via GitHub

You might hope that GitHub could automatically keep your fork `YOU/REPO` synced up with the original `OWNER/REPO`. Or that you could do this in the browser interface. Then you could pull those upstream changes into your local repo.

But you can't.

There are some tantalizing, janky ways to sort of do parts of this. But they have fatal flaws that make them unsustainable. I believe you really do need to [add `upstream` as a second remote on your repo and pull from there](#upstream-changes).

## Initial conditions

Get into the repo of interest, i.e. your local copy. For many of you, this means launching it as an RStudio Project. You'll probably also want to open a terminal within RStudio for some Git work.

Make sure you are on the `master` branch and your "working tree is clean". BTW I recommend that you [never make your own commits to the `master` branch of a fork](#dont-touch-master).`git status` should show something like:

``` bash
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean
```

## List your remotes

Let's inspect [the current remotes](#git-remotes) for your local repo. In the shell:

``` bash
git remote -v
```

Most of you will see output along these lines (call this BEFORE):

``` bash
origin https://github.com/YOU/REPO.git (fetch)
origin https://github.com/YOU/REPO.git (push)
```

There is only one remote, named `origin`, corresponding to your fork on GitHub. This figure depicts such a BEFORE scenario:

![](img/fork-no-upstream-sad.png)

This is sad, because there is no direct connection betwen `OWNER/REPO` and your local copy of the repo.

The state we want to see is like this (call this AFTER):

``` bash
origin https://github.com/YOU/REPO.git (fetch)
origin https://github.com/YOU/REPO.git (push)
upstream https://github.com/OWNER/REPO.git (fetch)
upstream https://github.com/OWNER/REPO.git (push)
```

Notice the second remote, named `upstream`, corresponding to the original repo on GitHub. This figure depicts AFTER, the scenario we want to achieve:

![](img/fork-triangle-happy.png)

Sidebar: If you used `usethis::create_from_github("OWNER/REPO")` for your original "fork and clone", the `upstream` should already be set up. In that case, you can skip the next step.

## Add the `upstream` remote

Let us add `OWNER/REPO` as the `upstream` remote.

On [GitHub](https://github.com), make sure you are signed in and navigate to the original repo, `OWNER/REPO`. It is easy to get to from your fork, `YOU/REPO`, via "forked from" links near the top.

Use the big green "Clone or download" button to get the URL for `OWNER/REPO` on your clipboard. Be intentional about whether you copy the HTTPS or SSH URL.

### Command line Git

``` bash
git remote add upstream https://github.com/OWNER/REPO.git
```

The nickname `upstream` can technically be whatever you want. There is a strong tradition of using `upstream` in this context and, even though I have better ideas, I believe it is best to conform. Every book, blog post, and Stack Overflow thread that you read will use `upstream` here. Save your psychic energy for other things.

### RStudio

This feels a bit odd, but humor me. Click on "New Branch" in the Git pane.

![](img/rstudio-new-branch.png)]

This will reveal a button to "Add Remote". Click it. Enter `upstream` as the remote name and paste the URL for `OWNER/REPO` that you got from GitHub. Click "Add". Decline the opportunity to add a new branch by clicking "Cancel".

## Verify your `upstream` remote

Let's inspect [the current remotes](#git-remotes) for your local repo AGAIN. In the shell:

``` bash
git remote -v
```

Now you should see something like

``` bash
origin https://github.com/YOU/REPO.git (fetch)
origin https://github.com/YOU/REPO.git (push)
upstream https://github.com/OWNER/REPO.git (fetch)
upstream https://github.com/OWNER/REPO.git (push)
```

Notice the second remote, named `upstream`, corresponding to the original repo on GitHub. We have gotten to this:

![](img/fork-triangle-happy.png)

## Pull changes from `upstream`

Now we can pull the changes that we don't have from `OWNER/REPO` into our local copy.

``` bash
git pull upstream master
```

This says: "pull the changes from the remote known as `upstream` into the `master` branch of my local repo".

We are being explicit about the remote and the branch in this case, because (as our `git remote -v` commands reveal), `upstream/master` is **not** the default tracking branch for local `master`.

## Push these changes to `origin/master`

This is, frankly, totally optional and many people who are facile with Git do not bother.

If you take my advice to [never work in `master` of a fork](#dont-touch-master), then the state of the `master` branch in your fork `YOU/REPO` does not matter. You will never make a pull request from `master`.

If, however, your grasp of all these Git concepts is tenuous at best, it can be helpful to try to keep things simple and orderly and synced up.

Feel free to push the state of local `master` to your fork and enjoy the satisfaction of being "caught up".

In the shell:

``` bash
git push
```

Or use the green "Push" button in RStudio.

0 comments on commit 2260fc3

Please sign in to comment.