Skip to content

Commit

Permalink
Merge pull request #43 from yarikoptic/enh-codespell
Browse files Browse the repository at this point in the history
codespell: config, workflow + typos fixed
  • Loading branch information
jsheunis authored Dec 6, 2024
2 parents 71d83d8 + 06300ed commit 4447bb7
Showing 10 changed files with 41 additions and 14 deletions.
5 changes: 5 additions & 0 deletions .codespellrc
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[codespell]
skip = .git,*.pdf,*.svg,*.min.js,*.map,*.scss,*.css
ignore-regex = (Nat\.? Commun.?|highlighter: rouge)
#
# ignore-words-list =
22 changes: 22 additions & 0 deletions .github/workflows/codespell.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
name: Codespell

on:
push:
branches: [gh-pages]
pull_request:
branches: [gh-pages]

permissions:
contents: read

jobs:
codespell:
name: Check for spelling errors
runs-on: ubuntu-latest

steps:
- name: Checkout
uses: actions/checkout@v3
- name: Codespell
uses: codespell-project/actions-codespell@v2
8 changes: 4 additions & 4 deletions _episodes/03-remote-collaboration.md
Original file line number Diff line number Diff line change
@@ -63,7 +63,7 @@ locations. Although each scenario will be slightly different, the
setup steps that we will cover with GIN will look similar on other
git-based hosting solutions.

## Prelude: file availability, getting and droping content
## Prelude: file availability, getting and dropping content

Before we proceed to data publishing let's first take a look at the
dataset we created during the first module. We used two ways of adding
@@ -122,7 +122,7 @@ datalad drop inputs/images/chinstrap_01.jpg
drop(error): /home/alice/Documents/rdm-workshop/example-dataset/inputs/images/chinstrap_01.jpg (file)
[unsafe; Could only verify the existence of 0 out of 1 necessary copy; (Use --reckless availability to override this check, or adjust numcopies.)]
# If you were to run this with DataLad version < 0.16.0, the safety check would be overriden with --nocheck instead of --reckless availablity)
# If you were to run this with DataLad version < 0.16.0, the safety check would be overridden with --nocheck instead of --reckless availability)
```
{: .output}

@@ -716,7 +716,7 @@ With a cloned dataset, you can do the following:

- Change a (text) file. For example, in the
`inputs/images/chinstrap_02.yaml` file we entered `penguin_count:
2`, but if you look closely at the related fotograph, there are
2`, but if you look closely at the related photograph, there are
actually three penguins (one is partially behind a rock). Edit the
file and save the changes with an informative commit message, such
as "Include penguins occluded by rocks in the count" or something
@@ -738,7 +738,7 @@ With a cloned dataset, you can do the following:
location on the website.

#### Contributing back
When ready, you can contribute back wih `datalad push`. If the other
When ready, you can contribute back with `datalad push`. If the other
person has granted you access to their repository (as should be the
case during the workshop), you can do it right away. Note that in this
case you are pushing to `origin` - this is a default name given to a
4 changes: 2 additions & 2 deletions _episodes/04-dataset-management.md
Original file line number Diff line number Diff line change
@@ -363,7 +363,7 @@ This is important: a *superdataset* does not record individual changes within th
In other words, it points to the subdataset location and a point in its life (indicated by a specific commit).

Let's acknowledge that we want our superdataset to point to the updated version of the subdataset (ie. that which has all three tabular files) by saving this change in the superdataset's history.
In other words, while the subdataset progressed by three comits, in the parent dataset we can record it as a single change (from empty to populated subdataset):
In other words, while the subdataset progressed by three commits, in the parent dataset we can record it as a single change (from empty to populated subdataset):

~~~
datalad save -d . -m "Progress the subdataset version"
@@ -566,6 +566,6 @@ save(ok): . (dataset)
~~~
{: .output}

The end! We have produced a nested datset:
The end! We have produced a nested dataset:
- the superdataset (penguin-report) directly contains our code, figures, and report (tracking their history), and includes inputs as a subdatset.
- the subdataset (inputs) tracks the history of the raw data files.
4 changes: 2 additions & 2 deletions _episodes/91-branching.md
Original file line number Diff line number Diff line change
@@ -263,7 +263,7 @@ $ git merge preproc

#### And... what now?

Branching opens up the possibility to keep parallel developments neat and orderly next to eachother, hidden away in branches. A `checkout` of your favourite branch lets you travel to its timeline and view all of the changes it contains, and a `merge` combines one or more timelines into another one.
Branching opens up the possibility to keep parallel developments neat and orderly next to each other, hidden away in branches. A `checkout` of your favourite branch lets you travel to its timeline and view all of the changes it contains, and a `merge` combines one or more timelines into another one.

> ## Exercise
>
@@ -383,7 +383,7 @@ $ datalad save -m "Fix: Change absolute to relative paths</code></td>

In order to propose the fix to the central dataset as an addition, the collaborator pushes their branch to the central sibling.
When the central sibling is on GitHub or a similar hosting service, the hosting service assists with merging `fix-paths` to `main` with a **pull request** - a browser-based description and overview of the changes a branch carries.
Collaborators can conviently take a look and decide whether they accept the pull request and thereby merge the `fix-paths` into `upstream`'s `main`.
Collaborators can conveniently take a look and decide whether they accept the pull request and thereby merge the `fix-paths` into `upstream`'s `main`.
You can see how opening and merging PRs look like in GitHub's interface in the expandable box below.

> ## Creating a PR on GitHub
2 changes: 1 addition & 1 deletion _episodes/92-filesystem-operations.md
Original file line number Diff line number Diff line change
@@ -231,7 +231,7 @@ datalad remove -d local-dataset
uninstall(error): . (dataset) [to-be-dropped dataset has revisions that are not available at any known sibling. Use `datalad push --to ...` to push these before dropping the local dataset, or ignore via `--reckless availability`. Unique revisions: ['main']]
~~~

``remove`` advises to either ``push`` the "unique revisions" to a different place (i.e., creating a sibling to host your pristine, version-controlled changes), or, similarily to how it was done for ``drop``, to disable the availability check with ``--reckless availability``.
``remove`` advises to either ``push`` the "unique revisions" to a different place (i.e., creating a sibling to host your pristine, version-controlled changes), or, similarly to how it was done for ``drop``, to disable the availability check with ``--reckless availability``.

~~~
datalad remove -d local-dataset --reckless availability
4 changes: 2 additions & 2 deletions _extras/for_instructors.md
Original file line number Diff line number Diff line change
@@ -154,7 +154,7 @@ This can be done with [Let's Encrypt](https://letsencrypt.org/) by following ins
~~~
sudo tljh-config set https.enabled true
~~~
2. Set your email addres for Let's Encrypt:
2. Set your email address for Let's Encrypt:
~~~
sudo tljh-config set https.letsencrypt.email <you@example.com>
~~~
@@ -326,7 +326,7 @@ Different authentication options are possible (e.g. admin can also authenticate
2022/12/01 12:40:30Z: OsProductName: Ubuntu
2022/12/01 12:40:30Z: OsVersion: 22.04
~~~
- If the JupyterHub bootsrap script succeeded, within the last 30 lines you will find:
- If the JupyterHub bootstrap script succeeded, within the last 30 lines you will find:
~~~
[ 210.143720] cloud-init[1233]: Waiting for JupyterHub to come up (1/20 tries)
[ 210.147437] cloud-init[1233]: Done!
2 changes: 1 addition & 1 deletion bin/dependencies.R
Original file line number Diff line number Diff line change
@@ -79,7 +79,7 @@ create_description <- function(required_pkgs) {
# We have to write the description twice to get the hidden dependencies
# because renv only considers explicit dependencies.
#
# This is needed because some of the hidden dependencis will require system
# This is needed because some of the hidden dependencies will require system
# libraries to be configured.
suppressMessages(repo <- BiocManager::repositories())
deps <- remotes::dev_package_deps(dependencies = TRUE, repos = repo)
2 changes: 1 addition & 1 deletion bin/repo_check.py
Original file line number Diff line number Diff line change
@@ -146,7 +146,7 @@ def check_labels(reporter, repo_url):
for name in sorted(overlap):
reporter.check(EXPECTED[name].lower() == actual[name].lower(),
None,
'Color mis-match for label {0} in {1}: expected {2}, found {3}',
'Color mismatch for label {0} in {1}: expected {2}, found {3}',
name, repo_url, EXPECTED[name], actual[name])


2 changes: 1 addition & 1 deletion setup.md
Original file line number Diff line number Diff line change
@@ -29,7 +29,7 @@ about collaboration you won't be able to publish all example data).

## Participate with own computer: install software

If you want to follow the exaples on your own machine, you will need
If you want to follow the examples on your own machine, you will need
to install DataLad and some additional software which we will use
during the walkthrough. Note that Linux or MacOS are strongly
recommended for this workshop; although DataLad works on all main

0 comments on commit 4447bb7

Please sign in to comment.