Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add global table uswds component + pages documentation #2595

Merged
merged 2 commits into from
Feb 25, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions _includes/content-table.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
<table class="usa-table usa-table--borderless">
{% if include.caption %}
<caption>
{{ include.caption }}
</caption>
{% endif %}
<thead>
<tr>
<th scope="col">{{ include.header1 }}</th>
<th scope="col">{{ include.header2 }}</th>
<th scope="col">{{ include.header3 }}</th>
<th scope="col">{{ include.header4 }}</th>
</tr>
</thead>
<tbody>
{{ include.content }}
</tbody>
</table>
9 changes: 8 additions & 1 deletion _pages/pages/documentation/previews.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,4 +31,11 @@ forked repositories.
## Builds and Logs
Build history and logs for every build are available in the Pages web application. Note: build logs will only be available for **180** days after the build completes.

![Build logs screenshot]({{site.baseurl}}/assets/images/pages/buildlogs.png)
![Build logs screenshot]({{site.baseurl}}/assets/images/pages/buildlogs.png)

**Absolute URL management**

Although Pages automatically sets `BASEURL`, it is best to define your production URL in the site config file (`site.yaml`) to construct absolute URLs throughout an Eleventy site where `url: “https://agency-production-url.gov”`. This will allow the sitemap to construct proper absolute URLs by using `site.url` and `page.url` instead of the `BASEURL` value maintaining consistency across builds.
{% raw %}
`<loc>{{ site.url }}{{ page.url }}</loc>`
{% endraw %}
46 changes: 46 additions & 0 deletions _pages/pages/documentation/search.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,49 @@ We recommend using [Search.gov][], a free site search and search analytics servi
If you'd prefer another solution, you can configure a tool like [lunrjs](https://lunrjs.com/) that creates a search function run using the client browser. An example of this is at the [18F blog](https://18f.gsa.gov/blog/). This avoids any dependency on another service, but the search results are not as robust.

[Search.gov]: https://search.gov/

**Crawl/Index Pages sites**

Pages automatically handles search engine visibility for preview URLs via the Pages proxy. For traffic served through a preview site, the Pages proxy automatically serves the appropriate HTTP robots header, `robots:none`. Preview URLs are not crawlable or indexable by design. Only webpages on the production domain are served with the `robots: all` directive, indicating to crawlers and bots such as search.gov to index the site and enable search capabilities.

{% capture search_table_content %}
<tr>
<th scope="row">1</th>
<td><p> <strong>robots.txt in your Pages site</strong> <br> <br> Discourages robots from crawling the page or pages listed. Webpages that aren’t crawled generally can’t be indexed.</p></td>
<td><code>User-agent: *</code><code>disallow: / directory</code></td>
<td>N/A, crawling is allowed by default</td>
</tr>
<tr>
<th scope="row">2</th>
<td><p> <strong>X-Robots-Tag HTTP header (served by Pages via the Pages proxy)</strong> <br> <br> Encourages or discourages robots to read and index the content on this page or use it to find more links to crawl.</p></td>
<td><code>robots: none</code>(this is automatically served to visitors of all Pages preview builds) </td>
<td><code> robots: all</code>(this is automatically served to visitors of custom/production domains)</td>
</tr>
<tr>
<th scope="row">3</th>
<td><p> <strong>&lt;meta name="robots"&gt; in your Pages site webpage HTML</strong> <br> <br> Discourages robots from crawling the page or pages listed. Webpages that aren’t crawled generally can’t be indexed.</p></td>
<td><code>content="noindex, nofollow”</code></td>
<td>N/A, indexing is allowed by default</td>
</tr>
{% endcapture %}

{% include content-table.html
caption="Search with Pages"
header1="Priority"
header2="Method to manage robot behavior"
header3="How to <u>prevent</u> indexing/crawling"
header4="How to <u>allow</u> indexing/crawling"
content=search_table_content %}

If you want to disable crawling and indexing for specific pages of your production site, you can include the `noindex/nofollow` meta tag in the head of those pages, or include those folders in your `robots.txt`, if your site generates one.

**Conditionally set robots - Eleventy (11ty)**

Take advantage of Pages-provided environment variables to enable environment-specific functionality. Hardcode the condition and meta tags to check the branch from the `process.env` environment variable. This differs from how it is dealt with on a Jekyll site, you are able to add specificity with `process.env.BRANCH`.
You can use this code sample
```
{% unless process.env.BRANCH == "main" %}
<meta name="robots" content="noindex, nofollow">
{% endunless %}
```
See additional documentation on [build environment variables](https://cloud.gov/pages/documentation/env-vars-on-pages-builds/).
Loading