Skip to content

Commit

Permalink
Merge pull request #65 from bobmatyas/updates/111024
Browse files Browse the repository at this point in the history
Updates/111024
  • Loading branch information
bobmatyas authored Nov 10, 2024
2 parents 2942b3c + 62d493c commit 28337ab
Show file tree
Hide file tree
Showing 3 changed files with 20 additions and 3 deletions.
4 changes: 3 additions & 1 deletion block-ai-crawlers.php
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
* Author: Bob Matyas
* Author URI: https://www.bobmatyas.com
* Text Domain: block-ai-crawlers
* Version: 1.4.0
* Version: 1.4.1
* License: GPL-2.0-or-later
* License URI: https://www.gnu.org/licenses/gpl-2.0.html
*
Expand All @@ -29,6 +29,7 @@
function block_ai_robots_txt( $robots ) {
$robots .= "\n# Block AI Crawlers\n\n";
$robots .= "User-agent: AI2Bot\n";
$robots .= "User-agent: Ai2Bot-Dolma\n";
$robots .= "User-agent: AmazonBot\n";
$robots .= "User-agent: Applebot-Extended\n";
$robots .= "User-agent: anthropic-ai\n";
Expand Down Expand Up @@ -56,6 +57,7 @@ function block_ai_robots_txt( $robots ) {
$robots .= "User-agent: SentiBot\n";
$robots .= "User-agent: sentibot\n";
$robots .= "User-agent: Timpibot\n";
$robots .= "User-agent: TurnitinBot\n";
$robots .= "User-agent: YouBot\n";
$robots .= "User-agent: webzio\n";
$robots .= "User-agent: webzio-extended\n";
Expand Down
10 changes: 10 additions & 0 deletions inc/settings-html.php
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,11 @@
<td><p>Explores sites for web content that is used to train open language models</p></td>
<td><a href="https://allenai.org/crawler" target=_blank>More Info <span class="dashicons dashicons-external link"></span></a></td>
</tr>
<tr>
<th>Ai2Bot-Dolma</th>
<td><p>Generates data sets used to train open language models</p></td>
<td><a href="https://allenai.org/dolma" target=_blank>More Info <span class="dashicons dashicons-external link"></span></a></td>
</tr>
<tr>
<th>AmazonBot</th>
<td><p>Used by Amazon's Alexa AI to provide AI answers.</p></td>
Expand Down Expand Up @@ -127,6 +132,11 @@
<td><p>Used by Timpi; likely for their Wilson AI Product.</p></td>
<td><a href="https://timpi.io/wilson-ai/" target=_blank>More Info <span class="dashicons dashicons-external link"></span></a></td>
</tr>
<tr>
<th>TurnitinBot</th>
<td><p>Used by Turnitin to scrape data for plagiarism detection</p></td>
<td><a href="https://www.turnitin.com/robot/crawlerinfo.html" target=_blank>More Info <span class="dashicons dashicons-external link"></span></a></td>
</tr>
<tr>
<th>Webzio</th>
<td><p>Used by Webz.io for their social listening and intelligence platforms.</p></td>
Expand Down
9 changes: 7 additions & 2 deletions readme.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
Contributors: lastsplash
Tags: ai, robots.txt, chatgpt, crawlers
Requires at least: 5.6
Tested up to: 6.6.2
Tested up to: 6.7
Requires PHP: 7.4
Stable tag: 1.4.0
Stable tag: 1.4.1
License: GPLv2 or later
License URI: https://www.gnu.org/licenses/gpl-2.0.html

Expand Down Expand Up @@ -95,6 +95,11 @@ No. Search engines follow different `robots.txt` rules.

== Changelog ==

= 1.4.1 =
= New: Block Turnitinbot
- New: Block Ai2Bot-Dolma
- Enhancement: WordPress 6.7 compatibility

= 1.4.0 =
- New: Block Kangaroo Bot
- New: Block sentibot
Expand Down

0 comments on commit 28337ab

Please sign in to comment.