-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CANTINA-995: Disable crawling for VIP convenience domains #5129
Conversation
|
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## develop #5129 +/- ##
=============================================
- Coverage 28.84% 28.76% -0.08%
Complexity 4775 4775
=============================================
Files 279 279
Lines 21005 21000 -5
=============================================
- Hits 6059 6041 -18
- Misses 14946 14959 +13 ☔ View full report in Codecov by Sentry. |
return $output; | ||
} | ||
// phpcs:ignore WordPressVIPMinimum.Hooks.RestrictedHooks.robots_txt | ||
add_filter( 'robots_txt', __NAMESPACE__ . '\vip_convenience_domain_robots_txt' ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WP must have loaded by this point to be able to use add_filter()
. Has any customer application code been loaded yet? i.e. this being run at priority 10 isn't going to overwrite any existing changes they may have made?
But, they still have chance to unhook it if they wish? Would be good to document an example of the right hook and process to run that remove_filter( 'robots_txt', '\Automattic\VIP\Core\Privacy\vip_convenience_domain_robots_txt' )
call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No customer application code should be loaded at this point yet. But yeah, @yolih, if you wanted to add this to documentation, that'd be fine.
*/ | ||
function vip_convenience_domain_robots_txt( $output ) { | ||
$host = strtolower( $_SERVER['HTTP_HOST'] ?? '' ); // phpcs:ignore WordPress.Security.ValidatedSanitizedInput.InputNotSanitized | ||
if ( false !== strpos( $host, '.go-vip.co' ) || false !== strpos( $host, '.go-vip.net' ) ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these check for the strings at the end of the $host
, rather than just the presence of it? develop.go-vip.company.com
would satisfy this condition, for instance.
Does any IPv4 or IPv6 need to be taken into account as potential values of $host
?
* @return string The modified robots.txt content. | ||
*/ | ||
function vip_convenience_domain_robots_txt( $output ) { | ||
$host = strtolower( $_SERVER['HTTP_HOST'] ?? '' ); // phpcs:ignore WordPress.Security.ValidatedSanitizedInput.InputNotSanitized |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is $_SERVER['HTTP_HOST']
set for curl requests?
I see the nullcoalescing, but want to avoid customers making requests with curl and a browser on their convenience domain sites and getting different results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, unless the host parameter is set otherwise?
function vip_convenience_domain_robots_txt( $output ) { | ||
$host = strtolower( $_SERVER['HTTP_HOST'] ?? '' ); // phpcs:ignore WordPress.Security.ValidatedSanitizedInput.InputNotSanitized | ||
if ( false !== strpos( $host, '.go-vip.co' ) || false !== strpos( $host, '.go-vip.net' ) ) { | ||
$output = "# Crawling is blocked for go-vip.co and go-vip.net domains\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$output = "# Crawling is blocked for go-vip.co and go-vip.net domains\n"; | |
$output = "# Crawling is blocked for go-vip.co and go-vip.net domains.\n"; |
Description
Should be no change functionally since we are moving this logic into MU-plugins.
Pre-review checklist
Please make sure the items below have been covered before requesting a review:
Pre-deploy checklist
Steps to Test
vip-cli
to incorporate this nginx update change and then, link itif ( false !== strpos( $host, '.go-vip.co' ) || false !== strpos( $host, '.go-vip.net' ) ) {
toif ( false !== strpos( $host, '.vip-dev.lndo.site' ) || false !== strpos( $host, '.go-vip.net' ) ) {