Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Studio: Fix site url after import #437

Merged
merged 9 commits into from
Aug 13, 2024
Merged

Studio: Fix site url after import #437

merged 9 commits into from
Aug 13, 2024

Conversation

kozer
Copy link
Contributor

@kozer kozer commented Aug 7, 2024

Related to https://github.com/Automattic/dotcom-forge/issues/8462

Proposed Changes

After import, the URLs are pointing to the old site, and not to the URL used by studio.
This pr fixes that, by replacing the URLs in the database.

Relation with sqlite PR

  1. wp search-replace among other queries, it runs LIKE REPLACE queries as well ( I manage to make it do that after adding DROP table if exists in each sql file of a jetpack backup and try to import that ).
  2. This sql statement, makes search-replace to fail, and although it reports that it did replacements, it actually doesn't.
  3. You can easily test that, if you navigate locally to ~/Studio/< a site> and run wp search-replace.

Testing Instructions

  1. Create a new site in studio.
  2. Create and download a jetpack backup via activity log.
  3. Locate your Studio site locally. Navigate to wp-content/mu-plugins/sqlite-database-integration/, and ensure that you apply the changes under this pr in your local plugin.
  4. Start Studio, using STUDIO_IMPORT_EXPORT=true flag.
  5. Navigate to import/export tab, and import the backup you have previously downloaded.
  6. After import, export the database of your site.
  7. Open the .sql file, and search for baseurl and/or siteurl. Ensure it's the URL of your local site.

NOTE: Some URLs are not getting updated, but this is normal, as stated under this issue.

Pre-merge Checklist

  • Have you checked for TypeScript, React or other console errors?

@kozer kozer self-assigned this Aug 7, 2024
@kozer kozer marked this pull request as draft August 7, 2024 16:37
@kozer kozer requested a review from a team August 8, 2024 05:56
@kozer kozer marked this pull request as ready for review August 8, 2024 05:56
@wojtekn
Copy link
Contributor

wojtekn commented Aug 8, 2024

I left two comments and I didn't test it yet. Props for the clean solution!

@fluiddot
Copy link
Contributor

fluiddot commented Aug 8, 2024

[...] and ensure that you apply the changes under WordPress/sqlite-database-integration#149 in your local plugin.

@kozer I'm curious about why changes from Sqlite: Fix LIKE BINARY queries are required to test the PR. What's the connection between rewriting the site URL and that fix?

@kozer
Copy link
Contributor Author

kozer commented Aug 8, 2024

[...] and ensure that you apply the changes under WordPress/sqlite-database-integration#149 in your local plugin.

@kozer I'm curious about why changes from Sqlite: Fix LIKE BINARY queries are required to test the PR. What's the connection between rewriting the site URL and that fix?

Hey @fluiddot!

  1. wp search-replace among other queries, it runs LIKE REPLACE queries as well ( I manage to make it do that after adding DROP table if exists in each sql file of a jetpack backup and try to import that ).
  2. This sql statement, makes search-replace to fail, and although it reports that it did replacements, it actually doesn't.
  3. You can easily test that, if you navigate locally to ~/Studio/, and run wp search-replace.

Copy link
Contributor

@fluiddot fluiddot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explored the database after importing a Jetpack backup and I still see references to the original site URL. Specifically, in the guid column of the wp_posts table. Similarly, when I export the database, I see in the SQL file references to the original site URL. Not sure if I'm testing it incorrectly, @kozer are you getting different results?

src/lib/import-export/import/importers/importer.ts Outdated Show resolved Hide resolved
src/lib/import-export/import/importers/importer.ts Outdated Show resolved Hide resolved
@kozer
Copy link
Contributor Author

kozer commented Aug 8, 2024

I explored the database after importing a Jetpack backup and I still see references to the original site URL. Specifically, in the guid column of the wp_posts table. Similarly, when I export the database, I see in the SQL file references to the original site URL. Not sure if I'm testing it incorrectly, @kozer are you getting different results?

No @fluiddot , you are right on that! I suspect that this is due to what I'm referring to in the description.

NOTE: Some URLs are not getting updated, but this is normal, as stated under wp-cli/wp-cli#4871.

Note that I have tried --all-tables as well, without any difference on those urls.

@fluiddot
Copy link
Contributor

fluiddot commented Aug 8, 2024

NOTE: Some URLs are not getting updated, but this is normal, as stated under wp-cli/wp-cli#4871.

Note that I have tried --all-tables as well, without any difference on those URLs.

That's interesting. After further exploring the database, I find that the options siteurl and home of the wp_optionstable also remain with the original site URL. Is this expected?

Curiously enough, the output of the WP-CLI command search-replace states a lot of replacements but I couldn't find exactly those when exploring the database 🤔.

+-------------------------+-----------------------+--------------+------+
| Table                   | Column                | Replacements | Type |
+-------------------------+-----------------------+--------------+------+
| _mysql_data_types_cache | table                 | 131          | SQL  |
| _mysql_data_types_cache | column_or_index       | 131          | SQL  |
| _mysql_data_types_cache | mysql_type            | 131          | SQL  |
| wp_commentmeta          | meta_key              | 0            | SQL  |
| wp_commentmeta          | meta_value            | 0            | SQL  |
| sqlite_sequence         |                       | skipped      |      |
| wp_comments             | comment_author        | 0            | SQL  |
| wp_comments             | comment_author_email  | 0            | SQL  |
| wp_comments             | comment_author_url    | 0            | SQL  |
| wp_comments             | comment_author_IP     | 0            | SQL  |
| wp_comments             | comment_content       | 0            | SQL  |
| wp_comments             | comment_approved      | 0            | SQL  |
| wp_comments             | comment_agent         | 0            | SQL  |
| wp_comments             | comment_type          | 0            | SQL  |
| wp_jetpack_sync_queue   | queue_id              | 0            | SQL  |
| wp_jetpack_sync_queue   | event_id              | 0            | SQL  |
| wp_jetpack_sync_queue   | event_payload         | 0            | SQL  |
| wp_links                | link_url              | 10           | SQL  |
| wp_links                | link_name             | 10           | SQL  |
| wp_links                | link_image            | 10           | SQL  |
| wp_links                | link_target           | 10           | SQL  |
| wp_links                | link_description      | 10           | SQL  |
| wp_links                | link_visible          | 10           | SQL  |
| wp_links                | link_rel              | 10           | SQL  |
| wp_links                | link_notes            | 10           | SQL  |
| wp_links                | link_rss              | 10           | SQL  |
| wp_options              | option_name           | 290          | SQL  |
| wp_options              | option_value          | 0            | PHP  |
| wp_options              | autoload              | 290          | SQL  |
| wp_postmeta             | meta_key              | 123          | SQL  |
| wp_postmeta             | meta_value            | 0            | PHP  |
| wp_posts                | post_content          | 57           | SQL  |
| wp_posts                | post_title            | 57           | SQL  |
| wp_posts                | post_excerpt          | 57           | SQL  |
| wp_posts                | post_status           | 57           | SQL  |
| wp_posts                | comment_status        | 57           | SQL  |
| wp_posts                | ping_status           | 57           | SQL  |
| wp_posts                | post_password         | 57           | SQL  |
| wp_posts                | post_name             | 57           | SQL  |
| wp_posts                | to_ping               | 57           | SQL  |
| wp_posts                | pinged                | 57           | SQL  |
| wp_posts                | post_content_filtered | 0            | PHP  |
| wp_posts                | guid                  | 57           | SQL  |
| wp_posts                | post_type             | 57           | SQL  |
| wp_posts                | post_mime_type        | 57           | SQL  |
| wp_term_taxonomy        | taxonomy              | 4            | SQL  |
| wp_term_taxonomy        | description           | 4            | SQL  |
| wp_termmeta             | meta_key              | 0            | SQL  |
| wp_termmeta             | meta_value            | 0            | SQL  |
| wp_terms                | name                  | 4            | SQL  |
| wp_terms                | slug                  | 4            | SQL  |
| wp_usermeta             | meta_key              | 53           | SQL  |
| wp_usermeta             | meta_value            | 0            | PHP  |
| wp_users                | user_login            | 2            | SQL  |
| wp_users                | user_nicename         | 2            | SQL  |
| wp_users                | user_email            | 2            | SQL  |
| wp_users                | user_url              | 2            | SQL  |
| wp_users                | user_activation_key   | 2            | SQL  |
| wp_users                | display_name          | 2            | SQL  |
+-------------------------+-----------------------+--------------+------+
Success: Made 2008 replacements.

@kozer have you been able to identify any of those replacements? What do we expect to replace when executing the function replaceSiteUrl?

@kozer
Copy link
Contributor Author

kozer commented Aug 8, 2024

I find that the options siteurl and home of the wp_optionstable also remain with the original site URL. Is this expected?

I find that the options siteurl and home of the wp_optionstable also remain with the original site URL. Is this expected?

For me those URLs get replaced... It's strange.

have you been able to identify any of those replacements?

Yes, the URLs in wp_options, mostly. I didn't check the wp search-replace functionality in depth.

What do we expect to replace when executing the function replaceSiteUrl?

I'm not sure what portion of URLs gets replaced. I guess it's in the internals of search-replace. Do you think we need to check it more?

@kozer
Copy link
Contributor Author

kozer commented Aug 8, 2024

@fluiddot , I did a longer investigation.

First of all, what wp search-replace is reporting is waaaaaay off. ( Success: Made 1420 replacements. ).

I used the following process:

  1. Created a site in Studio.
  2. Exported it's database.
  3. Checked the lines that localhost:<port> is appearing.
  4. Import a backup ( without the sqlite fix )
  5. Checked again the lines localhost:<port> appears
  6. Applied the fix
  7. Re-imported, and, once again, counted the lines.

Before import

grep 'localhost:8891' studio-backup-my-serene-website-2024-08-08-14-46-25.sql | wc -l
7

This included links in wp_options, wp_posts and wp_users.
Note that it was only 7 lines but 9 occurrences ( in one line, inside the posts table, there was a row with 2 occurrences ).

After import - before fix

grep 'localhost:8891' studio-backup-my-serene-website-2024-08-08-14-46-25_2.sql | wc -l
0

All URLs were replaced,

After import - after fix

grep 'localhost:8891' studio-backup-my-serene-website-2024-08-08-14-46-25_3.sql | wc -l
2

This is correct as:

  1. Some URLs in the wp_posts were not recovered. ( as I already have mentioned above )
  2. All wp_options replaced as expected
  3. The user in wp_users was replaced by the imported user ( and for some reason, this imported user, doesn't have any URLs stored in order to get replaced )

@fluiddot
Copy link
Contributor

fluiddot commented Aug 8, 2024

I'm not sure what portion of URLs gets replaced. I guess it's in the internals of search-replace. Do you think we need to check it more?
@fluiddot , I did a longer investigation.

Thank you so much @kozer for investigating this 🙇 ! I think the problem I was having was related to not applying the changes to the SQLite integration (ref). I tried again and now I can see some of the replacements.

  • SQL export WITH replacements: 70 occurrences of original site URL.
    • 65 of those references are post GUIDs.
    • 5 are located in wp_options table, associated with options with prefix jpsq_sync- or jb_transient_jetpack_boost_speed_scores_.
  • SQL export WITHOUT replacements: 73 occurrences of original site URL.
    • 65 of those references are post GUIDs.
    • 5 are located in wp_options table, associated with options with prefix jpsq_sync- or jb_transient_jetpack_boost_speed_scores_.
    • ✅ 2 are the WP options siteurl and home
    • ✅ 1 is located in a post content.

Seems that wp search-replace is actually doing the replacements but doesn't apply to:

  • Some WP options associated with Jetpack.
  • Post GUIDs.

@kozer kozer requested review from wojtekn and fluiddot August 9, 2024 07:34
Copy link
Contributor

@fluiddot fluiddot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kozer Regarding the changes from WordPress/sqlite-database-integration#149, should we hold the merge of this PR until a new version of sqlite-database-integration is released with the needed changes?

  1. Open the .sql file, and search for baseurl. Ensure it's the URL of your local site.

The testing instructions mention baseurl but I wonder if we should check siteurl (in wp_options table) instead. If so, I confirmed it changes this value to the proper site URL.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker for this PR but it would be great if we could cover this functionality with unit tests.

@kozer
Copy link
Contributor Author

kozer commented Aug 12, 2024

@kozer Regarding the changes from WordPress/sqlite-database-integration#149, should we hold the merge of this PR until a new version of sqlite-database-integration is released with the needed changes?

  1. Open the .sql file, and search for baseurl. Ensure it's the URL of your local site.

The testing instructions mention baseurl but I wonder if we should check siteurl (in wp_options table) instead. If so, I confirmed it changes this value to the proper site URL.

@fluiddot , we can wait, although it's not mandatory.
I suggest to merge it, in order to have one less PR to track it's status ( given that it doesn't produce any errors ).
What do you think?

@fluiddot
Copy link
Contributor

@fluiddot , we can wait, although it's not mandatory.
I suggest to merge it, in order to have one less PR to track it's status ( given that it doesn't produce any errors ).
What do you think?

It's true that the only side-effect of not including the fix is that the rewrite isn't happening. Apart from that, it won't produce any errors, so it sounds good to me to merge it as-is. We could add a note to the associated GitHub issue explaining that the issue is pending to be solved upon merging WordPress/sqlite-database-integration#149. WDYT?

@kozer kozer merged commit d708522 into trunk Aug 13, 2024
10 checks passed
@kozer kozer deleted the fix/site_url_after_import branch August 13, 2024 06:57
@sejas
Copy link
Member

sejas commented Aug 13, 2024

@kozer, I've noticed this will fail for new sites. I've tried running the importDatabase after the importWpContent, but it's still displaying an error message int he console. Do you have any ideas how to fix it?

This line "const { stdout: currentSiteUrl } = await server.executeWpCliCommand( option get siteurl );" is returning a wp-cli error:

wp-config-not-found

UPDATE:

The import works, but probably the search-replace do not, because the wp-content files and sqlite database integration are not ready.

Steps to reproduce:

  1. Click add site
  2. Select a backup file
  3. Observe the error in the console.

@kozer
Copy link
Contributor Author

kozer commented Aug 13, 2024

@kozer, I've noticed this will fail for new sites. I've tried running the importDatabase after the importWpContent, but it's still displaying an error message int he console. Do you have any ideas how to fix it?

This line "const { stdout: currentSiteUrl } = await server.executeWpCliCommand( option get siteurl );" is returning a wp-cli error:

wp-config-not-found

On it! I ll check it out. Thanks for mentioning!

@kozer
Copy link
Contributor Author

kozer commented Aug 13, 2024

As discussed, #455 solves the above issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants