feat(source-nodes): optimize node sourcing and transformation for improved performance #703

demirgazetic · 2024-07-09T12:11:53Z

Changes made:

Implement concurrent fetching of pages using Promise.all to speed up data retrieval.
Refactor variable names for better readability and maintainability.
Enhance node processing logic to handle content fields more efficiently.
Ensure robust handling of datasource entries and their dimensions.
Maintain support for local assets with improved caching mechanism.
Remove unused code.
Fix linting issues.
Update README.md.

Reason for the Change

Our goal was to enable incremental builds in our Gatsby using the official gatsby-source-storyblok plugin. During implementation, we noticed that the source and transform nodes step was significantly slow, leading us to reconsider using this plugin.

After reviewing the source code of the Storyblok plugin, we identified several potential improvements. This pull request (PR) details the enhancements we've made.

Key Changes and Benefits

Sequential Fetching to Concurrent Fetching:
- Improved Performance: Parallel requests significantly boost performance, especially for large datasets.
- Rate Limit Handling: Concurrent fetching allows for more careful management of rate limits and server load.
- High-Performance Applications: Though more complex, this approach is necessary for optimizing high-performance applications.
By implementing concurrent fetching, we observed a 20%-30% speed increase for large spaces with over 5000 pages and more than 20 datasources, some containing over 10,000 values.

Optimizing Datasource Fetching:

Focused Fetching: Some datasources, in our case such as icons are used internally in Storyblok, are not required on the frontend. By selectively fetching only the needed datasources, we further optimized performance.
New Option includeDatasources: This option allows specifying which datasources to fetch, avoiding unnecessary data retrieval.

Example configuration in gatsby-config.js:

{
  resolve: 'gatsby-source-storyblok',
  options: {
    accessToken: 'YOUR_TOKEN',
    version: 'draft',
    resolveRelations: [''],
    includeLinks: false,
    includeDatasources: ['datasource1', 'datasource2', 'datasource3']
  }
}

Implementation logic:

if (options.includeDatasources === undefined) {
  datasources = await fetchAllDatasources();
} else if (options.includeDatasources.length > 0) {
  datasources = options.includeDatasources;
}

Fetching Tags
Since not everyone uses Storyblok Tags, having the option to disable this feature would also save time.

Summary

These changes result in significant performance improvements, making the plugin more suitable for large projects. By implementing concurrent fetching and selective datasource fetching, we have optimized the source and transform nodes step, making the Gatsby build process more efficient.

…roved performance - Implement concurrent fetching of pages using Promise.all to speed up data retrieval. - Refactor variable names for better readability and maintainability. - Enhance node processing logic to handle content fields more efficiently. - Ensure robust handling of datasource entries and their dimensions. - Maintain support for local assets with improved caching mechanism. - Remove unused code. - Fix linting issues. - Update README.md.

lib/src/sync.js

schabibi1 · 2024-07-30T17:42:23Z

@demirgazetic Thank you for creating a PR! Also, thank you for providing me with the details above.

I wrote a few questions for review. Please feel free to comment there, as I have a few things I would like to hear more details about.
In my opinion, changing from sequential fetching to concurrent fetching sounds good for scalable projects with Gatsby to minimize the risk of hitting the rate limit unnecessarily.
As it's limited from the Gatsby side to enhance the performance, going with Promise.all to only resolve when required arrays of Promises (i.e. stories, datasources, & tags) seems what you can do from your side.

You also removed the old code from gatsby-node.js and reduced the scope levels with getAll when possible.
It's a good approach to make includeDatasources and includeTags opt-in options in gatsby-config, as we already did for the links with includeLinks.

As Gatsby users need to constantly pay attention to reducing the rate limit and performance cost, these opt-in options object provided by Gatsby looks like the way to go. Again, that's what we also did the same for the includeLinks.

…ation logic - Updated the pagination logic to use a single `page` variable for both the initial response and the for loop. - Changed the calculation of `lastPage` by dividing the total number by 100 instead of 25 to optimize the page count.

github-actions · 2024-08-13T07:58:16Z

🎉 This issue has been resolved in version 7.1.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

notion-workspace · 2024-08-13T08:13:10Z

Merge

demirgazetic added 3 commits July 9, 2024 13:45

chore:Revert to CommonJS Syntax for Compatibility with Gatsby

35c1fe4

fix: missing data_source_dimension in included datasources

04bb879

schabibi1 added the investigation [Issue] The Storyblok team is investigating label Jul 30, 2024

schabibi1 reviewed Jul 30, 2024

View reviewed changes

lib/src/sync.js Show resolved Hide resolved

lib/src/sync.js Show resolved Hide resolved

schabibi1 assigned schabibi1 and dipankarmaikap Aug 5, 2024

schabibi1 requested a review from dipankarmaikap August 5, 2024 15:50

dipankarmaikap approved these changes Aug 5, 2024

View reviewed changes

schabibi1 approved these changes Aug 6, 2024

View reviewed changes

schabibi1 merged commit 2adeafa into storyblok:master Aug 13, 2024
2 checks passed

github-actions bot added the released label Aug 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(source-nodes): optimize node sourcing and transformation for improved performance #703

feat(source-nodes): optimize node sourcing and transformation for improved performance #703

demirgazetic commented Jul 9, 2024 •

edited

Loading

schabibi1 commented Jul 30, 2024

github-actions bot commented Aug 13, 2024

notion-workspace bot commented Aug 13, 2024

feat(source-nodes): optimize node sourcing and transformation for improved performance #703

feat(source-nodes): optimize node sourcing and transformation for improved performance #703

Conversation

demirgazetic commented Jul 9, 2024 • edited Loading

Reason for the Change

Key Changes and Benefits

Summary

schabibi1 commented Jul 30, 2024

github-actions bot commented Aug 13, 2024

notion-workspace bot commented Aug 13, 2024

demirgazetic commented Jul 9, 2024 •

edited

Loading