- remove '\n' and '\r' characters in fulltext extraction
- detect required 'utf8mb4' charset for mariaDB and throw readable error in migrate command
- update README, that utf8mb4 is required when using mariaDB
Extensible, low impact, self-contained, SQL-pure stupidly simple search integration for Neos. Search with the power of SQL fulltext queries. No need for additional infrastructure, works purely with your existing DB.
Search configuration is more or less downwards compatible to Neos.SearchPlugin / SimpleSearch / ElasticSearch.
Supports:
- MariaDB version >= 10.6
- MySQL version >= 8.0
- PostgreSQL -> supported very soon
Next Steps:
- migration tooling
- PostgreSQL support (first working draft)
still early WIP phase, TODO more documentation!
- no additional infrastructure required (like ElasticSearch or MySQLite)
- no explicit index building after changing nodes required, run the SQL migration... that's it :)
- still performant due to fulltext indexing on database level
- easy to extend with additional SearchResultTypes (f.e. tables from your database and/or custom flow entities like
products, etc.)
- high level and low level extension API
- comes with Neos Content as default SearchResultType
- designed to be database agnostic
- search multiple sources with a single, performant SQL query
KISSearch has some abstraction layers:
- underlying database system (MariaDB, MySQL, Postgres, ...)
- search result types API easily usable for custom search result types that lives in your database (e.g. products in a shop)
- the
neos_content
search result type comes shipped with this package, internally it uses the same API - each search result type can declare their own additional query parameters (see f.e. the [neos_content parameter section](# neos_content additional parameters))
Configuration happens via Settings.yaml
There are three public APIs that you can use for searching:
- PHP API - the Flow singleton
SearchService
class; f.e. out of your flow controller - Fusion API - all objects in the
Sandstorm.KISSearch
namespace; run search queries from within your Fusion components - Flow CLI Command - run search commands on the command line; most likely for debugging
- migrations must be transactional -> currently, they are not and may leave an inconsistent migration status on errors
TODO: use composer - doc after first release
Default config that lives inside the Sandstorm.KISSearch package:
Sandstorm:
KISSearch:
# database type; mainly required for minimal version checking
# supported databases:
# - MySQL
# - MariaDB
# - PostgreSQL (coming soon)
databaseType: 'MariaDB'
# all registered search result types
searchResultTypes:
# default search result type: Neos Content (Nodes from Content Repository)
neos_content: 'Sandstorm\KISSearch\SearchResultTypes\NeosContent\NeosContentSearchResultType'
# configuration for the Neos Content search result type
neosContent:
# explicit filter to exclude specific node types from search indexing
excludedNodeTypes:
- 'Neos.Neos:Shortcut'
# only document node types that extends this get indexed
# also only content nodes that lives below documents extending this get indexed
baseDocumentNodeType: 'Neos.Neos:Document'
# only content node types that extends this get indexed
baseContentNodeType: 'Neos.Neos:Content'
Write your own search result types and register them via config:
Sandstorm:
KISSearch:
searchResultTypes:
# extensibility for your custom search result types
my_products: 'Vendor\YourProject\SearchResultTypes\YourTypes\ProductSearchResultType'
Important: The custom search result class must be a flow service known by the object manager, and it must implement the
interface Sandstorm\KISSearch\SearchResultTypes\SearchResultTypeInterface
.
Internal name: neos_content
Implemented using the public search result type API, KISSearch comes with Neos content search shipped <3
Name | Description | Type | Required |
---|---|---|---|
neosContentSiteNodeName | side node name allow list filter | string, array of strings (also supports NodeName value objects) | optional |
neosContentExcludedSiteNodeName | side node name deny list filter | string, array of strings (also supports NodeName value objects) | optional |
neosContentDimensionValues | content dimension value node filter | array of arrays, dimension name mapped to array of target values | optional |
Example CLI:
./flow kissearch:search --query "Neos" --additional-params '{"neosContentDimensionValues": {"language": ["en_US"]}, "neosContentSiteNodeName": "neosdemo"}'
Mode: Extract text into single bucket.
'Vendor:My.NodeType':
properties:
'myProperty':
search:
# possible values: 'critical', 'major', 'normal', 'minor'
bucket: 'major'
Mode: Extract HTML tags into specific buckets.
'Vendor:My.ExampleText':
superTypes:
'Neos.NodeTypes.BaseMixins:TextMixin': true
properties:
'text':
search:
# possible values: 'all', 'critical', 'major', 'normal', 'minor'
# or an array containing multiple values of: 'critical', 'major', 'normal', 'minor'
# 'all' is not supported as array value
# 'all' is equivalent to ['critical', 'major', 'normal', 'minor']
extractHtmlInto: 'all'
This package is compatible to the fulltext extraction configuration used by Neos.SearchPlugin / Neos.SimpleSearch / Neos.ElasticSearch.
Create SQL index:
./flow kissearch:migrate
Remove SQL index:
./flow kissearch:remove
Check required minimum database version:
# if not fulfilled, this will print an error message and exit with code 1
./flow kissearch:checkVersion
In general, search result types can bring own additional query parameters. They are basically key-value pairs and live in two places:
- pass additional parameter values into the public search API
- additional parameters are declared in the SQL query to pass in dynamic values
In general, you can choose between two search modes (or even use both in different places):
Use this API if you want results purely ranked by the overall score importance. What can happen here is, that lots of important results from one single result type can "push out" results from other types.
Example, let's say you have:
- two different search result types: neos_content (built in) and products (custom)
- a global query limit of
10
Let's also say: without applying the limit, the search result query matches 20 results from neos_content and 20 results from products. All 20 results from products have a higher score than the results from neos_content. That means, after applying the global limit of 10, the end result will be 10 products.
That works well, if you want the most important results from all result types. That does not work well, if you want at least a few results from each result type. For those use-cases, the next search mode might be the better approach.
Use this API if you want at least a few results from all search result types.
TODO more doc -> "push out" cannot happen, but less scored results from other types will be returned.
Search on command line:
# without URL generator
./flow kissearch:search --query "Neos" --limit 100
# with URL generator
./flow kissearch:searchFrontend --query "Neos" --limit 100
# limit by result type - without URL generator
./flow kissearch:searchLimitPerResultType --query "Neos" --limit '{"neos_content": 50, "product": 50}'
# limit by result type - with URL generator
./flow kissearch:searchFrontendLimitPerResultType --query "Neos" --limit '{"neos_content": 50, "product": 50}'
Flow Service API:
use Sandstorm\KISSearch\Service\SearchQueryInput;
use Sandstorm\KISSearch\Service\SearchService;
use Sandstorm\KISSearch\SearchResultTypes\SearchResult;
use Sandstorm\KISSearch\SearchResultTypes\SearchResultFrontend;
use Neos\Flow\Mvc\Controller\ActionController;
use Neos\Flow\Annotations\Scope;
use Neos\Flow\Annotations\Inject;
#[Scope('singleton')]
class SearchController extends ActionController {
private const DEFAULT_SEARCH_LIMIT = 100;
#[Inject]
protected SearchService $searchService;
/**
* Includes URLs to the document nodes.
*
* @param string $searchQueryUserInput the search term user input
* @return string search results as JSON
*/
public function searchFrontendAction(string $searchQueryUserInput): string
{
/** @var SearchResultFrontend[] $searchResults */
$searchResults = $this->searchService->searchFrontend(
new SearchQueryInput(
$searchQueryUserInput,
[
'neosContentSiteNodeName' => ['foobar'],
'neosContentExcludedSiteNodeName' => ['site-i-want-to-exclude'],
'neosContentDimensionValues' => [
]
]
),
self::DEFAULT_SEARCH_LIMIT
);
return json_encode($searchResults);
}
/**
* Includes only IDs of the document nodes.
*
* @param string $searchQueryUserInput the search term user input
* @return string search results as JSON
*/
public function searchAction(string $searchQueryUserInput): string
{
/** @var SearchResult[] $searchResults */
$searchResults = $this->searchService->search(new SearchQueryInput($searchQueryUserInput), self::DEFAULT_SEARCH_LIMIT);
return json_encode($searchResults);
}
}
Fusion Object API:
// includes URLs of the document nodes
root = Sandstorm.KISSearch:SearchFrontend {
query = ${request.arguments.q}
limit = 100
@process.json = ${Json.stringify(value)}
}
// includes only IDs to the document nodes
root = Sandstorm.KISSearch:Search {
query = ${request.arguments.q}
limit = 100
@process.json = ${Json.stringify(value)}
}
Fusion Eel Helper API:
// includes URLs of the document nodes
root = ${KISSearch.searchFrontend(request.arguments.q, 100)}
[email protected] = ${Json.stringify(value)}
// includes only IDs to the document nodes
root = ${KISSearch.search(request.arguments.q, 100)}
[email protected] = ${Json.stringify(value)}
TODO document additionalParameters!
TODO document on how to use with Fusion caching
Objects.yaml:
'Neos\Neos\Domain\Service\NodeSearchServiceInterface':
className: 'Sandstorm\KISSearch\Service\KISSearchNodeSearchService'
TODO
run in container:
bin/behat -c Packages/Application/Sandstorm.KISSearch/Tests/Behavior/behat.yml.dist -vvv Packages/Application/Sandstorm.KISSearch/Tests/Behavior/Features/