Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Opensearch compatibility #271

Open
Ark74 opened this issue Jul 24, 2023 · 15 comments
Open

Enable Opensearch compatibility #271

Ark74 opened this issue Jul 24, 2023 · 15 comments

Comments

@Ark74
Copy link
Contributor

Ark74 commented Jul 24, 2023

Hello!
Since 2021 there is an re-implementation capable to use Opensearch 1.x¹ compatible enough to work with ES 7.x version.

Starting on the 26 version, this is not possible any more, even there are checks in place to avoid any other server other than Elasticsearch.
Like:
ProductCheckTrait.php

...
        if ($statusCode >= 200 && $statusCode < 300) {
            $product = $response->getHeaderLine(Elasticsearch::HEADER_CHECK);
            if (empty($product) || $product !== Elasticsearch::PRODUCT_NAME) {
                throw new ProductCheckException(
                    'The client noticed that the server is not Elasticsearch and we do not support this unknown product'
                );
...

I wonder if is there any reason in particular to side to one project in particular, or if fulltextsearch could be a more agnostic fts client.

Best regards.

¹ https://github.com/nextcloud/vm/blob/master/apps/fulltextsearch.sh

@enoch85
Copy link
Member

enoch85 commented Jul 24, 2023

Don't know if it makes any difference, but the Docker container in question which is used by the Nextcloud VM has more than 10 million pulls. This breaking change affects many users. Please consider fixing it.

cc @ArtificialOwl

@enoch85
Copy link
Member

enoch85 commented Aug 14, 2023

Not so crucial to me anymore since we changed to Elasticsearch, but I still think it would be great if you supported Opensearch.

@hashworks
Copy link

What is the show-stopper here? Is it general incompatibility or just a simple version check? If it's the latter it would be nice if you could include a flag to ignore any unknown versions (which then voids any support).

@Ark74
Copy link
Contributor Author

Ark74 commented Aug 30, 2023

@hashworks Might wanna check: apps/fulltextsearch_elasticsearch/vendor/elasticsearch/

Seems like a general incompatibility, as Elasticsearch will not allow Opensearch connections.

@poiNt3D
Copy link

poiNt3D commented Feb 7, 2024

Seems strange that Nextcloud relies on a commercial product instead of open-source fork.

@ProfZiebart
Copy link

ProfZiebart commented Oct 12, 2024

Opensearch states that the elastic-client should be fully compatible with their server. Furthermore they state that they support either versions: elasticsearch AND opensearch clients. Elasticsearchclients block the opensearch server as stated above.
I would vote to replace the elastic-client with an opensearch-client under these circumstances as it would give Nextcloud Users more flexibility.

@arminfelder
Copy link

I just did quick PoC, changing the lib from ElasticSearch to OpenSearch: https://github.com/arminfelder/fulltextsearch_elasticsearch/tree/opensearch-test-30
there is almost no difference, besides, that the ElasticSearch library has a check build in to block OpenSearch and the Queries issued by this plugin do not require any Elastic specific features, so replacing or extending in favor of OpenSearch would certainly be a good idea, given the benefits for the users, not having to deal with license adventures

@rasos
Copy link

rasos commented Oct 27, 2024

This OpenSearch fork looks awesome and searches blazingly fast.
@ArtificialOwl should this be rather merged into the fulltextsearch_elasticsearch app or live as a separate one? Even if that can register as a new search service, the latter would mean double maintenance effort.

@ProfZiebart
Copy link

ProfZiebart commented Oct 27, 2024

I am stuck with an OpenSearch as Elastic doesn't quite seem to run on my Raspi 5, whilst the former does. I use it in an BlueSpice-Wiki.

  1. Is there anybody out there with an Elastic Server who could test if the OpenSearch Statement that both are compatible is correct?
  2. Are there any known test cases to run for this Extension?
    If the test cases run as promised, we could stick to this Extension without double maintenance effort as it could client to both: Elastic- and OpenSearch Servers.

@ProfZiebart
Copy link

ProfZiebart commented Oct 28, 2024 via email

@arminfelder
Copy link

arminfelder commented Oct 28, 2024

I just looked into it, works fine on my machine :) , form the error message, I would guess, that the opensearch lib is not proberly downloaded or processed on your machine, maybe try deleting lib/Vendor and run make again

here is the build I use in my nextcloud(version 30) test env
fulltextsearch_elasticsearch.tar.gz

@ProfZiebart
Copy link

ProfZiebart commented Oct 28, 2024

I figured it out: As I cloned your branch, opensearch was in \vendor\opensearch-project\opensearch-php\OpenSearch. I had to move that folder to \vendor\OpenSearch.
So for the files:

cd \var\www\nextcloud\apps
git clone https://github.com/arminfelder/fulltextsearch_elasticsearch.git
git checkout origin/opensearch-test-30
mv ./vendor/opensearch-project/opensearch-php/OpenSearch ./ -R
curl -sS https://getcomposer.org/installer | php
php composer.phar install
php /var/www/nextcloud/occ app:enable fulltextsearch_elasticsearch

Now I get nearly 90% of the occ fulltextsearch:test done but it throws an error:

.Testing your current setup:
Creating mocked content provider. ok
Testing mocked provider: get indexable documents. (2 items) ok
Loading search platform. (Elasticsearch) ok
Testing search platform. ok
Locking process ok
Removing test. ok
Pausing 3 seconds 1 2 3 ok
Initializing index mapping. ok
Indexing generated documents. ok
Pausing 3 seconds 1 2 3 ok
Retreiving content from a big index (license). (size: 32386) ok
Comparing document with source. ok
Searching basic keywords:
 - 'test' (result: 1, expected: ["simple"]) ok
 - 'document is a simple test' (result: 2, expected: ["simple","license"]) ok
 - '"document is a test"' (result: 0, expected: []) ok
 - '"document is a simple test"' (result: 1, expected: ["simple"]) ok
 - 'document is a simple -test' (result: 1, expected: ["license"]) ok
 - 'document is a simple +test' (result: 1, expected: ["simple"]) ok
 - '-document is a simple test' (result: 0, expected: []) ok
 - 'document is a simple +test +testing' (result: 1, expected: ["simple"]) ok
 - 'document is a simple +test -testing' (result: 0, expected: []) ok
 - 'document is a +simple -test -testing' (result: 0, expected: []) ok
 - '+document is a simple -test -testing' (result: 1, expected: ["license"]) ok
 - 'document is a +simple -license +testing' (result: 1, expected: ["simple"]) ok
Updating documents access. ok
Pausing 3 seconds 1 2 3 ok
Searching with group access rights:
 - 'license' - [] -  (result: 0, expected: []) ok
 - 'license' - ["group_1"] -  (result: 1, expected: ["license"]) ok
 - 'license' - ["group_1","Group_2"] -  (result: 1, expected: ["license"]) ok
 - 'license' - ["group_3","Group_2"] -  (result: 0, expected: ["license"]) fail
Error detected, unlocking process ok
In Test.php line 676:

  Unexpected SearchResult: {"provider":{"id":"test_provider","name":"Test Provider"},"platform":{"id":"elastic_search","name":"Elast
  icsearch"},"documents":[],"info":[],"meta":{"timedOut":false,"time":10,"count":0,"total":0,"maxScore":0}}


fulltextsearch:test [--output [OUTPUT]] [-j|--json] [-d|--platform_delay PLATFORM_DELAY]

@arminfelder
Copy link

arminfelder commented Oct 28, 2024

could you try:
occ fulltextsearch:reset (clears everathing from Elastic/OpenSearch)
occ fulltextsearch:index (creates index and mappings)
occ fulltextsearch:test

I get:

root@391dc97c9c0b:/var/www/html# sudo -u www-data php occ fulltextsearch:test

Warning: Failed to set memory limit to 0 bytes (Current memory usage is 2097152 bytes) in Unknown on line 0
The current PHP memory limit is below the recommended value of 512MB.
 
.Testing your current setup:  
Creating mocked content provider. ok  
Testing mocked provider: get indexable documents. (2 items) ok  
Loading search platform. (Elasticsearch) ok  
Testing search platform. ok  
Locking process ok  
Removing test. ok  
Pausing 3 seconds 1 2 3 ok  
Initializing index mapping. ok  
Indexing generated documents. ok  
Pausing 3 seconds 1 2 3 ok  
Retreiving content from a big index (license). (size: 32386) ok  
Comparing document with source. ok  
Searching basic keywords:  
 - 'test' (result: 1, expected: ["simple"]) ok  
 - 'document is a simple test' (result: 2, expected: ["simple","license"]) ok  
 - '"document is a test"' (result: 0, expected: []) ok  
 - '"document is a simple test"' (result: 1, expected: ["simple"]) ok  
 - 'document is a simple -test' (result: 1, expected: ["license"]) ok  
 - 'document is a simple +test' (result: 1, expected: ["simple"]) ok  
 - '-document is a simple test' (result: 0, expected: []) ok  
 - 'document is a simple +test +testing' (result: 1, expected: ["simple"]) ok  
 - 'document is a simple +test -testing' (result: 0, expected: []) ok  
 - 'document is a +simple -test -testing' (result: 0, expected: []) ok  
 - '+document is a simple -test -testing' (result: 1, expected: ["license"]) ok  
 - 'document is a +simple -license +testing' (result: 1, expected: ["simple"]) ok  
Updating documents access. ok  
Pausing 3 seconds 1 2 3 ok  
Searching with group access rights:  
 - 'license' - [] -  (result: 0, expected: []) ok  
 - 'license' - ["group_1"] -  (result: 1, expected: ["license"]) ok  
 - 'license' - ["group_1","Group_2"] -  (result: 1, expected: ["license"]) ok  
 - 'license' - ["group_3","Group_2"] -  (result: 1, expected: ["license"]) ok  
 - 'license' - ["group_3"] -  (result: 0, expected: []) ok  
Searching with share rights:  
 - 'license' - notuser -  (result: 0, expected: []) ok  
 - 'license' - User number_2 -  (result: 1, expected: ["license"]) ok  
 - 'license' - User3 -  (result: 1, expected: ["license"]) ok  
 - 'license' - User@4 -  (result: 1, expected: ["license"]) ok  
Removing test. ok  
Unlocking process ok  

my dev setup(docker compose) is:

services:
  opensearch: # This is also the hostname of the container within the Docker network (i.e. https://opensearch-node1/)
    image: opensearchproject/opensearch:latest # Specifying the latest available image - modify if you want a specific version
    environment:
      - cluster.name=opensearch-cluster # Name the cluster
      - node.name=opensearch-node1 # Name the node that will run in this container
      - discovery.seed_hosts=opensearch-node1 # Nodes to look for when discovering the cluster
      - cluster.initial_cluster_manager_nodes=opensearch-node1
      - bootstrap.memory_lock=true # Disable JVM heap memory swapping
      - "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m" # Set min and max JVM heap sizes to at least 50% of system RAM
      - OPENSEARCH_INITIAL_ADMIN_PASSWORD=vvQYA7sVFUWq    # Sets the demo admin user password when using demo configuration, required for OpenSearch 2.12 and later
      - ISABLE_SECURITY_DASHBOARDS_PLUGIN=true
      - plugins.security.ssl.http.enabled=false
    ulimits:
      memlock:
        soft: -1 # Set memlock to unlimited (no soft or hard limit)
        hard: -1
      nofile:
        soft: 65536 # Maximum number of open files for the opensearch user - set to at least 65536
        hard: 65536
    volumes:
      - opensearch-data1:/usr/share/opensearch/data # Creates volume called opensearch-data1 and mounts it to the container
      - opensearch-plugins:/usr/share/opensearch/plugins/
    ports:
      - 9200:9200 # REST API
      - 9600:9600 # Performance Analyzer
    networks:
      - opensearch-net
      - nextcloud
  opensearch-dashboards:
    image: opensearchproject/opensearch-dashboards:latest # Make sure the version of opensearch-dashboards matches the version of opensearch installed on other nodes
    container_name: opensearch-dashboards
    ports:
      - 5601:5601 # Map host port 5601 to container port 5601
    expose:
      - "5601" # Expose port 5601 for web access to OpenSearch Dashboards
    environment:
      OPENSEARCH_HOSTS: '["http://opensearch:9200"]' # Define the OpenSearch nodes that OpenSearch Dashboards will query
    networks:
      - opensearch-net
  db:
    image: mariadb:10.6
    command: --transaction-isolation=READ-COMMITTED --log-bin=binlog --binlog-format=ROW
    volumes:
      - db:/var/lib/mysql
    environment:
      - MYSQL_ROOT_PASSWORD=utDaYtTzn4dk
      - MYSQL_PASSWORD=utDaYtTzn4dk
      - MYSQL_DATABASE=nextcloud
      - MYSQL_USER=nextcloud
    networks:
      - nextcloud

  app:
    image: nextcloud
    ports:
      - 8080:80
    volumes:
      - nextcloud:/var/www/html
    environment:
      - MYSQL_PASSWORD=utDaYtTzn4dk
      - MYSQL_DATABASE=nextcloud
      - MYSQL_USER=nextcloud
      - MYSQL_HOST=db
      - NEXTCLOUD_ADMIN_USER=admin
      - NEXTCLOUD_ADMIN_PASSWORD=1234
    networks:
      - nextcloud

networks:
  nextcloud:
  opensearch-net:

volumes:
  nextcloud:
  db:
  opensearch-plugins:
  opensearch-data1:

I did run:

  • make
  • extract the tar.gz from ./build to the nextclouds app folder e.g.: ~/.local/share/containers/storage/volumes/nextcloud_debug-nextcloud/_data/apps/
  • chown 100032:100032 ~/.local/share/containers/storage/volumes/nextcloud_debug-nextcloud/_data/apps/ -R
  • inside the nexcloud container: sudo -u www-data -php occ app:install fulltextsearch_elasticsearch

@ProfZiebart
Copy link

I tried that one.
I get:

Searching with group access rights:
 - 'license' - [] -  (result: 0, expected: []) ok
 - 'license' - ["group_1"] -  (result: 1, expected: ["license"]) ok
 - 'license' - ["group_1","Group_2"] -  (result: 1, expected: ["license"]) ok
 - 'license' - ["group_3","Group_2"] -  (result: 0, expected: ["license"]) fail
Error detected, unlocking process ok
In Test.php line 676:

  Unexpected SearchResult: {"provider":{"id":"test_provider","name":"Test Provider"},"platform":{"id":"elastic_search","name":"Elast
  icsearch"},"documents":[],"info":[],"meta":{"timedOut":false,"time":3,"count":0,"total":0,"maxScore":0}}


fulltextsearch:test [--output [OUTPUT]] [-j|--json] [-d|--platform_delay PLATFORM_DELAY]

@arminfelder
Copy link

ok, I just wiped my test installation, and did a rerun, and was able to reproduce your issue.

  • :index is required to run first, to initialize teh elastic/opensearch, index and ingest pipeline
  • if you have nothing to index, like with an empty installation, without the file plugin, the index wont be configured in a proper way, I did not investigate deeper, but I guess, it is related to index mapping:dynamic , or with other words, it might need some data to be ingested first, for OpenSearch to determine, the type of fields not initially included in the mapping.

when installing the fulltext search file plugin as well, before running :index, the error disapears

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants