Skip to content

Commit

Permalink
Add analyzer script
Browse files Browse the repository at this point in the history
  • Loading branch information
laysauchoa committed Oct 20, 2022
1 parent 57b5839 commit b217de0
Show file tree
Hide file tree
Showing 3 changed files with 49 additions and 5 deletions.
9 changes: 5 additions & 4 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
.. image:: /images/opensearch-python.png
:alt: OpenSearch Python dive in

OpenSearch® search queries with Python
======================================
OpenSearch® with Python
========================

This repository contains code examples related to `OpenSearch with Python queries <https://developer.aiven.io/docs/products/opensearch/howto/opensearch-search-and-python.html>`_.

Expand All @@ -18,8 +18,9 @@ Dataset
-------
You can download the `Kaggle recipe dataset <https://www.kaggle.com/hugodarwood/epirecipes?select=full_format_recipes.json>`_, and save the file as ``recipes.json`` in this current folder.

Search examples
---------------
OpenSearch® search queries with Python
---------------------------------------

The available search options can be found by using the `--help` command::

python search.py --help
Expand Down
43 changes: 43 additions & 0 deletions analyzer.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
"""
This file contains code samples for analyzers.
Run the following to check the available methods:
.. code-block:: shell
python analyzers.py --help
"""
import typer
from rich.console import Console

from config import INDEX_NAME, client
from helpers import log_titles
from typing import List

app = typer.Typer(rich_markup_mode="rich")
console = Console()
resp = client.search(index=INDEX_NAME, body=query_body)


analyzers = [
"standard",
"simple",
"whitespace",
"stop",
"keyword",
"pattern",
"fingerprint",
]


def generate_tokens(analyzer, text):
res = os.indices.analyze(body={"analyzer": analyzer, "text": [text]})
tokens = [sample["token"] for c, sample in enumerate(res["tokens"])]
print(f"{analyzer} \n")
print(f"Tokens: {tokens} \n")


if __name__ == "__main__":
text = "Hello my Name is Laysa.12345"
for analyzer in analyzers:
generate_tokens(analyzer, text)
2 changes: 1 addition & 1 deletion search.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
import typer
from rich.console import Console

from config import INDEX_NAME, SERVICE_URI, client
from config import INDEX_NAME, client
from helpers import log_titles
from typing import List

Expand Down

0 comments on commit b217de0

Please sign in to comment.