Merge pull request #192 from deepmipt/dev

Release v0.2.2
deeppavlov · Aug 24, 2022 · 2b50d6d · 2b50d6d
2 parents 82a19df + 0304479
commit 2b50d6d
Show file tree

Hide file tree

Showing 17 changed files with 459 additions and 69 deletions.
diff --git a/.gitignore b/.gitignore
@@ -140,3 +140,5 @@ network.yml
 
 kubernetes/models
 docker-compose-one-replica.yml
+
+*.pem
diff --git a/README.md b/README.md
@@ -118,17 +118,32 @@ AGENT_PORT=4242 docker-compose -f docker-compose.yml -f assistant_dists/dream/do
 ```
 
 ### Let's chat
+
+DeepPavlov Agent provides several options for interaction: a command line interface, an HTTP API, and a Telegram bot 
+
+#### CLI
 In a separate terminal tab run:
 
 ```
-docker-compose exec agent python -m deeppavlov_agent.run -pl assistant_dists/dream/pipeline_conf.json
+docker-compose exec agent python -m deeppavlov_agent.run agent.channel=cmd agent.pipeline_config=assistant_dists/dream/pipeline_conf.json
 ```
 
 Enter your username and have a chat with Dream!
 
+#### HTTP API
+Once you've started the bot, DeepPavlov's Agent API will run on `http://localhost:4242`.
+You can learn about the API from the [DeepPavlov Agent Docs](https://deeppavlov-agent.readthedocs.io/en/latest/intro/overview.html#http-api-server).
+
+A basic chat interface will be available at `http://localhost:4242/chat`.
 
-### Let's talk via HTTP API
-Once you've started the bot, DeepPavlov's Agent API will run on `http://localhost:4242'. You can learn about its API from the [DeepPavlov Agent Docs](https://deeppavlov-agent.readthedocs.io/en/latest/intro/overview.html#http-api-server).
+#### Telegram Bot
+Currently, Telegram bot is deployed **instead** of HTTP API.
+Edit `agent` `command` definition inside `docker-compose.override.yml` config:
+```
+agent:
+  command: sh -c 'bin/wait && python -m deeppavlov_agent.run agent.channel=telegram agent.telegram_token=<TELEGRAM_BOT_TOKEN> agent.pipeline_config=assistant_dists/dream/pipeline_conf.json'
+```
+**NOTE:** treat your Telegram token as a secret and do not commit it to public repositories!
 
 # Configuration and proxy usage
 Dream uses several docker-compose configuration files:

diff --git a/README_ru.md b/README_ru.md
@@ -115,19 +115,32 @@ AGENT_PORT=4242 docker-compose -f docker-compose.yml -f assistant_dists/dream/do
 ```
 
 ### Использование
-В отдельном вкладке терминала запустите:
+
+DeepPavlov Agent предоставляет 3 варианта взаимодействия: через интерфейс командной строки, API и Телеграм-бот
+
+#### CLI
+В отдельной вкладке терминала запустите:
 
 ```
-docker-compose exec agent python -m deeppavlov_agent.run -pl assistant_dists/dream/pipeline_conf.json
+docker-compose exec agent python -m deeppavlov_agent.run agent.channel=cmd agent.pipeline_config=assistant_dists/dream/pipeline_conf.json
 ```
 
-Введите имя пользователя и можете начать болтать с Dream!
+Введите имя пользователя и можете начать общаться с Dream!
+
+#### HTTP API
+Как только вы подняли бота, Agent API станет доступен по адресу `http://localhost:4242`.
+Узнать больше про API можно в [DeepPavlov Agent Docs](https://deeppavlov-agent.readthedocs.io/en/latest/intro/overview.html#http-api-server).
 
+Браузерный интерфейс чата в DeepPavlov Agent доступен по адресу `http://localhost:4242/chat'.
 
-### Использование с HTTP API
-Как только вы подняли бота, DeepPavlov's Agent API запускает `http://localhost:4242'. 
-Брайзерный интерфейс по умолчанию в DeepPavlov Agent доступен как `http://localhost:4242/chat'. 
-Узнать больше про  API можно в [DeepPavlov Agent Docs](https://deeppavlov-agent.readthedocs.io/en/latest/intro/overview.html#http-api-server).
+#### Telegram Bot
+На данный момент Телеграм-бот разворачивается **вместо** HTTP API
+Измените определение `agent` `command` внутри `docker-compose.override.yml`:
+```
+agent:
+  command: sh -c 'bin/wait && python -m deeppavlov_agent.run agent.channel=telegram agent.telegram_token=<TELEGRAM_BOT_TOKEN> agent.pipeline_config=assistant_dists/dream/pipeline_conf.json'
+```
+**ВАЖНО:** Не храните токен бота в открытом репозитории!
 
 # Конфигурация и использование proxy 
 Dream использует несколько конфигурационных файлов для docker-compose:

diff --git a/annotators/NER_deeppavlov/Dockerfile b/annotators/NER_deeppavlov/Dockerfile
@@ -16,5 +16,6 @@ COPY $SRC_DIR /src
 WORKDIR /src
 
 RUN python -m deeppavlov install $CONFIG
+RUN python -m deeppavlov download $CONFIG
 
 CMD gunicorn  --workers=1 --timeout 500 server:app -b 0.0.0.0:8021
diff --git a/annotators/NER_deeppavlov/ner_case_agnostic_multilingual_bert_base_extended.json b/annotators/NER_deeppavlov/ner_case_agnostic_multilingual_bert_base_extended.json
@@ -0,0 +1,157 @@
+{
+    "dataset_reader": {
+        "class_name": "conll2003_reader",
+        "data_path": "{DOWNLOADS_PATH}/conll2003/",
+        "dataset_name": "conll2003",
+        "provide_pos": false
+    },
+    "dataset_iterator": {
+        "class_name": "data_learning_iterator"
+    },
+    "chainer": {
+        "in": [
+            "x"
+        ],
+        "in_y": [
+            "y"
+        ],
+        "pipe": [
+            {
+                "class_name": "torch_transformers_ner_preprocessor",
+                "vocab_file": "{TRANSFORMER}",
+                "do_lower_case": false,
+                "max_seq_length": 512,
+                "max_subword_length": 15,
+                "token_masking_prob": 0.0,
+                "in": [
+                    "x"
+                ],
+                "out": [
+                    "x_tokens",
+                    "x_subword_tokens",
+                    "x_subword_tok_ids",
+                    "startofword_markers",
+                    "attention_mask"
+                ]
+            },
+            {
+                "id": "tag_vocab",
+                "class_name": "simple_vocab",
+                "unk_token": [
+                    "O"
+                ],
+                "pad_with_zeros": true,
+                "save_path": "{MODEL_PATH}/tag.dict",
+                "load_path": "{MODEL_PATH}/tag.dict",
+                "fit_on": [
+                    "y"
+                ],
+                "in": [
+                    "y"
+                ],
+                "out": [
+                    "y_ind"
+                ]
+            },
+            {
+                "class_name": "torch_transformers_sequence_tagger",
+                "n_tags": "#tag_vocab.len",
+                "pretrained_bert": "{TRANSFORMER}",
+                "attention_probs_keep_prob": 0.5,
+                "return_probas": false,
+                "use_crf": true,
+                "encoder_layer_ids": [
+                    -1
+                ],
+                "optimizer": "AdamW",
+                "optimizer_parameters": {
+                    "lr": 2e-05,
+                    "weight_decay": 1e-06,
+                    "betas": [
+                        0.9,
+                        0.999
+                    ],
+                    "eps": 1e-06
+                },
+                "clip_norm": 1.0,
+                "min_learning_rate": 1e-07,
+                "learning_rate_drop_patience": 20,
+                "learning_rate_drop_div": 1.5,
+                "load_before_drop": true,
+                "save_path": "{MODEL_PATH}/model",
+                "load_path": "{MODEL_PATH}/model",
+                "in": [
+                    "x_subword_tok_ids",
+                    "attention_mask",
+                    "startofword_markers"
+                ],
+                "in_y": [
+                    "y_ind"
+                ],
+                "out": [
+                    "y_pred_ind",
+                    "probas"
+                ]
+            },
+            {
+                "ref": "tag_vocab",
+                "in": [
+                    "y_pred_ind"
+                ],
+                "out": [
+                    "y_pred"
+                ]
+            }
+        ],
+        "out": [
+            "x_tokens",
+            "y_pred"
+        ]
+    },
+    "train": {
+        "epochs": 50,
+        "batch_size": 100,
+        "metrics": [
+            {
+                "name": "ner_f1",
+                "inputs": [
+                    "y",
+                    "y_pred"
+                ]
+            },
+            {
+                "name": "ner_token_f1",
+                "inputs": [
+                    "y",
+                    "y_pred"
+                ],
+                "print_results": true
+            }
+        ],
+        "validation_patience": 100,
+        "val_every_n_batches": 50,
+        "log_every_n_batches": 50,
+        "show_examples": false,
+        "pytest_max_batches": 2,
+        "pytest_batch_size": 8,
+        "evaluation_targets": [
+            "test"
+        ],
+        "class_name": "torch_trainer"
+    },
+    "metadata": {
+        "variables": {
+            "ROOT_PATH": "~/.deeppavlov",
+            "DOWNLOADS_PATH": "~/.deeppavlov/downloads",
+            "MODELS_PATH": "~/.deeppavlov/models",
+            "TRANSFORMER": "bert-base-multilingual-cased",
+            "MODEL_PATH": "{MODELS_PATH}/ner/mbert_dream_with_numbers_rus_ext"
+        },
+        "download": [
+            {
+                "url": "http://files.deeppavlov.ai/v1/ner/mbert_dream_with_numbers.tar.gz",
+                "subdir": "{MODEL_PATH}"
+            }
+        ]
+    }
+}
Original file line number	Diff line number	Diff line change
Expand Up		@@ -140,3 +140,5 @@ network.yml

		kubernetes/models
		docker-compose-one-replica.yml

		*.pem