Skip to content

Commit

Permalink
Merge pull request #475 from deeppavlov/dev
Browse files Browse the repository at this point in the history
Release v1.5.1
  • Loading branch information
dilyararimovna authored May 29, 2023
2 parents 9b2708f + 633cd43 commit 528e063
Show file tree
Hide file tree
Showing 211 changed files with 786 additions and 314 deletions.
4 changes: 2 additions & 2 deletions .env
Original file line number Diff line number Diff line change
Expand Up @@ -20,15 +20,15 @@ TEXT_QA_URL=http://text-qa:8078/model
BADLIST_ANNOTATOR_URL=http://badlisted-words:8018/badlisted_words_batch
COMET_ATOMIC_SERVICE_URL=http://comet-atomic:8053/comet
COMET_CONCEPTNET_SERVICE_URL=http://comet-conceptnet:8065/comet
MASKED_LM_SERVICE_URL=http://masked-lm:8088/respond
MASKED_LM_SERVICE_URL=http://masked-lm:8102/respond
DP_WIKIDATA_URL=http://wiki-parser:8077/model
DP_ENTITY_LINKING_URL=http://entity-linking:8075/model
KNOWLEDGE_GROUNDING_SERVICE_URL=http://knowledge-grounding:8083/respond
WIKIDATA_DIALOGUE_SERVICE_URL=http://wikidata-dial-service:8092/model
NEWS_API_ANNOTATOR_URL=http://news-api-annotator:8112/respond
WIKI_FACTS_URL=http://wiki-facts:8116/respond
FACT_RANDOM_SERVICE_URL=http://fact-random:8119/respond
INFILLING_SERVICE_URL=http://infilling:8122/respond
INFILLING_SERVICE_URL=http://infilling:8106/respond
DIALOGPT_CONTINUE_SERVICE_URL=http://dialogpt:8125/continue
PROMPT_STORYGPT_SERVICE_URL=http://prompt-storygpt:8127/respond
STORYGPT_SERVICE_URL=http://storygpt:8126/respond
Expand Down
1 change: 1 addition & 0 deletions MODELS.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,4 @@ Here you may find a list of models that currently available for use in Generativ
| Open-Assistant Pythia 12B | transformers-lm-oasst12b | [link](https://huggingface.co/OpenAssistant/pythia-12b-sft-v8-7k-steps) | yes | 12B | 26GB (half-precision) | 5,120 tokens | An open-source English-only instruction-based large language model which is NOT good at answering math and coding questions. NB: free of charge. This model is up and running on our servers and can be used for free. |
| GPT-4 | openai-api-gpt4 | [link](https://platform.openai.com/docs/models/gpt-4) | no (paid access via API) | supposedly, 175B | - (cannot be run locally) | 8,192 tokens | A multilingual instruction-based large language model which is capable of code generation and other complex tasks. More capable than any GPT-3.5 model, able to do more complex tasks, and optimized for chat. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. |
| GPT-4 32K | openai-api-gpt4-32k | [link](https://platform.openai.com/docs/models/gpt-4) | no (paid access via API) | supposedly, 175B | - (cannot be run locally) | 32,768 tokens | A multilingual instruction-based large language model which is capable of code generation and other complex tasks. Same capabilities as the base gpt-4 mode but with 4x the context length. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. |
| GPT-JT 6B | transformers-lm-gptjt | [link](https://huggingface.co/togethercomputer/GPT-JT-6B-v1) | yes | 6B | 26GB | 2,048 tokens | An open-source English-only large language model which was fine-tuned for instruction following but is NOT capable of code generation. NB: free of charge. This model is up and running on our servers and can be used for free. |
34 changes: 18 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -260,22 +260,24 @@ Dream Architecture is presented in the following image:
| Wiki Facts | 1.7 GB RAM | model that extracts related facts from Wikipedia and WikiHow pages |

## Services
| Name | Requirements | Description |
|------------------------|-------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| DialoGPT | 1.2 GB RAM, 2.1 GB GPU | generative service based on Transformers generative model, the model is set in docker compose argument `PRETRAINED_MODEL_NAME_OR_PATH` (for example, `microsoft/DialoGPT-small` with 0.2-0.5 sec on GPU) |
| DialoGPT Persona-based | 1.2 GB RAM, 2.1 GB GPU | generative service based on Transformers generative model, the model was pre-trained on the PersonaChat dataset to generate a response conditioned on a several sentences of the socialbot's persona |
| Image Captioning | 4 GB RAM, 5.4 GB GPU | creates text representation of a received image |
| Infilling | 1 GB RAM, 1.2 GB GPU | (turned off but the code is available) generative service based on Infilling model, for the given utterance returns utterance where `_` from original text is replaced with generated tokens |
| Knowledge Grounding | 2 GB RAM, 2.1 GB GPU | generative service based on BlenderBot architecture providing a response to the context taking into account an additional text paragraph |
| Masked LM | 1.1 GB RAM, 1 GB GPU | (turned off but the code is available) |
| Seq2seq Persona-based | 1.5 GB RAM, 1.5 GB GPU | generative service based on Transformers seq2seq model, the model was pre-trained on the PersonaChat dataset to generate a response conditioned on a several sentences of the socialbot's persona |
| Sentence Ranker | 1.2 GB RAM, 2.1 GB GPU | ranking model given as `PRETRAINED_MODEL_NAME_OR_PATH` which for a pair os sentences returns a float score of correspondence |
| StoryGPT | 2.6 GB RAM, 2.15 GB GPU | generative service based on fine-tuned GPT-2, for the given set of keywords returns a short story using the keywords |
| GPT-3.5 | 100 MB RAM | generative service based on OpenAI API service, the model is set in docker compose argument `PRETRAINED_MODEL_NAME_OR_PATH` (in particular, in this service, `text-davinci-003` is used. |
| ChatGPT | 100 MB RAM | generative service based on OpenAI API service, the model is set in docker compose argument `PRETRAINED_MODEL_NAME_OR_PATH` (in particular, in this service, `gpt-3.5-turbo` is used. |
| Prompt StoryGPT | 3 GB RAM, 4 GB GPU | generative service based on fine-tuned GPT-2, for the given topic represented by one noun returns short story on a given topic |
| GPT-J 6B | 1.5 GB RAM, 24.2 GB GPU | generative service based on Transformers generative model, the model is set in docker compose argument `PRETRAINED_MODEL_NAME_OR_PATH` (in particular, in this service, [GPT-J model](https://huggingface.co/EleutherAI/gpt-j-6B) is used. |
| BLOOMZ 7B | 2.5 GB RAM, 29 GB GPU | generative service based on Transformers generative model, the model is set in docker compose argument `PRETRAINED_MODEL_NAME_OR_PATH` (in particular, in this service, [BLOOMZ-7b1 model](https://huggingface.co/bigscience/bloomz-7b1) is used. |
| Name | Requirements | Description |
|------------------------|-------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| DialoGPT | 1.2 GB RAM, 2.1 GB GPU | generative service based on Transformers generative model, the model is set in docker compose argument `PRETRAINED_MODEL_NAME_OR_PATH` (for example, `microsoft/DialoGPT-small` with 0.2-0.5 sec on GPU) |
| DialoGPT Persona-based | 1.2 GB RAM, 2.1 GB GPU | generative service based on Transformers generative model, the model was pre-trained on the PersonaChat dataset to generate a response conditioned on a several sentences of the socialbot's persona |
| Image Captioning | 4 GB RAM, 5.4 GB GPU | creates text representation of a received image |
| Infilling | 1 GB RAM, 1.2 GB GPU | (turned off but the code is available) generative service based on Infilling model, for the given utterance returns utterance where `_` from original text is replaced with generated tokens |
| Knowledge Grounding | 2 GB RAM, 2.1 GB GPU | generative service based on BlenderBot architecture providing a response to the context taking into account an additional text paragraph |
| Masked LM | 1.1 GB RAM, 1 GB GPU | (turned off but the code is available) |
| Seq2seq Persona-based | 1.5 GB RAM, 1.5 GB GPU | generative service based on Transformers seq2seq model, the model was pre-trained on the PersonaChat dataset to generate a response conditioned on a several sentences of the socialbot's persona |
| Sentence Ranker | 1.2 GB RAM, 2.1 GB GPU | ranking model given as `PRETRAINED_MODEL_NAME_OR_PATH` which for a pair os sentences returns a float score of correspondence |
| StoryGPT | 2.6 GB RAM, 2.15 GB GPU | generative service based on fine-tuned GPT-2, for the given set of keywords returns a short story using the keywords |
| GPT-3.5 | 100 MB RAM | generative service based on OpenAI API service, the model is set in docker compose argument `PRETRAINED_MODEL_NAME_OR_PATH` (in particular, in this service, `text-davinci-003` is used. |
| ChatGPT | 100 MB RAM | generative service based on OpenAI API service, the model is set in docker compose argument `PRETRAINED_MODEL_NAME_OR_PATH` (in particular, in this service, `gpt-3.5-turbo` is used. |
| Prompt StoryGPT | 3 GB RAM, 4 GB GPU | generative service based on fine-tuned GPT-2, for the given topic represented by one noun returns short story on a given topic |
| GPT-J 6B | 1.5 GB RAM, 24.2 GB GPU | generative service based on Transformers generative model, the model is set in docker compose argument `PRETRAINED_MODEL_NAME_OR_PATH` (in particular, in this service, [GPT-J model](https://huggingface.co/EleutherAI/gpt-j-6B) is used. |
| BLOOMZ 7B | 2.5 GB RAM, 29 GB GPU | generative service based on Transformers generative model, the model is set in docker compose argument `PRETRAINED_MODEL_NAME_OR_PATH` (in particular, in this service, [BLOOMZ-7b1 model](https://huggingface.co/bigscience/bloomz-7b1) is used. |
| GPT-JT 6B | 2.5 GB RAM, 25.1 GB GPU | generative service based on Transformers generative model, the model is set in docker compose argument `PRETRAINED_MODEL_NAME_OR_PATH` (in particular, in this service, [GPT-JT model](https://huggingface.co/togethercomputer/GPT-JT-6B-v1) is used. |


## Skills
| Name | Requirements | Description |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,13 @@ compose:
CONFIG: fact_retrieval_rus.json
COMMIT: c8264bf82eaa3ed138395ab68f71d47a4175f2fc
TOP_N: 20
SERVICE_PORT: 8130
SERVICE_PORT: 8110
SRC_DIR: annotators/fact_retrieval_rus
CUDA_VISIBLE_DEVICES: '0'
FLASK_APP: server
context: ./
dockerfile: annotators/fact_retrieval_rus/Dockerfile
command: flask run -h 0.0.0.0 -p 8130
command: flask run -h 0.0.0.0 -p 8110
environment:
- CUDA_VISIBLE_DEVICES=0
- FLASK_APP=server
Expand All @@ -29,5 +29,5 @@ compose:
- ./annotators/fact_retrieval_rus:/src
- ~/.deeppavlov:/root/.deeppavlov
ports:
- 8130:8130
- 8110:8110
proxy: null
4 changes: 3 additions & 1 deletion annotators/spacy_annotator/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,6 @@ spacy==3.2.0
typer==0.4.1
click<=8.0.4
jinja2<=3.0.3
Werkzeug<=2.0.3
Werkzeug<=2.0.3
typing-inspect==0.8.0
typing_extensions==4.5.0
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,12 @@ compose:
build:
context: ./annotators/toxic_classification_ru/
args:
SERVICE_PORT: 8126
SERVICE_PORT: 8118
PRETRAINED_MODEL_NAME_OR_PATH: s-nlp/russian_toxicity_classifier
LANGUAGE: RU
CUDA_VISIBLE_DEVICES: '0'
FLASK_APP: server
command: flask run -h 0.0.0.0 -p 8126
command: flask run -h 0.0.0.0 -p 8118
environment:
- CUDA_VISIBLE_DEVICES=0
- FLASK_APP=server
Expand All @@ -27,5 +27,5 @@ compose:
- ./annotators/toxic_classification_ru:/src
- ~/.deeppavlov/cache:/root/.cache
ports:
- 8126:8126
- 8118:8118
proxy: null
4 changes: 2 additions & 2 deletions assistant_dists/dream/dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -446,14 +446,14 @@ services:
- "./services/infilling:/src"
- "~/.deeppavlov:/root/.deeppavlov"
ports:
- 8139:8139
- 8106:8106
masked-lm:
volumes:
- "./services/masked_lm:/src"
- "./common:/src/common"
- "~/.deeppavlov/cache:/root/.cache"
ports:
- 8141:8141
- 8102:8102
dff-template-skill:
volumes:
- "./skills/dff_template_skill:/src"
Expand Down
8 changes: 4 additions & 4 deletions assistant_dists/dream/docker-compose.override.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1355,9 +1355,9 @@ services:
build:
context: ./services/infilling/
args:
SERVICE_PORT: 8139
SERVICE_PORT: 8106
SERVICE_NAME: infilling
command: flask run -h 0.0.0.0 -p 8139
command: flask run -h 0.0.0.0 -p 8106
environment:
- CUDA_VISIBLE_DEVICES=0
- FLASK_APP=server
Expand All @@ -1373,10 +1373,10 @@ services:
build:
context: ./services/masked_lm/
args:
SERVICE_PORT: 8141
SERVICE_PORT: 8102
SERVICE_NAME: masked_lm
PRETRAINED_MODEL_NAME_OR_PATH: "bert-base-uncased"
command: flask run -h 0.0.0.0 -p 8141
command: flask run -h 0.0.0.0 -p 8102
environment:
- CUDA_VISIBLE_DEVICES=0
- FLASK_APP=server
Expand Down
2 changes: 1 addition & 1 deletion assistant_dists/dream_multimodal/dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -75,5 +75,5 @@ services:
- "./skills/dff_image_skill:/src"
- "./common:/src/common"
ports:
- 8188:8188
- 8124:8124
version: "3.7"
6 changes: 3 additions & 3 deletions assistant_dists/dream_multimodal/docker-compose.override.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ services:
environment:
WAIT_HOSTS: "dff-program-y-skill:8008, sentseg:8011, convers-evaluation-selector:8009,
dff-intent-responder-skill:8012, intent-catcher:8014, badlisted-words:8018,
spelling-preprocessing:8074, dialogpt:8125, sentence-ranker:8128, image-captioning:8123, dff-image-skill:8188"
spelling-preprocessing:8074, dialogpt:8125, sentence-ranker:8128, image-captioning:8123, dff-image-skill:8124"
WAIT_HOSTS_TIMEOUT: ${WAIT_TIMEOUT:-1200}
HIGH_PRIORITY_INTENTS: 1
RESTRICTION_FOR_SENSITIVE_CASE: 1
Expand Down Expand Up @@ -216,12 +216,12 @@ services:
env_file: [.env]
build:
args:
SERVICE_PORT: 8188
SERVICE_PORT: 8124
SERVICE_NAME: dff_image_skill
LANGUAGE: EN
context: .
dockerfile: ./skills/dff_image_skill/Dockerfile
command: gunicorn --workers=1 server:app -b 0.0.0.0:8188 --reload
command: gunicorn --workers=1 server:app -b 0.0.0.0:8124 --reload
deploy:
resources:
limits:
Expand Down
2 changes: 1 addition & 1 deletion assistant_dists/dream_multimodal/pipeline_conf.json
Original file line number Diff line number Diff line change
Expand Up @@ -318,7 +318,7 @@
"connector": {
"protocol": "http",
"timeout": 2.0,
"url": "http://dff-image-skill:8188/respond"
"url": "http://dff-image-skill:8124/respond"
},
"dialog_formatter": "state_formatters.dp_formatters:dff_image_skill_formatter",
"response_formatter": "state_formatters.dp_formatters:skill_with_attributes_formatter_service",
Expand Down
8 changes: 4 additions & 4 deletions assistant_dists/dream_persona_prompted/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,10 +76,10 @@ If one wants to create a new prompted distribution (distribution containing prom
to an unused one.
3. Choose the generative service to be used. For that one needs to:
1. in `dream/assistant_dists/dream_custom_prompted/` folder in files `docker-compose.override.yml`, `dev.yml`
replace `transformers-lm-gptj` container description to a new one.
replace `transformers-lm-gptjt` container description to a new one.
In particular, one may replace in `PRETRAINED_MODEL_NAME_OR_PATH` parameter
a utilized Language Model (LM) `GPT-J` with another one from `Transformers` library.
Please change a port (`8130` for `transformers-lm-gptj`) to unused ones.
a utilized Language Model (LM) `GPT-JT` with another one from `Transformers` library.
Please change a port (`8161` for `transformers-lm-gptjt`) to unused ones.
2. in all prompted skills' containers descriptions change `GENERATIVE_SERVICE_URL` to your generative model.
Take into account that the service name is constructed as `http://<container-name>:<port>/<endpoint>`.
4. For each prompted skill, one needs to create an input state formatter. To do that, one needs to:
Expand All @@ -99,7 +99,7 @@ If one wants to create a new prompted distribution (distribution containing prom
"connector": {
"protocol": "http",
"timeout": 4.5,
"url": "http://dff-dream-persona-gpt-j-prompted-skill:8134/respond"
"url": "http://dff-dream-persona-gpt-jt-prompted-skill:8134/respond"
},
"dialog_formatter": {
"name": "state_formatters.dp_formatters:dff_prompted_skill_formatter",
Expand Down
2 changes: 1 addition & 1 deletion assistant_dists/dream_persona_prompted/cpu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ services:
environment:
DEVICE: cpu
CUDA_VISIBLE_DEVICES: ""
transformers-lm-gptj:
transformers-lm-gptjt:
environment:
DEVICE: cpu
CUDA_VISIBLE_DEVICES: ""
Loading

0 comments on commit 528e063

Please sign in to comment.