Skip to content

Commit

Permalink
Merge pull request #107 from alan-turing-institute/openai-multimodal
Browse files Browse the repository at this point in the history
Refactor multimodal prompting with VertexAI and Gemini models
  • Loading branch information
rchan26 authored Oct 21, 2024
2 parents 2a9ee06 + 7ba488c commit 771be5f
Show file tree
Hide file tree
Showing 43 changed files with 1,117 additions and 280 deletions.
9 changes: 9 additions & 0 deletions examples/gemini/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,15 @@ We provide an example experiment file in [data/input/gemini-example.jsonl](https
prompto_run_experiment --file data/input/gemini-example.jsonl --max-queries 30
```

## Multimodal prompting

Multimodal prompting is available with the Gemini API. We provide an example notebook in the [Multimodal prompting with Vertex AI notebook](./gemini-multimodal.ipynb) and example experiment file in [data/input/gemini-multimodal-example.jsonl](https://github.com/alan-turing-institute/prompto/blob/main/examples/gemini/data/input/gemini-multimodal-example.jsonl). You can run it with the following command:
```bash
prompto_run_experiment --file data/input/gemini-multimodal-example.jsonl --max-queries 30
```

## Environment variables

To run the experiment, you will need to set the following environment variables first:
```bash
export GEMINI_API_KEY=<YOUR-GEMINI-KEY>
Expand Down
3 changes: 3 additions & 0 deletions examples/gemini/data/input/gemini-multimodal-example.jsonl
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{"id": 0, "api": "gemini", "model_name": "gemini-1.5-flash", "prompt": [{"role": "user", "parts": ["describe what is happening in this image", {"type": "image", "media": "pantani_giro.jpg"}]}], "parameters": {"candidate_count": 1, "temperature": 1, "max_output_tokens": 1000}}
{"id": 1, "api": "gemini", "model_name": "gemini-1.5-flash", "prompt": [{"role": "user", "parts": [{"type": "image", "media": "mortadella.jpg"}, "what is this?"]}], "parameters": {"candidate_count": 1, "temperature": 1, "max_output_tokens": 1000}}
{"id": 2, "api": "gemini", "model_name": "gemini-1.5-flash", "prompt": [{"role": "user", "parts": ["what is in this image?", {"type": "image", "media": "pantani_giro.jpg"}]}, {"role": "model", "parts": "This is image shows a group of cyclists."}, {"role": "user", "parts": "are there any notable cyclists in this image? what are their names?"}], "parameters": {"candidate_count": 1, "temperature": 1, "max_output_tokens": 1000}}
Binary file added examples/gemini/data/media/mortadella.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/gemini/data/media/pantani_giro.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{"id": 0, "api": "gemini", "model_name": "gemini-1.5-flash", "prompt": "How does technology impact us?", "safety_filter": "none", "parameters": {"candidate_count": 1, "temperature": 1, "max_output_tokens": 100}, "timestamp_sent": "18-10-2024-09-57-31", "response": "Technology impacts us in profound and multifaceted ways, shaping our lives in every aspect from communication and work to entertainment and healthcare. Here's a breakdown of key impacts:\n\n**Positive Impacts:**\n\n* **Enhanced Communication:** Technology has revolutionized communication. We can connect with people across the globe instantly through video calls, instant messaging, and social media. This fosters stronger relationships, facilitates business, and enables global collaboration.\n* **Increased Efficiency and Productivity:** Technology automates tasks, streamlines processes,", "safety_attributes": {"HARM_CATEGORY_SEXUALLY_EXPLICIT": "1", "HARM_CATEGORY_HATE_SPEECH": "1", "HARM_CATEGORY_HARASSMENT": "1", "HARM_CATEGORY_DANGEROUS_CONTENT": "1", "blocked": "[False, False, False, False]", "finish_reason": "MAX_TOKENS"}}
{"id": 1, "api": "gemini", "model_name": "gemini-1.0-pro", "prompt": "How does technology impact us?", "safety_filter": "few", "parameters": {"candidate_count": 1, "temperature": 1, "max_output_tokens": 100}, "timestamp_sent": "18-10-2024-09-57-33", "response": "**Positive Impacts of Technology**\n\n* **Enhanced Communication:** Social media, messaging apps, and video conferencing facilitate seamless connection and information exchange globally.\n* **Increased Accessibility:** Technology provides equal access to education, healthcare, and information for individuals regardless of location or circumstances.\n* **Improved Productivity:** Automation, artificial intelligence, and data analytics streamline processes, improve efficiency, and reduce time consumption.\n* **Medical Advancements:** Medical devices, surgical robots, and telemedicine enhance patient care, improve", "safety_attributes": {"HARM_CATEGORY_SEXUALLY_EXPLICIT": "1", "HARM_CATEGORY_HATE_SPEECH": "1", "HARM_CATEGORY_HARASSMENT": "1", "HARM_CATEGORY_DANGEROUS_CONTENT": "1", "blocked": "[False, False, False, False]", "finish_reason": "MAX_TOKENS"}}
{"id": 2, "api": "gemini", "model_name": "gemini-1.5-flash", "prompt": ["How does international trade create jobs?", "I want a joke about that"], "safety_filter": "some", "parameters": {"candidate_count": 1, "temperature": 1, "max_output_tokens": 100}, "timestamp_sent": "18-10-2024-09-57-35", "response": ["International trade creates jobs in a multitude of ways, boosting both domestic and global economies. Here's a breakdown of how it works:\n\n**1. Increased Production and Export Jobs:**\n\n* **Specialization and Comparative Advantage:** Countries specialize in producing goods and services where they have a comparative advantage (lower opportunity cost). This leads to increased efficiency and higher production levels.\n* **Export Jobs:** Producers of these specialized goods and services find new markets abroad, leading to increased demand for their products", "Why did the economist get a job in international trade?\n\nBecause he was tired of being a \"jobless\" economist! \ud83d\ude1c \n"], "safety_attributes": [{"HARM_CATEGORY_SEXUALLY_EXPLICIT": "1", "HARM_CATEGORY_HATE_SPEECH": "1", "HARM_CATEGORY_HARASSMENT": "1", "HARM_CATEGORY_DANGEROUS_CONTENT": "1", "blocked": "[False, False, False, False]", "finish_reason": "MAX_TOKENS"}, {"HARM_CATEGORY_SEXUALLY_EXPLICIT": "1", "HARM_CATEGORY_HATE_SPEECH": "1", "HARM_CATEGORY_HARASSMENT": "2", "HARM_CATEGORY_DANGEROUS_CONTENT": "1", "blocked": "[False, False, False, False]", "finish_reason": "STOP"}]}
{"id": 3, "api": "gemini", "model_name": "gemini-1.5-flash", "prompt": [{"role": "system", "parts": "You are a helpful assistant designed to answer questions briefly."}, {"role": "user", "parts": "What efforts are being made to keep the hakka language alive?"}], "safety_filter": "default", "parameters": {"candidate_count": 1, "temperature": 1, "max_output_tokens": 100}, "timestamp_sent": "18-10-2024-09-57-37", "response": "Efforts to preserve Hakka include language schools, cultural festivals, and online resources. \n", "safety_attributes": {"HARM_CATEGORY_SEXUALLY_EXPLICIT": "1", "HARM_CATEGORY_HATE_SPEECH": "1", "HARM_CATEGORY_HARASSMENT": "1", "HARM_CATEGORY_DANGEROUS_CONTENT": "1", "blocked": "[False, False, False, False]", "finish_reason": "STOP"}}
{"id": 4, "api": "gemini", "model_name": "gemini-1.5-flash", "prompt": [{"role": "user", "parts": "What efforts are being made to keep the hakka language alive?"}, {"role": "system", "parts": "You are a helpful assistant designed to answer questions briefly."}], "safety_filter": "most", "parameters": {"candidate_count": 1, "temperature": 1, "max_output_tokens": 100}, "timestamp_sent": "18-10-2024-09-57-39", "response": "TypeError - if api == 'gemini', then the prompt must be a str, list[str], or list[dict[str,str]] where the dictionary contains the keys 'role' and 'parts' only, and the values for 'role' must be one of 'user' or 'model', except for the first message in the list of dictionaries can be a system message with the key 'role' set to 'system'."}
{"id": 5, "api": "gemini", "model_name": "gemini-1.5-flash", "prompt": [{"role": "system", "parts": "You are a helpful assistant designed to answer questions briefly."}, {"role": "user", "parts": "Hello, I'm Bob and I'm 6 years old"}, {"role": "model", "parts": "Hi Bob, how may I assist you?"}, {"role": "user", "parts": "How old will I be next year?"}], "safety_filter": "most", "parameters": {"candidate_count": 1, "temperature": 1, "max_output_tokens": 100}, "timestamp_sent": "18-10-2024-09-57-41", "response": "You will be 7 years old next year! \n", "safety_attributes": {"HARM_CATEGORY_SEXUALLY_EXPLICIT": "1", "HARM_CATEGORY_HATE_SPEECH": "1", "HARM_CATEGORY_HARASSMENT": "1", "HARM_CATEGORY_DANGEROUS_CONTENT": "1", "blocked": "[False, False, False, False]", "finish_reason": "STOP"}}
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
18-10-2024, 09:57: Error (i=5, id=4): TypeError - if api == 'gemini', then the prompt must be a str, list[str], or list[dict[str,str]] where the dictionary contains the keys 'role' and 'parts' only, and the values for 'role' must be one of 'user' or 'model', except for the first message in the list of dictionaries can be a system message with the key 'role' set to 'system'.
18-10-2024, 09:57: Completed experiment: gemini-example.jsonl! Experiment processing time: 12.673 seconds, Average time per query: 2.112 seconds

This file was deleted.

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{"id": 0, "api": "gemini", "model_name": "gemini-1.5-flash", "prompt": [{"role": "user", "parts": ["describe what is happening in this image", {"type": "image", "media": "pantani_giro.jpg"}]}], "parameters": {"candidate_count": 1, "temperature": 1, "max_output_tokens": 1000}, "timestamp_sent": "21-10-2024-12-22-49", "response": "A group of cyclists are racing down a road. The cyclist in the lead is wearing a pink and yellow jersey and is riding a black bicycle. He is closely followed by a cyclist in a yellow and red jersey on a yellow and black bicycle, and a cyclist in a green and yellow jersey on a blue bicycle. The cyclist in the pink jersey is looking over his shoulder at the other cyclists, and the cyclists in the yellow and red and green and yellow jerseys are looking straight ahead. The cyclists are all in a single file line, and the road is paved and has a white line down the middle. There are a few other cyclists behind them, but they are not as clear in the image. There is a stone wall on the right side of the road, and some green grass and trees in the background.", "safety_attributes": {"HARM_CATEGORY_SEXUALLY_EXPLICIT": "1", "HARM_CATEGORY_HATE_SPEECH": "1", "HARM_CATEGORY_HARASSMENT": "1", "HARM_CATEGORY_DANGEROUS_CONTENT": "1", "blocked": "[False, False, False, False]", "finish_reason": "STOP"}}
{"id": 1, "api": "gemini", "model_name": "gemini-1.5-flash", "prompt": [{"role": "user", "parts": [{"type": "image", "media": "mortadella.jpg"}, "what is this?"]}], "parameters": {"candidate_count": 1, "temperature": 1, "max_output_tokens": 1000}, "timestamp_sent": "21-10-2024-12-22-51", "response": "This is Mortadella, an Italian cured pork sausage. It is known for its characteristic marbling of fat.", "safety_attributes": {"HARM_CATEGORY_SEXUALLY_EXPLICIT": "1", "HARM_CATEGORY_HATE_SPEECH": "1", "HARM_CATEGORY_HARASSMENT": "1", "HARM_CATEGORY_DANGEROUS_CONTENT": "1", "blocked": "[False, False, False, False]", "finish_reason": "STOP"}}
{"id": 2, "api": "gemini", "model_name": "gemini-1.5-flash", "prompt": [{"role": "user", "parts": ["what is in this image?", {"type": "image", "media": "pantani_giro.jpg"}]}, {"role": "model", "parts": "This is image shows a group of cyclists."}, {"role": "user", "parts": "are there any notable cyclists in this image? what are their names?"}], "parameters": {"candidate_count": 1, "temperature": 1, "max_output_tokens": 1000}, "timestamp_sent": "21-10-2024-12-22-53", "response": "The image features several notable cyclists, including:\n\n* **Miguel Indurain:** The cyclist in pink, wearing a yellow helmet and sunglasses, is the Spanish legend Miguel Indurain. He is a five-time winner of the Tour de France.\n* **Claudio Chiappucci:** The cyclist in the red, white, and yellow jersey is Claudio Chiappucci, an Italian cyclist known for his strong performances in the mountains. \n* **Marco Pantani:** The cyclist in the green and yellow jersey is Marco Pantani, an Italian climber who won the Tour de France in 1998 and the Giro d'Italia in 1998 and 1999.\n\nIt's important to note that this image is from the 1990s, and these cyclists are riding for their respective teams at the time. \n", "safety_attributes": {"HARM_CATEGORY_SEXUALLY_EXPLICIT": "1", "HARM_CATEGORY_HATE_SPEECH": "1", "HARM_CATEGORY_HARASSMENT": "1", "HARM_CATEGORY_DANGEROUS_CONTENT": "1", "blocked": "[False, False, False, False]", "finish_reason": "STOP"}}
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{"id": 0, "api": "gemini", "model_name": "gemini-1.5-flash", "prompt": [{"role": "user", "parts": ["describe what is happening in this image", {"type": "image", "media": "pantani_giro.jpg"}]}], "parameters": {"candidate_count": 1, "temperature": 1, "max_output_tokens": 1000}}
{"id": 1, "api": "gemini", "model_name": "gemini-1.5-flash", "prompt": [{"role": "user", "parts": [{"type": "image", "media": "mortadella.jpg"}, "what is this?"]}], "parameters": {"candidate_count": 1, "temperature": 1, "max_output_tokens": 1000}}
{"id": 2, "api": "gemini", "model_name": "gemini-1.5-flash", "prompt": [{"role": "user", "parts": ["what is in this image?", {"type": "image", "media": "pantani_giro.jpg"}]}, {"role": "model", "parts": "This is image shows a group of cyclists."}, {"role": "user", "parts": "are there any notable cyclists in this image? what are their names?"}], "parameters": {"candidate_count": 1, "temperature": 1, "max_output_tokens": 1000}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
21-10-2024, 12:22: Completed experiment: gemini-multimodal-example.jsonl! Experiment processing time: 8.451 seconds, Average time per query: 2.817 seconds
Loading

0 comments on commit 771be5f

Please sign in to comment.