diff --git a/14-the-generative-ai-application-lifecycle/README.md b/14-the-generative-ai-application-lifecycle/README.md index c62d01ba7..cf044dddc 100644 --- a/14-the-generative-ai-application-lifecycle/README.md +++ b/14-the-generative-ai-application-lifecycle/README.md @@ -23,11 +23,11 @@ With this, we need a new Paradigm to adapt this tool in a dynamic, with the corr ![LLMOps vs. MLOps comparison](./images/01-llmops-shift.png?WT.mc_id=academic-105485-koreys) -Notice that in LLMOps, we are more focused in the App Developers, using integrations as a key point, using "Models-as-a-Service" and thinking in the following points for metrics. +Notice that in LLMOps, we are more focused on the App Developers, using integrations as a key point, using "Models-as-a-Service" and thinking in the following points for metrics. - Quality: Response quality - Harm: Responsible AI -- Honesty: Response groundness (Makes sense? It is correct?) +- Honesty: Response groundedness (Makes sense? It is correct?) - Cost: Solution Budget - Latency: Avg. time for token response @@ -37,11 +37,11 @@ First, to understand the lifecycle and the modifications, let's note the next in ![LLMOps infographic](./images/02-llmops.png?WT.mc_id=academic-105485-koreys) -As you may note, this is different from the usual Lifecycles from MLOps. LLMs have many new requirements, as Prompting, different tecniques to improve quality (Fine-Tuning, RAG, Meta-Prompts), different assessment and responsability with responsible AI, lastly, new evaluation metrics (Quality, Harm, Honesty, Cost and Latency). +As you may note, this is different from the usual Lifecycles from MLOps. LLMs have many new requirements, as Prompting, different techniques to improve quality (Fine-Tuning, RAG, Meta-Prompts), different assessment and responsability with responsible AI, lastly, new evaluation metrics (Quality, Harm, Honesty, Cost and Latency). -For instance, take a look how we ideate. Using prompt engineering to experiment with various LLMs to explore possibilities to tests if their Hypothesis could be correct. +For instance, take a look at how we ideate. Using prompt engineering to experiment with various LLMs to explore possibilities to test if their Hypothesis could be correct. -Note that this is not linear, but integrated loops, iterative and with an overacrching cycle. +Note that this is not linear, but integrated loops, iterative and with an overarching cycle. How could we explore those steps? Let's step into detail in how could we build a lifecycle. @@ -81,7 +81,7 @@ Construct, from Proof-of-Concept(POC) until large scale applications with Prompt ## Great! Continue your Learning! -Amazing, now learn more how we structure an application to use the concepts with the [Contoso Chat App](https://nitya.github.io/contoso-chat/?WT.mc_id=academic-105485-koreyst), to check how Cloud Advocacy adds those concepts in demonstations. For more content, check our [Ignite breakout session! +Amazing, now learn more about how we structure an application to use the concepts with the [Contoso Chat App](https://nitya.github.io/contoso-chat/?WT.mc_id=academic-105485-koreyst), to check how Cloud Advocacy adds those concepts in demonstrations. For more content, check our [Ignite breakout session! ](https://www.youtube.com/watch?v=DdOylyrTOWg) Now, check Lesson 15, to understand how [Retrieval Augmented Generation and Vector Databases](../15-rag-and-vector-databases/README.md?WT.mc_id=academic-105485-koreyst) impact Generative AI and to make more engaging Applications! diff --git a/15-rag-and-vector-databases/README.md b/15-rag-and-vector-databases/README.md index 5ef8fb27a..3925d686f 100644 --- a/15-rag-and-vector-databases/README.md +++ b/15-rag-and-vector-databases/README.md @@ -123,7 +123,7 @@ An example of embedded text using OpenAI's `text-embedding-ada-002` model is: ## Retrieval and Vector Search -When a user asks a question, the retriever transforms it into a vector using the query encoder, it then searches the through our document search index for relevant vectors in the document that are related to the input. Once done, it converts both the input vector and document vectors into text and passes it through the LLM. +When a user asks a question, the retriever transforms it into a vector using the query encoder, it then searches through our document search index for relevant vectors in the document that are related to the input. Once done, it converts both the input vector and document vectors into text and passes it through the LLM. ### Retrieval @@ -207,7 +207,7 @@ def chatbot(user_input): # create a message object messages=[ - {"role": "system", "content": "You are an AI assiatant that helps with AI questions."}, + {"role": "system", "content": "You are an AI assistant that helps with AI questions."}, {"role": "user", "content": history[-1]} ] diff --git a/16-open-source-models/README.md b/16-open-source-models/README.md index bbb5f3de8..d1e931675 100644 --- a/16-open-source-models/README.md +++ b/16-open-source-models/README.md @@ -12,7 +12,7 @@ The world of open-source LLMs is exciting and constantly evolving. This lesson a ## What are Open Source Models? -Open source software has played a crucial role in the growth of technology across various fields.The Open Source Initiative (OSI) has defined [10 criteria for software](https://opensource.org/osd?WT.mc_id=academic-105485-koreyst) to be classified as open source. The source code must be openly shared under a license approved by the OSI. +Open source software has played a crucial role in the growth of technology across various fields. The Open Source Initiative (OSI) has defined [10 criteria for software](https://opensource.org/osd?WT.mc_id=academic-105485-koreyst) to be classified as open source. The source code must be openly shared under a license approved by the OSI. While the development of LLMs has similar elements to developing software, the process is not exactly the same. This has brought much discussion in the community on the definition of open source in the context of LLMs. For a model to be aligned with the traditional definition of open source the following information should be publicly available: @@ -28,14 +28,14 @@ For this lesson, we will refer to the models as "open models" going forward as t ## Benefits of Open Models -**Highly Customizable** - Since open models are released with detailed training information, researchers and developers can modify the model's internals. This enables the creation of highly specialized models that are fine-tuned for a specific task or area of study. Some examples of this is code generation, mathematical operations and biology. +**Highly Customizable** - Since open models are released with detailed training information, researchers and developers can modify the model's internals. This enables the creation of highly specialized models that are fine-tuned for a specific task or area of study. Some examples of this are code generation, mathematical operations and biology. **Cost** - The cost per token for using and deploying these models is lower than that of proprietary models. When building Generative AI applications, looking at performance vs price when working with these models on your use case should be done. ![Model Cost](./images/model-price.png?WT.mc_id=academic-105485-koreyst) -Source: Artifical Anayslsis +Source: Artificial Analysis -**Flexibility** - Working with open models enables you do be flexible on in terms of using different models or combining them. An example of this is the [HuggingChat Assistants ](https://huggingface.co/chat?WT.mc_id=academic-105485-koreyst) where a users can select the model being used directly in the user interface: +**Flexibility** - Working with open models enables you to be flexible in terms of using different models or combining them. An example of this is the [HuggingChat Assistants ](https://huggingface.co/chat?WT.mc_id=academic-105485-koreyst) where a user can select the model being used directly in the user interface: ![Choose Model](./images/choose-model.png?WT.mc_id=academic-105485-koreyst) @@ -49,19 +49,19 @@ Some examples of fine-tuned versions of Llama include [Japanese Llama](https://h ### Mistral -[Mistral](https://huggingface.co/mistralai?WT.mc_id=academic-105485-koreyst)is an open model with a strong focus of high performance and efficiency. It uses the Mixture-of-Experts approach which combines a group of specialized expert models into one system where depending on the input, certain models are selected to be used. This makes the computation more effective as models are only addressing the inputs they are specalized in. +[Mistral](https://huggingface.co/mistralai?WT.mc_id=academic-105485-koreyst) is an open model with a strong focus of high performance and efficiency. It uses the Mixture-of-Experts approach which combines a group of specialized expert models into one system where depending on the input, certain models are selected to be used. This makes the computation more effective as models are only addressing the inputs they are specialized in. Some examples of fine-tuned versions of Mistral include [BioMistral](https://huggingface.co/BioMistral/BioMistral-7B?text=Mon+nom+est+Thomas+et+mon+principal?WT.mc_id=academic-105485-koreyst), which is focused on the medical domain and [OpenMath Mistral](https://huggingface.co/nvidia/OpenMath-Mistral-7B-v0.1-hf?WT.mc_id=academic-105485-koreyst), which performs mathematical computation. ### Falcon -[Falcon](https://huggingface.co/tiiuae?WT.mc_id=academic-105485-koreyst) is an LLM created by the Technology Innovation Institute (**TII**) .The Falcon-40B was trained on 40 billion parameters which has been shown to perform better than GPT-3 with less compute budget. This is dues to its use of the FlashAttention algorithm and multiquery attention that enables it to cut down on the memory requirements at inference time. With this reduced inference time, the Falcon-40B is suitable for chat applications. +[Falcon](https://huggingface.co/tiiuae?WT.mc_id=academic-105485-koreyst) is an LLM created by the Technology Innovation Institute (**TII**). The Falcon-40B was trained on 40 billion parameters which has been shown to perform better than GPT-3 with less compute budget. This is due to its use of the FlashAttention algorithm and multiquery attention that enables it to cut down on the memory requirements at inference time. With this reduced inference time, the Falcon-40B is suitable for chat applications. Some examples of fine-tuned versions of Falcon are the [OpenAssistant](https://huggingface.co/OpenAssistant/falcon-40b-sft-top1-560?WT.mc_id=academic-105485-koreyst), an assistant built on open models and [GPT4ALL](https://huggingface.co/nomic-ai/gpt4all-falcon?WT.mc_id=academic-105485-koreyst), which delivers higher performance than the base model. ## How to Choose -There is no one answer for choosing an open model. A good place to start is by using the Azure AI Studio's filter by task feature. This will help you understand what types of tasks the model has been trained for. Hugging Face also maintains an LLM Leaderboard which shows you the best performing models base on certain metrics. +There is no one answer for choosing an open model. A good place to start is by using the Azure AI Studio's filter by task feature. This will help you understand what types of tasks the model has been trained for. Hugging Face also maintains an LLM Leaderboard which shows you the best performing models based on certain metrics. When looking to compare LLMs across the different types, [Artificial Analysis](https://artificialanalysis.ai/?WT.mc_id=academic-105485-koreyst) is another great resource: @@ -72,7 +72,7 @@ If working on a specific use case, searching for fine-tuned versions that are fo ## Next Steps -The best part about open models is that you can get started working with them pretty quickly. Check out the [Azure AI Studio Model Catalog](https://ai.azure.com?WT.mc_id=academic-105485-koreyst) , which features a specific Hugging Face collection with these models we discussed here. +The best part about open models is that you can get started working with them pretty quickly. Check out the [Azure AI Studio Model Catalog](https://ai.azure.com?WT.mc_id=academic-105485-koreyst), which features a specific Hugging Face collection with these models we discussed here. ## Learning does not stop here, continue the Journey