From bbdcabda3e2d190f89d4ea028e95797315bbf0e8 Mon Sep 17 00:00:00 2001 From: hieu-jan <150573299+hieu-jan@users.noreply.github.com> Date: Sat, 2 Mar 2024 15:11:37 +0900 Subject: [PATCH] docs: update the figure style --- ...-chatgpt-with-open-source-alternatives.mdx | 23 +++++++++---------- 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/docs/docs/blog/00-surpassing-chatgpt-with-open-source-alternatives.mdx b/docs/docs/blog/00-surpassing-chatgpt-with-open-source-alternatives.mdx index 052f54bc5..f75c6fd1a 100644 --- a/docs/docs/blog/00-surpassing-chatgpt-with-open-source-alternatives.mdx +++ b/docs/docs/blog/00-surpassing-chatgpt-with-open-source-alternatives.mdx @@ -35,7 +35,7 @@ Problems still arise with catastrophic forgetting in general tasks, commonly obs ![Mistral vs LLama vs Gemma](assets/mistral-comparasion.png) -_Figure 1. Mistral 7B excels in benchmarks, ranking among the top foundational models._ +_Figure 1._ Mistral 7B excels in benchmarks, ranking among the top foundational models. _Note: we are not sponsored by the Mistral team. Though many folks in their community do like to run Mistral locally using our desktop client - [Jan](https://jan.ai/)._ @@ -45,7 +45,7 @@ Mistral alone has known, poor math capabilities, which we needed for our highly ![Merged model vs finetuned models](assets/stealth-comparasion.png) -_Figure 2: The merged model, Stealth, doubles the mathematical capabilities of its foundational model while retaining the performance in other tasks._ +_Figure 2._ The merged model, Stealth, doubles the mathematical capabilities of its foundational model while retaining the performance in other tasks. We found merging models is quick and cost-effective, enabling fast adjustments based on the result of each iteration. @@ -95,13 +95,13 @@ This final model is publicly available at https://huggingface.co/jan-hq/nitro-v1 ![Using LLM locally](assets/nitro-on-jan.png) -_Figure 3. Using the new finetuned model in [Jan](https://jan.ai/)_ +_Figure 3._ Using the new finetuned model in [Jan](https://jan.ai/) ## Improving Results With Rag As an additional step, we also added [Retrieval Augmented Generation (RAG)](https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/) as an experiment parameter. -A simple RAG setup was done using **[Llamaindex](https://www.llamaindex.ai/)** and the **[bge-en-base-v1.5 embedding](https://huggingface.co/BAAI/bge-base-en-v1.5)** model for efficient documentation retrieval and question-answering. You can find the RAG implementation [here](https://github.com/janhq/open-foundry/blob/main/rag-is-not-enough/rag/nitro_rag.ipynb). +A simple RAG setup was done using **[Llamaindex](https://www.llamaindex.ai/)** and the **[bge-en-base-v1.5 embedding](https://huggingface.co/BAAI/bge-base-en-v1.5)** model for efficient documentation retrieval and question-answering. The RAG implementation is publicly available at at https://github.com/janhq/open-foundry/blob/main/rag-is-not-enough/rag/nitro_rag.ipynb ## Benchmarking the Results @@ -109,16 +109,15 @@ We curated a new set of [50 multiple-choice questions](https://github.com/janhq/ ![Opensource model outperforms GPT](assets/rag-comparasion.png) -_Figure 4. Comparison between fine-tuned model and OpenAI's GPT._ +_Figure 4._ Comparison between fine-tuned model and OpenAI's GPT. -**Results** - -| Approach | Performance | +_Table 1._ Result of the Benchmarking. +| Approach | Performance | | ----------------------------------------------------------------------------------- | ----------- | -| GPT-3.5 with RAG | 56.7% | -| GPT-4 with RAG | 64.3% | -| Merged 7B Model ([Stealth 7B](https://huggingface.co/jan-hq/stealth-v1.3)) with RAG | 47.7% | -| Finetuned 7B Model (Nitro 7B) with RAG | 57.8% | +| GPT-3.5 with RAG | 56.7% | +| GPT-4 with RAG | 64.3% | +| Merged 7B Model ([Stealth 7B](https://huggingface.co/jan-hq/stealth-v1.3)) with RAG | 47.7% | +| Finetuned 7B Model (Nitro 7B) with RAG | 57.8% | This indicates that with task-specific training, we can improve an open-source, Small Language Model to the level of GPT-3.5 on domain knowledge.