From 17172db6bc304aa97e74c7de69ff47c5ecc5e905 Mon Sep 17 00:00:00 2001
From: Ho Duc Hieu <150573299+hieu-jan@users.noreply.github.com>
Date: Mon, 8 Jan 2024 21:23:06 +0700
Subject: [PATCH] docs: finalize customize engine settings

---
 .../guides/04-using-models/04-customize-engine-settings.mdx     | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/docs/guides/04-using-models/04-customize-engine-settings.mdx b/docs/docs/guides/04-using-models/04-customize-engine-settings.mdx
index 8c3984a00..01c32b796 100644
--- a/docs/docs/guides/04-using-models/04-customize-engine-settings.mdx
+++ b/docs/docs/guides/04-using-models/04-customize-engine-settings.mdx
@@ -71,7 +71,7 @@ In this guide, we will show you how to customize the engine settings.
 
 :::tip
 
-- By default, the value of `ngl` is set to 100, which indicates that it will offload all. If you wish to offload only 50% of the GPU, you can set `ngl` to 15. This is because most models on Mistral or Llama is around ~ 30 layers.
+- By default, the value of `ngl` is set to 100, which indicates that it will offload all. If you wish to offload only 50% of the GPU, you can set `ngl` to 15. Because the majority of models on Mistral or Llama are around ~ 30 layers.
 - To utilize the embedding feature, include the JSON parameter `"embedding": true`. It will enable Nitro to process inferences with embedding capabilities. For a more detailed explanation, please refer to the [Embedding in the Nitro documentation](https://nitro.jan.ai/features/embed).
 - To utilize the continuous batching feature to boost throughput and minimize latency in large language model (LLM) inference, please refer to the [Continuous Batching in the Nitro documentation](https://nitro.jan.ai/features/cont-batch).