Go to home
Back
8
1

quick-start

:

Fine-tune your first large language model (LLM) with LoRA, llama.cpp, and KitOps in 5 easy steps

Getting started with LLMs can be intimidating, in this tutorial we will show you how to fine-tune a large language model using LoRA, facilitated by tools like llama.cpp and KitOps.

LoRA, or Low-Rank Adaptation, is a technique for efficiently adapting pre-trained models with minimal computational overhead. This tutorial aims to guide you through the setup, adaptation process, and deployment using easy-to-follow steps.

Prerequisites

Before we get started, you’ll need to install Kit CLI and download an executable from Llama.

1.Environment setup

  1. Create a project directory (lora_finetuning) and change into it:

    mkdir ./lora_finetuning
    cd ./lora_finetuning
    
  2. Sign up for a free Jozu Hub account.

  3. Log in to Jozu Hub and click on ADD NEW REPOSITORY +, name the Repository fine-tuning and click the CREATE REPOSITORY button.

  4. Log in to Jozu Hub using the Kit CLI. This will require your username and password. Alternatively, you can use any other registry that is compatible with OCI artifacts:

    kit login jozu.ml --username 
    

2. Create your Kitfile and refer to your base model

  1. In your project directory (lora_finetuning), create a Kitfile with the following contents:

    manifestVersion: "1.0"
    package:
    name: llama3 fine-tuned
    version: 3.0.0
    authors: ["Jozu AI"]
    model:
    name: llama3-8B-instruct-q4_0
    path: jozu.ml/jozu/llama3-8b:8B-instruct-q4_0
    description: Llama 3 8B instruct model
    license: Apache 2.0
    

[!NOTE] Notice how the path refers to another ModelKit? In this case jozu.ml/jozu/llama3-8b:8B-instruct-q4_0.

  1. In your project directory (lora_finetuning), let's create our initial ModelKit by invoking:

    kit pack . -t fine-tuning:untuned
    
  2. Now let's unpack our ModelKit to our work folder. Notice that kit CLI will also unpack the model that you have referenced to your work folder so that you can work with it.

    cd ..
    kit unpack fine-tuning:untuned -d ./lora_finetuning --overwrite
    

3. Create your Lora Adapter

  1. Prepare your training data and store it in a text file, ideally using a common string to separate each example that isn't found in your training data, e.g., "<s>".

     Example one text here.
     Example two text here.
    

[!NOTE] The following data file has been provided for your convenience: ./lora_finetuning/training-data.txt

  1. Run the fine-tuning command from your project directory (lora_finetuning):

    cd ./lora_finetuning
    llama.cpp\finetune.exe --model-base ./llama3-8B-instruct-q4_0.gguf --train-data ./training-data.txt --threads 8 --sample-start "" --lora-out lora_adapter.gguf
    

4. Update the Kitfile with your LoRA adapter

Update the Kitfile to include the LoRA adapter

manifestVersion: "1.0"
package:
  name: llama3 fine-tuned
  version: 3.0.0
  authors: ["Jozu AI"]
model:
  name: llama3-8B-instruct-q4_0
  path: jozu.ml/jozu/llama3-8b:8B-instruct-q4_0
  description: Llama 3 8B instruct model
  license: Apache 2.0
  parts:
    - path: ./lora-adapter.gguf
      type: lora-adapter
datasets:
  - name: fine-tune-data
    path: ./training-data.txt

[!TIP] We recommend including training data in your ModelKits as a best practice that enables further iterations of training to be easier, however, this is not required for production deployments.

5. Pack and ship your Adapter in a ModelKit

  1. Package your the tuned model

    kit pack . -t fine-tuning:tuned
    
  2. Tag your model for Jozu Hub

    kit tag fine-tuning:tuned jozu.ml//fine-tuning:latest
    
  3. Push the ModelKit to Jozu Hub.

    kit push jozu.ml//fine-tuning:latest
    

Conclusion

Congratulations, your fine-tuned model is now packaged and ready for deployment. You've successfully updated the Kitfile, packed the model, and pushed it to Jozu Hub, making it accessible for implementation in various applications.

You can access it in Jozu Hub from the My Repositories page.