rfq-rag

A demo to share intuition for how RAG and LLM can be used to automate finance tasks that require written comprehension

Project maintained by parrisma Hosted on GitHub Pages — Theme by mattgraham

Deployment

This project demonstrates RFQ parsing using Retrieval Augmented Generation (RAG) and deploys open-source components with Docker.

Click here for Demo project repository & full code

Components:

Ollama: (ollama.com) - Downloads and runs Large Language Models (LLMs).
ChromaDB: (trychroma.com) - A vector database for storing example RFQs. This enables the Retrieval Augmented part of RAG, specializing the LLM prompt.

Linux Environment:
- While compatible with both Windows and Linux, a Linux environment is recommended.
- For Windows users, utilize Windows Subsystem for Linux (WSL 2) (ubuntu.com/desktop/wsl).
- WSL 2 is integrated into Docker Desktop.
- Execute project scripts within a Linux (Ubuntu) shell on WSL 2 or a native Linux system.
Docker:
- A basic Docker installation is required (docker.com).
- This project uses simple Docker containers, without Kubernetes or complex orchestration.
GPU and Memory:
- A dedicated GPU and ample system memory are strongly recommended for optimal performance when running LLMs locally.
- Testing was conducted on a system with an NVIDIA RTX 4090 and 128GB of RAM.
- For systems with lower resources, explore smaller models from the Ollama model directory.
- Smaller models offer faster processing but may exhibit reduced accuracy for complex reasoning tasks.
- The demo will still run on lower spec machines.
Python
- Conda and pip were used to manage the virtual environment
- the environment.yml is here - edit the last line of the file so the prefix matches your home directory
- Create env conda env create -f environment.yml

Follow these steps to set up the project:

Open a Linux Shell:
- Start a Linux terminal on your system (WSL 2 or native Linux).

Clone the Repository:

 mkdir rfq-rag
 cd rfq-rag
 git clone https://github.com/parrisma/rfq-rag

Navigate to the Scripts Directory:
- Change your current directory to the scripts folder:
```
 cd scripts
```
Run the Bootstrap Script:
- Execute the bootstrap.sh script, providing your username as an argument:
```
 ./bootstrap.sh <your-user-name>
```
- What this script does:
  - Downloads and starts both Ollama and Chroma DB using Docker.
  - Creates necessary directories in your home folder for persisting data from these services.
  - Downloads and starts the default Large Language Model (LLM) within Ollama.

To use a different LLM model:
1. Access the Ollama Container:
  - Connect to the running Ollama container named ollama-gpu:
    docker exec -it ollama-gpu /bin/bash
2. Find Your Model:
  - Browse the Ollama model directory to find the desired model.
3. Run the Model:
  - Inside the Ollama container, execute:
    ollama run <model-name>
    - Note: Downloading the model may take time depending on its size.
4. Verify the Model:
  - Exit the current model:
    /bye
  - Check the running models:
    ollama ps
    - This confirms your chosen model is active.
5. Let the demo script know
  - Pass in command line arg
```
 python ./rfq-rag-main-demo.py -om=<model name>
```
  you will also need to pass in other command line arguments depending on the tests you want to run. Not covered here.