Instructions to use Macmill/Fyve-AI with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Macmill/Fyve-AI with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Macmill/Fyve-AI",
	filename="fyve-ai.Q4_K_M.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use Macmill/Fyve-AI with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Macmill/Fyve-AI:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Macmill/Fyve-AI:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Macmill/Fyve-AI:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Macmill/Fyve-AI:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Macmill/Fyve-AI:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf Macmill/Fyve-AI:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Macmill/Fyve-AI:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Macmill/Fyve-AI:Q4_K_M

Use Docker

docker model run hf.co/Macmill/Fyve-AI:Q4_K_M

LM Studio
Jan
Ollama
How to use Macmill/Fyve-AI with Ollama:
```
ollama run hf.co/Macmill/Fyve-AI:Q4_K_M
```

Unsloth Studio new

How to use Macmill/Fyve-AI with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Macmill/Fyve-AI to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Macmill/Fyve-AI to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Macmill/Fyve-AI to start chatting

Docker Model Runner
How to use Macmill/Fyve-AI with Docker Model Runner:
```
docker model run hf.co/Macmill/Fyve-AI:Q4_K_M
```

Lemonade

How to use Macmill/Fyve-AI with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull Macmill/Fyve-AI:Q4_K_M

Run and chat with the model

lemonade run user.Fyve-AI-Q4_K_M

List all available models

lemonade list

Fyve-AI

Fyve-AI is a fine-tuned version of Qwen3-4B trained for one specific task: reading a student's broken Python code and responding with a Socratic 3-sentence hint — not the answer.

It is the AI model powering PyFyve, a fully offline Python tutoring application.

What It Does

Given a task description, the student's buggy code, and the Python error it produced, the model outputs a JSON object with two fields:

reasoning — internal diagnosis of what went wrong and why
hint — exactly 3 sentences following a fixed structure:
1. Diagnosis — names the specific variable, expression, or construct that caused the error
2. Rule — states the Python rule that was violated
3. Directive — starts with "Think about..." or "Consider..." and guides without giving the fix

The model never gives corrected code. It never gives more than 3 sentences. It does not know how to do anything outside this task.

Input Format

Task:
<what the student was asked to do>
Code:
<the student's broken code>
Error:
<the Python error message>

Output Format

{
  "reasoning": "...",
  "hint": "Sentence 1.\nSentence 2.\nSentence 3."
}

Example

Input:

Task:
Create a variable score = 95 and print its value.
Code:
score = 95
print(Score)
Error:
NameError: name 'Score' is not defined at line 2

Output:

{
  "reasoning": "score is defined lowercase but Score (capital S) is used in print. I name both spellings, explain case sensitivity, and direct toward comparing the two usages.",
  "hint": "You defined a variable called score on line 1 but referenced Score on line 2.\nIn Python, variable names are case-sensitive, so score and Score are treated as two completely different identifiers.\nConsider whether the capitalisation of the variable name is consistent between where it was defined and where it is used."
}

Training Details

Detail	Value
Base model	Qwen3-4B
Method	QLoRA via Unsloth
Hardware	Google Colab T4 (free tier)
Dataset	555 curated (task, code, error, hint) pairs
Dataset source	Synthetic — generated using Qwen3-30B-A3B as teacher model
Error types covered	SyntaxError, NameError, TypeError, IndexError, KeyError, ValueError, AttributeError, UnboundLocalError, RecursionError, ZeroDivisionError, and more

The training data was generated by a 30B teacher model, manually reviewed for quality, and filtered through a validation pipeline that checks hint structure, sentence count, and semantic rules (e.g. AttributeError on strings must guide toward + or +=, not list conversion).

Intended Use

This model is designed exclusively for use inside the PyFyve app. It is not a general-purpose assistant and will produce poor results for tasks outside its training distribution.

It is not designed to:

Answer general Python questions
Explain concepts freely
Write or complete code
Serve as a chatbot

Limitations

Trained on 555 examples — covers common beginner and intermediate Python errors well, but unusual or advanced errors may produce weaker hints
No coverage of logic errors (code that runs but produces wrong output)
Some uncommon syntax patterns (e.g. trailing comma creating a tuple) are outside the training distribution
The 3-sentence format is enforced by the prompt at inference time — removing the few-shot examples from the prompt degrades output quality significantly

Usage with Ollama

This model is distributed as a GGUF file for use with Ollama. The Modelfile in this repository contains the Ollama model definition.

ollama create fyve-ai -f Modelfile

Or use the PyFyve app, which handles setup automatically.

License

The fine-tuned weights are released under the same license as the base model: the Apache License.

Please read it before redistributing — it permits research and personal use but has restrictions on commercial use above certain usage thresholds.

Citation

If you use this model in research or build on it, please link back to the PyFyve repository.

Downloads last month: 76

GGUF

Model size

4B params

Architecture

qwen3

Hardware compatibility

4-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Macmill/Fyve-AI

Base model

Qwen/Qwen3-4B-Base

Finetuned

Qwen/Qwen3-4B

Quantized

(217)

this model