paazmaya.fi

The Website of Juga Paazmaya | Stories about web development, hardware prototyping, and education

Publishing LLMs to Hugging Face and Ollama.com

I recently managed to train my first ML model, to recognize Japanese kanji characters that are provided as an image. I used this model to create a web page using WebAssembly, drawing a kanji (or hiragana) to a canvas, which is then fed to the model.

Publishing CNN-RNN Model to Hugging Face and Ollama

Common Prerequisites

  • Trained model saved (PyTorch format)
  • Account on target platform(s)
  • Access token/authentication credentials
  • Model documentation ready
  • Requirements.txt with dependencies

Common Setup Steps

1. Install Dependencies

pip install huggingface-hub transformers torch ollama gguf

2. Authenticate

Hugging Face:

huggingface-cli login
# Or set environment variable:
export HF_TOKEN=your_access_token

Ollama:

# No authentication needed for local use
# For registry submission, you'll need a GitHub account

3. Prepare Core Model Files

Create these files in your repository:

config.json (model configuration)

{
  "model_type": "your_model_type",
  "architecture": "cnn_rnn",
  "hidden_size": 768,
  "num_layers": 12
}

README.md (comprehensive documentation)

# Kanji Recognition CNN-RNN Model

## Description

A CNN-RNN model for recognizing handwritten and printed Kanji characters from images. The CNN extracts visual features from input images, while the RNN refines predictions through sequential processing.

## Model Details

- Architecture: CNN-RNN
- Framework: PyTorch
- Task: Kanji character recognition from images
- Input: Images (3-channel, 224x224 recommended)
- Output: Kanji character predictions with confidence scores
- Parameters: [your count]

## Training

- Dataset: [your dataset name/description]
- Methodology: [training approach]
- Hyperparameters: [learning rate, batch size, epochs, etc.]
- Training time: [duration]

## Usage

### Hugging Face

```python
from transformers import AutoModel
from PIL import Image
import torch

model = AutoModel.from_pretrained("your-username/kanji-recognition-cnn-rnn")
image = Image.open("kanji_sample.jpg")

# Process image and get predictions

with torch.no_grad():
predictions = model(image)
kanji_char = predictions.argmax(dim=1)
confidence = predictions.max(dim=1).values
```

Ollama

ollama pull your-username/kanji-recognition-cnn-rnn
ollama run your-username/kanji-recognition-cnn-rnn

Performance

  • Accuracy: [percentage]
  • Precision: [percentage]
  • Recall: [percentage]
  • F1-Score: [score]
  • Test set size: [number of samples]

Character Coverage

  • Total Kanji characters: [count]
  • Coverage: [Jōyō/Jinmeiyō/Other]

Limitations

[Known limitations or edge cases]

Citation

[If applicable]


**requirements.txt**

torch>=2.0.0 transformers>=4.30.0 numpy>=1.21.0


**LICENSE** (add a license file, e.g., MIT, Apache 2.0)

### 4. Prepare Model Weights

```bash
# Save your trained model
python -c "
import torch
model = YourModel()
# ... load trained weights ...
torch.save(model.state_dict(), 'pytorch_model.bin')
"

Publishing to Hugging Face

Steps and Commands

1. Create Repository

huggingface-cli repo create your-model-name --type model

2. Clone Repository

git clone https://huggingface.co/your-username/your-model-name
cd your-model-name
import torch
from safetensors.torch import save_file

# Load your trained model
model = YourCNNRNNModel.load_from_checkpoint("path/to/checkpoint.pt")
state_dict = model.state_dict()

# Convert to SafeTensors format
save_file(state_dict, "model.safetensors")

Or using transformers:

pip install safetensors
python -c "
from transformers import AutoModel
from safetensors.torch import save_file

model = AutoModel.from_pretrained('your-model-path')
save_file(model.state_dict(), 'model.safetensors')
"

4. Add Files

cp model.safetensors .
# or keep pytorch_model.bin if not converting
cp config.json .
cp README.md .
cp requirements.txt .
cp LICENSE .

5. Custom Model Support (If Needed)

Create modeling_custom.py:

from transformers import PreTrainedModel
from torch import nn

class YourCNNRNNModel(PreTrainedModel):
    config_class = YourConfig

    def __init__(self, config):
        super().__init__(config)
        # Your model architecture

    def forward(self, input_ids, attention_mask=None):
        # Forward pass
        pass

Update config.json:

{
  "architectures": ["YourCNNRNNModel"],
  "model_type": "your_model_type",
  "auto_map": {
    "AutoModel": "modeling_custom.YourCNNRNNModel"
  }
}

5. Push to Hub

git add .
git commit -m "Initial model commit"
git push

Or use Python API:

from huggingface_hub import upload_folder

upload_folder(
    folder_path="./your-model-folder",
    repo_id="your-username/your-model-name",
    repo_type="model"
)

6. Verify

Visit https://huggingface.co/your-username/your-model-name

Additional Resources


Publishing to Ollama

Ollama requires model conversion to GGUF format for optimized local inference.

Steps and Commands

1. Convert PyTorch to GGUF Format

Step 1: Export PyTorch model to standard format

import torch
import json
from pathlib import Path

# Load your trained model
model = YourCNNRNNModel()
checkpoint = torch.load("path/to/checkpoint.pt")
model.load_state_dict(checkpoint)

# Save in standard PyTorch format
torch.save(model.state_dict(), "pytorch_model.bin")

# Save config as JSON
config = {
    "model_type": "cnn_rnn",
    "architecture": "cnn_rnn",
    "hidden_size": 768,
    "num_layers": 12
}
with open("config.json", "w") as f:
    json.dump(config, f)

Step 2: Convert to GGUF using llama.cpp

# Clone and setup llama.cpp
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
pip install -r requirements.txt

# Convert PyTorch model to GGUF
python convert.py --model-dir . --outfile model.gguf

Or use a more direct conversion:

import torch
from gguf import GGUFWriter

# Load model
model = YourCNNRNNModel()
checkpoint = torch.load("path/to/checkpoint.pt")
model.load_state_dict(checkpoint)

# Convert to GGUF
writer = GGUFWriter("model.gguf", "cnn_rnn")
writer.add_tensors(model.state_dict())
writer.write_header_to_file()
writer.close()

print("Model converted to model.gguf")
# Quantize to Q4_K_M (recommended balance of size/quality)
./quantize model.gguf model-q4.gguf Q4_K_M

3. Create Modelfile

Create Modelfile:

FROM ./model-q4.gguf

TEMPLATE """
[INST] {{ .Prompt }} [/INST]
"""

PARAMETER num_ctx 2048
PARAMETER temperature 0.7
PARAMETER top_k 40
PARAMETER top_p 0.9

SYSTEM """You are a helpful AI assistant."""

3. Build and Test Locally

ollama create your-model-name -f Modelfile
ollama run your-model-name "Your test prompt"

4. Publish to Ollama Library

  1. Fork ollama/ollama-library
  2. Create directory and files:
mkdir -p models/your-username
cp Modelfile models/your-username/
cp README.md models/your-username/
  1. Push to your fork and create Pull Request

5. Users Can Access With

ollama pull your-username/your-model-name
ollama run your-username/your-model-name

Additional Resources