Building the future of
AI Infrastructure

Open-source tools for GPU-accelerated computing, intelligent retrieval systems, and production-grade ML operations.

Projects

Production-ready tools designed for real-world AI workloads

Platform

AEGIS

AI Operations Platform

A comprehensive infrastructure system integrating custom ROCm kernels, Mixture-of-Experts routing, web search RAG, and execution hooks. Features a FastAPI backend with React frontend for command and chat interfaces.

Custom HIP/CUDA kernels for GPU-accelerated ops
Dynamic MoE expert loading with memory budgeting
Multi-tier caching (L1/L2/L3)
Cross-platform network diagnostics

PyTorch FastAPI React ROCm

Retrieval

Pensive

Hierarchical Context Retrieval

A sophisticated two-tier context management system with L1 hot cache, L2 vector retrieval via FAISS, and L3 persistent archive. Handles 1M+ token contexts with sub-350ms latency.

Hybrid BM25 + dense vector fusion
Async dependency chain orchestration
KV summarization & deduplication
Multi-hop logic chaining

FAISS sentence-transformers BM25

Training

mud-puppy

ROCm-First LLM Fine-tuning

A lightweight fine-tuning framework optimized for AMD GPUs. Supports LoRA, QLoRA, DPO, GRPO, and GPTQ quantization without bitsandbytes dependency.

Full, LoRA, and QLoRA fine-tuning
DPO/IPO/KTO/ORPO preference tuning
Custom ROCm kernels (qgemm, fbgemm)
Memory-efficient streaming & offloading

ROCm TRL HuggingFace GPTQ

Built For

AMD ROCm

First-class support for AMD GPUs with custom kernels

Production ML

Battle-tested infrastructure for real workloads

Open Source

Transparent, auditable, community-driven

Memory Efficient

Optimized for maximum utilization on consumer hardware

About Tuklus Labs

We build tools that make advanced AI accessible. Our focus is on GPU-vendor-agnostic infrastructure that works on real hardware—not just datacenter clusters with unlimited CUDA cores.

Every project is designed with ROCm-first principles, ensuring AMD GPU users aren't second-class citizens in the AI ecosystem. We believe in efficient, production-ready code over flashy demos.

3 Core Projects

ROCm First Design

100% Open Source