Local LLM: MITHRIL icon

Local LLM: MITHRIL

DEPLOY FORWARD LLC
Free
4.3 out of 5

About Local LLM: MITHRIL

Run quantized large language models directly on your iPhone. No cloud, no internet required.

Access state-of-the-art quantized AI models optimized for mobile hardware. Download GGUF-format models that compress billion-parameter networks into mobile-friendly sizes while maintaining performance.

COMPLETE MODEL SUITE
• Llama 3.2 1B/3B (Meta) - Q4/Q8 quantization
• Gemma 3 270M/2B/9B (Google) - IQ4_NL optimization
• Qwen 2.5 0.5B-7B (Alibaba) - Multiple quantization levels
• LLaVA 1.5/1.6 (Vision) - Multimodal image understanding
• Direct integration with Hugging Face model repository

TECHNICAL FEATURES
• GGML/llama.cpp inference engine
• Metal GPU acceleration on Apple Silicon
• Dynamic context window management (2K-8K tokens)
• Retrieval-Augmented Generation (RAG) with embeddings
• Real-time streaming with token/second metrics
• SQLite conversation storage with vector search

SYSTEM REQUIREMENTS
Models run efficiently when file size ≤ available RAM. Recommended minimum 6GB RAM for larger models. iPhone 15 Pro/Pro Max optimal. iOS26 for Apple foundation model.

Zero telemetry. Zero data transmission. Pure local AI computing.

Local LLM: MITHRIL Screenshots