Episode 636
Red Hat's James Huang
December 19th, 2025
20 mins 53 secs
Tags
About this Episode
Links
James on LinkedIn
Mike on LinkedIn
Mike's Blog
Show on Discord
- AI on Red Hat Enterprise Linux (RHEL)
Trust and Stability: RHEL provides the mission-critical foundation needed for workloads where security and reliability cannot be compromised.
Predictive vs. Generative: Acknowledging the hype of GenAI while maintaining support for traditional machine learning algorithms.
Determinism: The challenge of bringing consistency and security to emerging AI technologies in production environments.
- Rama-Llama & Containerization
Developer Simplicity: Rama-Llama helps developers run local LLMs easily without being "locked in" to specific engines; it supports Podman, Docker, and various inference engines like Llama.cpp and Whisper.cpp.
Production Path: The tool is designed to "fade away" after helping package the model and stack into a container that can be deployed directly to Kubernetes.
Behind the Firewall: Addressing the needs of industries (like aircraft maintenance) that require AI to stay strictly on-premises.
- Enterprise AI Infrastructure
Red Hat AI: A commercial product offering tools for model customization, including pre-training, fine-tuning, and RAG (Retrieval-Augmented Generation).
Inference Engines: James highlights the difference between Llama.cpp (for smaller/edge hardware) and vLLM, which has become the enterprise standard for multi-GPU data center inferencing.