8bit.tr Journal
Optimization
3 articles tagged with Optimization.
December 21, 2025
Context Window Allocation: Budgeting Tokens for Maximum Signal
How to allocate context windows across system prompts, memory, and retrieval to maximize model performance.
December 20, 2025
RLHF and Preference Optimization: Aligning LLMs With Real Users
A deep dive into RLHF pipelines, preference data, and practical alignment strategies for production LLMs.
December 16, 2025
Speculative Decoding and Fast Inference: Making LLMs Feel Instant
A technical guide to speculative decoding, draft models, and system tricks that cut latency without sacrificing quality.