Model Weights
Parameters learned during training that define a model’s behavior. Stored in files (e.g., .bin, .pt) and loaded for inference. Size directly impacts computational requirements (e.g., 20B parameters ≈ 40GB storage).
Key Characteristics
- Open-weight models (e.g.,
gpt-oss-20b) publicly share weights while keeping training code proprietary - Local deployment requires downloading weights (e.g., via Hugging Face Hub)
- Inference executes using weights without cloud dependency; loading/running involves inference engines, memory-mapping, and performance optimization rather than simple file execution (see 2026 04 22 LLM Inference Engines Memory Mapping and Performance Optimization)
Recent Developments
- Jeredblu demonstrates running gpt-oss-20b (OpenAI’s open-weight LLM) locally (see 2026 04 14 Jeredblu running LLM locally)
Source Notes
- 2026-04-14: # Prompt Engineering channel - new RAG multi modal approach --- --- <
Source Notes
- 2026-04-14: I Looked At Amazon After They Fired 16,000 Engineers. Their AI Broke Everything.
- 2026-04-14: # Julian Goldie SEO channel GLM 4.7 --- --- https://www.youtube.com/watch?v=uy7F7u8A0jo # GLM-4.7: Advancing the Coding Capability & Business Automation GLM-4.7 is the latest [[concepts/open-source|open-sour (Julian Goldie SEO channel GLM 4.7)
- 2026-04-22: [[lab-notes/2026-04-22-LLM-Inference-Engines-Memory-Mapping-and-Performance-Optimization|LLM Inference: Engines, Memory Mapping, and Performance Optimization]]
- 2026-04-24: [[lab-notes/2026-04-24-DeepSeek-V4-Next-Gen-Open-Source-LLM-Performance-and-Efficiency-Analysis|DeepSeek V4: Next-Gen Open-Source LLM Performance and Efficiency Analysis]]