For experimentation and token-minimizing I have been running some open-weight LLMs (e.g., llama3.2 for text and qwen3-coder:30b for code) locally on a MacBook via ollama within VS Code. I noticed that long-context tasks (e.g., analyzing a complex codebase and generating a new feature) can take a long time and compute, eventually heating up the device to a hazardous degree, especially under an ambient room temperature over 90°F.
https://www.instagram.com/p/DaXY3tDj8nM/


