Hardware aware LLM training and inference