SLO-aware GPU Frequency Scaling for Energy Efficient LLM Inference Serving https://arxiv.org/abs/2408.05235 #cs.DC #cs.AI #cs.AR #cs.LG
QOTO: Question Others to Teach Ourselves An inclusive, Academic Freedom, instance All cultures welcome. Hate speech and harassment strictly forbidden.