Hex-LLM: High-efficiency large language model serving on TPUs in Vertex AI Model Garden July 27, 2024