Abstract: Deep learning models based on Transformer architectures are undergoing significant transformations as large language models (LLMs) become more widely adopted and computationally demanding.