Make your code more readable, concise, and efficient using “einsum”
I had never heard of this and was fascinated to find out more. I also found this additional article useful:
Understanding einsum for Deep learning: implement a transformer with multi-head self-attention from scratch