AMD interview question

How does the self attention layer work in transformers?