Skip to main content

Showing 1–2 of 2 results for author: Fastowski, A

  1. arXiv:2405.13536  [pdf, other

    cs.LG cs.AI cs.CL

    Attention Mechanisms Don't Learn Additive Models: Rethinking Feature Importance for Transformers

    Authors: Tobias Leemann, Alina Fastowski, Felix Pfeiffer, Gjergji Kasneci

    Abstract: We address the critical challenge of applying feature attribution methods to the transformer architecture, which dominates current applications in natural language processing and beyond. Traditional attribution methods to explainable AI (XAI) explicitly or implicitly rely on linear or additive surrogate models to quantify the impact of input features on a model's output. In this work, we formally… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  2. arXiv:2306.00458  [pdf, other

    cs.CL

    Exploring Anisotropy and Outliers in Multilingual Language Models for Cross-Lingual Semantic Sentence Similarity

    Authors: Katharina Hämmerl, Alina Fastowski, Jindřich Libovický, Alexander Fraser

    Abstract: Previous work has shown that the representations output by contextual language models are more anisotropic than static type embeddings, and typically display outlier dimensions. This seems to be true for both monolingual and multilingual models, although much less work has been done on the multilingual context. Why these outliers occur and how they affect the representations is still an active are… ▽ More

    Submitted 7 June, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: To appear in ACL Findings 2023. Fixed a citation in this version