Skip to main content

Showing 1–4 of 4 results for author: Piękos, P

  1. arXiv:2312.07987  [pdf, other

    cs.LG cs.CL cs.NE

    SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention

    Authors: Róbert Csordás, Piotr Piękos, Kazuki Irie, Jürgen Schmidhuber

    Abstract: The costly self-attention layers in modern Transformers require memory and compute quadratic in sequence length. Existing approximation methods usually underperform and fail to obtain significant speedups in practice. Here we present SwitchHead - a novel method that reduces both compute and memory requirements and achieves wall-clock speedup, while matching the language modeling performance of bas… ▽ More

    Submitted 14 December, 2023; v1 submitted 13 December, 2023; originally announced December 2023.

  2. arXiv:2305.17066  [pdf, other

    cs.AI cs.CL cs.CV cs.LG cs.MA

    Mindstorms in Natural Language-Based Societies of Mind

    Authors: Mingchen Zhuge, Haozhe Liu, Francesco Faccio, Dylan R. Ashley, Róbert Csordás, Anand Gopalakrishnan, Abdullah Hamdi, Hasan Abed Al Kader Hammoud, Vincent Herrmann, Kazuki Irie, Louis Kirsch, Bing Li, Guohao Li, Shuming Liu, Jinjie Mai, Piotr Piękos, Aditya Ramesh, Imanol Schlag, Weimin Shi, Aleksandar Stanić, Wenyi Wang, Yuhui Wang, Mengmeng Xu, Deng-Ping Fan, Bernard Ghanem , et al. (1 additional authors not shown)

    Abstract: Both Minsky's "society of mind" and Schmidhuber's "learning to think" inspire diverse societies of large multimodal neural networks (NNs) that solve problems by interviewing each other in a "mindstorm." Recent implementations of NN-based societies of minds consist of large language models (LLMs) and other NN-based experts communicating through a natural language interface. In doing so, they overco… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: 9 pages in main text + 7 pages of references + 38 pages of appendices, 14 figures in main text + 13 in appendices, 7 tables in appendices

    MSC Class: 68T07 ACM Class: I.2.6; I.2.11

  3. arXiv:2206.00702  [pdf, other

    cs.AI cs.LG

    Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal Search

    Authors: Michał Zawalski, Michał Tyrolski, Konrad Czechowski, Tomasz Odrzygóźdź, Damian Stachura, Piotr Piękos, Yuhuai Wu, Łukasz Kuciński, Piotr Miłoś

    Abstract: Complex reasoning problems contain states that vary in the computational cost required to determine a good action plan. Taking advantage of this property, we propose Adaptive Subgoal Search (AdaSubS), a search method that adaptively adjusts the planning horizon. To this end, AdaSubS generates diverse sets of subgoals at different distances. A verification mechanism is employed to filter out unreac… ▽ More

    Submitted 25 May, 2024; v1 submitted 1 June, 2022; originally announced June 2022.

    Comments: ICLR 2023 (notable-top-5%) website: https://sites.google.com/view/adaptivesubgoalsearch/

    ACM Class: I.2.8; I.2.6

  4. arXiv:2106.03921  [pdf, other

    cs.CL cs.AI cs.LG

    Measuring and Improving BERT's Mathematical Abilities by Predicting the Order of Reasoning

    Authors: Piotr Piękos, Henryk Michalewski, Mateusz Malinowski

    Abstract: Imagine you are in a supermarket. You have two bananas in your basket and want to buy four apples. How many fruits do you have in total? This seemingly straightforward question can be challenging for data-driven language models, even if trained at scale. However, we would expect such generic language models to possess some mathematical abilities in addition to typical linguistic competence. Toward… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

    Comments: The paper has been accepted to the ACL-IJCNLP 2021 conference

    ACM Class: I.2.7