-
Learning to Understand: Identifying Interactions via the Möbius Transform
Authors:
Justin S. Kang,
Yigit E. Erginbas,
Landon Butler,
Ramtin Pedarsani,
Kannan Ramchandran
Abstract:
One of the key challenges in machine learning is to find interpretable representations of learned functions. The Möbius transform is essential for this purpose, as its coefficients correspond to unique importance scores for sets of input variables. This transform is closely related to widely used game-theoretic notions of importance like the Shapley and Bhanzaf value, but it also captures crucial…
▽ More
One of the key challenges in machine learning is to find interpretable representations of learned functions. The Möbius transform is essential for this purpose, as its coefficients correspond to unique importance scores for sets of input variables. This transform is closely related to widely used game-theoretic notions of importance like the Shapley and Bhanzaf value, but it also captures crucial higher-order interactions. Although computing the obius Transform of a function with $n$ inputs involves $2^n$ coefficients, it becomes tractable when the function is sparse and of low-degree as we show is the case for many real-world functions. Under these conditions, the complexity of the transform computation is significantly reduced. When there are $K$ non-zero coefficients, our algorithm recovers the Möbius transform in $O(Kn)$ samples and $O(Kn^2)$ time asymptotically under certain assumptions, the first non-adaptive algorithm to do so. We also uncover a surprising connection between group testing and the Möbius transform. For functions where all interactions involve at most $t$ inputs, we use group testing results to compute the Möbius transform with $O(Kt\log n)$ sample complexity and $O(K\mathrm{poly}(n))$ time. A robust version of this algorithm withstands noise and maintains this complexity. This marks the first $n$ sub-linear query complexity, noise-tolerant algorithm for the Möbius transform. In several examples, we observe that representations generated via sparse Möbius transform are up to twice as faithful to the original function, as compared to Shaply and Banzhaf values, while using the same number of terms.
△ Less
Submitted 15 June, 2024; v1 submitted 4 February, 2024;
originally announced February 2024.
-
Efficiently Computing Sparse Fourier Transforms of $q$-ary Functions
Authors:
Yigit Efe Erginbas,
Justin Singh Kang,
Amirali Aghazadeh,
Kannan Ramchandran
Abstract:
Fourier transformations of pseudo-Boolean functions are popular tools for analyzing functions of binary sequences. Real-world functions often have structures that manifest in a sparse Fourier transform, and previous works have shown that under the assumption of sparsity the transform can be computed efficiently. But what if we want to compute the Fourier transform of functions defined over a $q$-a…
▽ More
Fourier transformations of pseudo-Boolean functions are popular tools for analyzing functions of binary sequences. Real-world functions often have structures that manifest in a sparse Fourier transform, and previous works have shown that under the assumption of sparsity the transform can be computed efficiently. But what if we want to compute the Fourier transform of functions defined over a $q$-ary alphabet? These types of functions arise naturally in many areas including biology. A typical workaround is to encode the $q$-ary sequence in binary, however, this approach is computationally inefficient and fundamentally incompatible with the existing sparse Fourier transform techniques. Herein, we develop a sparse Fourier transform algorithm specifically for $q$-ary functions of length $n$ sequences, dubbed $q$-SFT, which provably computes an $S$-sparse transform with vanishing error as $q^n \rightarrow \infty$ in $O(Sn)$ function evaluations and $O(S n^2 \log q)$ computations, where $S = q^{nδ}$ for some $δ< 1$. Under certain assumptions, we show that for fixed $q$, a robust version of $q$-SFT has a sample complexity of $O(Sn^2)$ and a computational complexity of $O(Sn^3)$ with the same asymptotic guarantees. We present numerical simulations on synthetic and real-world RNA data, demonstrating the scalability of $q$-SFT to massively high dimensional $q$-ary functions.
△ Less
Submitted 15 January, 2023;
originally announced January 2023.
-
Minimum Feedback for Collision-Free Scheduling in Massive Random Access
Authors:
Justin Singh Kang,
Wei Yu
Abstract:
Consider a massive random access scenario in which a small set of $k$ active users out of a large number of $n$ potential users need to be scheduled in $b\ge k$ slots. What is the minimum common feedback to the users needed to ensure that scheduling is collision-free? Instead of a naive scheme of listing the indices of the $k$ active users in the order in which they should transmit, at a cost of…
▽ More
Consider a massive random access scenario in which a small set of $k$ active users out of a large number of $n$ potential users need to be scheduled in $b\ge k$ slots. What is the minimum common feedback to the users needed to ensure that scheduling is collision-free? Instead of a naive scheme of listing the indices of the $k$ active users in the order in which they should transmit, at a cost of $k\log(n)$ bits, this paper shows that for the case of $b=k$, the rate of the minimum fixed-length common feedback code scales only as $k \log(e)$ bits, plus an additive term that scales in $n$ as $Θ\left(\log \log(n) \right)$ for fixed $k$. If a variable-length code can be used, assuming uniform activity among the users, the minimum average common feedback rate still requires $k \log(e)$ bits, but the dependence on $n$ can be reduced to $O(1)$. When $b>k$, the number of feedback bits needed for collision-free scheduling can be significantly further reduced. Moreover, a similar scaling on the minimum feedback rate is derived for the case of scheduling $m$ users per slot, when $k \le mb$. The problem of constructing a minimum collision-free feedback scheduling code is connected to that of constructing a perfect hashing family, which allows practical feedback scheduling codes to be constructed from perfect hashing algorithms.
△ Less
Submitted 20 September, 2021; v1 submitted 30 July, 2020;
originally announced July 2020.
-
A Cross-Domain Transferable Neural Coherence Model
Authors:
Peng Xu,
Hamidreza Saghir,
Jin Sung Kang,
Teng Long,
Avishek Joey Bose,
Yanshuai Cao,
Jackie Chi Kit Cheung
Abstract:
Coherence is an important aspect of text quality and is crucial for ensuring its readability. One important limitation of existing coherence models is that training on one domain does not easily generalize to unseen categories of text. Previous work advocates for generative models for cross-domain generalization, because for discriminative models, the space of incoherent sentence orderings to disc…
▽ More
Coherence is an important aspect of text quality and is crucial for ensuring its readability. One important limitation of existing coherence models is that training on one domain does not easily generalize to unseen categories of text. Previous work advocates for generative models for cross-domain generalization, because for discriminative models, the space of incoherent sentence orderings to discriminate against during training is prohibitively large. In this work, we propose a local discriminative neural model with a much smaller negative sampling space that can efficiently learn against incorrect orderings. The proposed coherence model is simple in structure, yet it significantly outperforms previous state-of-art methods on a standard benchmark dataset on the Wall Street Journal corpus, as well as in multiple new challenging settings of transfer to unseen categories of discourse on Wikipedia articles.
△ Less
Submitted 9 July, 2019; v1 submitted 28 May, 2019;
originally announced May 2019.
-
PoMo: Generating Entity-Specific Post-Modifiers in Context
Authors:
Jun Seok Kang,
Robert L. Logan IV,
Zewei Chu,
Yang Chen,
Dheeru Dua,
Kevin Gimpel,
Sameer Singh,
Niranjan Balasubramanian
Abstract:
We introduce entity post-modifier generation as an instance of a collaborative writing task. Given a sentence about a target entity, the task is to automatically generate a post-modifier phrase that provides contextually relevant information about the entity. For example, for the sentence, "Barack Obama, _______, supported the #MeToo movement.", the phrase "a father of two girls" is a contextually…
▽ More
We introduce entity post-modifier generation as an instance of a collaborative writing task. Given a sentence about a target entity, the task is to automatically generate a post-modifier phrase that provides contextually relevant information about the entity. For example, for the sentence, "Barack Obama, _______, supported the #MeToo movement.", the phrase "a father of two girls" is a contextually relevant post-modifier. To this end, we build PoMo, a post-modifier dataset created automatically from news articles reflecting a journalistic need for incorporating entity information that is relevant to a particular news event. PoMo consists of more than 231K sentences with post-modifiers and associated facts extracted from Wikidata for around 57K unique entities. We use crowdsourcing to show that modeling contextual relevance is necessary for accurate post-modifier generation. We adapt a number of existing generation approaches as baselines for this dataset. Our results show there is large room for improvement in terms of both identifying relevant facts to include (knowing which claims are relevant gives a >20% improvement in BLEU score), and generating appropriate post-modifier text for the context (providing relevant claims is not sufficient for accurate generation). We conduct an error analysis that suggests promising directions for future research.
△ Less
Submitted 8 April, 2019; v1 submitted 5 April, 2019;
originally announced April 2019.