Skip to main content

Showing 1–50 of 126 results for author: Galstyan, A

  1. arXiv:2406.17304  [pdf, other

    cs.CL

    Leveraging LLMs for Dialogue Quality Measurement

    Authors: Jinghan Jia, Abi Komma, Timothy Leffel, Xujun Peng, Ajay Nagesh, Tamer Soliman, Aram Galstyan, Anoop Kumar

    Abstract: In task-oriented conversational AI evaluation, unsupervised methods poorly correlate with human judgments, and supervised approaches lack generalization. Recent advances in large language models (LLMs) show robust zeroshot and few-shot capabilities across NLP tasks. This paper explores using LLMs for automated dialogue quality evaluation, experimenting with various configurations on public and pro… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  2. arXiv:2402.15833  [pdf, other

    cs.CL cs.LG

    Prompt Perturbation Consistency Learning for Robust Language Models

    Authors: Yao Qiang, Subhrangshu Nandi, Ninareh Mehrabi, Greg Ver Steeg, Anoop Kumar, Anna Rumshisky, Aram Galstyan

    Abstract: Large language models (LLMs) have demonstrated impressive performance on a number of natural language processing tasks, such as question answering and text summarization. However, their performance on sequence labeling tasks such as intent classification and slot filling (IC-SF), which is a central component in personal assistant systems, lags significantly behind discriminative models. Furthermor… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

  3. arXiv:2402.11138  [pdf, other

    cs.CL cs.AI cs.LG

    Contrastive Instruction Tuning

    Authors: Tianyi Lorena Yan, Fei Wang, James Y. Huang, Wenxuan Zhou, Fan Yin, Aram Galstyan, Wenpeng Yin, Muhao Chen

    Abstract: Instruction tuning has been used as a promising approach to improve the performance of large language models (LLMs) on unseen tasks. However, current LLMs exhibit limited robustness to unseen instructions, generating inconsistent outputs when the same instruction is phrased with slightly varied forms or language styles. This behavior indicates LLMs' lack of robustness to textual variations and gen… ▽ More

    Submitted 6 June, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

    Comments: ACL 2024 Findings

  4. arXiv:2312.11779  [pdf, other

    cs.CL cs.AI cs.LG

    Tokenization Matters: Navigating Data-Scarce Tokenization for Gender Inclusive Language Technologies

    Authors: Anaelia Ovalle, Ninareh Mehrabi, Palash Goyal, Jwala Dhamala, Kai-Wei Chang, Richard Zemel, Aram Galstyan, Yuval Pinter, Rahul Gupta

    Abstract: Gender-inclusive NLP research has documented the harmful limitations of gender binary-centric large language models (LLM), such as the inability to correctly use gender-diverse English neopronouns (e.g., xe, zir, fae). While data scarcity is a known culprit, the precise mechanisms through which scarcity affects this behavior remain underexplored. We discover LLM misgendering is significantly influ… ▽ More

    Submitted 6 April, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: Accepted to NAACL 2024 findings

  5. arXiv:2311.09473  [pdf, other

    cs.AI cs.CL

    JAB: Joint Adversarial Prompting and Belief Augmentation

    Authors: Ninareh Mehrabi, Palash Goyal, Anil Ramakrishna, Jwala Dhamala, Shalini Ghosh, Richard Zemel, Kai-Wei Chang, Aram Galstyan, Rahul Gupta

    Abstract: With the recent surge of language models in different applications, attention to safety and robustness of these models has gained significant importance. Here we introduce a joint framework in which we simultaneously probe and improve the robustness of a black-box target model via adversarial prompting and belief augmentation using iterative feedback loops. This framework utilizes an automated red… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  6. arXiv:2311.04978  [pdf, other

    cs.CL

    On the steerability of large language models toward data-driven personas

    Authors: Junyi Li, Ninareh Mehrabi, Charith Peris, Palash Goyal, Kai-Wei Chang, Aram Galstyan, Richard Zemel, Rahul Gupta

    Abstract: Large language models (LLMs) are known to generate biased responses where the opinions of certain groups and populations are underrepresented. Here, we present a novel approach to achieve controllable generation of specific viewpoints using LLMs, that can be leveraged to produce multiple perspectives and to reflect the diverse opinions. Moving beyond the traditional reliance on demographics like a… ▽ More

    Submitted 2 April, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

  7. arXiv:2308.04265  [pdf, other

    cs.AI

    FLIRT: Feedback Loop In-context Red Teaming

    Authors: Ninareh Mehrabi, Palash Goyal, Christophe Dupuy, Qian Hu, Shalini Ghosh, Richard Zemel, Kai-Wei Chang, Aram Galstyan, Rahul Gupta

    Abstract: Warning: this paper contains content that may be inappropriate or offensive. As generative models become available for public use in various applications, testing and analyzing vulnerabilities of these models has become a priority. Here we propose an automatic red teaming framework that evaluates a given model and exposes its vulnerabilities against unsafe and inappropriate content generation. O… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

  8. arXiv:2306.09520  [pdf, other

    cs.LG cs.AI stat.ME stat.ML

    Ensembled Prediction Intervals for Causal Outcomes Under Hidden Confounding

    Authors: Myrl G. Marmarelis, Greg Ver Steeg, Aram Galstyan, Fred Morstatter

    Abstract: Causal inference of exact individual treatment outcomes in the presence of hidden confounders is rarely possible. Recent work has extended prediction intervals with finite-sample guarantees to partially identifiable causal outcomes, by means of a sensitivity model for hidden confounding. In deep learning, predictors can exploit their inductive biases for better generalization out of sample. We arg… ▽ More

    Submitted 1 November, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

  9. arXiv:2306.06302  [pdf, other

    cs.IR cs.LG

    Multi-Task Knowledge Enhancement for Zero-Shot and Multi-Domain Recommendation in an AI Assistant Application

    Authors: Elan Markowitz, Ziyan Jiang, Fan Yang, Xing Fan, Tony Chen, Greg Ver Steeg, Aram Galstyan

    Abstract: Recommender systems have found significant commercial success but still struggle with integrating new users. Since users often interact with content in different domains, it is possible to leverage a user's interactions in previous domains to improve that user's recommendations in a new one (multi-domain recommendation). A separate research thread on knowledge graph enhancement uses external knowl… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

  10. arXiv:2306.03984  [pdf, other

    cs.CL cs.LG

    Toward More Accurate and Generalizable Evaluation Metrics for Task-Oriented Dialogs

    Authors: Abishek Komma, Nagesh Panyam Chandrasekarasastry, Timothy Leffel, Anuj Goyal, Angeliki Metallinou, Spyros Matsoukas, Aram Galstyan

    Abstract: Measurement of interaction quality is a critical task for the improvement of spoken dialog systems. Existing approaches to dialog quality estimation either focus on evaluating the quality of individual turns, or collect dialog-level quality measurements from end users immediately following an interaction. In contrast to these approaches, we introduce a new dialog-level annotation workflow called D… ▽ More

    Submitted 8 June, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

  11. arXiv:2305.19264  [pdf, other

    cs.CL cs.LG

    Jointly Reparametrized Multi-Layer Adaptation for Efficient and Private Tuning

    Authors: Umang Gupta, Aram Galstyan, Greg Ver Steeg

    Abstract: Efficient finetuning of pretrained language transformers is becoming increasingly prevalent for solving natural language processing tasks. While effective, it can still require a large number of tunable parameters. This can be a drawback for low-resource applications and training with differential-privacy constraints, where excessive noise may be introduced during finetuning. To this end, we propo… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: To appear in the Findings of ACL 2023. Code available at https://github.com/umgupta/jointly-reparametrized-finetuning

  12. arXiv:2305.18675  [pdf, other

    cs.LG cs.AI cs.CL

    History Repeats: Overcoming Catastrophic Forgetting For Event-Centric Temporal Knowledge Graph Completion

    Authors: Mehrnoosh Mirtaheri, Mohammad Rostami, Aram Galstyan

    Abstract: Temporal knowledge graph (TKG) completion models typically rely on having access to the entire graph during training. However, in real-world scenarios, TKG data is often received incrementally as events unfold, leading to a dynamic non-stationary data distribution over time. While one could incorporate fine-tuning to existing methods to allow them to adapt to evolving TKG data, this can lead to fo… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

    Comments: 14 pages, 6 figures

    ACM Class: I.2.6; I.2.7

  13. arXiv:2305.16597  [pdf, other

    cs.CL cs.AI cs.LG

    Neural Architecture Search for Parameter-Efficient Fine-tuning of Large Pre-trained Language Models

    Authors: Neal Lawton, Anoop Kumar, Govind Thattai, Aram Galstyan, Greg Ver Steeg

    Abstract: Parameter-efficient tuning (PET) methods fit pre-trained language models (PLMs) to downstream tasks by either computing a small compressed update for a subset of model parameters, or appending and fine-tuning a small number of new model parameters to the pre-trained network. Hand-designed PET architectures from the literature perform well in practice, but have the potential to be improved via auto… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

    Comments: 8 pages, 3 figures, ACL 2023

    ACM Class: I.2.7

  14. arXiv:2305.16585  [pdf, other

    cs.CL

    ParaAMR: A Large-Scale Syntactically Diverse Paraphrase Dataset by AMR Back-Translation

    Authors: Kuan-Hao Huang, Varun Iyer, I-Hung Hsu, Anoop Kumar, Kai-Wei Chang, Aram Galstyan

    Abstract: Paraphrase generation is a long-standing task in natural language processing (NLP). Supervised paraphrase generation models, which rely on human-annotated paraphrase pairs, are cost-inefficient and hard to scale up. On the other hand, automatically annotated paraphrase pairs (e.g., by machine back-translation), usually suffer from the lack of syntactic diversity -- the generated paraphrase sentenc… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

    Comments: ACL 2023

  15. arXiv:2305.14449  [pdf, other

    cs.AI cs.IR cs.LG

    Graph Meets LLM: A Novel Approach to Collaborative Filtering for Robust Conversational Understanding

    Authors: Zheng Chen, Ziyan Jiang, Fan Yang, Eunah Cho, Xing Fan, Xiaojiang Huang, Yanbin Lu, Aram Galstyan

    Abstract: Conversational AI systems such as Alexa need to understand defective queries to ensure robust conversational understanding and reduce user friction. These defective queries often arise from user ambiguities, mistakes, or errors in automatic speech recognition (ASR) and natural language understanding (NLU). Personalized query rewriting is an approach that focuses on reducing defects in queries by… ▽ More

    Submitted 19 June, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    ACM Class: F.2.2; I.2.7

  16. arXiv:2305.10625  [pdf, other

    cs.LG

    Measuring and Mitigating Local Instability in Deep Neural Networks

    Authors: Arghya Datta, Subhrangshu Nandi, Jingcheng Xu, Greg Ver Steeg, He Xie, Anoop Kumar, Aram Galstyan

    Abstract: Deep Neural Networks (DNNs) are becoming integral components of real world services relied upon by millions of users. Unfortunately, architects of these systems can find it difficult to ensure reliable performance as irrelevant details like random initialization can unexpectedly change the outputs of a trained system with potentially disastrous consequences. We formulate the model stability proble… ▽ More

    Submitted 18 May, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

    Comments: To be published in Findings of the Association for Computational Linguistics (ACL), 2023

  17. arXiv:2305.09941  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    "I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation

    Authors: Anaelia Ovalle, Palash Goyal, Jwala Dhamala, Zachary Jaggers, Kai-Wei Chang, Aram Galstyan, Richard Zemel, Rahul Gupta

    Abstract: Transgender and non-binary (TGNB) individuals disproportionately experience discrimination and exclusion from daily life. Given the recent popularity and adoption of language generation technologies, the potential to further marginalize this population only grows. Although a multitude of NLP fairness literature focuses on illuminating and addressing gender biases, assessing gender harms for TGNB i… ▽ More

    Submitted 1 June, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

    ACM Class: I.2; I.7; K.4

    Journal ref: 2023 ACM Conference on Fairness, Accountability, and Transparency

  18. arXiv:2305.07797  [pdf, other

    cs.CL

    ACCENT: An Automatic Event Commonsense Evaluation Metric for Open-Domain Dialogue Systems

    Authors: Sarik Ghazarian, Yijia Shao, Rujun Han, Aram Galstyan, Nanyun Peng

    Abstract: Commonsense reasoning is omnipresent in human communications and thus is an important feature for open-domain dialogue systems. However, evaluating commonsense in dialogue systems is still an open challenge. We take the first step by focusing on event commonsense that considers events and their relations, and is crucial in both dialogues and general commonsense reasoning. We propose ACCENT, an eve… ▽ More

    Submitted 3 November, 2023; v1 submitted 12 May, 2023; originally announced May 2023.

    Comments: ACL 2023

  19. arXiv:2211.12503  [pdf, other

    cs.CL cs.CV cs.LG cs.MM

    Is the Elephant Flying? Resolving Ambiguities in Text-to-Image Generative Models

    Authors: Ninareh Mehrabi, Palash Goyal, Apurv Verma, Jwala Dhamala, Varun Kumar, Qian Hu, Kai-Wei Chang, Richard Zemel, Aram Galstyan, Rahul Gupta

    Abstract: Natural language often contains ambiguities that can lead to misinterpretation and miscommunication. While humans can handle ambiguities effectively by asking clarifying questions and/or relying on contextual cues and common-sense knowledge, resolving ambiguities can be notoriously hard for machines. In this work, we study ambiguities that arise in text-to-image generative models. We curate a benc… ▽ More

    Submitted 17 November, 2022; originally announced November 2022.

  20. arXiv:2211.00881  [pdf, other

    cs.CL

    Unsupervised Syntactically Controlled Paraphrase Generation with Abstract Meaning Representations

    Authors: Kuan-Hao Huang, Varun Iyer, Anoop Kumar, Sriram Venkatapathy, Kai-Wei Chang, Aram Galstyan

    Abstract: Syntactically controlled paraphrase generation has become an emerging research direction in recent years. Most existing approaches require annotated paraphrase pairs for training and are thus costly to extend to new domains. Unsupervised approaches, on the other hand, do not need paraphrase pairs but suffer from relatively poor performance in terms of syntactic control and quality of generated par… ▽ More

    Submitted 2 November, 2022; originally announced November 2022.

    Comments: Paper accepted by EMNLP 2022 Findings. The first two authors contribute equally

  21. arXiv:2210.03826  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    An Analysis of the Effects of Decoding Algorithms on Fairness in Open-Ended Language Generation

    Authors: Jwala Dhamala, Varun Kumar, Rahul Gupta, Kai-Wei Chang, Aram Galstyan

    Abstract: Several prior works have shown that language models (LMs) can generate text containing harmful social biases and stereotypes. While decoding algorithms play a central role in determining properties of LM generated text, their impact on the fairness of the generations has not been studied. We present a systematic analysis of the impact of decoding algorithms on LM fairness, and analyze the trade-of… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

    Comments: Accepted at IEEE SLT 2022

  22. Formal limitations of sample-wise information-theoretic generalization bounds

    Authors: Hrayr Harutyunyan, Greg Ver Steeg, Aram Galstyan

    Abstract: Some of the tightest information-theoretic generalization bounds depend on the average information between the learned hypothesis and a single training example. However, these sample-wise bounds were derived only for expected generalization gap. We show that even for expected squared generalization gap no such sample-wise information-theoretic bounds exist. The same is true for PAC-Bayes and singl… ▽ More

    Submitted 13 December, 2022; v1 submitted 13 May, 2022; originally announced May 2022.

    Comments: 2022 IEEE Information Theory Workshop

  23. arXiv:2205.02392  [pdf, other

    cs.CL cs.AI

    Robust Conversational Agents against Imperceptible Toxicity Triggers

    Authors: Ninareh Mehrabi, Ahmad Beirami, Fred Morstatter, Aram Galstyan

    Abstract: Warning: this paper contains content that maybe offensive or upsetting. Recent research in Natural Language Processing (NLP) has advanced the development of various toxicity detection models with the intention of identifying and mitigating toxic language from existing systems. Despite the abundance of research in this area, less attention has been given to adversarial attacks that force the system… ▽ More

    Submitted 4 May, 2022; originally announced May 2022.

  24. arXiv:2204.11206  [pdf, other

    stat.ME cs.LG stat.ML

    Partial Identification of Dose Responses with Hidden Confounders

    Authors: Myrl G. Marmarelis, Elizabeth Haddad, Andrew Jesson, Neda Jahanshad, Aram Galstyan, Greg Ver Steeg

    Abstract: Inferring causal effects of continuous-valued treatments from observational data is a crucial task promising to better inform policy- and decision-makers. A critical assumption needed to identify these effects is that all confounding variables -- causal parents of both the treatment and the outcome -- are included as covariates. Unfortunately, given observational data alone, we cannot know with ce… ▽ More

    Submitted 12 June, 2023; v1 submitted 24 April, 2022; originally announced April 2022.

  25. arXiv:2203.13928  [pdf, other

    cs.CL

    On the Intrinsic and Extrinsic Fairness Evaluation Metrics for Contextualized Language Representations

    Authors: Yang Trista Cao, Yada Pruksachatkun, Kai-Wei Chang, Rahul Gupta, Varun Kumar, Jwala Dhamala, Aram Galstyan

    Abstract: Multiple metrics have been introduced to measure fairness in various natural language processing tasks. These metrics can be roughly categorized into two categories: 1) \emph{extrinsic metrics} for evaluating fairness in downstream applications and 2) \emph{intrinsic metrics} for estimating fairness in upstream contextualized language representation models. In this paper, we conduct an extensive c… ▽ More

    Submitted 25 March, 2022; originally announced March 2022.

    Journal ref: ACL 2022

  26. arXiv:2203.12574  [pdf, other

    cs.CL cs.LG

    Mitigating Gender Bias in Distilled Language Models via Counterfactual Role Reversal

    Authors: Umang Gupta, Jwala Dhamala, Varun Kumar, Apurv Verma, Yada Pruksachatkun, Satyapriya Krishna, Rahul Gupta, Kai-Wei Chang, Greg Ver Steeg, Aram Galstyan

    Abstract: Language models excel at generating coherent text, and model compression techniques such as knowledge distillation have enabled their use in resource-constrained settings. However, these models can be biased in multiple ways, including the unfounded association of male and female genders with gender-neutral professions. Therefore, knowledge distillation without any fairness constraints may preserv… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

    Comments: To appear in the Findings of ACL 2022

  27. arXiv:2203.10204  [pdf, other

    cond-mat.mtrl-sci cond-mat.dis-nn cs.CV cs.LG

    Inferring topological transitions in pattern-forming processes with self-supervised learning

    Authors: Marcin Abram, Keith Burghardt, Greg Ver Steeg, Aram Galstyan, Remi Dingreville

    Abstract: The identification and classification of transitions in topological and microstructural regimes in pattern-forming processes are critical for understanding and fabricating microstructurally precise novel materials in many application domains. Unfortunately, relevant microstructure transitions may depend on process parameters in subtle and complex ways that are not captured by the classic theory of… ▽ More

    Submitted 10 August, 2022; v1 submitted 18 March, 2022; originally announced March 2022.

    Comments: 17 pages, 6 figures, 8 pages of supplementary information

    ACM Class: I.2.6; I.4.7; I.5.4; I.6.m; J.2

  28. arXiv:2203.09711  [pdf, other

    cs.CL

    DEAM: Dialogue Coherence Evaluation using AMR-based Semantic Manipulations

    Authors: Sarik Ghazarian, Nuan Wen, Aram Galstyan, Nanyun Peng

    Abstract: Automatic evaluation metrics are essential for the rapid development of open-domain dialogue systems as they facilitate hyper-parameter tuning and comparison between models. Although recently proposed trainable conversation-level metrics have shown encouraging results, the quality of the metrics is strongly dependent on the quality of training data. Prior works mainly resort to heuristic text-leve… ▽ More

    Submitted 17 March, 2022; originally announced March 2022.

    Comments: Association for Computational Linguistics (ACL 2022)

  29. arXiv:2111.13733  [pdf, other

    cs.LG

    Failure Modes of Domain Generalization Algorithms

    Authors: Tigran Galstyan, Hrayr Harutyunyan, Hrant Khachatrian, Greg Ver Steeg, Aram Galstyan

    Abstract: Domain generalization algorithms use training data from multiple domains to learn models that generalize well to unseen domains. While recently proposed benchmarks demonstrate that most of the existing algorithms do not outperform simple baselines, the established evaluation methods fail to expose the impact of various factors that contribute to the poor performance. In this paper we propose an ev… ▽ More

    Submitted 26 November, 2021; originally announced November 2021.

  30. arXiv:2111.06312  [pdf, other

    cs.LG cs.AI cs.MS cs.SI

    Implicit SVD for Graph Representation Learning

    Authors: Sami Abu-El-Haija, Hesham Mostafa, Marcel Nassar, Valentino Crespi, Greg Ver Steeg, Aram Galstyan

    Abstract: Recent improvements in the performance of state-of-the-art (SOTA) methods for Graph Representational Learning (GRL) have come at the cost of significant computational resource requirements for training, e.g., for calculating gradients via backprop over many data epochs. Meanwhile, Singular Value Decomposition (SVD) can find closed-form solutions to convex problems, using merely a handful of epochs… ▽ More

    Submitted 11 November, 2021; originally announced November 2021.

    Journal ref: Advances in Neural Information Processing Systems (NeurIPS) 2021

  31. arXiv:2111.02434  [pdf, other

    cs.LG physics.comp-ph

    Hamiltonian Dynamics with Non-Newtonian Momentum for Rapid Sampling

    Authors: Greg Ver Steeg, Aram Galstyan

    Abstract: Sampling from an unnormalized probability distribution is a fundamental problem in machine learning with applications including Bayesian modeling, latent factor inference, and energy-based model training. After decades of research, variations of MCMC remain the default approach to sampling despite slow convergence. Auxiliary neural models can learn to speed up MCMC, but the overhead for training t… ▽ More

    Submitted 29 December, 2021; v1 submitted 3 November, 2021; originally announced November 2021.

    Comments: 31 pages, 19 figures. Advances in Neural Information Processing Systems (NeurIPS), 2021. Animations at https://sites.google.com/view/esh-dynamics/home, code at https://github.com/gregversteeg/esh_dynamics

  32. arXiv:2110.04662  [pdf, other

    cs.LG cs.AI

    Cognitively Inspired Learning of Incremental Drifting Concepts

    Authors: Mohammad Rostami, Aram Galstyan

    Abstract: Humans continually expand their learned knowledge to new domains and learn new concepts without any interference with past learned experiences. In contrast, machine learning models perform poorly in a continual learning setting, where input data distribution changes over time. Inspired by the nervous system learning mechanisms, we develop a computational model that enables a deep neural network to… ▽ More

    Submitted 21 April, 2023; v1 submitted 9 October, 2021; originally announced October 2021.

    Comments: 2023 International Joint Conference on Artificial Intelligence

  33. arXiv:2110.01584  [pdf, other

    cs.LG stat.ML

    Information-theoretic generalization bounds for black-box learning algorithms

    Authors: Hrayr Harutyunyan, Maxim Raginsky, Greg Ver Steeg, Aram Galstyan

    Abstract: We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms… ▽ More

    Submitted 5 October, 2021; v1 submitted 4 October, 2021; originally announced October 2021.

    Comments: NeurIPS 2021

  34. arXiv:2109.07607  [pdf, other

    cs.CV

    Partner-Assisted Learning for Few-Shot Image Classification

    Authors: Jiawei Ma, Hanchen Xie, Guangxing Han, Shih-Fu Chang, Aram Galstyan, Wael Abd-Almageed

    Abstract: Few-shot Learning has been studied to mimic human visual capabilities and learn effective models without the need of exhaustive human annotation. Even though the idea of meta-learning for adaptation has dominated the few-shot learning methods, how to train a feature extractor is still a challenge. In this paper, we focus on the design of training strategy to obtain an elemental representation such… ▽ More

    Submitted 15 September, 2021; originally announced September 2021.

    Comments: ICCV2021 Camera Ready

  35. arXiv:2109.03952  [pdf, other

    cs.AI

    Attributing Fair Decisions with Attention Interventions

    Authors: Ninareh Mehrabi, Umang Gupta, Fred Morstatter, Greg Ver Steeg, Aram Galstyan

    Abstract: The widespread use of Artificial Intelligence (AI) in consequential domains, such as healthcare and parole decision-making systems, has drawn intense scrutiny on the fairness of these methods. However, ensuring fairness is often insufficient as the rationale for a contentious decision needs to be audited, understood, and defended. We propose that the attention mechanism can be used to ensure fair… ▽ More

    Submitted 8 September, 2021; originally announced September 2021.

  36. arXiv:2107.01598  [pdf, other

    cs.CL cs.LG

    Domain Adaptation for Sentiment Analysis Using Increased Intraclass Separation

    Authors: Mohammad Rostami, Aram Galstyan

    Abstract: Sentiment analysis is a costly yet necessary task for enterprises to study the opinions of their customers to improve their products and to determine optimal marketing strategies. Due to the existence of a wide range of domains across different products and services, cross-domain sentiment analysis methods have received significant attention. These methods mitigate the domain gap between different… ▽ More

    Submitted 4 July, 2021; originally announced July 2021.

  37. arXiv:2107.00745  [pdf, other

    cs.LG cs.AI stat.ML

    q-Paths: Generalizing the Geometric Annealing Path using Power Means

    Authors: Vaden Masrani, Rob Brekelmans, Thang Bui, Frank Nielsen, Aram Galstyan, Greg Ver Steeg, Frank Wood

    Abstract: Many common machine learning methods involve the geometric annealing path, a sequence of intermediate densities between two distributions of interest constructed using the geometric average. While alternatives such as the moment-averaging path have demonstrated performance gains in some settings, their practical applicability remains limited by exponential family endpoint assumptions and a lack of… ▽ More

    Submitted 1 July, 2021; originally announced July 2021.

    Comments: arXiv admin note: text overlap with arXiv:2012.07823

  38. arXiv:2104.10232  [pdf, other

    cs.CR

    Identifying botnet IP address clusters using natural language processing techniques on honeypot command logs

    Authors: Valentino Crespi, Wes Hardaker, Sami Abu-El-Haija, Aram Galstyan

    Abstract: Computer security has been plagued by increasing formidable, dynamic, hard-to-detect, hard-to-predict, and hard-to-characterize hacking techniques. Such techniques are very often deployed in self-propagating worms capable of automatically infecting vulnerable computer systems and then building large bot networks, which are then used to launch coordinated attacks on designated targets. In this work… ▽ More

    Submitted 20 April, 2021; originally announced April 2021.

  39. arXiv:2104.05801  [pdf, other

    cs.CL cs.LG

    Plot-guided Adversarial Example Construction for Evaluating Open-domain Story Generation

    Authors: Sarik Ghazarian, Zixi Liu, Akash SM, Ralph Weischedel, Aram Galstyan, Nanyun Peng

    Abstract: With the recent advances of open-domain story generation, the lack of reliable automatic evaluation metrics becomes an increasingly imperative issue that hinders the fast development of story generation. According to conducted researches in this regard, learnable evaluation metrics have promised more accurate assessments by having higher correlations with human judgments. A critical bottleneck of… ▽ More

    Submitted 25 May, 2021; v1 submitted 12 April, 2021; originally announced April 2021.

    Comments: NAACL 2021

  40. arXiv:2103.11320  [pdf, other

    cs.CL

    Lawyers are Dishonest? Quantifying Representational Harms in Commonsense Knowledge Resources

    Authors: Ninareh Mehrabi, Pei Zhou, Fred Morstatter, Jay Pujara, Xiang Ren, Aram Galstyan

    Abstract: Warning: this paper contains content that may be offensive or upsetting. Numerous natural language processing models have tried injecting commonsense by using the ConceptNet knowledge base to improve performance on different tasks. ConceptNet, however, is mostly crowdsourced from humans and may reflect human biases such as "lawyers are dishonest." It is important that these biases are not confla… ▽ More

    Submitted 10 September, 2021; v1 submitted 21 March, 2021; originally announced March 2021.

  41. arXiv:2102.08530  [pdf, other

    cs.LG cs.MS cs.SI

    Fast Graph Learning with Unique Optimal Solutions

    Authors: Sami Abu-El-Haija, Valentino Crespi, Greg Ver Steeg, Aram Galstyan

    Abstract: We consider two popular Graph Representation Learning (GRL) methods: message passing for node classification and network embedding for link prediction. For each, we pick a popular model that we: (i) linearize and (ii) and switch its training objective to Frobenius norm error minimization. These simplifications can cast the training into finding the optimal parameters in closed-form. We program in… ▽ More

    Submitted 22 April, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

    Journal ref: ICLR 2021 Workshop on Geometrical and Topological Representation Learning

  42. arXiv:2102.04350  [pdf, other

    cs.LG

    Graph Traversal with Tensor Functionals: A Meta-Algorithm for Scalable Learning

    Authors: Elan Markowitz, Keshav Balasubramanian, Mehrnoosh Mirtaheri, Sami Abu-El-Haija, Bryan Perozzi, Greg Ver Steeg, Aram Galstyan

    Abstract: Graph Representation Learning (GRL) methods have impacted fields from chemistry to social science. However, their algorithmic implementations are specialized to specific use-cases e.g.message passing methods are run differently from node embedding ones. Despite their apparent differences, all these methods utilize the graph structure, and therefore, their learning can be approximated with stochast… ▽ More

    Submitted 8 February, 2021; originally announced February 2021.

    Comments: To appear in ICLR 2021

  43. arXiv:2102.02191  [pdf, other

    cs.CL

    DiSCoL: Toward Engaging Dialogue Systems through Conversational Line Guided Response Generation

    Authors: Sarik Ghazarian, Zixi Liu, Tuhin Chakrabarty, Xuezhe Ma, Aram Galstyan, Nanyun Peng

    Abstract: Having engaging and informative conversations with users is the utmost goal for open-domain conversational systems. Recent advances in transformer-based language models and their applications to dialogue systems have succeeded to generate fluent and human-like responses. However, they still lack control over the generation process towards producing contentful responses and achieving engaging conve… ▽ More

    Submitted 3 February, 2021; originally announced February 2021.

  44. arXiv:2012.15480  [pdf, other

    cs.LG cs.IT stat.ML

    Likelihood Ratio Exponential Families

    Authors: Rob Brekelmans, Frank Nielsen, Alireza Makhzani, Aram Galstyan, Greg Ver Steeg

    Abstract: The exponential family is well known in machine learning and statistical physics as the maximum entropy distribution subject to a set of observed constraints, while the geometric mixture path is common in MCMC methods such as annealed importance sampling. Linking these two ideas, recent work has interpreted the geometric mixture path as an exponential family of distributions to analyze the thermod… ▽ More

    Submitted 15 January, 2021; v1 submitted 31 December, 2020; originally announced December 2020.

    Comments: NeurIPS Workshop on Deep Learning through Information Geometry

  45. arXiv:2012.08723  [pdf, other

    cs.LG cs.AI cs.CR

    Exacerbating Algorithmic Bias through Fairness Attacks

    Authors: Ninareh Mehrabi, Muhammad Naveed, Fred Morstatter, Aram Galstyan

    Abstract: Algorithmic fairness has attracted significant attention in recent years, with many quantitative measures suggested for characterizing the fairness of different machine learning algorithms. Despite this interest, the robustness of those fairness measures with respect to an intentional adversarial attack has not been properly addressed. Indeed, most adversarial machine learning has focused on the i… ▽ More

    Submitted 15 December, 2020; originally announced December 2020.

  46. arXiv:2012.07823  [pdf, other

    cs.LG

    Annealed Importance Sampling with q-Paths

    Authors: Rob Brekelmans, Vaden Masrani, Thang Bui, Frank Wood, Aram Galstyan, Greg Ver Steeg, Frank Nielsen

    Abstract: Annealed importance sampling (AIS) is the gold standard for estimating partition functions or marginal likelihoods, corresponding to importance sampling over a path of distributions between a tractable base and an unnormalized target. While AIS yields an unbiased estimator for any path, existing literature has been primarily limited to the geometric mixture or moment-averaged paths associated with… ▽ More

    Submitted 14 December, 2020; originally announced December 2020.

    Comments: NeurIPS Workshop on Deep Learning through Information Geometry (Best Paper Award)

    Journal ref: Published at UAI 2021 https://arxiv.org/abs/2107.00745

  47. arXiv:2012.00150  [pdf, other

    cs.LG cs.CV

    MUSCLE: Strengthening Semi-Supervised Learning Via Concurrent Unsupervised Learning Using Mutual Information Maximization

    Authors: Hanchen Xie, Mohamed E. Hussein, Aram Galstyan, Wael Abd-Almageed

    Abstract: Deep neural networks are powerful, massively parameterized machine learning models that have been shown to perform well in supervised learning tasks. However, very large amounts of labeled data are usually needed to train deep neural networks. Several semi-supervised learning approaches have been proposed to train neural networks using smaller amounts of labeled data with a large amount of unlabel… ▽ More

    Submitted 30 November, 2020; originally announced December 2020.

    Comments: 10 pages, 3 figures, Accepted to WACV2021

  48. arXiv:2010.12144  [pdf, other

    cs.LG cs.AI

    One-shot Learning for Temporal Knowledge Graphs

    Authors: Mehrnoosh Mirtaheri, Mohammad Rostami, Xiang Ren, Fred Morstatter, Aram Galstyan

    Abstract: Most real-world knowledge graphs are characterized by a long-tail relation frequency distribution where a significant fraction of relations occurs only a handful of times. This observation has given rise to recent interest in low-shot learning methods that are able to generalize from only a few examples. The existing approaches, however, are tailored to static knowledge graphs and not easily gener… ▽ More

    Submitted 22 October, 2020; originally announced October 2020.

  49. arXiv:2009.01966  [pdf, other

    cs.HC cs.LG

    Leveraging Clickstream Trajectories to Reveal Low-Quality Workers in Crowdsourced Forecasting Platforms

    Authors: Akira Matsui, Emilio Ferrara, Fred Morstatter, Andres Abeliuk, Aram Galstyan

    Abstract: Crowdwork often entails tackling cognitively-demanding and time-consuming tasks. Crowdsourcing can be used for complex annotation tasks, from medical imaging to geospatial data, and such data powers sensitive applications, such as health diagnostics or autonomous driving. However, the existence and prevalence of underperforming crowdworkers is well-recognized, and can pose a threat to the validity… ▽ More

    Submitted 3 September, 2020; originally announced September 2020.

    Comments: 12 pages, 8 figures

  50. arXiv:2007.14917  [pdf, other

    cs.LG stat.ML

    Compressing Deep Neural Networks via Layer Fusion

    Authors: James O' Neill, Greg Ver Steeg, Aram Galstyan

    Abstract: This paper proposes \textit{layer fusion} - a model compression technique that discovers which weights to combine and then fuses weights of similar fully-connected, convolutional and attention layers. Layer fusion can significantly reduce the number of layers of the original network with little additional computation overhead, while maintaining competitive performance. From experiments on CIFAR-10… ▽ More

    Submitted 29 July, 2020; originally announced July 2020.