Skip to main content

Showing 1–50 of 65 results for author: Bouneffouf, D

  1. arXiv:2405.05060  [pdf, other

    cs.CL

    Conversational Topic Recommendation in Counseling and Psychotherapy with Decision Transformer and Large Language Models

    Authors: Aylin Gunal, Baihan Lin, Djallel Bouneffouf

    Abstract: Given the increasing demand for mental health assistance, artificial intelligence (AI), particularly large language models (LLMs), may be valuable for integration into automated clinical support systems. In this work, we leverage a decision transformer architecture for topic recommendation in counseling conversations between patients and mental health professionals. The architecture is utilized fo… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 5 pages excluding references, 3 figures; accepted at Clinical NLP Workshop @ NAACL 2024

  2. arXiv:2403.12805  [pdf, other

    cs.AI cs.CL

    Contextual Moral Value Alignment Through Context-Based Aggregation

    Authors: Pierre Dognin, Jesus Rios, Ronny Luss, Inkit Padhi, Matthew D Riemer, Miao Liu, Prasanna Sattigeri, Manish Nagireddy, Kush R. Varshney, Djallel Bouneffouf

    Abstract: Developing value-aligned AI agents is a complex undertaking and an ongoing challenge in the field of AI. Specifically within the domain of Large Language Models (LLMs), the capability to consolidate multiple independently trained dialogue agents, each aligned with a distinct moral value, into a unified system that can adapt to and be aligned with multiple moral values is of paramount importance. I… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  3. arXiv:2403.09704  [pdf, other

    cs.CL cs.AI cs.LG

    Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations

    Authors: Swapnaja Achintalwar, Ioana Baldini, Djallel Bouneffouf, Joan Byamugisha, Maria Chang, Pierre Dognin, Eitan Farchi, Ndivhuwo Makondo, Aleksandra Mojsilovic, Manish Nagireddy, Karthikeyan Natesan Ramamurthy, Inkit Padhi, Orna Raz, Jesus Rios, Prasanna Sattigeri, Moninder Singh, Siphiwe Thwala, Rosario A. Uceda-Sosa, Kush R. Varshney

    Abstract: The alignment of large language models is usually done by model providers to add or control behaviors that are common or universally understood across use cases and contexts. In contrast, in this article, we present an approach and architecture that empowers application developers to tune a model to their particular values, social norms, laws and other regulations, and orchestrate between potentia… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: 7 pages, 5 figures

  4. arXiv:2403.06009  [pdf, other

    cs.LG

    Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations

    Authors: Swapnaja Achintalwar, Adriana Alvarado Garcia, Ateret Anaby-Tavor, Ioana Baldini, Sara E. Berger, Bishwaranjan Bhattacharjee, Djallel Bouneffouf, Subhajit Chaudhury, Pin-Yu Chen, Lamogha Chiazor, Elizabeth M. Daly, Kirushikesh DB, Rogério Abreu de Paula, Pierre Dognin, Eitan Farchi, Soumya Ghosh, Michael Hind, Raya Horesh, George Kour, Ja Young Lee, Nishtha Madaan, Sameep Mehta, Erik Miehling, Keerthiram Murugesan, Manish Nagireddy , et al. (13 additional authors not shown)

    Abstract: Large language models (LLMs) are susceptible to a variety of risks, from non-faithful output to biased and toxic generations. Due to several limiting factors surrounding LLMs (training cost, API access, data availability, etc.), it may not always be feasible to impose direct safety constraints on a deployed model. Therefore, an efficient and reliable alternative is required. To this end, we presen… ▽ More

    Submitted 13 June, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

  5. arXiv:2402.14701  [pdf, other

    cs.CL cs.AI cs.HC cs.LG q-bio.NC

    COMPASS: Computational Mapping of Patient-Therapist Alliance Strategies with Language Modeling

    Authors: Baihan Lin, Djallel Bouneffouf, Yulia Landa, Rachel Jespersen, Cheryl Corcoran, Guillermo Cecchi

    Abstract: The therapeutic working alliance is a critical factor in predicting the success of psychotherapy treatment. Traditionally, working alliance assessment relies on questionnaires completed by both therapists and patients. In this paper, we present COMPASS, a novel framework to directly infer the therapeutic working alliance from the natural language used in psychotherapy sessions. Our approach utiliz… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: This work extends our research series in computational psychiatry (e.g auto annotation in arXiv:2204.05522, topic extraction in arXiv:2204.10189, and diagnosis in arXiv:2210.15603) with the introduction of LLMs to complete the full cycle of interpreting and understanding psychotherapy strategies as a comprehensive analytical framework

  6. arXiv:2306.10050  [pdf, other

    cs.IR cs.CY cs.GT cs.LG

    Interpolating Item and User Fairness in Multi-Sided Recommendations

    Authors: Qinyi Chen, Jason Cheuk Nam Liang, Negin Golrezaei, Djallel Bouneffouf

    Abstract: Today's online platforms heavily lean on algorithmic recommendations for bolstering user engagement and driving revenue. However, these recommendations can impact multiple stakeholders simultaneously -- the platform, items (sellers), and users (customers) -- each with their unique objectives, making it difficult to find the right middle ground that accommodates all stakeholders. To address this, w… ▽ More

    Submitted 25 May, 2024; v1 submitted 12 June, 2023; originally announced June 2023.

  7. arXiv:2306.03902  [pdf, other

    cs.CL cs.AI cs.LO q-bio.NC

    Utterance Classification with Logical Neural Network: Explainable AI for Mental Disorder Diagnosis

    Authors: Yeldar Toleubay, Don Joven Agravante, Daiki Kimura, Baihan Lin, Djallel Bouneffouf, Michiaki Tatsubori

    Abstract: In response to the global challenge of mental health problems, we proposes a Logical Neural Network (LNN) based Neuro-Symbolic AI method for the diagnosis of mental disorders. Due to the lack of effective therapy coverage for mental disorders, there is a need for an AI solution that can assist therapists with the diagnosis. However, current Neural Network models lack explainability and may not be… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: ACL 2023

  8. arXiv:2304.00416  [pdf, other

    cs.AI cs.CL cs.CY cs.HC cs.LG

    Towards Healthy AI: Large Language Models Need Therapists Too

    Authors: Baihan Lin, Djallel Bouneffouf, Guillermo Cecchi, Kush R. Varshney

    Abstract: Recent advances in large language models (LLMs) have led to the development of powerful AI chatbots capable of engaging in natural and human-like conversations. However, these chatbots can be potentially harmful, exhibiting manipulative, gaslighting, and narcissistic behaviors. We define Healthy AI to be safe, trustworthy and ethical. To create healthy AI systems, we present the SafeguardGPT frame… ▽ More

    Submitted 1 April, 2023; originally announced April 2023.

  9. arXiv:2303.09601  [pdf, other

    cs.LG cs.AI cs.CL cs.HC q-bio.NC

    Psychotherapy AI Companion with Reinforcement Learning Recommendations and Interpretable Policy Dynamics

    Authors: Baihan Lin, Guillermo Cecchi, Djallel Bouneffouf

    Abstract: We introduce a Reinforcement Learning Psychotherapy AI Companion that generates topic recommendations for therapists based on patient responses. The system uses Deep Reinforcement Learning (DRL) to generate multi-objective policies for four different psychiatric conditions: anxiety, depression, schizophrenia, and suicidal cases. We present our experimental results on the accuracy of recommended to… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: WWW 2023. This work supersede our prior work arxiv:2208.13077 by studying the interpretability of RL-based therapy agents with policy visualizations

  10. arXiv:2302.10845  [pdf, other

    cs.CL cs.AI cs.HC cs.LG

    TherapyView: Visualizing Therapy Sessions with Temporal Topic Modeling and AI-Generated Arts

    Authors: Baihan Lin, Stefan Zecevic, Djallel Bouneffouf, Guillermo Cecchi

    Abstract: We present the TherapyView, a demonstration system to help therapists visualize the dynamic contents of past treatment sessions, enabled by the state-of-the-art neural topic modeling techniques to analyze the topical tendencies of various psychiatric conditions and deep learning-based image generation engine to provide a visual summary. The system incorporates temporal modeling to provide a time-s… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

    Comments: This work extends our prior empirical work on topic modeling (arxiv:2204.10189) to now provide an interpretable and interactive data visualization platform with AI-generated artworks as a concrete user scenario for therapists

  11. arXiv:2302.01067  [pdf, other

    cs.AI cs.LG cs.SC

    A Survey on Compositional Generalization in Applications

    Authors: Baihan Lin, Djallel Bouneffouf, Irina Rish

    Abstract: The field of compositional generalization is currently experiencing a renaissance in AI, as novel problem settings and algorithms motivated by various practical applications are being introduced, building on top of the classical compositional generalization problem. This article aims to provide a comprehensive review of top recent developments in multiple real-life applications of the compositiona… ▽ More

    Submitted 2 February, 2023; originally announced February 2023.

  12. arXiv:2210.16386  [pdf, other

    cs.LG cs.DS

    Non-Stationary Bandits with Auto-Regressive Temporal Dependency

    Authors: Qinyi Chen, Negin Golrezaei, Djallel Bouneffouf

    Abstract: Traditional multi-armed bandit (MAB) frameworks, predominantly examined under stochastic or adversarial settings, often overlook the temporal dynamics inherent in many real-world applications such as recommendation systems and online advertising. This paper introduces a novel non-stationary MAB framework that captures the temporal structure of these real-world dynamics through an auto-regressive (… ▽ More

    Submitted 12 December, 2023; v1 submitted 28 October, 2022; originally announced October 2022.

    Comments: 45 pages, 8 figures

  13. arXiv:2210.15603  [pdf, other

    cs.CL cs.AI cs.HC cs.LG q-bio.NC

    Working Alliance Transformer for Psychotherapy Dialogue Classification

    Authors: Baihan Lin, Guillermo Cecchi, Djallel Bouneffouf

    Abstract: As a predictive measure of the treatment outcome in psychotherapy, the working alliance measures the agreement of the patient and the therapist in terms of their bond, task and goal. Long been a clinical quantity estimated by the patients' and therapists' self-evaluative reports, we believe that the working alliance can be better characterized using natural language processing technique directly i… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

  14. arXiv:2209.12618  [pdf, ps, other

    cs.AI cs.SC

    Survey on Applications of Neurosymbolic Artificial Intelligence

    Authors: Djallel Bouneffouf, Charu C. Aggarwal

    Abstract: In recent years, the Neurosymbolic framework has attracted a lot of attention in various applications, from recommender systems and information retrieval to healthcare and finance. This success is due to its stellar performance combined with attractive properties, such as learning and reasoning. The new emerging Neurosymbolic field is currently experiencing a renaissance, as novel frameworks and a… ▽ More

    Submitted 8 September, 2022; originally announced September 2022.

  15. arXiv:2208.13077  [pdf, other

    cs.CL cs.AI cs.HC cs.LG q-bio.NC

    SupervisorBot: NLP-Annotated Real-Time Recommendations of Psychotherapy Treatment Strategies with Deep Reinforcement Learning

    Authors: Baihan Lin, Guillermo Cecchi, Djallel Bouneffouf

    Abstract: We propose a recommendation system that suggests treatment strategies to a therapist during the psychotherapy session in real-time. Our system uses a turn-level rating mechanism that predicts the therapeutic outcome by computing a similarity score between the deep embedding of a scoring inventory, and the current sentence that the patient is speaking. The system automatically transcribes a continu… ▽ More

    Submitted 29 October, 2022; v1 submitted 27 August, 2022; originally announced August 2022.

    Comments: This work extends our work series in interactive speech or text systems for psychotherapy (e.g. arXiv:2006.04376, arXiv:2204.05522 and arXiv:2204.10189) and proposes a novel recommendation setting

  16. arXiv:2208.10627  [pdf, other

    cs.SI cs.LG

    Targeted Advertising on Social Networks Using Online Variational Tensor Regression

    Authors: Tsuyoshi Idé, Keerthiram Murugesan, Djallel Bouneffouf, Naoki Abe

    Abstract: This paper is concerned with online targeted advertising on social networks. The main technical task we address is to estimate the activation probability for user pairs, which quantifies the influence one user may have on another towards purchasing decisions. This is a challenging task because one marketing episode typically involves a multitude of marketing campaigns/strategies of different produ… ▽ More

    Submitted 9 October, 2022; v1 submitted 22 August, 2022; originally announced August 2022.

    Comments: 18 pages, 7 figures

    MSC Class: 68T05

  17. arXiv:2204.10189  [pdf, other

    cs.CL cs.AI cs.HC cs.LG q-bio.NC

    Neural Topic Modeling of Psychotherapy Sessions

    Authors: Baihan Lin, Djallel Bouneffouf, Guillermo Cecchi, Ravi Tejwani

    Abstract: In this work, we compare different neural topic modeling methods in learning the topical propensities of different psychiatric conditions from the psychotherapy session transcripts parsed from speech recordings. We also incorporate temporal modeling to put this additional interpretability to action by parsing out topic similarities as a time series in a turn-level resolution. We believe this topic… ▽ More

    Submitted 3 November, 2022; v1 submitted 13 April, 2022; originally announced April 2022.

    Comments: This work extends our research series in computational linguistics for psychiatry (e.g. working alliance analysis in arXiv:2204.05522) with a systematic investigation of neural topic modeling approaches to provide interpretable insights in psychotherapy

  18. arXiv:2204.05522  [pdf, other

    q-bio.NC cs.AI cs.CL cs.HC cs.LG

    Deep Annotation of Therapeutic Working Alliance in Psychotherapy

    Authors: Baihan Lin, Guillermo Cecchi, Djallel Bouneffouf

    Abstract: The therapeutic working alliance is an important predictor of the outcome of the psychotherapy treatment. In practice, the working alliance is estimated from a set of scoring questionnaires in an inventory that both the patient and the therapists fill out. In this work, we propose an analytical framework of directly inferring the therapeutic working alliance from the natural language within the ps… ▽ More

    Submitted 12 April, 2022; originally announced April 2022.

  19. arXiv:2106.15808  [pdf, other

    cs.LG cs.AI cs.CY

    Optimal Epidemic Control as a Contextual Combinatorial Bandit with Budget

    Authors: Baihan Lin, Djallel Bouneffouf

    Abstract: In light of the COVID-19 pandemic, it is an open challenge and critical practical problem to find a optimal way to dynamically prescribe the best policies that balance both the governmental resources and epidemic control in different countries and regions. To solve this multi-dimensional tradeoff of exploitation and exploration, we formulate this technical challenge as a contextual combinatorial b… ▽ More

    Submitted 26 April, 2022; v1 submitted 30 June, 2021; originally announced June 2021.

    Comments: Proceeding of FUZZ-IEEE 2022. This work extends our prior work on real-world applications of budgeted bandits (e.g. arXiv:1906.09384), and aims to solve the critical problem of epidemic control. Codes at: https://github.com/doerlbh/BanditZoo

  20. arXiv:2103.08241  [pdf, other

    cs.LG

    Reinforcement Learning with Algorithms from Probabilistic Structure Estimation

    Authors: Jonathan P. Epperlein, Roman Overko, Sergiy Zhuk, Christopher King, Djallel Bouneffouf, Andrew Cullen, Robert Shorten

    Abstract: Reinforcement learning (RL) algorithms aim to learn optimal decisions in unknown environments through experience of taking actions and observing the rewards gained. In some cases, the environment is not influenced by the actions of the RL agent, in which case the problem can be modeled as a contextual multi-armed bandit and lightweight myopic algorithms can be employed. On the other hand, when the… ▽ More

    Submitted 1 June, 2022; v1 submitted 15 March, 2021; originally announced March 2021.

  21. arXiv:2101.00001  [pdf, ps, other

    cs.LG cs.AI

    Etat de l'art sur l'application des bandits multi-bras

    Authors: Djallel Bouneffouf

    Abstract: The Multi-armed bandit offer the advantage to learn and exploit the already learnt knowledge at the same time. This capability allows this approach to be applied in different domains, going from clinical trials where the goal is investigating the effects of different experimental treatments while minimizing patient losses, to adaptive routing where the goal is to minimize the delays in a network.… ▽ More

    Submitted 4 January, 2021; originally announced January 2021.

    Comments: in French

  22. arXiv:2010.11413  [pdf, other

    cs.LG cs.AI q-bio.NC

    Predicting human decision making in psychological tasks with recurrent neural networks

    Authors: Baihan Lin, Djallel Bouneffouf, Guillermo Cecchi

    Abstract: Unlike traditional time series, the action sequences of human decision making usually involve many cognitive processes such as beliefs, desires, intentions, and theory of mind, i.e., what others are thinking. This makes predicting human decision-making challenging to be treated agnostically to the underlying psychological mechanisms. We propose here to use a recurrent neural network architecture b… ▽ More

    Submitted 20 April, 2022; v1 submitted 21 October, 2020; originally announced October 2020.

    Comments: To appear in PLOS ONE. Codes at https://github.com/doerlbh/HumanLSTM

    Journal ref: PLOS ONE 17(5): e0267907 (2022)

  23. arXiv:2010.09473  [pdf, other

    cs.LG cs.AI

    Double-Linear Thompson Sampling for Context-Attentive Bandits

    Authors: Djallel Bouneffouf, Raphaël Féraud, Sohini Upadhyay, Yasaman Khazaeni, Irina Rish

    Abstract: In this paper, we analyze and extend an online learning framework known as Context-Attentive Bandit, motivated by various practical applications, from medical diagnosis to dialog systems, where due to observation costs only a small subset of a potentially large number of context variables can be observed at each iteration;however, the agent has a freedom to choose which variables to observe. We de… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

    Comments: arXiv admin note: text overlap with arXiv:1906.09384

  24. arXiv:2009.13714  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Learning to Generate Image Source-Agnostic Universal Adversarial Perturbations

    Authors: Pu Zhao, Parikshit Ram, Songtao Lu, Yuguang Yao, Djallel Bouneffouf, Xue Lin, Sijia Liu

    Abstract: Adversarial perturbations are critical for certifying the robustness of deep learning models. A universal adversarial perturbation (UAP) can simultaneously attack multiple images, and thus offers a more unified threat model, obviating an image-wise attack algorithm. However, the existing UAP generator is underdeveloped when images are drawn from different image sources (e.g., with different image… ▽ More

    Submitted 17 August, 2022; v1 submitted 28 September, 2020; originally announced September 2020.

  25. arXiv:2007.11967  [pdf, other

    stat.ML cs.AI cs.LG

    Computing the Dirichlet-Multinomial Log-Likelihood Function

    Authors: Djallel Bouneffouf

    Abstract: Dirichlet-multinomial (DMN) distribution is commonly used to model over-dispersion in count data. Precise and fast numerical computation of the DMN log-likelihood function is important for performing statistical inference using this distribution, and remains a challenge. To address this, we use mathematical properties of the gamma function to derive a closed form expression for the DMN log-likelih… ▽ More

    Submitted 17 July, 2020; originally announced July 2020.

  26. arXiv:2007.11416  [pdf, ps, other

    cs.LG stat.ML

    Spectral Clustering using Eigenspectrum Shape Based Nystrom Sampling

    Authors: Djallel Bouneffouf

    Abstract: Spectral clustering has shown a superior performance in analyzing the cluster structure. However, its computational complexity limits its application in analyzing large-scale data. To address this problem, many low-rank matrix approximating algorithms are proposed, including the Nystrom method - an approach with proven approximate error bounds. There are several algorithms that provide recipes to… ▽ More

    Submitted 21 July, 2020; originally announced July 2020.

  27. arXiv:2007.06368  [pdf, other

    cs.LG cs.AI stat.ML

    Contextual Bandit with Missing Rewards

    Authors: Djallel Bouneffouf, Sohini Upadhyay, Yasaman Khazaeni

    Abstract: We consider a novel variant of the contextual bandit problem (i.e., the multi-armed bandit with side-information, or context, available to a decision-maker) where the reward associated with each context-based decision may not always be observed("missing rewards"). This new problem is motivated by certain online settings including clinical trial and ad recommendation applications. In order to addre… ▽ More

    Submitted 18 July, 2020; v1 submitted 13 July, 2020; originally announced July 2020.

  28. arXiv:2006.15194  [pdf, other

    cs.LG stat.ML

    Online learning with Corrupted context: Corrupted Contextual Bandits

    Authors: Djallel Bouneffouf

    Abstract: We consider a novel variant of the contextual bandit problem (i.e., the multi-armed bandit with side-information, or context, available to a decision-maker) where the context used at each decision may be corrupted ("useless context"). This new problem is motivated by certain on-line settings including clinical trial and ad recommendation applications. In order to address the corrupted-context sett… ▽ More

    Submitted 26 June, 2020; originally announced June 2020.

  29. arXiv:2006.09635  [pdf, other

    cs.LG math.OC stat.ML

    Solving Constrained CASH Problems with ADMM

    Authors: Parikshit Ram, Sijia Liu, Deepak Vijaykeerthi, Dakuo Wang, Djallel Bouneffouf, Greg Bramble, Horst Samulowitz, Alexander G. Gray

    Abstract: The CASH problem has been widely studied in the context of automated configurations of machine learning (ML) pipelines and various solvers and toolkits are available. However, CASH solvers do not directly handle black-box constraints such as fairness, robustness or other domain-specific custom constraints. We present our recent approach [Liu, et al., 2020] that leverages the ADMM optimization fram… ▽ More

    Submitted 10 July, 2020; v1 submitted 16 June, 2020; originally announced June 2020.

    Comments: 7th ICML Workshop on Automated Machine Learning (2020)

  30. arXiv:2006.06580  [pdf, other

    cs.GT cs.AI cs.LG cs.MA q-bio.NC

    Online Learning in Iterated Prisoner's Dilemma to Mimic Human Behavior

    Authors: Baihan Lin, Djallel Bouneffouf, Guillermo Cecchi

    Abstract: As an important psychological and social experiment, the Iterated Prisoner's Dilemma (IPD) treats the choice to cooperate or defect as an atomic action. We propose to study the behaviors of online learning algorithms in the Iterated Prisoner's Dilemma (IPD) game, where we investigate the full spectrum of reinforcement learning agents: multi-armed bandits, contextual bandits and reinforcement learn… ▽ More

    Submitted 26 August, 2022; v1 submitted 9 June, 2020; originally announced June 2020.

    Comments: Proceeding of PRICAI 2022. To the best of our knowledge, this is the first attempt to explore the full spectrum of reinforcement learning agents (multi-armed bandits, contextual bandits and reinforcement learning) in the sequential social dilemma. Codes at https://github.com/doerlbh/dilemmaRL

  31. arXiv:2005.04544  [pdf, other

    cs.AI cs.LG q-bio.NC stat.ML

    Unified Models of Human Behavioral Agents in Bandits, Contextual Bandits and RL

    Authors: Baihan Lin, Guillermo Cecchi, Djallel Bouneffouf, Jenna Reinen, Irina Rish

    Abstract: Artificial behavioral agents are often evaluated based on their consistent behaviors and performance to take sequential actions in an environment to maximize some notion of cumulative reward. However, human decision making in real life usually involves different strategies and behavioral trajectories that lead to the same empirical outcome. Motivated by clinical literature of a wide range of neuro… ▽ More

    Submitted 27 December, 2021; v1 submitted 9 May, 2020; originally announced May 2020.

    Comments: Proceeding of HBAI 2020. This article supersedes and extends our work arXiv:1706.02897 (MAB) and arXiv:1906.11286 (RL) into the Contextual Bandit (CB) framework. It generalized extensively into multi-armed bandits, contextual bandits and RL settings to create a unified framework of human behavioral agents

    Journal ref: In Human Brain and Artificial Intelligence (pp. 14-33). Springer 2021

  32. arXiv:2005.02209  [pdf, ps, other

    cs.LG stat.ML

    Hyper-parameter Tuning for the Contextual Bandit

    Authors: Djallel Bouneffouf, Emmanuelle Claeys

    Abstract: We study here the problem of learning the exploration exploitation trade-off in the contextual bandit problem with linear reward function setting. In the traditional algorithms that solve the contextual bandit problem, the exploration is a parameter that is tuned by the user. However, our proposed algorithm learn to choose the right exploration parameters in an online manner based on the observed… ▽ More

    Submitted 4 May, 2020; originally announced May 2020.

    Comments: arXiv admin note: text overlap with arXiv:1705.03821

  33. arXiv:1910.14436  [pdf, other

    cs.AI cs.LG

    How can AI Automate End-to-End Data Science?

    Authors: Charu Aggarwal, Djallel Bouneffouf, Horst Samulowitz, Beat Buesser, Thanh Hoang, Udayan Khurana, Sijia Liu, Tejaswini Pedapati, Parikshit Ram, Ambrish Rawat, Martin Wistuba, Alexander Gray

    Abstract: Data science is labor-intensive and human experts are scarce but heavily involved in every aspect of it. This makes data science time consuming and restricted to experts with the resulting quality heavily dependent on their experience and skills. To make data science more accessible and scalable, we need its democratization. Automated Data Science (AutoDS) is aimed towards that goal and is emergin… ▽ More

    Submitted 22 October, 2019; originally announced October 2019.

  34. arXiv:1906.12350  [pdf, other

    cs.LG cs.AI cs.MA q-bio.NC stat.ML

    Split Q Learning: Reinforcement Learning with Two-Stream Rewards

    Authors: Baihan Lin, Djallel Bouneffouf, Guillermo Cecchi

    Abstract: Drawing an inspiration from behavioral studies of human decision making, we propose here a general parametric framework for a reinforcement learning problem, which extends the standard Q-learning approach to incorporate a two-stream framework of reward processing with biases biologically associated with several neurological and psychiatric conditions, including Parkinson's and Alzheimer's diseases… ▽ More

    Submitted 12 November, 2019; v1 submitted 20 June, 2019; originally announced June 2019.

    Comments: IJCAI 2019. This article supersedes our work arXiv:1706.02897 into RL setting, with a different focus by applying Inverse Reinforcement Learning to model human clinical behavioral bias. It also precedes our work arXiv:1906.11286 which introduces extensive emphases in RL games

  35. arXiv:1906.11286  [pdf, other

    cs.LG cs.AI cs.MA q-bio.NC stat.ML

    A Story of Two Streams: Reinforcement Learning Models from Human Behavior and Neuropsychiatry

    Authors: Baihan Lin, Guillermo Cecchi, Djallel Bouneffouf, Jenna Reinen, Irina Rish

    Abstract: Drawing an inspiration from behavioral studies of human decision making, we propose here a more general and flexible parametric framework for reinforcement learning that extends standard Q-learning to a two-stream model for processing positive and negative rewards, and allows to incorporate a wide range of reward-processing biases -- an important component of human decision making which can help u… ▽ More

    Submitted 14 April, 2020; v1 submitted 20 June, 2019; originally announced June 2019.

    Comments: Published in AAMAS 2020 as a full paper. This article supersedes our work arXiv:1706.02897 into RL setting and extends extensively into RL games, cognitive modeling, and gambling tasks in lifelong learning setting

  36. arXiv:1906.03979  [pdf, other

    cs.LG stat.ML

    Optimal Exploitation of Clustering and History Information in Multi-Armed Bandit

    Authors: Djallel Bouneffouf, Srinivasan Parthasarathy, Horst Samulowitz, Martin Wistub

    Abstract: We consider the stochastic multi-armed bandit problem and the contextual bandit problem with historical observations and pre-clustered arms. The historical observations can contain any number of instances for each arm, and the pre-clustering information is a fixed clustering of arms provided as part of the input. We develop a variety of algorithms which incorporate this offline information effecti… ▽ More

    Submitted 31 May, 2019; originally announced June 2019.

    Comments: IJCAI 2019, International Joint Conferences on Artificial Intelligence

  37. arXiv:1905.00424  [pdf, other

    cs.LG stat.ML

    An ADMM Based Framework for AutoML Pipeline Configuration

    Authors: Sijia Liu, Parikshit Ram, Deepak Vijaykeerthy, Djallel Bouneffouf, Gregory Bramble, Horst Samulowitz, Dakuo Wang, Andrew Conn, Alexander Gray

    Abstract: We study the AutoML problem of automatically configuring machine learning pipelines by jointly selecting algorithms and their appropriate hyper-parameters for all steps in supervised learning pipelines. This black-box (gradient-free) optimization with mixed integer & continuous variables is a challenging problem. We propose a novel AutoML scheme by leveraging the alternating direction method of mu… ▽ More

    Submitted 6 December, 2019; v1 submitted 1 May, 2019; originally announced May 2019.

    Journal ref: published at AAAI 2020

  38. arXiv:1904.10040  [pdf, ps, other

    cs.LG stat.ML

    A Survey on Practical Applications of Multi-Armed and Contextual Bandits

    Authors: Djallel Bouneffouf, Irina Rish

    Abstract: In recent years, multi-armed bandit (MAB) framework has attracted a lot of attention in various applications, from recommender systems and information retrieval to healthcare and finance, due to its stellar performance combined with certain attractive properties, such as learning from less feedback. The multi-armed bandit field is currently flourishing, as novel problem settings and algorithms mot… ▽ More

    Submitted 2 April, 2019; originally announced April 2019.

    Comments: under review by IJCAI 2019 Survey

  39. arXiv:1809.08343  [pdf, other

    cs.LG cs.AI stat.ML

    Interpretable Multi-Objective Reinforcement Learning through Policy Orchestration

    Authors: Ritesh Noothigattu, Djallel Bouneffouf, Nicholas Mattei, Rachita Chandra, Piyush Madan, Kush Varshney, Murray Campbell, Moninder Singh, Francesca Rossi

    Abstract: Autonomous cyber-physical agents and systems play an increasingly large role in our lives. To ensure that agents behave in ways aligned with the values of the societies in which they operate, we must develop techniques that allow these agents to not only maximize their reward in an environment, but also to learn and follow the implicit constraints of society. These constraints and norms can come f… ▽ More

    Submitted 21 September, 2018; originally announced September 2018.

    Comments: 8 pages, 3 figures

  40. arXiv:1809.05720  [pdf, other

    cs.AI cs.HC cs.LG

    Incorporating Behavioral Constraints in Online AI Systems

    Authors: Avinash Balakrishnan, Djallel Bouneffouf, Nicholas Mattei, Francesca Rossi

    Abstract: AI systems that learn through reward feedback about the actions they take are increasingly deployed in domains that have significant impact on our daily life. However, in many cases the online rewards should not be the only guiding criteria, as there are additional constraints and/or priorities imposed by regulations, values, preferences, or ethical principles. We detail a novel online agent that… ▽ More

    Submitted 15 September, 2018; originally announced September 2018.

    Comments: 9 pages, 6 figures

  41. arXiv:1806.09077  [pdf, other

    stat.ML cs.LG

    Beyond Backprop: Online Alternating Minimization with Auxiliary Variables

    Authors: Anna Choromanska, Benjamin Cowen, Sadhana Kumaravel, Ronny Luss, Mattia Rigotti, Irina Rish, Brian Kingsbury, Paolo DiAchille, Viatcheslav Gurev, Ravi Tejwani, Djallel Bouneffouf

    Abstract: Despite significant recent advances in deep neural networks, training them remains a challenge due to the highly non-convex nature of the objective function. State-of-the-art methods rely on error backpropagation, which suffers from several well-known issues, such as vanishing and exploding gradients, inability to handle non-differentiable nonlinearities and to parallelize weight-updates across la… ▽ More

    Submitted 5 June, 2019; v1 submitted 23 June, 2018; originally announced June 2018.

    Comments: First six authors contributed equally to this work: A.C. - theory, manuscript, B.C. - code, experiments, S.K. - code, experiments, R.L. - algorithm, experiments, M.R. - code, experiments, I.R. - algorithm, manuscript

  42. arXiv:1802.00981  [pdf, other

    cs.AI cs.LG stat.ML

    Contextual Bandit with Adaptive Feature Extraction

    Authors: Baihan Lin, Djallel Bouneffouf, Guillermo Cecchi, Irina Rish

    Abstract: We consider an online decision making setting known as contextual bandit problem, and propose an approach for improving contextual bandit performance by using an adaptive feature extraction (representation learning) based on online clustering. Our approach starts with an off-line pre-training on unlabeled history of contexts (which can be exploited by our approach, but not by the standard contextu… ▽ More

    Submitted 14 September, 2020; v1 submitted 3 February, 2018; originally announced February 2018.

    Comments: IEEE ICDMW 2018

  43. arXiv:1711.06761  [pdf, other

    cs.LG cs.AI cs.NE

    Scalable Recollections for Continual Lifelong Learning

    Authors: Matthew Riemer, Tim Klinger, Djallel Bouneffouf, Michele Franceschini

    Abstract: Given the recent success of Deep Learning applied to a variety of single tasks, it is natural to consider more human-realistic settings. Perhaps the most difficult of these settings is that of continual lifelong learning, where the model must learn online over a continuous stream of non-stationary data. A successful continual lifelong learning system must have three key capabilities: it must learn… ▽ More

    Submitted 19 December, 2018; v1 submitted 17 November, 2017; originally announced November 2017.

    Comments: AAAI 2019

  44. arXiv:1706.02897  [pdf, ps, other

    cs.AI

    Bandit Models of Human Behavior: Reward Processing in Mental Disorders

    Authors: Djallel Bouneffouf, Irina Rish, Guillermo A. Cecchi

    Abstract: Drawing an inspiration from behavioral studies of human decision making, we propose here a general parametric framework for multi-armed bandit problem, which extends the standard Thompson Sampling approach to incorporate reward processing biases associated with several neurological and psychiatric conditions, including Parkinson's and Alzheimer's diseases, attention-deficit/hyperactivity disorder… ▽ More

    Submitted 7 June, 2017; originally announced June 2017.

    Comments: Conference on Artificial General Intelligence, AGI-17

  45. arXiv:1705.03821  [pdf, other

    cs.AI cs.LG stat.ML

    Context Attentive Bandits: Contextual Bandit with Restricted Context

    Authors: Djallel Bouneffouf, Irina Rish, Guillermo A. Cecchi, Raphael Feraud

    Abstract: We consider a novel formulation of the multi-armed bandit model, which we call the contextual bandit with restricted context, where only a limited number of features can be accessed by the learner at every iteration. This novel formulation is motivated by different online problems arising in clinical trials, recommender systems and attention modeling. Herein, we adapt the standard multi-armed band… ▽ More

    Submitted 7 June, 2017; v1 submitted 10 May, 2017; originally announced May 2017.

    Comments: IJCAI 2017

  46. arXiv:1508.07091  [pdf, other

    cs.LG

    Multi-armed Bandit Problem with Known Trend

    Authors: Djallel Bouneffouf, Raphaël Feraud

    Abstract: We consider a variant of the multi-armed bandit model, which we call multi-armed bandit problem with known trend, where the gambler knows the shape of the reward function of each arm but not its distribution. This new problem is motivated by different online problems like active learning, music and interface recommendation applications, where when an arm is sampled by the model the received reward… ▽ More

    Submitted 10 May, 2017; v1 submitted 28 August, 2015; originally announced August 2015.

    Comments: Neurocomputing 2016. arXiv admin note: text overlap with arXiv:0805.3415 by other authors

    ACM Class: I.2

  47. arXiv:1409.8572  [pdf, other

    cs.IR cs.LG

    Freshness-Aware Thompson Sampling

    Authors: Djallel Bouneffouf

    Abstract: To follow the dynamicity of the user's content, researchers have recently started to model interactions between users and the Context-Aware Recommender Systems (CARS) as a bandit problem where the system needs to deal with exploration and exploitation dilemma. In this sense, we propose to study the freshness of the user's content in CARS through the bandit problem. We introduce in this paper an al… ▽ More

    Submitted 29 September, 2014; originally announced September 2014.

    Comments: 21st International Conference on Neural Information Processing. arXiv admin note: text overlap with arXiv:1409.7729

    ACM Class: I.2

  48. arXiv:1409.8191  [pdf, other

    cs.NE cs.LG

    A Neural Networks Committee for the Contextual Bandit Problem

    Authors: Robin Allesiardo, Raphael Feraud, Djallel Bouneffouf

    Abstract: This paper presents a new contextual bandit algorithm, NeuralBandit, which does not need hypothesis on stationarity of contexts and rewards. Several neural networks are trained to modelize the value of rewards knowing the context. Two variants, based on multi-experts approach, are proposed to choose online the parameters of multi-layer perceptrons. The proposed algorithms are successfully tested o… ▽ More

    Submitted 29 September, 2014; originally announced September 2014.

    Comments: 21st International Conference on Neural Information Processing

    ACM Class: I.2

  49. arXiv:1409.7729  [pdf, other

    cs.IR

    Context-Based Information Retrieval in Risky Environment

    Authors: Djallel Bouneffouf

    Abstract: Context-Based Information Retrieval is recently modelled as an exploration/ exploitation trade-off (exr/exp) problem, where the system has to choose between maximizing its expected rewards dealing with its current knowledge (exploitation) and learning more about the unknown user's preferences to improve its knowledge (exploration). This problem has been addressed by the reinforcement learning comm… ▽ More

    Submitted 30 July, 2014; originally announced September 2014.

    Comments: arXiv admin note: substantial text overlap with arXiv:1408.2195

    ACM Class: I.2

  50. arXiv:1408.2196  [pdf, other

    cs.LG cs.AI

    Exponentiated Gradient Exploration for Active Learning

    Authors: Djallel Bouneffouf

    Abstract: Active learning strategies respond to the costly labelling task in a supervised classification by selecting the most useful unlabelled examples in training a predictive model. Many conventional active learning algorithms focus on refining the decision boundary, rather than exploring new regions that can be more informative. In this setting, we propose a sequential algorithm named EG-Active that ca… ▽ More

    Submitted 10 August, 2014; originally announced August 2014.

    ACM Class: I.2