Skip to main content

Showing 1–9 of 9 results for author: Tipirneni, S

  1. arXiv:2405.00988  [pdf, other

    cs.CL cs.LG

    Context-Aware Clustering using Large Language Models

    Authors: Sindhu Tipirneni, Ravinarayana Adkathimar, Nurendra Choudhary, Gaurush Hiranandani, Rana Ali Amjad, Vassilis N. Ioannidis, Changhe Yuan, Chandan K. Reddy

    Abstract: Despite the remarkable success of Large Language Models (LLMs) in text understanding and generation, their potential for text clustering tasks remains underexplored. We observed that powerful closed-source LLMs provide good quality clusterings of entity sets but are not scalable due to the massive compute power required and the associated costs. Thus, we propose CACTUS (Context-Aware ClusTering wi… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: 16 pages

    ACM Class: I.2.7; I.2.m

  2. arXiv:2303.13024  [pdf

    cs.LG cs.AI eess.SP

    Identifying TBI Physiological States by Clustering Multivariate Clinical Time-Series Data

    Authors: Hamid Ghaderi, Brandon Foreman, Amin Nayebi, Sindhu Tipirneni, Chandan K. Reddy, Vignesh Subbian

    Abstract: Determining clinically relevant physiological states from multivariate time series data with missing values is essential for providing appropriate treatment for acute conditions such as Traumatic Brain Injury (TBI), respiratory failure, and heart failure. Utilizing non-temporal clustering or data imputation and aggregation techniques may lead to loss of valuable information and biased analyses. In… ▽ More

    Submitted 17 July, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

    Comments: 10 pages, 7 figures, 2 tables

    Journal ref: AMIA Annu Symp Proc. 2024 Jan 11;2023:379-388

  3. arXiv:2302.13457  [pdf

    cs.LG cs.AI

    A Self-Supervised Learning-based Approach to Clustering Multivariate Time-Series Data with Missing Values (SLAC-Time): An Application to TBI Phenotyping

    Authors: Hamid Ghaderi, Brandon Foreman, Amin Nayebi, Sindhu Tipirneni, Chandan K. Reddy, Vignesh Subbian

    Abstract: Self-supervised learning approaches provide a promising direction for clustering multivariate time-series data. However, real-world time-series data often include missing values, and the existing approaches require imputing missing values before clustering, which may cause extensive computations and noise and result in invalid interpretations. To address these challenges, we present a Self-supervi… ▽ More

    Submitted 27 May, 2023; v1 submitted 26 February, 2023; originally announced February 2023.

    Comments: Submitted to the Journal of Biomedical Informatics

  4. arXiv:2301.13816  [pdf, other

    cs.LG cs.AI cs.CL cs.PL

    Execution-based Code Generation using Deep Reinforcement Learning

    Authors: Parshin Shojaee, Aneesh Jain, Sindhu Tipirneni, Chandan K. Reddy

    Abstract: The utilization of programming language (PL) models, pre-trained on large-scale code corpora, as a means of automating software engineering processes has demonstrated considerable potential in streamlining various code generation tasks such as code completion, code translation, and program synthesis. However, current approaches mainly rely on supervised fine-tuning objectives borrowed from text ge… ▽ More

    Submitted 19 July, 2023; v1 submitted 31 January, 2023; originally announced January 2023.

    Comments: Published in Transactions on Machine Learning Research (TMLR), 2023

    Journal ref: Transactions on Machine Learning Research (TMLR), 2023

  5. arXiv:2211.06507  [pdf

    cs.LG cs.AI

    WindowSHAP: An Efficient Framework for Explaining Time-series Classifiers based on Shapley Values

    Authors: Amin Nayebi, Sindhu Tipirneni, Chandan K Reddy, Brandon Foreman, Vignesh Subbian

    Abstract: Unpacking and comprehending how black-box machine learning algorithms make decisions has been a persistent challenge for researchers and end-users. Explaining time-series predictive models is useful for clinical applications with high stakes to understand the behavior of prediction models. However, existing approaches to explain such models are frequently unique to data where the features do not h… ▽ More

    Submitted 8 May, 2023; v1 submitted 11 November, 2022; originally announced November 2022.

    Comments: Submitted to the Journal of Biomedical Informatics

  6. arXiv:2208.06717  [pdf

    cs.AI cs.LG

    An Empirical Comparison of Explainable Artificial Intelligence Methods for Clinical Data: A Case Study on Traumatic Brain Injury

    Authors: Amin Nayebi, Sindhu Tipirneni, Brandon Foreman, Chandan K. Reddy, Vignesh Subbian

    Abstract: A longstanding challenge surrounding deep learning algorithms is unpacking and understanding how they make their decisions. Explainable Artificial Intelligence (XAI) offers methods to provide explanations of internal functions of algorithms and reasons behind their decisions in ways that are interpretable and understandable to human users. . Numerous XAI approaches have been developed thus far, an… ▽ More

    Submitted 13 August, 2022; originally announced August 2022.

    Comments: Accepted at American Medical Informatics Association (AMIA) Annual Symposium 2022, 10 pages, 6 figures, 2 tables

  7. arXiv:2206.08474  [pdf, other

    cs.SE cs.AI cs.LG

    XLCoST: A Benchmark Dataset for Cross-lingual Code Intelligence

    Authors: Ming Zhu, Aneesh Jain, Karthik Suresh, Roshan Ravindran, Sindhu Tipirneni, Chandan K. Reddy

    Abstract: Recent advances in machine learning have significantly improved the understanding of source code data and achieved good performance on a number of downstream tasks. Open source repositories like GitHub enable this process with rich unlabeled code data. However, the lack of high quality labeled data has largely hindered the progress of several code related tasks, such as program translation, summar… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

    Comments: 20 pages, 11 tables, 2 figures

  8. arXiv:2206.05239  [pdf, other

    cs.LG cs.SE

    StructCoder: Structure-Aware Transformer for Code Generation

    Authors: Sindhu Tipirneni, Ming Zhu, Chandan K. Reddy

    Abstract: There has been a recent surge of interest in automating software engineering tasks using deep learning. This paper addresses the problem of code generation, where the goal is to generate target code given source code in a different language or a natural language description. Most state-of-the-art deep learning models for code generation use training strategies primarily designed for natural langua… ▽ More

    Submitted 30 January, 2024; v1 submitted 10 June, 2022; originally announced June 2022.

  9. arXiv:2107.14293  [pdf, other

    cs.LG

    Self-Supervised Transformer for Sparse and Irregularly Sampled Multivariate Clinical Time-Series

    Authors: Sindhu Tipirneni, Chandan K. Reddy

    Abstract: Multivariate time-series data are frequently observed in critical care settings and are typically characterized by sparsity (missing information) and irregular time intervals. Existing approaches for learning representations in this domain handle these challenges by either aggregation or imputation of values, which in-turn suppresses the fine-grained information and adds undesirable noise/overhead… ▽ More

    Submitted 16 February, 2022; v1 submitted 29 July, 2021; originally announced July 2021.

    Comments: Changed title to better reflect the challenges dealt with in the paper. Improved section 4.6. Changed the format to use ACM camera-ready template

    ACM Class: I.2.1; I.2.6