-
Generation and De-Identification of Indian Clinical Discharge Summaries using LLMs
Authors:
Sanjeet Singh,
Shreya Gupta,
Niralee Gupta,
Naimish Sharma,
Lokesh Srivastava,
Vibhu Agarwal,
Ashutosh Modi
Abstract:
The consequences of a healthcare data breach can be devastating for the patients, providers, and payers. The average financial impact of a data breach in recent months has been estimated to be close to USD 10 million. This is especially significant for healthcare organizations in India that are managing rapid digitization while still establishing data governance procedures that align with the lett…
▽ More
The consequences of a healthcare data breach can be devastating for the patients, providers, and payers. The average financial impact of a data breach in recent months has been estimated to be close to USD 10 million. This is especially significant for healthcare organizations in India that are managing rapid digitization while still establishing data governance procedures that align with the letter and spirit of the law. Computer-based systems for de-identification of personal information are vulnerable to data drift, often rendering them ineffective in cross-institution settings. Therefore, a rigorous assessment of existing de-identification against local health datasets is imperative to support the safe adoption of digital health initiatives in India. Using a small set of de-identified patient discharge summaries provided by an Indian healthcare institution, in this paper, we report the nominal performance of de-identification algorithms (based on language models) trained on publicly available non-Indian datasets, pointing towards a lack of cross-institutional generalization. Similarly, experimentation with off-the-shelf de-identification systems reveals potential risks associated with the approach. To overcome data scarcity, we explore generating synthetic clinical reports (using publicly available and Indian summaries) by performing in-context learning over Large Language Models (LLMs). Our experiments demonstrate the use of generated reports as an effective strategy for creating high-performing de-identification systems with good generalization capabilities.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
iSign: A Benchmark for Indian Sign Language Processing
Authors:
Abhinav Joshi,
Romit Mohanty,
Mounika Kanakanti,
Andesha Mangla,
Sudeep Choudhary,
Monali Barbate,
Ashutosh Modi
Abstract:
Indian Sign Language has limited resources for developing machine learning and data-driven approaches for automated language processing. Though text/audio-based language processing techniques have shown colossal research interest and tremendous improvements in the last few years, Sign Languages still need to catch up due to the need for more resources. To bridge this gap, in this work, we propose…
▽ More
Indian Sign Language has limited resources for developing machine learning and data-driven approaches for automated language processing. Though text/audio-based language processing techniques have shown colossal research interest and tremendous improvements in the last few years, Sign Languages still need to catch up due to the need for more resources. To bridge this gap, in this work, we propose iSign: a benchmark for Indian Sign Language (ISL) Processing. We make three primary contributions to this work. First, we release one of the largest ISL-English datasets with more than 118K video-sentence/phrase pairs. To the best of our knowledge, it is the largest sign language dataset available for ISL. Second, we propose multiple NLP-specific tasks (including SignVideo2Text, SignPose2Text, Text2Pose, Word Prediction, and Sign Semantics) and benchmark them with the baseline models for easier access to the research community. Third, we provide detailed insights into the proposed benchmarks with a few linguistic insights into the workings of ISL. We streamline the evaluation of Sign Language processing, addressing the gaps in the NLP research community for Sign Languages. We release the dataset, tasks, and models via the following website: https://exploration-lab.github.io/iSign/
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
IL-TUR: Benchmark for Indian Legal Text Understanding and Reasoning
Authors:
Abhinav Joshi,
Shounak Paul,
Akshat Sharma,
Pawan Goyal,
Saptarshi Ghosh,
Ashutosh Modi
Abstract:
Legal systems worldwide are inundated with exponential growth in cases and documents. There is an imminent need to develop NLP and ML techniques for automatically processing and understanding legal documents to streamline the legal system. However, evaluating and comparing various NLP models designed specifically for the legal domain is challenging. This paper addresses this challenge by proposing…
▽ More
Legal systems worldwide are inundated with exponential growth in cases and documents. There is an imminent need to develop NLP and ML techniques for automatically processing and understanding legal documents to streamline the legal system. However, evaluating and comparing various NLP models designed specifically for the legal domain is challenging. This paper addresses this challenge by proposing IL-TUR: Benchmark for Indian Legal Text Understanding and Reasoning. IL-TUR contains monolingual (English, Hindi) and multi-lingual (9 Indian languages) domain-specific tasks that address different aspects of the legal system from the point of view of understanding and reasoning over Indian legal documents. We present baseline models (including LLM-based) for each task, outlining the gap between models and the ground truth. To foster further research in the legal domain, we create a leaderboard (available at: https://exploration-lab.github.io/IL-TUR/) where the research community can upload and compare legal text understanding systems.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
BookSQL: A Large Scale Text-to-SQL Dataset for Accounting Domain
Authors:
Rahul Kumar,
Amar Raja Dibbu,
Shrutendra Harsola,
Vignesh Subrahmaniam,
Ashutosh Modi
Abstract:
Several large-scale datasets (e.g., WikiSQL, Spider) for developing natural language interfaces to databases have recently been proposed. These datasets cover a wide breadth of domains but fall short on some essential domains, such as finance and accounting. Given that accounting databases are used worldwide, particularly by non-technical people, there is an imminent need to develop models that co…
▽ More
Several large-scale datasets (e.g., WikiSQL, Spider) for developing natural language interfaces to databases have recently been proposed. These datasets cover a wide breadth of domains but fall short on some essential domains, such as finance and accounting. Given that accounting databases are used worldwide, particularly by non-technical people, there is an imminent need to develop models that could help extract information from accounting databases via natural language queries. In this resource paper, we aim to fill this gap by proposing a new large-scale Text-to-SQL dataset for the accounting and financial domain: BookSQL. The dataset consists of 100k natural language queries-SQL pairs, and accounting databases of 1 million records. We experiment with and analyze existing state-of-the-art models (including GPT-4) for the Text-to-SQL task on BookSQL. We find significant performance gaps, thus pointing towards developing more focused models for this domain.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Multi-Stain Multi-Level Convolutional Network for Multi-Tissue Breast Cancer Image Segmentation
Authors:
Akash Modi,
Sumit Kumar Jha,
Purnendu Mishra,
Rajiv Kumar,
Kiran Aatre,
Gursewak Singh,
Shubham Mathur
Abstract:
Digital pathology and microscopy image analysis are widely employed in the segmentation of digitally scanned IHC slides, primarily to identify cancer and pinpoint regions of interest (ROI) indicative of tumor presence. However, current ROI segmentation models are either stain-specific or suffer from the issues of stain and scanner variance due to different staining protocols or modalities across m…
▽ More
Digital pathology and microscopy image analysis are widely employed in the segmentation of digitally scanned IHC slides, primarily to identify cancer and pinpoint regions of interest (ROI) indicative of tumor presence. However, current ROI segmentation models are either stain-specific or suffer from the issues of stain and scanner variance due to different staining protocols or modalities across multiple labs. Also, tissues like Ductal Carcinoma in Situ (DCIS), acini, etc. are often classified as Tumors due to their structural similarities and color compositions. In this paper, we proposed a novel convolutional neural network (CNN) based Multi-class Tissue Segmentation model for histopathology whole-slide Breast slides which classify tumors and segments other tissue regions such as Ducts, acini, DCIS, Squamous epithelium, Blood Vessels, Necrosis, etc. as a separate class. Our unique pixel-aligned non-linear merge across spatial resolutions empowers models with both local and global fields of view for accurate detection of various classes. Our proposed model is also able to separate bad regions such as folds, artifacts, blurry regions, bubbles, etc. from tissue regions using multi-level context from different resolutions of WSI. Multi-phase iterative training with context-aware augmentation and increasing noise was used to efficiently train a multi-stain generic model with partial and noisy annotations from 513 slides. Our training pipeline used 12 million patches generated using context-aware augmentations which made our model stain and scanner invariant across data sources. To extrapolate stain and scanner invariance, our model was evaluated on 23000 patches which were for a completely new stain (Hematoxylin and Eosin) from a completely new scanner (Motic) from a different lab. The mean IOU was 0.72 which is on par with model performance on other data sources and scanners.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
IITK at SemEval-2024 Task 10: Who is the speaker? Improving Emotion Recognition and Flip Reasoning in Conversations via Speaker Embeddings
Authors:
Shubham Patel,
Divyaksh Shukla,
Ashutosh Modi
Abstract:
This paper presents our approach for the SemEval-2024 Task 10: Emotion Discovery and Reasoning its Flip in Conversations. For the Emotion Recognition in Conversations (ERC) task, we utilize a masked-memory network along with speaker participation. We propose a transformer-based speaker-centric model for the Emotion Flip Reasoning (EFR) task. We also introduce Probable Trigger Zone, a region of the…
▽ More
This paper presents our approach for the SemEval-2024 Task 10: Emotion Discovery and Reasoning its Flip in Conversations. For the Emotion Recognition in Conversations (ERC) task, we utilize a masked-memory network along with speaker participation. We propose a transformer-based speaker-centric model for the Emotion Flip Reasoning (EFR) task. We also introduce Probable Trigger Zone, a region of the conversation that is more likely to contain the utterances causing the emotion to flip. For sub-task 3, the proposed approach achieves a 5.9 (F1 score) improvement over the task baseline. The ablation study results highlight the significance of various design choices in the proposed method.
△ Less
Submitted 6 April, 2024;
originally announced April 2024.
-
IITK at SemEval-2024 Task 4: Hierarchical Embeddings for Detection of Persuasion Techniques in Memes
Authors:
Shreenaga Chikoti,
Shrey Mehta,
Ashutosh Modi
Abstract:
Memes are one of the most popular types of content used in an online disinformation campaign. They are primarily effective on social media platforms since they can easily reach many users. Memes in a disinformation campaign achieve their goal of influencing the users through several rhetorical and psychological techniques, such as causal oversimplification, name-calling, and smear. The SemEval 202…
▽ More
Memes are one of the most popular types of content used in an online disinformation campaign. They are primarily effective on social media platforms since they can easily reach many users. Memes in a disinformation campaign achieve their goal of influencing the users through several rhetorical and psychological techniques, such as causal oversimplification, name-calling, and smear. The SemEval 2024 Task 4 \textit{Multilingual Detection of Persuasion Technique in Memes} on identifying such techniques in the memes is divided across three sub-tasks: ($\mathbf{1}$) Hierarchical multi-label classification using only textual content of the meme, ($\mathbf{2}$) Hierarchical multi-label classification using both, textual and visual content of the meme and ($\mathbf{3}$) Binary classification of whether the meme contains a persuasion technique or not using it's textual and visual content. This paper proposes an ensemble of Class Definition Prediction (CDP) and hyperbolic embeddings-based approaches for this task. We enhance meme classification accuracy and comprehensiveness by integrating HypEmo's hierarchical label embeddings (Chen et al., 2023) and a multi-task learning framework for emotion prediction. We achieve a hierarchical F1-score of 0.60, 0.67, and 0.48 on the respective sub-tasks.
△ Less
Submitted 6 April, 2024;
originally announced April 2024.
-
IITK at SemEval-2024 Task 1: Contrastive Learning and Autoencoders for Semantic Textual Relatedness in Multilingual Texts
Authors:
Udvas Basak,
Rajarshi Dutta,
Shivam Pandey,
Ashutosh Modi
Abstract:
This paper describes our system developed for the SemEval-2024 Task 1: Semantic Textual Relatedness. The challenge is focused on automatically detecting the degree of relatedness between pairs of sentences for 14 languages including both high and low-resource Asian and African languages. Our team participated in two subtasks consisting of Track A: supervised and Track B: unsupervised. This paper f…
▽ More
This paper describes our system developed for the SemEval-2024 Task 1: Semantic Textual Relatedness. The challenge is focused on automatically detecting the degree of relatedness between pairs of sentences for 14 languages including both high and low-resource Asian and African languages. Our team participated in two subtasks consisting of Track A: supervised and Track B: unsupervised. This paper focuses on a BERT-based contrastive learning and similarity metric based approach primarily for the supervised track while exploring autoencoders for the unsupervised track. It also aims on the creation of a bigram relatedness corpus using negative sampling strategy, thereby producing refined word embeddings.
△ Less
Submitted 6 April, 2024;
originally announced April 2024.
-
IITK at SemEval-2024 Task 2: Exploring the Capabilities of LLMs for Safe Biomedical Natural Language Inference for Clinical Trials
Authors:
Shreyasi Mandal,
Ashutosh Modi
Abstract:
Large Language models (LLMs) have demonstrated state-of-the-art performance in various natural language processing (NLP) tasks across multiple domains, yet they are prone to shortcut learning and factual inconsistencies. This research investigates LLMs' robustness, consistency, and faithful reasoning when performing Natural Language Inference (NLI) on breast cancer Clinical Trial Reports (CTRs) in…
▽ More
Large Language models (LLMs) have demonstrated state-of-the-art performance in various natural language processing (NLP) tasks across multiple domains, yet they are prone to shortcut learning and factual inconsistencies. This research investigates LLMs' robustness, consistency, and faithful reasoning when performing Natural Language Inference (NLI) on breast cancer Clinical Trial Reports (CTRs) in the context of SemEval 2024 Task 2: Safe Biomedical Natural Language Inference for Clinical Trials. We examine the reasoning capabilities of LLMs and their adeptness at logical problem-solving. A comparative analysis is conducted on pre-trained language models (PLMs), GPT-3.5, and Gemini Pro under zero-shot settings using Retrieval-Augmented Generation (RAG) framework, integrating various reasoning chains. The evaluation yields an F1 score of 0.69, consistency of 0.71, and a faithfulness score of 0.90 on the test dataset.
△ Less
Submitted 6 April, 2024;
originally announced April 2024.
-
Towards Measuring and Modeling "Culture" in LLMs: A Survey
Authors:
Muhammad Farid Adilazuarda,
Sagnik Mukherjee,
Pradhyumna Lavania,
Siddhant Singh,
Alham Fikri Aji,
Jacki O'Neill,
Ashutosh Modi,
Monojit Choudhury
Abstract:
We present a survey of more than 90 recent papers that aim to study cultural representation and inclusion in large language models (LLMs). We observe that none of the studies explicitly define "culture, which is a complex, multifaceted concept; instead, they probe the models on some specially designed datasets which represent certain aspects of "culture". We call these aspects the proxies of cultu…
▽ More
We present a survey of more than 90 recent papers that aim to study cultural representation and inclusion in large language models (LLMs). We observe that none of the studies explicitly define "culture, which is a complex, multifaceted concept; instead, they probe the models on some specially designed datasets which represent certain aspects of "culture". We call these aspects the proxies of culture, and organize them across two dimensions of demographic and semantic proxies. We also categorize the probing methods employed. Our analysis indicates that only certain aspects of ``culture,'' such as values and objectives, have been studied, leaving several other interesting and important facets, especially the multitude of semantic domains (Thompson et al., 2020) and aboutness (Hershcovich et al., 2022), unexplored. Two other crucial gaps are the lack of robustness of probing techniques and situated studies on the impact of cultural mis- and under-representation in LLM-based applications.
△ Less
Submitted 19 June, 2024; v1 submitted 5 March, 2024;
originally announced March 2024.
-
Data Driven Approaches to Cybersecurity Governance for Board Decision-Making -- A Systematic Review
Authors:
Anita Modi,
Ievgeniia Kuzminykh,
Bogdan Ghita
Abstract:
Cybersecurity governance influences the quality of strategic decision-making to ensure cyber risks are managed effectively. Board of Directors are the decisions-makers held accountable for managing this risk; however, they lack adequate and efficient information necessary for making such decisions. In addition to the myriad of challenges they face, they are often insufficiently versed in the techn…
▽ More
Cybersecurity governance influences the quality of strategic decision-making to ensure cyber risks are managed effectively. Board of Directors are the decisions-makers held accountable for managing this risk; however, they lack adequate and efficient information necessary for making such decisions. In addition to the myriad of challenges they face, they are often insufficiently versed in the technology or cybersecurity terminology or not provided with the correct tools to support them to make sound decisions to govern cybersecurity effectively. A different approach is needed to ensure BoDs are clear on the approach the business is taking to build a cyber resilient organization. This systematic literature review investigates the existing risk measurement instruments, cybersecurity metrics, and associated models for supporting BoDs. We identified seven conceptual themes through literature analysis that form the basis of this study's main contribution. The findings showed that, although sophisticated cybersecurity tools exist and are developing, there is limited information for Board of Directors to support them in terms of metrics and models to govern cybersecurity in a language they understand. The review also provides some recommendations on theories and models that can be further investigated to provide support to Board of Directors.
△ Less
Submitted 29 November, 2023;
originally announced November 2023.
-
EtiCor: Corpus for Analyzing LLMs for Etiquettes
Authors:
Ashutosh Dwivedi,
Pradhyumna Lavania,
Ashutosh Modi
Abstract:
Etiquettes are an essential ingredient of day-to-day interactions among people. Moreover, etiquettes are region-specific, and etiquettes in one region might contradict those in other regions. In this paper, we propose EtiCor, an Etiquettes Corpus, having texts about social norms from five different regions across the globe. The corpus provides a test bed for evaluating LLMs for knowledge and under…
▽ More
Etiquettes are an essential ingredient of day-to-day interactions among people. Moreover, etiquettes are region-specific, and etiquettes in one region might contradict those in other regions. In this paper, we propose EtiCor, an Etiquettes Corpus, having texts about social norms from five different regions across the globe. The corpus provides a test bed for evaluating LLMs for knowledge and understanding of region-specific etiquettes. Additionally, we propose the task of Etiquette Sensitivity. We experiment with state-of-the-art LLMs (Delphi, Falcon40B, and GPT-3.5). Initial results indicate that LLMs, mostly fail to understand etiquettes from regions from non-Western world.
△ Less
Submitted 29 October, 2023;
originally announced October 2023.
-
ISLTranslate: Dataset for Translating Indian Sign Language
Authors:
Abhinav Joshi,
Susmit Agrawal,
Ashutosh Modi
Abstract:
Sign languages are the primary means of communication for many hard-of-hearing people worldwide. Recently, to bridge the communication gap between the hard-of-hearing community and the rest of the population, several sign language translation datasets have been proposed to enable the development of statistical sign language translation systems. However, there is a dearth of sign language resources…
▽ More
Sign languages are the primary means of communication for many hard-of-hearing people worldwide. Recently, to bridge the communication gap between the hard-of-hearing community and the rest of the population, several sign language translation datasets have been proposed to enable the development of statistical sign language translation systems. However, there is a dearth of sign language resources for the Indian sign language. This resource paper introduces ISLTranslate, a translation dataset for continuous Indian Sign Language (ISL) consisting of 31k ISL-English sentence/phrase pairs. To the best of our knowledge, it is the largest translation dataset for continuous Indian Sign Language. We provide a detailed analysis of the dataset. To validate the performance of existing end-to-end Sign language to spoken language translation systems, we benchmark the created dataset with a transformer-based model for ISL translation.
△ Less
Submitted 11 July, 2023;
originally announced July 2023.
-
U-CREAT: Unsupervised Case Retrieval using Events extrAcTion
Authors:
Abhinav Joshi,
Akshat Sharma,
Sai Kiran Tanikella,
Ashutosh Modi
Abstract:
The task of Prior Case Retrieval (PCR) in the legal domain is about automatically citing relevant (based on facts and precedence) prior legal cases in a given query case. To further promote research in PCR, in this paper, we propose a new large benchmark (in English) for the PCR task: IL-PCR (Indian Legal Prior Case Retrieval) corpus. Given the complex nature of case relevance and the long size of…
▽ More
The task of Prior Case Retrieval (PCR) in the legal domain is about automatically citing relevant (based on facts and precedence) prior legal cases in a given query case. To further promote research in PCR, in this paper, we propose a new large benchmark (in English) for the PCR task: IL-PCR (Indian Legal Prior Case Retrieval) corpus. Given the complex nature of case relevance and the long size of legal documents, BM25 remains a strong baseline for ranking the cited prior documents. In this work, we explore the role of events in legal case retrieval and propose an unsupervised retrieval method-based pipeline U-CREAT (Unsupervised Case Retrieval using Events Extraction). We find that the proposed unsupervised retrieval method significantly increases performance compared to BM25 and makes retrieval faster by a considerable margin, making it applicable to real-time case retrieval systems. Our proposed system is generic, we show that it generalizes across two different legal systems (Indian and Canadian), and it shows state-of-the-art performance on the benchmarks for both the legal systems (IL-PCR and COLIEE corpora).
△ Less
Submitted 11 July, 2023;
originally announced July 2023.
-
ScriptWorld: Text Based Environment For Learning Procedural Knowledge
Authors:
Abhinav Joshi,
Areeb Ahmad,
Umang Pandey,
Ashutosh Modi
Abstract:
Text-based games provide a framework for developing natural language understanding and commonsense knowledge about the world in reinforcement learning based agents. Existing text-based environments often rely on fictional situations and characters to create a gaming framework and are far from real-world scenarios. In this paper, we introduce ScriptWorld: a text-based environment for teaching agent…
▽ More
Text-based games provide a framework for developing natural language understanding and commonsense knowledge about the world in reinforcement learning based agents. Existing text-based environments often rely on fictional situations and characters to create a gaming framework and are far from real-world scenarios. In this paper, we introduce ScriptWorld: a text-based environment for teaching agents about real-world daily chores and hence imparting commonsense knowledge. To the best of our knowledge, it is the first interactive text-based gaming framework that consists of daily real-world human activities designed using scripts dataset. We provide gaming environments for 10 daily activities and perform a detailed analysis of the proposed environment. We develop RL-based baseline models/agents to play the games in Scriptworld. To understand the role of language models in such environments, we leverage features obtained from pre-trained language models in the RL agents. Our experiments show that prior knowledge obtained from a pre-trained language model helps to solve real-world text-based gaming environments. We release the environment via Github: https://github.com/Exploration-Lab/ScriptWorld
△ Less
Submitted 8 July, 2023;
originally announced July 2023.
-
Quadratic Functional Encryption for Secure Training in Vertical Federated Learning
Authors:
Shuangyi Chen,
Anuja Modi,
Shweta Agrawal,
Ashish Khisti
Abstract:
Vertical federated learning (VFL) enables the collaborative training of machine learning (ML) models in settings where the data is distributed amongst multiple parties who wish to protect the privacy of their individual data. Notably, in VFL, the labels are available to a single party and the complete feature set is formed only when data from all parties is combined. Recently, Xu et al. proposed a…
▽ More
Vertical federated learning (VFL) enables the collaborative training of machine learning (ML) models in settings where the data is distributed amongst multiple parties who wish to protect the privacy of their individual data. Notably, in VFL, the labels are available to a single party and the complete feature set is formed only when data from all parties is combined. Recently, Xu et al. proposed a new framework called FedV for secure gradient computation for VFL using multi-input functional encryption. In this work, we explain how some of the information leakage in Xu et al. can be avoided by using Quadratic functional encryption when training generalized linear models for vertical federated learning.
△ Less
Submitted 19 June, 2023; v1 submitted 15 May, 2023;
originally announced May 2023.
-
SemEval 2023 Task 6: LegalEval - Understanding Legal Texts
Authors:
Ashutosh Modi,
Prathamesh Kalamkar,
Saurabh Karn,
Aman Tiwari,
Abhinav Joshi,
Sai Kiran Tanikella,
Shouvik Kumar Guha,
Sachin Malhan,
Vivek Raghavan
Abstract:
In populous countries, pending legal cases have been growing exponentially. There is a need for developing NLP-based techniques for processing and automatically understanding legal documents. To promote research in the area of Legal NLP we organized the shared task LegalEval - Understanding Legal Texts at SemEval 2023. LegalEval task has three sub-tasks: Task-A (Rhetorical Roles Labeling) is about…
▽ More
In populous countries, pending legal cases have been growing exponentially. There is a need for developing NLP-based techniques for processing and automatically understanding legal documents. To promote research in the area of Legal NLP we organized the shared task LegalEval - Understanding Legal Texts at SemEval 2023. LegalEval task has three sub-tasks: Task-A (Rhetorical Roles Labeling) is about automatically structuring legal documents into semantically coherent units, Task-B (Legal Named Entity Recognition) deals with identifying relevant entities in a legal document and Task-C (Court Judgement Prediction with Explanation) explores the possibility of automatically predicting the outcome of a legal case along with providing an explanation for the prediction. In total 26 teams (approx. 100 participants spread across the world) submitted systems paper. In each of the sub-tasks, the proposed systems outperformed the baselines; however, there is a lot of scope for improvement. This paper describes the tasks, and analyzes techniques proposed by various teams.
△ Less
Submitted 1 May, 2023; v1 submitted 19 April, 2023;
originally announced April 2023.
-
MLOps with enhanced performance control and observability
Authors:
Indradumna Banerjee,
Dinesh Ghanta,
Girish Nautiyal,
Pradeep Sanchana,
Prateek Katageri,
Atin Modi
Abstract:
The explosion of data and its ever increasing complexity in the last few years, has made MLOps systems more prone to failure, and new tools need to be embedded in such systems to avoid such failure. In this demo, we will introduce crucial tools in the observability module of a MLOps system that target difficult issues like data drfit and model version control for optimum model selection. We believ…
▽ More
The explosion of data and its ever increasing complexity in the last few years, has made MLOps systems more prone to failure, and new tools need to be embedded in such systems to avoid such failure. In this demo, we will introduce crucial tools in the observability module of a MLOps system that target difficult issues like data drfit and model version control for optimum model selection. We believe integrating these features in our MLOps pipeline would go a long way in building a robust system immune to early stage ML system failures.
△ Less
Submitted 2 February, 2023;
originally announced February 2023.
-
Multi-Task Learning Framework for Extracting Emotion Cause Span and Entailment in Conversations
Authors:
Ashwani Bhat,
Ashutosh Modi
Abstract:
Predicting emotions expressed in text is a well-studied problem in the NLP community. Recently there has been active research in extracting the cause of an emotion expressed in text. Most of the previous work has done causal emotion entailment in documents. In this work, we propose neural models to extract emotion cause span and entailment in conversations. For learning such models, we use RECCON…
▽ More
Predicting emotions expressed in text is a well-studied problem in the NLP community. Recently there has been active research in extracting the cause of an emotion expressed in text. Most of the previous work has done causal emotion entailment in documents. In this work, we propose neural models to extract emotion cause span and entailment in conversations. For learning such models, we use RECCON dataset, which is annotated with cause spans at the utterance level. In particular, we propose MuTEC, an end-to-end Multi-Task learning framework for extracting emotions, emotion cause, and entailment in conversations. This is in contrast to existing baseline models that use ground truth emotions to extract the cause. MuTEC performs better than the baselines for most of the data folds provided in the dataset.
△ Less
Submitted 7 November, 2022;
originally announced November 2022.
-
Generalized Product-of-Experts for Learning Multimodal Representations in Noisy Environments
Authors:
Abhinav Joshi,
Naman Gupta,
Jinang Shah,
Binod Bhattarai,
Ashutosh Modi,
Danail Stoyanov
Abstract:
A real-world application or setting involves interaction between different modalities (e.g., video, speech, text). In order to process the multimodal information automatically and use it for an end application, Multimodal Representation Learning (MRL) has emerged as an active area of research in recent times. MRL involves learning reliable and robust representations of information from heterogeneo…
▽ More
A real-world application or setting involves interaction between different modalities (e.g., video, speech, text). In order to process the multimodal information automatically and use it for an end application, Multimodal Representation Learning (MRL) has emerged as an active area of research in recent times. MRL involves learning reliable and robust representations of information from heterogeneous sources and fusing them. However, in practice, the data acquired from different sources are typically noisy. In some extreme cases, a noise of large magnitude can completely alter the semantics of the data leading to inconsistencies in the parallel multimodal data. In this paper, we propose a novel method for multimodal representation learning in a noisy environment via the generalized product of experts technique. In the proposed method, we train a separate network for each modality to assess the credibility of information coming from that modality, and subsequently, the contribution from each modality is dynamically varied while estimating the joint distribution. We evaluate our method on two challenging benchmarks from two diverse domains: multimodal 3D hand-pose estimation and multimodal surgical video segmentation. We attain state-of-the-art performance on both benchmarks. Our extensive quantitative and qualitative evaluations show the advantages of our method compared to previous approaches.
△ Less
Submitted 7 November, 2022;
originally announced November 2022.
-
BabyNet: A Lightweight Network for Infant Reaching Action Recognition in Unconstrained Environments to Support Future Pediatric Rehabilitation Applications
Authors:
Amel Dechemi,
Vikarn Bhakri,
Ipsita Sahin,
Arjun Modi,
Julya Mestas,
Pamodya Peiris,
Dannya Enriquez Barrundia,
Elena Kokkoni,
Konstantinos Karydis
Abstract:
Action recognition is an important component to improve autonomy of physical rehabilitation devices, such as wearable robotic exoskeletons. Existing human action recognition algorithms focus on adult applications rather than pediatric ones. In this paper, we introduce BabyNet, a light-weight (in terms of trainable parameters) network structure to recognize infant reaching action from off-body stat…
▽ More
Action recognition is an important component to improve autonomy of physical rehabilitation devices, such as wearable robotic exoskeletons. Existing human action recognition algorithms focus on adult applications rather than pediatric ones. In this paper, we introduce BabyNet, a light-weight (in terms of trainable parameters) network structure to recognize infant reaching action from off-body stationary cameras. We develop an annotated dataset that includes diverse reaches performed while in a sitting posture by different infants in unconstrained environments (e.g., in home settings, etc.). Our approach uses the spatial and temporal connection of annotated bounding boxes to interpret onset and offset of reaching, and to detect a complete reaching action. We evaluate the efficiency of our proposed approach and compare its performance against other learning-based network structures in terms of capability of capturing temporal inter-dependencies and accuracy of detection of reaching onset and offset. Results indicate our BabyNet can attain solid performance in terms of (average) testing accuracy that exceeds that of other larger networks, and can hence serve as a light-weight data-driven framework for video-based infant reaching action recognition.
△ Less
Submitted 12 October, 2022; v1 submitted 9 August, 2022;
originally announced August 2022.
-
On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL
Authors:
Jinglin Chen,
Aditya Modi,
Akshay Krishnamurthy,
Nan Jiang,
Alekh Agarwal
Abstract:
We study reward-free reinforcement learning (RL) under general non-linear function approximation, and establish sample efficiency and hardness results under various standard structural assumptions. On the positive side, we propose the RFOLIVE (Reward-Free OLIVE) algorithm for sample-efficient reward-free exploration under minimal structural assumptions, which covers the previously studied settings…
▽ More
We study reward-free reinforcement learning (RL) under general non-linear function approximation, and establish sample efficiency and hardness results under various standard structural assumptions. On the positive side, we propose the RFOLIVE (Reward-Free OLIVE) algorithm for sample-efficient reward-free exploration under minimal structural assumptions, which covers the previously studied settings of linear MDPs (Jin et al., 2020b), linear completeness (Zanette et al., 2020b) and low-rank MDPs with unknown representation (Modi et al., 2021). Our analyses indicate that the explorability or reachability assumptions, previously made for the latter two settings, are not necessary statistically for reward-free exploration. On the negative side, we provide a statistical hardness result for both reward-free and reward-aware exploration under linear completeness assumptions when the underlying features are unknown, showing an exponential separation between low-rank and linear completeness settings.
△ Less
Submitted 22 October, 2022; v1 submitted 21 June, 2022;
originally announced June 2022.
-
COGMEN: COntextualized GNN based Multimodal Emotion recognitioN
Authors:
Abhinav Joshi,
Ashwani Bhat,
Ayush Jain,
Atin Vikram Singh,
Ashutosh Modi
Abstract:
Emotions are an inherent part of human interactions, and consequently, it is imperative to develop AI systems that understand and recognize human emotions. During a conversation involving various people, a person's emotions are influenced by the other speaker's utterances and their own emotional state over the utterances. In this paper, we propose COntextualized Graph Neural Network based Multimod…
▽ More
Emotions are an inherent part of human interactions, and consequently, it is imperative to develop AI systems that understand and recognize human emotions. During a conversation involving various people, a person's emotions are influenced by the other speaker's utterances and their own emotional state over the utterances. In this paper, we propose COntextualized Graph Neural Network based Multimodal Emotion recognitioN (COGMEN) system that leverages local information (i.e., inter/intra dependency between speakers) and global information (context). The proposed model uses Graph Neural Network (GNN) based architecture to model the complex dependencies (local and global information) in a conversation. Our model gives state-of-the-art (SOTA) results on IEMOCAP and MOSEI datasets, and detailed ablation experiments show the importance of modeling information at both levels.
△ Less
Submitted 5 May, 2022;
originally announced May 2022.
-
HLDC: Hindi Legal Documents Corpus
Authors:
Arnav Kapoor,
Mudit Dhawan,
Anmol Goel,
T. H. Arjun,
Akshala Bhatnagar,
Vibhu Agrawal,
Amul Agrawal,
Arnab Bhattacharya,
Ponnurangam Kumaraguru,
Ashutosh Modi
Abstract:
Many populous countries including India are burdened with a considerable backlog of legal cases. Development of automated systems that could process legal documents and augment legal practitioners can mitigate this. However, there is a dearth of high-quality corpora that is needed to develop such data-driven systems. The problem gets even more pronounced in the case of low resource languages such…
▽ More
Many populous countries including India are burdened with a considerable backlog of legal cases. Development of automated systems that could process legal documents and augment legal practitioners can mitigate this. However, there is a dearth of high-quality corpora that is needed to develop such data-driven systems. The problem gets even more pronounced in the case of low resource languages such as Hindi. In this resource paper, we introduce the Hindi Legal Documents Corpus (HLDC), a corpus of more than 900K legal documents in Hindi. Documents are cleaned and structured to enable the development of downstream applications. Further, as a use-case for the corpus, we introduce the task of bail prediction. We experiment with a battery of models and propose a Multi-Task Learning (MTL) based model for the same. MTL models use summarization as an auxiliary task along with bail prediction as the main task. Experiments with different models are indicative of the need for further research in this area. We release the corpus and model implementation code with this paper: https://github.com/Exploration-Lab/HLDC
△ Less
Submitted 24 May, 2024; v1 submitted 2 April, 2022;
originally announced April 2022.
-
Corpus for Automatic Structuring of Legal Documents
Authors:
Prathamesh Kalamkar,
Aman Tiwari,
Astha Agarwal,
Saurabh Karn,
Smita Gupta,
Vivek Raghavan,
Ashutosh Modi
Abstract:
In populous countries, pending legal cases have been growing exponentially. There is a need for developing techniques for processing and organizing legal documents. In this paper, we introduce a new corpus for structuring legal documents. In particular, we introduce a corpus of legal judgment documents in English that are segmented into topical and coherent parts. Each of these parts is annotated…
▽ More
In populous countries, pending legal cases have been growing exponentially. There is a need for developing techniques for processing and organizing legal documents. In this paper, we introduce a new corpus for structuring legal documents. In particular, we introduce a corpus of legal judgment documents in English that are segmented into topical and coherent parts. Each of these parts is annotated with a label coming from a list of pre-defined Rhetorical Roles. We develop baseline models for automatically predicting rhetorical roles in a legal document based on the annotated corpus. Further, we show the application of rhetorical roles to improve performance on the tasks of summarization and legal judgment prediction. We release the corpus and baseline model code along with the paper.
△ Less
Submitted 19 September, 2022; v1 submitted 31 January, 2022;
originally announced January 2022.
-
Joint Learning-Based Stabilization of Multiple Unknown Linear Systems
Authors:
Mohamad Kazem Shirani Faradonbeh,
Aditya Modi
Abstract:
Learning-based control of linear systems received a lot of attentions recently. In popular settings, the true dynamical models are unknown to the decision-maker and need to be interactively learned by applying control inputs to the systems. Unlike the matured literature of efficient reinforcement learning policies for adaptive control of a single system, results on joint learning of multiple syste…
▽ More
Learning-based control of linear systems received a lot of attentions recently. In popular settings, the true dynamical models are unknown to the decision-maker and need to be interactively learned by applying control inputs to the systems. Unlike the matured literature of efficient reinforcement learning policies for adaptive control of a single system, results on joint learning of multiple systems are not currently available. Especially, the important problem of fast and reliable joint-stabilization remains unaddressed and so is the focus of this work. We propose a novel joint learning-based stabilization algorithm for quickly learning stabilizing policies for all systems understudy, from the data of unstable state trajectories. The presented procedure is shown to be notably effective such that it stabilizes the family of dynamical systems in an extremely short time period.
△ Less
Submitted 1 January, 2022;
originally announced January 2022.
-
Joint Learning of Linear Time-Invariant Dynamical Systems
Authors:
Aditya Modi,
Mohamad Kazem Shirani Faradonbeh,
Ambuj Tewari,
George Michailidis
Abstract:
Linear time-invariant systems are very popular models in system theory and applications. A fundamental problem in system identification that remains rather unaddressed in extant literature is to leverage commonalities amongst related linear systems to estimate their transition matrices more accurately. To address this problem, the current paper investigates methods for jointly estimating the trans…
▽ More
Linear time-invariant systems are very popular models in system theory and applications. A fundamental problem in system identification that remains rather unaddressed in extant literature is to leverage commonalities amongst related linear systems to estimate their transition matrices more accurately. To address this problem, the current paper investigates methods for jointly estimating the transition matrices of multiple systems. It is assumed that the transition matrices are unknown linear functions of some unknown shared basis matrices. We establish finite-time estimation error rates that fully reflect the roles of trajectory lengths, dimension, and number of systems under consideration. The presented results are fairly general and show the significant gains that can be achieved by pooling data across systems in comparison to learning each system individually. Further, they are shown to be robust against model misspecifications. To obtain the results, we develop novel techniques that are of interest for addressing similar joint-learning problems. They include tightly bounding estimation errors in terms of the eigen-structures of transition matrices, establishing sharp high probability bounds for singular values of dependent random matrices, and capturing effects of misspecified transition matrices as the systems evolve over time.
△ Less
Submitted 2 January, 2024; v1 submitted 20 December, 2021;
originally announced December 2021.
-
Shapes of Emotions: Multimodal Emotion Recognition in Conversations via Emotion Shifts
Authors:
Harsh Agarwal,
Keshav Bansal,
Abhinav Joshi,
Ashutosh Modi
Abstract:
Emotion Recognition in Conversations (ERC) is an important and active research area. Recent work has shown the benefits of using multiple modalities (e.g., text, audio, and video) for the ERC task. In a conversation, participants tend to maintain a particular emotional state unless some stimuli evokes a change. There is a continuous ebb and flow of emotions in a conversation. Inspired by this obse…
▽ More
Emotion Recognition in Conversations (ERC) is an important and active research area. Recent work has shown the benefits of using multiple modalities (e.g., text, audio, and video) for the ERC task. In a conversation, participants tend to maintain a particular emotional state unless some stimuli evokes a change. There is a continuous ebb and flow of emotions in a conversation. Inspired by this observation, we propose a multimodal ERC model and augment it with an emotion-shift component that improves performance. The proposed emotion-shift component is modular and can be added to any existing multimodal ERC model (with a few modifications). We experiment with different variants of the model, and results show that the inclusion of emotion shift signal helps the model to outperform existing models for ERC on MOSEI and IEMOCAP datasets.
△ Less
Submitted 7 November, 2022; v1 submitted 3 December, 2021;
originally announced December 2021.
-
Semantic Segmentation of Legal Documents via Rhetorical Roles
Authors:
Vijit Malik,
Rishabh Sanjay,
Shouvik Kumar Guha,
Angshuman Hazarika,
Shubham Nigam,
Arnab Bhattacharya,
Ashutosh Modi
Abstract:
Legal documents are unstructured, use legal jargon, and have considerable length, making them difficult to process automatically via conventional text processing techniques. A legal document processing system would benefit substantially if the documents could be segmented into coherent information units. This paper proposes a new corpus of legal documents annotated (with the help of legal experts)…
▽ More
Legal documents are unstructured, use legal jargon, and have considerable length, making them difficult to process automatically via conventional text processing techniques. A legal document processing system would benefit substantially if the documents could be segmented into coherent information units. This paper proposes a new corpus of legal documents annotated (with the help of legal experts) with a set of 13 semantically coherent units labels (referred to as Rhetorical Roles), e.g., facts, arguments, statute, issue, precedent, ruling, and ratio. We perform a thorough analysis of the corpus and the annotations. For automatically segmenting the legal documents, we experiment with the task of rhetorical role prediction: given a document, predict the text segments corresponding to various roles. Using the created corpus, we experiment extensively with various deep learning-based baseline models for the task. Further, we develop a multitask learning (MTL) based deep model with document rhetorical role label shift as an auxiliary task for segmenting a legal document. The proposed model shows superior performance over the existing models. We also experiment with model performance in the case of domain transfer and model distillation techniques to see the model performance in limited data conditions.
△ Less
Submitted 7 November, 2022; v1 submitted 3 December, 2021;
originally announced December 2021.
-
Design and Evaluation of Reconfigurable Intelligent Surfaces in Real-World Environment
Authors:
Georgios C. Trichopoulos,
Panagiotis Theofanopoulos,
Bharath Kashyap,
Aditya Shekhawat,
Anuj Modi,
Tawfik Osman,
Sanjay Kumar,
Anand Sengar,
Arkajyoti Chang,
Ahmed Alkhateeb
Abstract:
Reconfigurable intelligent surfaces (RISs) have promising coverage and data rate gains for wireless communication systems in 5G and beyond. Prior work has mainly focused on analyzing the performance of these surfaces using computer simulations or lab-level prototypes. To draw accurate insights about the actual performance of these systems, this paper develops an RIS proof-of-concept prototype and…
▽ More
Reconfigurable intelligent surfaces (RISs) have promising coverage and data rate gains for wireless communication systems in 5G and beyond. Prior work has mainly focused on analyzing the performance of these surfaces using computer simulations or lab-level prototypes. To draw accurate insights about the actual performance of these systems, this paper develops an RIS proof-of-concept prototype and extensively evaluates its potential gains in the field and under realistic wireless communication settings. In particular, a 160-element reconfigurable surface, operating at a 5.8GHz band, is first designed, fabricated, and accurately measured in the anechoic chamber. This surface is then integrated into a wireless communication system and the beamforming gains, path-loss, and coverage improvements are evaluated in realistic outdoor communication scenarios. When both the transmitter and receiver employ directional antennas and with 5m and 10m distances between the transmitter-RIS and RIS-receiver, the developed RIS achieves $15$-$20$dB gain in the signal-to-noise ratio (SNR) in a range of $\pm60^\circ$ beamforming angles. In terms of coverage, and considering a far-field experiment with a blockage between a base station and a grid of mobile users and with an average distance of $35m$ between base station (BS) and the user (through the RIS), the RIS provides an average SNR improvement of $6$dB (max $8$dB) within an area $> 75$m$^2$. Thanks to the scalable RIS design, these SNR gains can be directly increased with larger RIS areas. For example, a 1,600-element RIS with the same design is expected to provide around $26$dB SNR gain for a similar deployment. These results, among others, draw useful insights into the design and performance of RIS systems and provide an important proof for their potential gains in real-world far-field wireless communication environments.
△ Less
Submitted 16 September, 2021;
originally announced September 2021.
-
Fine-Grained Emotion Prediction by Modeling Emotion Definitions
Authors:
Gargi Singh,
Dhanajit Brahma,
Piyush Rai,
Ashutosh Modi
Abstract:
In this paper, we propose a new framework for fine-grained emotion prediction in the text through emotion definition modeling. Our approach involves a multi-task learning framework that models definitions of emotions as an auxiliary task while being trained on the primary task of emotion prediction. We model definitions using masked language modeling and class definition prediction tasks. Our mode…
▽ More
In this paper, we propose a new framework for fine-grained emotion prediction in the text through emotion definition modeling. Our approach involves a multi-task learning framework that models definitions of emotions as an auxiliary task while being trained on the primary task of emotion prediction. We model definitions using masked language modeling and class definition prediction tasks. Our models outperform existing state-of-the-art for fine-grained emotion dataset GoEmotions. We further show that this trained model can be used for transfer learning on other benchmark datasets in emotion prediction with varying emotion label sets, domains, and sizes. The proposed models outperform the baselines on transfer learning experiments demonstrating the generalization capability of the models.
△ Less
Submitted 26 July, 2021;
originally announced July 2021.
-
Pre-trained Language Models as Prior Knowledge for Playing Text-based Games
Authors:
Ishika Singh,
Gargi Singh,
Ashutosh Modi
Abstract:
Recently, text world games have been proposed to enable artificial agents to understand and reason about real-world scenarios. These text-based games are challenging for artificial agents, as it requires an understanding of and interaction using natural language in a partially observable environment. Agents observe the environment via textual descriptions designed to be challenging enough for even…
▽ More
Recently, text world games have been proposed to enable artificial agents to understand and reason about real-world scenarios. These text-based games are challenging for artificial agents, as it requires an understanding of and interaction using natural language in a partially observable environment. Agents observe the environment via textual descriptions designed to be challenging enough for even human players. Past approaches have not paid enough attention to the language understanding capability of the proposed agents. Typically, these approaches train from scratch, an agent that learns both textual representations and the gameplay online during training using a temporal loss function. Given the sample-inefficiency of RL approaches, it is inefficient to learn rich enough textual representations to be able to understand and reason using the textual observation in such a complicated game environment setting. In this paper, we improve the semantic understanding of the agent by proposing a simple RL with LM framework where we use transformer-based language models with Deep RL models. We perform a detailed study of our framework to demonstrate how our model outperforms all existing agents on the popular game, Zork1, to achieve a score of 44.7, which is 1.6 higher than the state-of-the-art model. Overall, our proposed approach outperforms 4 games out of the 14 text-based games, while performing comparable to the state-of-the-art models on the remaining games.
△ Less
Submitted 23 December, 2021; v1 submitted 18 July, 2021;
originally announced July 2021.
-
Delta Sampling R-BERT for limited data and low-light action recognition
Authors:
Sanchit Hira,
Ritwik Das,
Abhinav Modi,
Daniil Pakhomov
Abstract:
We present an approach to perform supervised action recognition in the dark. In this work, we present our results on the ARID dataset. Most previous works only evaluate performance on large, well illuminated datasets like Kinetics and HMDB51. We demonstrate that our work is able to achieve a very low error rate while being trained on a much smaller dataset of dark videos. We also explore a variety…
▽ More
We present an approach to perform supervised action recognition in the dark. In this work, we present our results on the ARID dataset. Most previous works only evaluate performance on large, well illuminated datasets like Kinetics and HMDB51. We demonstrate that our work is able to achieve a very low error rate while being trained on a much smaller dataset of dark videos. We also explore a variety of training and inference strategies including domain transfer methodologies and also propose a simple but useful frame selection strategy. Our empirical results demonstrate that we beat previously published baseline models by 11%.
△ Less
Submitted 12 July, 2021;
originally announced July 2021.
-
KEA: Tuning an Exabyte-Scale Data Infrastructure
Authors:
Yiwen Zhu,
Subru Krishnan,
Konstantinos Karanasos,
Isha Tarte,
Conor Power,
Abhishek Modi,
Manoj Kumar,
Deli Zhang,
Kartheek Muthyala,
Nick Jurgens,
Sarvesh Sakalanaga,
Sudhir Darbha,
Minu Iyer,
Ankita Agarwal,
Carlo Curino
Abstract:
Microsoft's internal big-data infrastructure is one of the largest in the world -- with over 300k machines running billions of tasks from over 0.6M daily jobs. Operating this infrastructure is a costly and complex endeavor, and efficiency is paramount. In fact, for over 15 years, a dedicated engineering team has tuned almost every aspect of this infrastructure, achieving state-of-the-art efficienc…
▽ More
Microsoft's internal big-data infrastructure is one of the largest in the world -- with over 300k machines running billions of tasks from over 0.6M daily jobs. Operating this infrastructure is a costly and complex endeavor, and efficiency is paramount. In fact, for over 15 years, a dedicated engineering team has tuned almost every aspect of this infrastructure, achieving state-of-the-art efficiency (>60% average CPU utilization across all clusters). Despite rich telemetry and strong expertise, faced with evolving hardware/software/workloads this manual tuning approach had reached its limit -- we had plateaued.
In this paper, we present KEA, a multi-year effort to automate our tuning processes to be fully data/model-driven. KEA leverages a mix of domain knowledge and principled data science to capture the essence of our cluster dynamic behavior in a set of machine learning (ML) models based on collected system data. These models power automated optimization procedures for parameter tuning, and inform our leadership in critical decisions around engineering and capacity management (such as hardware and data center design, software investments, etc.). We combine "observational" tuning (i.e., using models to predict system behavior without direct experimentation) with judicious use of "flighting" (i.e., conservative testing in production). This allows us to support a broad range of applications that we discuss in this paper.
KEA continuously tunes our cluster configurations and is on track to save Microsoft tens of millions of dollars per year. At the best of our knowledge, this paper is the first to discuss research challenges and practical learnings that emerge when tuning an exabyte-scale data infrastructure.
△ Less
Submitted 21 June, 2021;
originally announced June 2021.
-
ILDC for CJPE: Indian Legal Documents Corpus for Court Judgment Prediction and Explanation
Authors:
Vijit Malik,
Rishabh Sanjay,
Shubham Kumar Nigam,
Kripa Ghosh,
Shouvik Kumar Guha,
Arnab Bhattacharya,
Ashutosh Modi
Abstract:
An automated system that could assist a judge in predicting the outcome of a case would help expedite the judicial process. For such a system to be practically useful, predictions by the system should be explainable. To promote research in developing such a system, we introduce ILDC (Indian Legal Documents Corpus). ILDC is a large corpus of 35k Indian Supreme Court cases annotated with original co…
▽ More
An automated system that could assist a judge in predicting the outcome of a case would help expedite the judicial process. For such a system to be practically useful, predictions by the system should be explainable. To promote research in developing such a system, we introduce ILDC (Indian Legal Documents Corpus). ILDC is a large corpus of 35k Indian Supreme Court cases annotated with original court decisions. A portion of the corpus (a separate test set) is annotated with gold standard explanations by legal experts. Based on ILDC, we propose the task of Court Judgment Prediction and Explanation (CJPE). The task requires an automated system to predict an explainable outcome of a case. We experiment with a battery of baseline models for case predictions and propose a hierarchical occlusion based model for explainability. Our best prediction model has an accuracy of 78% versus 94% for human legal experts, pointing towards the complexity of the prediction task. The analysis of explanations by the proposed algorithm reveals a significant difference in the point of view of the algorithm and legal experts for explaining the judgments, pointing towards scope for future research.
△ Less
Submitted 31 May, 2021; v1 submitted 27 May, 2021;
originally announced May 2021.
-
NLP for Climate Policy: Creating a Knowledge Platform for Holistic and Effective Climate Action
Authors:
Pradip Swarnakar,
Ashutosh Modi
Abstract:
Climate change is a burning issue of our time, with the Sustainable Development Goal (SDG) 13 of the United Nations demanding global climate action. Realizing the urgency, in 2015 in Paris, world leaders signed an agreement committing to taking voluntary action to reduce carbon emissions. However, the scale, magnitude, and climate action processes vary globally, especially between developed and de…
▽ More
Climate change is a burning issue of our time, with the Sustainable Development Goal (SDG) 13 of the United Nations demanding global climate action. Realizing the urgency, in 2015 in Paris, world leaders signed an agreement committing to taking voluntary action to reduce carbon emissions. However, the scale, magnitude, and climate action processes vary globally, especially between developed and developing countries. Therefore, from parliament to social media, the debates and discussions on climate change gather data from wide-ranging sources essential to the policy design and implementation. The downside is that we do not currently have the mechanisms to pool the worldwide dispersed knowledge emerging from the structured and unstructured data sources.
The paper thematically discusses how NLP techniques could be employed in climate policy research and contribute to society's good at large. In particular, we exemplify symbiosis of NLP and Climate Policy Research via four methodologies. The first one deals with the major topics related to climate policy using automated content analysis. We investigate the opinions (sentiments) of major actors' narratives towards climate policy in the second methodology. The third technique explores the climate actors' beliefs towards pro or anti-climate orientation. Finally, we discuss developing a Climate Knowledge Graph.
The present theme paper further argues that creating a knowledge platform would help in the formulation of a holistic climate policy and effective climate action. Such a knowledge platform would integrate the policy actors' varied opinions from different social sectors like government, business, civil society, and the scientific community. The research outcome will add value to effective climate action because policymakers can make informed decisions by looking at the diverse public opinion on a comprehensive platform.
△ Less
Submitted 12 May, 2021;
originally announced May 2021.
-
BreakingBERT@IITK at SemEval-2021 Task 9 : Statement Verification and Evidence Finding with Tables
Authors:
Aditya Jindal,
Ankur Gupta,
Jaya Srivastava,
Preeti Menghwani,
Vijit Malik,
Vishesh Kaushik,
Ashutosh Modi
Abstract:
Recently, there has been an interest in factual verification and prediction over structured data like tables and graphs. To circumvent any false news incident, it is necessary to not only model and predict over structured data efficiently but also to explain those predictions. In this paper, as part of the SemEval-2021 Task 9, we tackle the problem of fact verification and evidence finding over ta…
▽ More
Recently, there has been an interest in factual verification and prediction over structured data like tables and graphs. To circumvent any false news incident, it is necessary to not only model and predict over structured data efficiently but also to explain those predictions. In this paper, as part of the SemEval-2021 Task 9, we tackle the problem of fact verification and evidence finding over tabular data. There are two subtasks. Given a table and a statement/fact, subtask A determines whether the statement is inferred from the tabular data, and subtask B determines which cells in the table provide evidence for the former subtask. We make a comparison of the baselines and state-of-the-art approaches over the given SemTabFact dataset. We also propose a novel approach CellBERT to solve evidence finding as a form of the Natural Language Inference task. We obtain a 3-way F1 score of 0.69 on subtask A and an F1 score of 0.65 on subtask B.
△ Less
Submitted 10 April, 2021; v1 submitted 7 April, 2021;
originally announced April 2021.
-
KnowGraph@IITK at SemEval-2021 Task 11: Building KnowledgeGraph for NLP Research
Authors:
Shashank Shailabh,
Sajal Chaurasia,
Ashutosh Modi
Abstract:
Research in Natural Language Processing is making rapid advances, resulting in the publication of a large number of research papers. Finding relevant research papers and their contribution to the domain is a challenging problem. In this paper, we address this challenge via the SemEval 2021 Task 11: NLPContributionGraph, by developing a system for a research paper contributions-focused knowledge gr…
▽ More
Research in Natural Language Processing is making rapid advances, resulting in the publication of a large number of research papers. Finding relevant research papers and their contribution to the domain is a challenging problem. In this paper, we address this challenge via the SemEval 2021 Task 11: NLPContributionGraph, by developing a system for a research paper contributions-focused knowledge graph over Natural Language Processing literature. The task is divided into three sub-tasks: extracting contribution sentences that show important contributions in the research article, extracting phrases from the contribution sentences, and predicting the information units in the research article together with triplet formation from the phrases. The proposed system is agnostic to the subject domain and can be applied for building a knowledge graph for any area. We found that transformer-based language models can significantly improve existing techniques and utilized the SciBERT-based model. Our first sub-task uses Bidirectional LSTM (BiLSTM) stacked on top of SciBERT model layers, while the second sub-task uses Conditional Random Field (CRF) on top of SciBERT with BiLSTM. The third sub-task uses a combined SciBERT based neural approach with heuristics for information unit prediction and triplet formation from the phrases. Our system achieved F1 score of 0.38, 0.63 and 0.76 in end-to-end pipeline testing, phrase extraction testing and triplet extraction testing respectively.
△ Less
Submitted 4 April, 2021;
originally announced April 2021.
-
MCL@IITK at SemEval-2021 Task 2: Multilingual and Cross-lingual Word-in-Context Disambiguation using Augmented Data, Signals, and Transformers
Authors:
Rohan Gupta,
Jay Mundra,
Deepak Mahajan,
Ashutosh Modi
Abstract:
In this work, we present our approach for solving the SemEval 2021 Task 2: Multilingual and Cross-lingual Word-in-Context Disambiguation (MCL-WiC). The task is a sentence pair classification problem where the goal is to detect whether a given word common to both the sentences evokes the same meaning. We submit systems for both the settings - Multilingual (the pair's sentences belong to the same la…
▽ More
In this work, we present our approach for solving the SemEval 2021 Task 2: Multilingual and Cross-lingual Word-in-Context Disambiguation (MCL-WiC). The task is a sentence pair classification problem where the goal is to detect whether a given word common to both the sentences evokes the same meaning. We submit systems for both the settings - Multilingual (the pair's sentences belong to the same language) and Cross-Lingual (the pair's sentences belong to different languages). The training data is provided only in English. Consequently, we employ cross-lingual transfer techniques. Our approach employs fine-tuning pre-trained transformer-based language models, like ELECTRA and ALBERT, for the English task and XLM-R for all other tasks. To improve these systems' performance, we propose adding a signal to the word to be disambiguated and augmenting our data by sentence pair reversal. We further augment the dataset provided to us with WiC, XL-WiC and SemCor 3.0. Using ensembles, we achieve strong performance in the Multilingual task, placing first in the EN-EN and FR-FR sub-tasks. For the Cross-Lingual setting, we employed translate-test methods and a zero-shot method, using our multilingual models, with the latter performing slightly better.
△ Less
Submitted 4 April, 2021;
originally announced April 2021.
-
IITK@Detox at SemEval-2021 Task 5: Semi-Supervised Learning and Dice Loss for Toxic Spans Detection
Authors:
Archit Bansal,
Abhay Kaushik,
Ashutosh Modi
Abstract:
In this work, we present our approach and findings for SemEval-2021 Task 5 - Toxic Spans Detection. The task's main aim was to identify spans to which a given text's toxicity could be attributed. The task is challenging mainly due to two constraints: the small training dataset and imbalanced class distribution. Our paper investigates two techniques, semi-supervised learning and learning with Self-…
▽ More
In this work, we present our approach and findings for SemEval-2021 Task 5 - Toxic Spans Detection. The task's main aim was to identify spans to which a given text's toxicity could be attributed. The task is challenging mainly due to two constraints: the small training dataset and imbalanced class distribution. Our paper investigates two techniques, semi-supervised learning and learning with Self-Adjusting Dice Loss, for tackling these challenges. Our submitted system (ranked ninth on the leader board) consisted of an ensemble of various pre-trained Transformer Language Models trained using either of the above-proposed techniques.
△ Less
Submitted 4 April, 2021;
originally announced April 2021.
-
ReCAM@IITK at SemEval-2021 Task 4: BERT and ALBERT based Ensemble for Abstract Word Prediction
Authors:
Abhishek Mittal,
Ashutosh Modi
Abstract:
This paper describes our system for Task 4 of SemEval-2021: Reading Comprehension of Abstract Meaning (ReCAM). We participated in all subtasks where the main goal was to predict an abstract word missing from a statement. We fine-tuned the pre-trained masked language models namely BERT and ALBERT and used an Ensemble of these as our submitted system on Subtask 1 (ReCAM-Imperceptibility) and Subtask…
▽ More
This paper describes our system for Task 4 of SemEval-2021: Reading Comprehension of Abstract Meaning (ReCAM). We participated in all subtasks where the main goal was to predict an abstract word missing from a statement. We fine-tuned the pre-trained masked language models namely BERT and ALBERT and used an Ensemble of these as our submitted system on Subtask 1 (ReCAM-Imperceptibility) and Subtask 2 (ReCAM-Nonspecificity). For Subtask 3 (ReCAM-Intersection), we submitted the ALBERT model as it gives the best results. We tried multiple approaches and found that Masked Language Modeling(MLM) based approach works the best.
△ Less
Submitted 4 April, 2021;
originally announced April 2021.
-
Counts@IITK at SemEval-2021 Task 8: SciBERT Based Entity And Semantic Relation Extraction For Scientific Data
Authors:
Akash Gangwar,
Sabhay Jain,
Shubham Sourav,
Ashutosh Modi
Abstract:
This paper presents the system for SemEval 2021 Task 8 (MeasEval). MeasEval is a novel span extraction, classification, and relation extraction task focused on finding quantities, attributes of these quantities, and additional information, including the related measured entities, properties, and measurement contexts. Our submitted system, which placed fifth (team rank) on the leaderboard, consiste…
▽ More
This paper presents the system for SemEval 2021 Task 8 (MeasEval). MeasEval is a novel span extraction, classification, and relation extraction task focused on finding quantities, attributes of these quantities, and additional information, including the related measured entities, properties, and measurement contexts. Our submitted system, which placed fifth (team rank) on the leaderboard, consisted of SciBERT with [CLS] token embedding and CRF layer on top. We were also placed first in Quantity (tied) and Unit subtasks, second in MeasuredEntity, Modifier and Qualifies subtasks, and third in Qualifier subtask.
△ Less
Submitted 3 April, 2021;
originally announced April 2021.
-
IITK@LCP at SemEval 2021 Task 1: Classification for Lexical Complexity Regression Task
Authors:
Neil Rajiv Shirude,
Sagnik Mukherjee,
Tushar Shandhilya,
Ananta Mukherjee,
Ashutosh Modi
Abstract:
This paper describes our contribution to SemEval 2021 Task 1: Lexical Complexity Prediction. In our approach, we leverage the ELECTRA model and attempt to mirror the data annotation scheme. Although the task is a regression task, we show that we can treat it as an aggregation of several classification and regression models. This somewhat counter-intuitive approach achieved an MAE score of 0.0654 f…
▽ More
This paper describes our contribution to SemEval 2021 Task 1: Lexical Complexity Prediction. In our approach, we leverage the ELECTRA model and attempt to mirror the data annotation scheme. Although the task is a regression task, we show that we can treat it as an aggregation of several classification and regression models. This somewhat counter-intuitive approach achieved an MAE score of 0.0654 for Sub-Task 1 and MAE of 0.0811 on Sub-Task 2. Additionally, we used the concept of weak supervision signals from Gloss-BERT in our work, and it significantly improved the MAE score in Sub-Task 1.
△ Less
Submitted 2 April, 2021;
originally announced April 2021.
-
Humor@IITK at SemEval-2021 Task 7: Large Language Models for Quantifying Humor and Offensiveness
Authors:
Aishwarya Gupta,
Avik Pal,
Bholeshwar Khurana,
Lakshay Tyagi,
Ashutosh Modi
Abstract:
Humor and Offense are highly subjective due to multiple word senses, cultural knowledge, and pragmatic competence. Hence, accurately detecting humorous and offensive texts has several compelling use cases in Recommendation Systems and Personalized Content Moderation. However, due to the lack of an extensive labeled dataset, most prior works in this domain haven't explored large neural models for s…
▽ More
Humor and Offense are highly subjective due to multiple word senses, cultural knowledge, and pragmatic competence. Hence, accurately detecting humorous and offensive texts has several compelling use cases in Recommendation Systems and Personalized Content Moderation. However, due to the lack of an extensive labeled dataset, most prior works in this domain haven't explored large neural models for subjective humor understanding. This paper explores whether large neural models and their ensembles can capture the intricacies associated with humor/offense detection and rating. Our experiments on the SemEval-2021 Task 7: HaHackathon show that we can develop reasonable humor and offense detection systems with such models. Our models are ranked third in subtask 1b and consistently ranked around the top 33% of the leaderboard for the remaining subtasks.
△ Less
Submitted 2 April, 2021;
originally announced April 2021.
-
An End-to-End Network for Emotion-Cause Pair Extraction
Authors:
Aaditya Singh,
Shreeshail Hingane,
Saim Wani,
Ashutosh Modi
Abstract:
The task of Emotion-Cause Pair Extraction (ECPE) aims to extract all potential clause-pairs of emotions and their corresponding causes in a document. Unlike the more well-studied task of Emotion Cause Extraction (ECE), ECPE does not require the emotion clauses to be provided as annotations. Previous works on ECPE have either followed a multi-stage approach where emotion extraction, cause extractio…
▽ More
The task of Emotion-Cause Pair Extraction (ECPE) aims to extract all potential clause-pairs of emotions and their corresponding causes in a document. Unlike the more well-studied task of Emotion Cause Extraction (ECE), ECPE does not require the emotion clauses to be provided as annotations. Previous works on ECPE have either followed a multi-stage approach where emotion extraction, cause extraction, and pairing are done independently or use complex architectures to resolve its limitations. In this paper, we propose an end-to-end model for the ECPE task. Due to the unavailability of an English language ECPE corpus, we adapt the NTCIR-13 ECE corpus and establish a baseline for the ECPE task on this dataset. On this dataset, the proposed method produces significant performance improvements (~6.5 increase in F1 score) over the multi-stage approach and achieves comparable performance to the state-of-the-art methods.
△ Less
Submitted 3 March, 2021; v1 submitted 2 March, 2021;
originally announced March 2021.
-
Model-free Representation Learning and Exploration in Low-rank MDPs
Authors:
Aditya Modi,
Jinglin Chen,
Akshay Krishnamurthy,
Nan Jiang,
Alekh Agarwal
Abstract:
The low rank MDP has emerged as an important model for studying representation learning and exploration in reinforcement learning. With a known representation, several model-free exploration strategies exist. In contrast, all algorithms for the unknown representation setting are model-based, thereby requiring the ability to model the full dynamics. In this work, we present the first model-free rep…
▽ More
The low rank MDP has emerged as an important model for studying representation learning and exploration in reinforcement learning. With a known representation, several model-free exploration strategies exist. In contrast, all algorithms for the unknown representation setting are model-based, thereby requiring the ability to model the full dynamics. In this work, we present the first model-free representation learning algorithms for low rank MDPs. The key algorithmic contribution is a new minimax representation learning objective, for which we provide variants with differing tradeoffs in their statistical and computational properties. We interleave this representation learning step with an exploration strategy to cover the state space in a reward-free manner. The resulting algorithms are provably sample efficient and can accommodate general function approximation to scale to complex environments.
△ Less
Submitted 21 June, 2022; v1 submitted 13 February, 2021;
originally announced February 2021.
-
Adv-OLM: Generating Textual Adversaries via OLM
Authors:
Vijit Malik,
Ashwani Bhat,
Ashutosh Modi
Abstract:
Deep learning models are susceptible to adversarial examples that have imperceptible perturbations in the original input, resulting in adversarial attacks against these models. Analysis of these attacks on the state of the art transformers in NLP can help improve the robustness of these models against such adversarial inputs. In this paper, we present Adv-OLM, a black-box attack method that adapts…
▽ More
Deep learning models are susceptible to adversarial examples that have imperceptible perturbations in the original input, resulting in adversarial attacks against these models. Analysis of these attacks on the state of the art transformers in NLP can help improve the robustness of these models against such adversarial inputs. In this paper, we present Adv-OLM, a black-box attack method that adapts the idea of Occlusion and Language Models (OLM) to the current state of the art attack methods. OLM is used to rank words of a sentence, which are later substituted using word replacement strategies. We experimentally show that our approach outperforms other attack methods for several text classification tasks.
△ Less
Submitted 21 January, 2021;
originally announced January 2021.
-
Adapting a Language Model for Controlled Affective Text Generation
Authors:
Ishika Singh,
Ahsan Barkati,
Tushar Goswamy,
Ashutosh Modi
Abstract:
Human use language not just to convey information but also to express their inner feelings and mental states. In this work, we adapt the state-of-the-art language generation models to generate affective (emotional) text. We posit a model capable of generating affect-driven and topic-focused sentences without losing grammatical correctness as the affect intensity increases. We propose to incorporat…
▽ More
Human use language not just to convey information but also to express their inner feelings and mental states. In this work, we adapt the state-of-the-art language generation models to generate affective (emotional) text. We posit a model capable of generating affect-driven and topic-focused sentences without losing grammatical correctness as the affect intensity increases. We propose to incorporate emotion as prior for the probabilistic state-of-the-art text generation model such as GPT-2. The model gives a user the flexibility to control the category and intensity of emotion as well as the topic of the generated text. Previous attempts at modelling fine-grained emotions fall out on grammatical correctness at extreme intensities, but our model is resilient to this and delivers robust results at all intensities. We conduct automated evaluations and human studies to test the performance of our model and provide a detailed comparison of the results with other models. In all evaluations, our model outperforms existing affective text generation models.
△ Less
Submitted 8 November, 2020;
originally announced November 2020.
-
AI-based Monitoring and Response System for Hospital Preparedness towards COVID-19 in Southeast Asia
Authors:
Tushar Goswamy,
Naishadh Parmar,
Ayush Gupta,
Raunak Shah,
Vatsalya Tandon,
Varun Goyal,
Sanyog Gupta,
Karishma Laud,
Shivam Gupta,
Sudhanshu Mishra,
Ashutosh Modi
Abstract:
This research paper proposes a COVID-19 monitoring and response system to identify the surge in the volume of patients at hospitals and shortage of critical equipment like ventilators in South-east Asian countries, to understand the burden on health facilities. This can help authorities in these regions with resource planning measures to redirect resources to the regions identified by the model. D…
▽ More
This research paper proposes a COVID-19 monitoring and response system to identify the surge in the volume of patients at hospitals and shortage of critical equipment like ventilators in South-east Asian countries, to understand the burden on health facilities. This can help authorities in these regions with resource planning measures to redirect resources to the regions identified by the model. Due to the lack of publicly available data on the influx of patients in hospitals, or the shortage of equipment, ICU units or hospital beds that regions in these countries might be facing, we leverage Twitter data for gleaning this information. The approach has yielded accurate results for states in India, and we are working on validating the model for the remaining countries so that it can serve as a reliable tool for authorities to monitor the burden on hospitals.
△ Less
Submitted 5 September, 2022; v1 submitted 30 July, 2020;
originally announced July 2020.
-
Clinician-in-the-Loop Decision Making: Reinforcement Learning with Near-Optimal Set-Valued Policies
Authors:
Shengpu Tang,
Aditya Modi,
Michael W. Sjoding,
Jenna Wiens
Abstract:
Standard reinforcement learning (RL) aims to find an optimal policy that identifies the best action for each state. However, in healthcare settings, many actions may be near-equivalent with respect to the reward (e.g., survival). We consider an alternative objective -- learning set-valued policies to capture near-equivalent actions that lead to similar cumulative rewards. We propose a model-free a…
▽ More
Standard reinforcement learning (RL) aims to find an optimal policy that identifies the best action for each state. However, in healthcare settings, many actions may be near-equivalent with respect to the reward (e.g., survival). We consider an alternative objective -- learning set-valued policies to capture near-equivalent actions that lead to similar cumulative rewards. We propose a model-free algorithm based on temporal difference learning and a near-greedy heuristic for action selection. We analyze the theoretical properties of the proposed algorithm, providing optimality guarantees and demonstrate our approach on simulated environments and a real clinical task. Empirically, the proposed algorithm exhibits good convergence properties and discovers meaningful near-equivalent actions. Our work provides theoretical, as well as practical, foundations for clinician/human-in-the-loop decision making, in which humans (e.g., clinicians, patients) can incorporate additional knowledge (e.g., side effects, patient preference) when selecting among near-equivalent actions.
△ Less
Submitted 24 July, 2020;
originally announced July 2020.