-
CXR-Agent: Vision-language models for chest X-ray interpretation with uncertainty aware radiology reporting
Authors:
Naman Sharma
Abstract:
Recently large vision-language models have shown potential when interpreting complex images and generating natural language descriptions using advanced reasoning. Medicine's inherently multimodal nature incorporating scans and text-based medical histories to write reports makes it conducive to benefit from these leaps in AI capabilities. We evaluate the publicly available, state of the art, founda…
▽ More
Recently large vision-language models have shown potential when interpreting complex images and generating natural language descriptions using advanced reasoning. Medicine's inherently multimodal nature incorporating scans and text-based medical histories to write reports makes it conducive to benefit from these leaps in AI capabilities. We evaluate the publicly available, state of the art, foundational vision-language models for chest X-ray interpretation across several datasets and benchmarks. We use linear probes to evaluate the performance of various components including CheXagent's vision transformer and Q-former, which outperform the industry-standard Torch X-ray Vision models across many different datasets showing robust generalisation capabilities. Importantly, we find that vision-language models often hallucinate with confident language, which slows down clinical interpretation. Based on these findings, we develop an agent-based vision-language approach for report generation using CheXagent's linear probes and BioViL-T's phrase grounding tools to generate uncertainty-aware radiology reports with pathologies localised and described based on their likelihood. We thoroughly evaluate our vision-language agents using NLP metrics, chest X-ray benchmarks and clinical evaluations by developing an evaluation platform to perform a user study with respiratory specialists. Our results show considerable improvements in accuracy, interpretability and safety of the AI-generated reports. We stress the importance of analysing results for normal and abnormal scans separately. Finally, we emphasise the need for larger paired (scan and report) datasets alongside data augmentation to tackle overfitting seen in these large vision-language models.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Deciphering Assamese Vowel Harmony with Featural InfoWaveGAN
Authors:
Sneha Ray Barman,
Shakuntala Mahanta,
Neeraj Kumar Sharma
Abstract:
Traditional approaches for understanding phonological learning have predominantly relied on curated text data. Although insightful, such approaches limit the knowledge captured in textual representations of the spoken language. To overcome this limitation, we investigate the potential of the Featural InfoWaveGAN model to learn iterative long-distance vowel harmony using raw speech data. We focus o…
▽ More
Traditional approaches for understanding phonological learning have predominantly relied on curated text data. Although insightful, such approaches limit the knowledge captured in textual representations of the spoken language. To overcome this limitation, we investigate the potential of the Featural InfoWaveGAN model to learn iterative long-distance vowel harmony using raw speech data. We focus on Assamese, a language known for its phonologically regressive and word-bound vowel harmony. We demonstrate that the model is adept at grasping the intricacies of Assamese phonotactics, particularly iterative long-distance harmony with regressive directionality. It also produced non-iterative illicit forms resembling speech errors during human language acquisition. Our statistical analysis reveals a preference for a specific [+high,+ATR] vowel as a trigger across novel items, indicative of feature learning. More data and control could improve model proficiency, contrasting the universality of learning.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Generation and De-Identification of Indian Clinical Discharge Summaries using LLMs
Authors:
Sanjeet Singh,
Shreya Gupta,
Niralee Gupta,
Naimish Sharma,
Lokesh Srivastava,
Vibhu Agarwal,
Ashutosh Modi
Abstract:
The consequences of a healthcare data breach can be devastating for the patients, providers, and payers. The average financial impact of a data breach in recent months has been estimated to be close to USD 10 million. This is especially significant for healthcare organizations in India that are managing rapid digitization while still establishing data governance procedures that align with the lett…
▽ More
The consequences of a healthcare data breach can be devastating for the patients, providers, and payers. The average financial impact of a data breach in recent months has been estimated to be close to USD 10 million. This is especially significant for healthcare organizations in India that are managing rapid digitization while still establishing data governance procedures that align with the letter and spirit of the law. Computer-based systems for de-identification of personal information are vulnerable to data drift, often rendering them ineffective in cross-institution settings. Therefore, a rigorous assessment of existing de-identification against local health datasets is imperative to support the safe adoption of digital health initiatives in India. Using a small set of de-identified patient discharge summaries provided by an Indian healthcare institution, in this paper, we report the nominal performance of de-identification algorithms (based on language models) trained on publicly available non-Indian datasets, pointing towards a lack of cross-institutional generalization. Similarly, experimentation with off-the-shelf de-identification systems reveals potential risks associated with the approach. To overcome data scarcity, we explore generating synthetic clinical reports (using publicly available and Indian summaries) by performing in-context learning over Large Language Models (LLMs). Our experiments demonstrate the use of generated reports as an effective strategy for creating high-performing de-identification systems with good generalization capabilities.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Faux Polyglot: A Study on Information Disparity in Multilingual Large Language Models
Authors:
Nikhil Sharma,
Kenton Murray,
Ziang Xiao
Abstract:
With Retrieval Augmented Generation (RAG), Large Language Models (LLMs) are playing a pivotal role in information search and are being adopted globally. Although the multilingual capability of LLMs offers new opportunities to bridge the language barrier, do these capabilities translate into real-life scenarios where linguistic divide and knowledge conflicts between multilingual sources are known o…
▽ More
With Retrieval Augmented Generation (RAG), Large Language Models (LLMs) are playing a pivotal role in information search and are being adopted globally. Although the multilingual capability of LLMs offers new opportunities to bridge the language barrier, do these capabilities translate into real-life scenarios where linguistic divide and knowledge conflicts between multilingual sources are known occurrences? In this paper, we studied LLM's linguistic preference in a RAG-based information search setting. We found that LLMs displayed systemic bias towards information in the same language as the query language in both information retrieval and answer generation. Furthermore, in scenarios where there is little information in the language of the query, LLMs prefer documents in high-resource languages, reinforcing the dominant views. Such bias exists for both factual and opinion-based queries. Our results highlight the linguistic divide within multilingual LLMs in information search systems. The seemingly beneficial multilingual capability of LLMs may backfire on information parity by reinforcing language-specific information cocoons or filter bubbles further marginalizing low-resource views.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
Context is Important in Depressive Language: A Study of the Interaction Between the Sentiments and Linguistic Markers in Reddit Discussions
Authors:
Neha Sharma,
Kairit Sirts
Abstract:
Research exploring linguistic markers in individuals with depression has demonstrated that language usage can serve as an indicator of mental health. This study investigates the impact of discussion topic as context on linguistic markers and emotional expression in depression, using a Reddit dataset to explore interaction effects. Contrary to common findings, our sentiment analysis revealed a broa…
▽ More
Research exploring linguistic markers in individuals with depression has demonstrated that language usage can serve as an indicator of mental health. This study investigates the impact of discussion topic as context on linguistic markers and emotional expression in depression, using a Reddit dataset to explore interaction effects. Contrary to common findings, our sentiment analysis revealed a broader range of emotional intensity in depressed individuals, with both higher negative and positive sentiments than controls. This pattern was driven by posts containing no emotion words, revealing the limitations of the lexicon based approaches in capturing the full emotional context. We observed several interesting results demonstrating the importance of contextual analyses. For instance, the use of 1st person singular pronouns and words related to anger and sadness correlated with increased positive sentiments, whereas a higher rate of present-focused words was associated with more negative sentiments. Our findings highlight the importance of discussion contexts while interpreting the language used in depression, revealing that the emotional intensity and meaning of linguistic markers can vary based on the topic of discussion.
△ Less
Submitted 3 July, 2024; v1 submitted 28 May, 2024;
originally announced May 2024.
-
Sketch-guided Image Inpainting with Partial Discrete Diffusion Process
Authors:
Nakul Sharma,
Aditay Tripathi,
Anirban Chakraborty,
Anand Mishra
Abstract:
In this work, we study the task of sketch-guided image inpainting. Unlike the well-explored natural language-guided image inpainting, which excels in capturing semantic details, the relatively less-studied sketch-guided inpainting offers greater user control in specifying the object's shape and pose to be inpainted. As one of the early solutions to this task, we introduce a novel partial discrete…
▽ More
In this work, we study the task of sketch-guided image inpainting. Unlike the well-explored natural language-guided image inpainting, which excels in capturing semantic details, the relatively less-studied sketch-guided inpainting offers greater user control in specifying the object's shape and pose to be inpainted. As one of the early solutions to this task, we introduce a novel partial discrete diffusion process (PDDP). The forward pass of the PDDP corrupts the masked regions of the image and the backward pass reconstructs these masked regions conditioned on hand-drawn sketches using our proposed sketch-guided bi-directional transformer. The proposed novel transformer module accepts two inputs -- the image containing the masked region to be inpainted and the query sketch to model the reverse diffusion process. This strategy effectively addresses the domain gap between sketches and natural images, thereby, enhancing the quality of inpainting results. In the absence of a large-scale dataset specific to this task, we synthesize a dataset from the MS-COCO to train and extensively evaluate our proposed framework against various competent approaches in the literature. The qualitative and quantitative results and user studies establish that the proposed method inpaints realistic objects that fit the context in terms of the visual appearance of the provided sketch. To aid further research, we have made our code publicly available at https://github.com/vl2g/Sketch-Inpainting .
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
Community Needs and Assets: A Computational Analysis of Community Conversations
Authors:
Md Towhidul Absar Chowdhury,
Naveen Sharma,
Ashiqur R. KhudaBukhsh
Abstract:
A community needs assessment is a tool used by non-profits and government agencies to quantify the strengths and issues of a community, allowing them to allocate their resources better. Such approaches are transitioning towards leveraging social media conversations to analyze the needs of communities and the assets already present within them. However, manual analysis of exponentially increasing s…
▽ More
A community needs assessment is a tool used by non-profits and government agencies to quantify the strengths and issues of a community, allowing them to allocate their resources better. Such approaches are transitioning towards leveraging social media conversations to analyze the needs of communities and the assets already present within them. However, manual analysis of exponentially increasing social media conversations is challenging. There is a gap in the present literature in computationally analyzing how community members discuss the strengths and needs of the community. To address this gap, we introduce the task of identifying, extracting, and categorizing community needs and assets from conversational data using sophisticated natural language processing methods. To facilitate this task, we introduce the first dataset about community needs and assets consisting of 3,511 conversations from Reddit, annotated using crowdsourced workers. Using this dataset, we evaluate an utterance-level classification model compared to sentiment classification and a popular large language model (in a zero-shot setting), where we find that our model outperforms both baselines at an F1 score of 94% compared to 49% and 61% respectively. Furthermore, we observe through our study that conversations about needs have negative sentiments and emotions, while conversations about assets focus on location and entities. The dataset is available at https://github.com/towhidabsar/CommunityNeeds.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Demystifying Faulty Code with LLM: Step-by-Step Reasoning for Explainable Fault Localization
Authors:
Ratnadira Widyasari,
Jia Wei Ang,
Truong Giang Nguyen,
Neil Sharma,
David Lo
Abstract:
Fault localization is a critical process that involves identifying specific program elements responsible for program failures. Manually pinpointing these elements, such as classes, methods, or statements, which are associated with a fault is laborious and time-consuming. To overcome this challenge, various fault localization tools have been developed. These tools typically generate a ranked list o…
▽ More
Fault localization is a critical process that involves identifying specific program elements responsible for program failures. Manually pinpointing these elements, such as classes, methods, or statements, which are associated with a fault is laborious and time-consuming. To overcome this challenge, various fault localization tools have been developed. These tools typically generate a ranked list of suspicious program elements. However, this information alone is insufficient. A prior study emphasized that automated fault localization should offer a rationale.
In this study, we investigate the step-by-step reasoning for explainable fault localization. We explore the potential of Large Language Models (LLM) in assisting developers in reasoning about code. We proposed FuseFL that utilizes several combinations of information to enhance the LLM results which are spectrum-based fault localization results, test case execution outcomes, and code description (i.e., explanation of what the given code is intended to do). We conducted our investigation using faulty code from Refactory dataset. First, we evaluate the performance of the automated fault localization. Our results demonstrate a more than 30% increase in the number of successfully localized faults at Top-1 compared to the baseline. To evaluate the explanations generated by FuseFL, we create a dataset of human explanations that provide step-by-step reasoning as to why specific lines of code are considered faulty. This dataset consists of 324 faulty code files, along with explanations for 600 faulty lines. Furthermore, we also conducted human studies to evaluate the explanations. We found that for 22 out of the 30 randomly sampled cases, FuseFL generated correct explanations.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
RADS : Restricted Anisotropic Diffusion Spectrum model for Axonal Health quantification in Multiple Sclerosis
Authors:
Nand Sharma
Abstract:
Axonal damage is the primary pathological correlate of long-term impairment in multiple sclerosis (MS). Our previous work using our method - diffusion basis spectrum imaging (DBSI) - demonstrated a strong, quantitative relationship between axial diffusivity and axonal damage. In the present work, we develop an extension of DBSI which can be used to quantify the fraction of diseased and healthy axo…
▽ More
Axonal damage is the primary pathological correlate of long-term impairment in multiple sclerosis (MS). Our previous work using our method - diffusion basis spectrum imaging (DBSI) - demonstrated a strong, quantitative relationship between axial diffusivity and axonal damage. In the present work, we develop an extension of DBSI which can be used to quantify the fraction of diseased and healthy axons in MS. In this method, we model the MRI signal with the axial diffusion (AD) spectrum for each fiber orientation. We use two component restricted anisotropic diffusion spectrum (RADS) to model the anisotropic component of the diffusion-weighted MRI signal. Diffusion coefficients and signal fractions are computed for the optimal model with the lowest Bayesian information criterion (BIC) score. This gives us the fractions of diseased and healthy axons based on the axial diffusivities of the diseased and healthy axons. We test our method using Monte-Carlo (MC) simulations with the MC simulation package developed as part of this work. First we test and validate our MC simulations for the basic RADS model. It accurately recovers the fiber and cell fractions simulated as well as the simulated diffusivities. For testing and validating RADS to quantify axonal loss, we simulate different fractions of diseased and healthy axons. Our method produces highly accurate quantification of diseased and healthy axons with Pearson's correlation (predicted vs true proportion) of $ r = 0.99 $ (p-value = 0.001); the one Sample t-test for proportion error gives the mean error of 2\% (p-value = 0.034). Furthermore, the method finds the axial diffusivities of the diseased and healthy axons very accurately with mean error of 4\% (p-value = 0.001). RADS modeling of the diffusion-weighted MRI signal has the potential to be used for Axonal Health quantification in Multiple Sclerosis.
△ Less
Submitted 10 March, 2024;
originally announced March 2024.
-
Infrastructure Ombudsman: Mining Future Failure Concerns from Structural Disaster Response
Authors:
Md Towhidul Absar Chowdhury,
Soumyajit Datta,
Naveen Sharma,
Ashiqur R. KhudaBukhsh
Abstract:
Current research concentrates on studying discussions on social media related to structural failures to improve disaster response strategies. However, detecting social web posts discussing concerns about anticipatory failures is under-explored. If such concerns are channeled to the appropriate authorities, it can aid in the prevention and mitigation of potential infrastructural failures. In this p…
▽ More
Current research concentrates on studying discussions on social media related to structural failures to improve disaster response strategies. However, detecting social web posts discussing concerns about anticipatory failures is under-explored. If such concerns are channeled to the appropriate authorities, it can aid in the prevention and mitigation of potential infrastructural failures. In this paper, we develop an infrastructure ombudsman -- that automatically detects specific infrastructure concerns. Our work considers several recent structural failures in the US. We present a first-of-its-kind dataset of 2,662 social web instances for this novel task mined from Reddit and YouTube.
△ Less
Submitted 21 February, 2024; v1 submitted 20 February, 2024;
originally announced February 2024.
-
Generative Echo Chamber? Effects of LLM-Powered Search Systems on Diverse Information Seeking
Authors:
Nikhil Sharma,
Q. Vera Liao,
Ziang Xiao
Abstract:
Large language models (LLMs) powered conversational search systems have already been used by hundreds of millions of people, and are believed to bring many benefits over conventional search. However, while decades of research and public discourse interrogated the risk of search systems in increasing selective exposure and creating echo chambers -- limiting exposure to diverse opinions and leading…
▽ More
Large language models (LLMs) powered conversational search systems have already been used by hundreds of millions of people, and are believed to bring many benefits over conventional search. However, while decades of research and public discourse interrogated the risk of search systems in increasing selective exposure and creating echo chambers -- limiting exposure to diverse opinions and leading to opinion polarization, little is known about such a risk of LLM-powered conversational search. We conduct two experiments to investigate: 1) whether and how LLM-powered conversational search increases selective exposure compared to conventional search; 2) whether and how LLMs with opinion biases that either reinforce or challenge the user's view change the effect. Overall, we found that participants engaged in more biased information querying with LLM-powered conversational search, and an opinionated LLM reinforcing their views exacerbated this bias. These results present critical implications for the development of LLMs and conversational search systems, and the policy governing these technologies.
△ Less
Submitted 10 February, 2024; v1 submitted 8 February, 2024;
originally announced February 2024.
-
Digits micro-model for accurate and secure transactions
Authors:
Chirag Chhablani,
Nikhita Sharma,
Jordan Hosier,
Vijay K. Gurbani
Abstract:
Automatic Speech Recognition (ASR) systems are used in the financial domain to enhance the caller experience by enabling natural language understanding and facilitating efficient and intuitive interactions. Increasing use of ASR systems requires that such systems exhibit very low error rates. The predominant ASR models to collect numeric data are large, general-purpose commercial models -- Google…
▽ More
Automatic Speech Recognition (ASR) systems are used in the financial domain to enhance the caller experience by enabling natural language understanding and facilitating efficient and intuitive interactions. Increasing use of ASR systems requires that such systems exhibit very low error rates. The predominant ASR models to collect numeric data are large, general-purpose commercial models -- Google Speech-to-text (STT), or Amazon Transcribe -- or open source (OpenAI's Whisper). Such ASR models are trained on hundreds of thousands of hours of audio data and require considerable resources to run. Despite recent progress large speech recognition models, we highlight the potential of smaller, specialized "micro" models. Such light models can be trained perform well on number recognition specific tasks, competing with general models like Whisper or Google STT while using less than 80 minutes of training time and occupying at least an order of less memory resources. Also, unlike larger speech recognition models, micro-models are trained on carefully selected and curated datasets, which makes them highly accurate, agile, and easy to retrain, while using low compute resources. We present our work on creating micro models for multi-digit number recognition that handle diverse speaking styles reflecting real-world pronunciation patterns. Our work contributes to domain-specific ASR models, improving digit recognition accuracy, and privacy of data. An added advantage, their low resource consumption allows them to be hosted on-premise, keeping private data local instead uploading to an external cloud. Our results indicate that our micro-model makes less errors than the best-of-breed commercial or open-source ASRs in recognizing digits (1.8% error rate of our best micro-model versus 5.8% error rate of Whisper), and has a low memory footprint (0.66 GB VRAM for our model versus 11 GB VRAM for Whisper).
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
Applications of Machine Learning to Optimizing Polyolefin Manufacturing
Authors:
Niket Sharma,
Y. A. Liu
Abstract:
This chapter is a preprint from our book by , focusing on leveraging machine learning (ML) in chemical and polyolefin manufacturing optimization. It's crafted for both novices and seasoned professionals keen on the latest ML applications in chemical processes. We trace the evolution of AI and ML in chemical industries, delineate core ML components, and provide resources for ML beginners. A detaile…
▽ More
This chapter is a preprint from our book by , focusing on leveraging machine learning (ML) in chemical and polyolefin manufacturing optimization. It's crafted for both novices and seasoned professionals keen on the latest ML applications in chemical processes. We trace the evolution of AI and ML in chemical industries, delineate core ML components, and provide resources for ML beginners. A detailed discussion on various ML methods is presented, covering regression, classification, and unsupervised learning techniques, with performance metrics and examples. Ensemble methods, deep learning networks, including MLP, DNNs, RNNs, CNNs, and transformers, are explored for their growing role in chemical applications. Practical workshops guide readers through predictive modeling using advanced ML algorithms. The chapter culminates with insights into science-guided ML, advocating for a hybrid approach that enhances model accuracy. The extensive bibliography offers resources for further research and practical implementation. This chapter aims to be a thorough primer on ML's practical application in chemical engineering, particularly for polyolefin production, and sets the stage for continued learning in subsequent chapters. Please cite the original work [169,170] when referencing.
△ Less
Submitted 18 January, 2024;
originally announced January 2024.
-
The Quantum Cryptography Approach: Unleashing the Potential of Quantum Key Reconciliation Protocol for Secure Communication
Authors:
Neha Sharma,
Vikas Saxena
Abstract:
Quantum cryptography is the study of delivering secret communications across a quantum channel. Recently, Quantum Key Distribution (QKD) has been recognized as the most important breakthrough in quantum cryptography. This process facilitates two distant parties to share secure communications based on physical laws. The BB84 protocol was developed in 1984 and remains the most widely used among BB92…
▽ More
Quantum cryptography is the study of delivering secret communications across a quantum channel. Recently, Quantum Key Distribution (QKD) has been recognized as the most important breakthrough in quantum cryptography. This process facilitates two distant parties to share secure communications based on physical laws. The BB84 protocol was developed in 1984 and remains the most widely used among BB92, Ekert91, COW, and SARG04 protocols. However the practical security of QKD with imperfect devices have been widely discussed, and there are many ways to guarantee that generated key by QKD still provides unconditional security. This paper proposed a novel method that allows users to communicate while generating the secure keys as well as securing the transmission without any leakage of the data. In this approach sender will never reveal her basis, hence neither the receiver nor the intruder will get knowledge of the fundamental basis.Further to detect Eve, polynomial interpolation is also used as a key verification technique. In order to fully utilize the quantum computing capabilities provided by IBM quantum computers, the protocol is executed using the Qiskit backend for 45 qubits. This article discusses a plot of % error against alpha (strength of eavesdropping). As a result, different types of noise have been included, and the success probability of the desired key bits has been determined. Furthermore, the success probability under depolarizing noise is explained for different qubit counts.Last but not least, even when the applied noise is increased to maximum capacity, a 50% probability of successful key generation is still observed in an experiment.
△ Less
Submitted 17 January, 2024;
originally announced January 2024.
-
A Dynamic Agent Based Model of the Real Economy with Monopolistic Competition, Perfect Product Differentiation, Heterogeneous Agents, Increasing Returns to Scale and Trade in Disequilibrium
Authors:
Subhamon Supantha,
Naresh Kumar Sharma
Abstract:
We have used agent-based modeling as our numerical method to artificially simulate a dynamic real economy where agents are rational maximizers of an objective function of Cobb-Douglas type. The economy is characterised by heterogeneous agents, acting out of local or imperfect information, monopolistic competition, perfect product differentiation, allowance for increasing returns to scale technolog…
▽ More
We have used agent-based modeling as our numerical method to artificially simulate a dynamic real economy where agents are rational maximizers of an objective function of Cobb-Douglas type. The economy is characterised by heterogeneous agents, acting out of local or imperfect information, monopolistic competition, perfect product differentiation, allowance for increasing returns to scale technology and trade in disequilibrium. An algorithm for economic activity in each period is devised and a general purpose open source agent-based model is developed which allows for counterfactual inquiries, testing out treatments, analysing causality of various economic processes, outcomes and studying emergent properties. 10,000 simulations, with 10 firms and 80 consumers are run with varying parameters and the results show that from only a few initial conditions the economy reaches equilibrium while in most of the other cases it remains in perpetual disequilibrium. It also shows that from a few initial conditions the economy reaches a disaster where all the consumer wealth falls to zero or only a single producer remains. Furthermore, from some initial conditions, an ideal economy with high wage rate, high consumer utility and no unemployment is also reached. It was also observed that starting from an equal endowment of wealth in consumers and in producers, inequality emerged in the economy. In majority of the cases most of the firms(6-7) shut down because they were not profitable enough and only a few firms remained. Our results highlight that all these varying outcomes are possible for a decentralized market economy with rational optimizing agents.
△ Less
Submitted 13 January, 2024;
originally announced January 2024.
-
On a Foundation Model for Operating Systems
Authors:
Divyanshu Saxena,
Nihal Sharma,
Donghyun Kim,
Rohit Dwivedula,
Jiayi Chen,
Chenxi Yang,
Sriram Ravula,
Zichao Hu,
Aditya Akella,
Sebastian Angel,
Joydeep Biswas,
Swarat Chaudhuri,
Isil Dillig,
Alex Dimakis,
P. Brighten Godfrey,
Daehyeok Kim,
Chris Rossbach,
Gang Wang
Abstract:
This paper lays down the research agenda for a domain-specific foundation model for operating systems (OSes). Our case for a foundation model revolves around the observations that several OS components such as CPU, memory, and network subsystems are interrelated and that OS traces offer the ideal dataset for a foundation model to grasp the intricacies of diverse OS components and their behavior in…
▽ More
This paper lays down the research agenda for a domain-specific foundation model for operating systems (OSes). Our case for a foundation model revolves around the observations that several OS components such as CPU, memory, and network subsystems are interrelated and that OS traces offer the ideal dataset for a foundation model to grasp the intricacies of diverse OS components and their behavior in varying environments and workloads. We discuss a wide range of possibilities that then arise, from employing foundation models as policy agents to utilizing them as generators and predictors to assist traditional OS control algorithms. Our hope is that this paper spurs further research into OS foundation models and creating the next generation of operating systems for the evolving computing landscape.
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
Learning from Label Proportions: Bootstrapping Supervised Learners via Belief Propagation
Authors:
Shreyas Havaldar,
Navodita Sharma,
Shubhi Sareen,
Karthikeyan Shanmugam,
Aravindan Raghuveer
Abstract:
Learning from Label Proportions (LLP) is a learning problem where only aggregate level labels are available for groups of instances, called bags, during training, and the aim is to get the best performance at the instance-level on the test data. This setting arises in domains like advertising and medicine due to privacy considerations. We propose a novel algorithmic framework for this problem that…
▽ More
Learning from Label Proportions (LLP) is a learning problem where only aggregate level labels are available for groups of instances, called bags, during training, and the aim is to get the best performance at the instance-level on the test data. This setting arises in domains like advertising and medicine due to privacy considerations. We propose a novel algorithmic framework for this problem that iteratively performs two main steps. For the first step (Pseudo Labeling) in every iteration, we define a Gibbs distribution over binary instance labels that incorporates a) covariate information through the constraint that instances with similar covariates should have similar labels and b) the bag level aggregated label. We then use Belief Propagation (BP) to marginalize the Gibbs distribution to obtain pseudo labels. In the second step (Embedding Refinement), we use the pseudo labels to provide supervision for a learner that yields a better embedding. Further, we iterate on the two steps again by using the second step's embeddings as new covariates for the next iteration. In the final iteration, a classifier is trained using the pseudo labels. Our algorithm displays strong gains against several SOTA baselines (up to 15%) for the LLP Binary Classification problem on various dataset types - tabular and Image. We achieve these improvements with minimal computational overhead above standard supervised learning due to Belief Propagation, for large bag sizes, even for a million samples.
△ Less
Submitted 20 March, 2024; v1 submitted 12 October, 2023;
originally announced October 2023.
-
Improved Inference of Human Intent by Combining Plan Recognition and Language Feedback
Authors:
Ifrah Idrees,
Tian Yun,
Naveen Sharma,
Yunxin Deng,
Nakul Gopalan,
George Konidaris,
Stefanie Tellex
Abstract:
Conversational assistive robots can aid people, especially those with cognitive impairments, to accomplish various tasks such as cooking meals, performing exercises, or operating machines. However, to interact with people effectively, robots must recognize human plans and goals from noisy observations of human actions, even when the user acts sub-optimally. Previous works on Plan and Goal Recognit…
▽ More
Conversational assistive robots can aid people, especially those with cognitive impairments, to accomplish various tasks such as cooking meals, performing exercises, or operating machines. However, to interact with people effectively, robots must recognize human plans and goals from noisy observations of human actions, even when the user acts sub-optimally. Previous works on Plan and Goal Recognition (PGR) as planning have used hierarchical task networks (HTN) to model the actor/human. However, these techniques are insufficient as they do not have user engagement via natural modes of interaction such as language. Moreover, they have no mechanisms to let users, especially those with cognitive impairments, know of a deviation from their original plan or about any sub-optimal actions taken towards their goal. We propose a novel framework for plan and goal recognition in partially observable domains -- Dialogue for Goal Recognition (D4GR) enabling a robot to rectify its belief in human progress by asking clarification questions about noisy sensor data and sub-optimal human actions. We evaluate the performance of D4GR over two simulated domains -- kitchen and blocks domain. With language feedback and the world state information in a hierarchical task model, we show that D4GR framework for the highest sensor noise performs 1% better than HTN in goal accuracy in both domains. For plan accuracy, D4GR outperforms by 4% in the kitchen domain and 2% in the blocks domain in comparison to HTN. The ALWAYS-ASK oracle outperforms our policy by 3% in goal recognition and 7%in plan recognition. D4GR does so by asking 68% fewer questions than an oracle baseline. We also demonstrate a real-world robot scenario in the kitchen domain, validating the improved plan and goal recognition of D4GR in a realistic setting.
△ Less
Submitted 3 October, 2023;
originally announced October 2023.
-
Coswara: A respiratory sounds and symptoms dataset for remote screening of SARS-CoV-2 infection
Authors:
Debarpan Bhattacharya,
Neeraj Kumar Sharma,
Debottam Dutta,
Srikanth Raj Chetupalli,
Pravin Mote,
Sriram Ganapathy,
Chandrakiran C,
Sahiti Nori,
Suhail K K,
Sadhana Gonuguntla,
Murali Alagesan
Abstract:
This paper presents the Coswara dataset, a dataset containing diverse set of respiratory sounds and rich meta-data, recorded between April-2020 and February-2022 from 2635 individuals (1819 SARS-CoV-2 negative, 674 positive, and 142 recovered subjects). The respiratory sounds contained nine sound categories associated with variants of breathing, cough and speech. The rich metadata contained demogr…
▽ More
This paper presents the Coswara dataset, a dataset containing diverse set of respiratory sounds and rich meta-data, recorded between April-2020 and February-2022 from 2635 individuals (1819 SARS-CoV-2 negative, 674 positive, and 142 recovered subjects). The respiratory sounds contained nine sound categories associated with variants of breathing, cough and speech. The rich metadata contained demographic information associated with age, gender and geographic location, as well as the health information relating to the symptoms, pre-existing respiratory ailments, comorbidity and SARS-CoV-2 test status. Our study is the first of its kind to manually annotate the audio quality of the entire dataset (amounting to 65~hours) through manual listening. The paper summarizes the data collection procedure, demographic, symptoms and audio data information. A COVID-19 classifier based on bi-directional long short-term (BLSTM) architecture, is trained and evaluated on the different population sub-groups contained in the dataset to understand the bias/fairness of the model. This enabled the analysis of the impact of gender, geographic location, date of recording, and language proficiency on the COVID-19 detection performance.
△ Less
Submitted 22 May, 2023;
originally announced May 2023.
-
mlpack 4: a fast, header-only C++ machine learning library
Authors:
Ryan R. Curtin,
Marcus Edel,
Omar Shrit,
Shubham Agrawal,
Suryoday Basak,
James J. Balamuta,
Ryan Birmingham,
Kartik Dutt,
Dirk Eddelbuettel,
Rishabh Garg,
Shikhar Jaiswal,
Aakash Kaushik,
Sangyeon Kim,
Anjishnu Mukherjee,
Nanubala Gnana Sai,
Nippun Sharma,
Yashwant Singh Parihar,
Roshan Swain,
Conrad Sanderson
Abstract:
For over 15 years, the mlpack machine learning library has served as a "swiss army knife" for C++-based machine learning. Its efficient implementations of common and cutting-edge machine learning algorithms have been used in a wide variety of scientific and industrial applications. This paper overviews mlpack 4, a significant upgrade over its predecessor. The library has been significantly refacto…
▽ More
For over 15 years, the mlpack machine learning library has served as a "swiss army knife" for C++-based machine learning. Its efficient implementations of common and cutting-edge machine learning algorithms have been used in a wide variety of scientific and industrial applications. This paper overviews mlpack 4, a significant upgrade over its predecessor. The library has been significantly refactored and redesigned to facilitate an easier prototyping-to-deployment pipeline, including bindings to other languages (Python, Julia, R, Go, and the command line) that allow prototyping to be seamlessly performed in environments other than C++. mlpack is open-source software, distributed under the permissive 3-clause BSD license; it can be obtained at https://mlpack.org
△ Less
Submitted 1 February, 2023;
originally announced February 2023.
-
Contrastive Multi-View Textual-Visual Encoding: Towards One Hundred Thousand-Scale One-Shot Logo Identification
Authors:
Nakul Sharma,
Abhirama S. Penamakuri,
Anand Mishra
Abstract:
In this paper, we study the problem of identifying logos of business brands in natural scenes in an open-set one-shot setting. This problem setup is significantly more challenging than traditionally-studied 'closed-set' and 'large-scale training samples per category' logo recognition settings. We propose a novel multi-view textual-visual encoding framework that encodes text appearing in the logos…
▽ More
In this paper, we study the problem of identifying logos of business brands in natural scenes in an open-set one-shot setting. This problem setup is significantly more challenging than traditionally-studied 'closed-set' and 'large-scale training samples per category' logo recognition settings. We propose a novel multi-view textual-visual encoding framework that encodes text appearing in the logos as well as the graphical design of the logos to learn robust contrastive representations. These representations are jointly learned for multiple views of logos over a batch and thereby they generalize well to unseen logos. We evaluate our proposed framework for cropped logo verification, cropped logo identification, and end-to-end logo identification in natural scene tasks; and compare it against state-of-the-art methods. Further, the literature lacks a 'very-large-scale' collection of reference logo images that can facilitate the study of one-hundred thousand-scale logo identification. To fill this gap in the literature, we introduce Wikidata Reference Logo Dataset (WiRLD), containing logos for 100K business brands harvested from Wikidata. Our proposed framework that achieves an area under the ROC curve of 91.3% on the QMUL-OpenLogo dataset for the verification task, outperforms state-of-the-art methods by 9.1% and 2.6% on the one-shot logo identification task on the Toplogos-10 and the FlickrLogos32 datasets, respectively. Further, we show that our method is more stable compared to other baselines even when the number of candidate logos is on a 100K scale.
△ Less
Submitted 23 November, 2022;
originally announced November 2022.
-
Community Learning: Understanding A Community Through NLP for Positive Impact
Authors:
Md Towhidul Absar Chowdhury,
Naveen Sharma
Abstract:
A post-pandemic world resulted in economic upheaval, particularly for the cities' communities. While significant work in NLP4PI focuses on national and international events, there is a gap in bringing such state-of-the-art methods into the community development field. In order to help with community development, we must learn about the communities we develop. To that end, we propose the task of co…
▽ More
A post-pandemic world resulted in economic upheaval, particularly for the cities' communities. While significant work in NLP4PI focuses on national and international events, there is a gap in bringing such state-of-the-art methods into the community development field. In order to help with community development, we must learn about the communities we develop. To that end, we propose the task of community learning as a computational task of extracting natural language data about the community, transforming and loading it into a suitable knowledge graph structure for further downstream applications. We study two particular cases of homelessness and education in showing the visualization capabilities of a knowledge graph, and also discuss other usefulness such a model can provide.
△ Less
Submitted 10 October, 2022; v1 submitted 2 October, 2022;
originally announced October 2022.
-
Collisionless Pattern Discovery in Robot Swarms Using Deep Reinforcement Learning
Authors:
Nelson Sharma,
Aswini Ghosh,
Rajiv Misra,
Supratik Mukhopadhyay,
Gokarna Sharma
Abstract:
We present a deep reinforcement learning-based framework for automatically discovering patterns available in any given initial configuration of fat robot swarms. In particular, we model the problem of collision-less gathering and mutual visibility in fat robot swarms and discover patterns for solving them using our framework. We show that by shaping reward signals based on certain constraints like…
▽ More
We present a deep reinforcement learning-based framework for automatically discovering patterns available in any given initial configuration of fat robot swarms. In particular, we model the problem of collision-less gathering and mutual visibility in fat robot swarms and discover patterns for solving them using our framework. We show that by shaping reward signals based on certain constraints like mutual visibility and safe proximity, the robots can discover collision-less trajectories leading to well-formed gathering and visibility patterns.
△ Less
Submitted 20 September, 2022;
originally announced September 2022.
-
CausNet : Generational orderings based search for optimal Bayesian networks via dynamic programming with parent set constraints
Authors:
Nand Sharma,
Joshua Millstein
Abstract:
Finding a globally optimal Bayesian Network using exhaustive search is a problem with super-exponential complexity, which severely restricts the number of variables that it can work for. We implement a dynamic programming based algorithm with built-in dimensionality reduction and parent set identification. This reduces the search space drastically and can be applied to large-dimensional data. We u…
▽ More
Finding a globally optimal Bayesian Network using exhaustive search is a problem with super-exponential complexity, which severely restricts the number of variables that it can work for. We implement a dynamic programming based algorithm with built-in dimensionality reduction and parent set identification. This reduces the search space drastically and can be applied to large-dimensional data. We use what we call generational orderings based search for optimal networks, which is a novel way to efficiently search the space of possible networks given the possible parent sets. The algorithm supports both continuous and categorical data, and categorical as well as survival outcomes. We demonstrate the efficacy of our algorithm on both synthetic and real data. In simulations, our algorithm performs better than three state-of-art algorithms that are currently used extensively. We then apply it to an Ovarian Cancer gene expression dataset with 513 genes and a survival outcome. Our algorithm is able to find an optimal network describing the disease pathway consisting of 6 genes leading to the outcome node in a few minutes on a basic computer. Our generational orderings based search for optimal networks, is both efficient and highly scalable approach to finding optimal Bayesian Networks, that can be applied to 1000s of variables. Using specifiable parameters - correlation, FDR cutoffs, and in-degree - one can increase or decrease the number of nodes and density of the networks. Availability of two scoring option-BIC and Bge-and implementation of survival outcomes and mixed data types makes our algorithm very suitable for many types of high dimensional biomedical data to find disease pathways.
△ Less
Submitted 17 July, 2022;
originally announced July 2022.
-
Analyzing the impact of SARS-CoV-2 variants on respiratory sound signals
Authors:
Debarpan Bhattacharya,
Debottam Dutta,
Neeraj Kumar Sharma,
Srikanth Raj Chetupalli,
Pravin Mote,
Sriram Ganapathy,
Chandrakiran C,
Sahiti Nori,
Suhail K K,
Sadhana Gonuguntla,
Murali Alagesan
Abstract:
The COVID-19 outbreak resulted in multiple waves of infections that have been associated with different SARS-CoV-2 variants. Studies have reported differential impact of the variants on respiratory health of patients. We explore whether acoustic signals, collected from COVID-19 subjects, show computationally distinguishable acoustic patterns suggesting a possibility to predict the underlying virus…
▽ More
The COVID-19 outbreak resulted in multiple waves of infections that have been associated with different SARS-CoV-2 variants. Studies have reported differential impact of the variants on respiratory health of patients. We explore whether acoustic signals, collected from COVID-19 subjects, show computationally distinguishable acoustic patterns suggesting a possibility to predict the underlying virus variant. We analyze the Coswara dataset which is collected from three subject pools, namely, i) healthy, ii) COVID-19 subjects recorded during the delta variant dominant period, and iii) data from COVID-19 subjects recorded during the omicron surge. Our findings suggest that multiple sound categories, such as cough, breathing, and speech, indicate significant acoustic feature differences when comparing COVID-19 subjects with omicron and delta variants. The classification areas-under-the-curve are significantly above chance for differentiating subjects infected by omicron from those infected by delta. Using a score fusion from multiple sound categories, we obtained an area-under-the-curve of 89% and 52.4% sensitivity at 95% specificity. Additionally, a hierarchical three class approach was used to classify the acoustic data into healthy and COVID-19 positive, and further COVID-19 subjects into delta and omicron variants providing high level of 3-class classification accuracy. These results suggest new ways for designing sound based COVID-19 diagnosis approaches.
△ Less
Submitted 24 June, 2022;
originally announced June 2022.
-
Coswara: A website application enabling COVID-19 screening by analysing respiratory sound samples and health symptoms
Authors:
Debarpan Bhattacharya,
Debottam Dutta,
Neeraj Kumar Sharma,
Srikanth Raj Chetupalli,
Pravin Mote,
Sriram Ganapathy,
Chandrakiran C,
Sahiti Nori,
Suhail K K,
Sadhana Gonuguntla,
Murali Alagesan
Abstract:
The COVID-19 pandemic has accelerated research on design of alternative, quick and effective COVID-19 diagnosis approaches. In this paper, we describe the Coswara tool, a website application designed to enable COVID-19 detection by analysing respiratory sound samples and health symptoms. A user using this service can log into a website using any device connected to the internet, provide there curr…
▽ More
The COVID-19 pandemic has accelerated research on design of alternative, quick and effective COVID-19 diagnosis approaches. In this paper, we describe the Coswara tool, a website application designed to enable COVID-19 detection by analysing respiratory sound samples and health symptoms. A user using this service can log into a website using any device connected to the internet, provide there current health symptom information and record few sound sampled corresponding to breathing, cough, and speech. Within a minute of analysis of this information on a cloud server the website tool will output a COVID-19 probability score to the user. As the COVID-19 pandemic continues to demand massive and scalable population level testing, we hypothesize that the proposed tool provides a potential solution towards this.
△ Less
Submitted 9 June, 2022;
originally announced June 2022.
-
LotRec: A Recommender for Urban Vacant Lot Conversion
Authors:
Md Towhidul A Chowdhury,
Naveen Sharma
Abstract:
Vacant lots are neglected properties in a city that lead to environmental hazards and poor standard of living for the community. Thus, reclaiming vacant lots and putting them to productive use is an important consideration for many cities. Given a large number of vacant lots and resource constraints for conversion, two key questions for a city are (1) whether to convert a vacant lot or not; and (2…
▽ More
Vacant lots are neglected properties in a city that lead to environmental hazards and poor standard of living for the community. Thus, reclaiming vacant lots and putting them to productive use is an important consideration for many cities. Given a large number of vacant lots and resource constraints for conversion, two key questions for a city are (1) whether to convert a vacant lot or not; and (2) what to convert a vacant lot as. We seek to provide computational support to answer these questions. To this end, we identify the determinants of a vacant lot conversion and build a recommender based on those determinants. We evaluate our models on real-world vacant lot datasets from the US cities of Philadelphia,PA and Baltimore, MD. Our results indicate that our recommender yields mean F-measures of (1) 90% in predicting whether a vacant lot should be converted or not within a single city, (2) 91% in predicting what a vacant lot should be converted to, within a single city and, (3) 85% in predicting whether a vacant lot should be converted or not across two cities.
△ Less
Submitted 4 February, 2022;
originally announced February 2022.
-
Ergonomics Integrated Design Methodology using Parameter Optimization, Computer-Aided Design, and Digital Human Modelling: A Case Study of a Cleaning Equipment
Authors:
Neelesh Kr. Sharma,
Mayank Tiwari,
Atul Thakur,
Anindya K. Ganguli
Abstract:
Challenges of enhancing productivity by amplifying efficiency and man-machine compatibility of equipment can be achieved by adopting advanced technologies. This study aims to present and exemplify methodology for incorporating ergonomics pro-actively into the design using computer-aided design and digital human modeling-based analysis. The cleaning equipment is parametrized to detect the critical…
▽ More
Challenges of enhancing productivity by amplifying efficiency and man-machine compatibility of equipment can be achieved by adopting advanced technologies. This study aims to present and exemplify methodology for incorporating ergonomics pro-actively into the design using computer-aided design and digital human modeling-based analysis. The cleaning equipment is parametrized to detect the critical variables. The relations are then constrained through the 3DSSPP software-based biomechanical and experimental analysis using a prototype. MATLAB and Minitab software is used for optimizing efficiency while satisfying the established constraints. The experiment showed nearly 67%, 120%, and 241% successive improvement in the mechanical advantage in comparison to their immediate predecessors. A significant (6 point) reduction in rapid entire body assessment score has been observed in the final posture while working with the manipulator. 3DSSPP suggested that the joint forces during the actuation of the manipulator were acceptable to 99% of the working population. The study demonstrated the potential of the methodology in revamping the equipment for improved ergonomic design.
△ Less
Submitted 5 April, 2022; v1 submitted 19 January, 2022;
originally announced January 2022.
-
A Hybrid Science-Guided Machine Learning Approach for Modeling and Optimizing Chemical Processes
Authors:
Niket Sharma,
Y. A. Liu
Abstract:
This study presents a broad perspective of hybrid process modeling and optimization combining the scientific knowledge and data analytics in bioprocessing and chemical engineering with a science-guided machine learning (SGML) approach. We divide the approach into two major categories. The first refers to the case where a data-based ML model compliments and makes the first-principle science-based m…
▽ More
This study presents a broad perspective of hybrid process modeling and optimization combining the scientific knowledge and data analytics in bioprocessing and chemical engineering with a science-guided machine learning (SGML) approach. We divide the approach into two major categories. The first refers to the case where a data-based ML model compliments and makes the first-principle science-based model more accurate in prediction, and the second corresponds to the case where scientific knowledge helps make the ML model more scientifically consistent. We present a detailed review of scientific and engineering literature relating to the hybrid SGML approach, and propose a systematic classification of hybrid SGML models. For applying ML to improve science-based models, we present expositions of the sub-categories of direct serial and parallel hybrid modeling and their combinations, inverse modeling, reduced-order modeling, quantifying uncertainty in the process and even discovering governing equations of the process model. For applying scientific principles to improve ML models, we discuss the sub-categories of science-guided design, learning and refinement. For each sub-category, we identify its requirements, advantages and limitations, together with their published and potential areas of applications in bioprocessing and chemical engineering.We also present several examples to illustrate different hybrid SGML methodologies for modeling polymer processes.
△ Less
Submitted 24 January, 2022; v1 submitted 2 December, 2021;
originally announced December 2021.
-
SACRIFICE: A Secure Road Condition Monitoring Scheme over Fog-based VANETs
Authors:
Nishttha Sharma,
Jayasree Sengupta,
Sipra Das Bit
Abstract:
With the rapid growth of Vehicular Ad-Hoc Networks (VANETs), huge amounts of road condition data are constantly being generated and sent to the cloud for processing. However, this introduces a significant load on the network bandwidth causing delay in the network and for a time-critical application like VANET such delay may have severe impact on real-time traffic management. This delay maybe reduc…
▽ More
With the rapid growth of Vehicular Ad-Hoc Networks (VANETs), huge amounts of road condition data are constantly being generated and sent to the cloud for processing. However, this introduces a significant load on the network bandwidth causing delay in the network and for a time-critical application like VANET such delay may have severe impact on real-time traffic management. This delay maybe reduced by offloading some computational tasks to devices as close as possible to the vehicles. Further, security and privacy of vehicles is another important concern in such applications. Thus, this paper proposes a secure road condition monitoring scheme for fog-based vehicular networks. It considers Road-Side Units (RSUs) with computational capabilities, which act as intermediate fog nodes in between vehicles and cloud to reduce delay in decision making and thereby increases scalability. Apart from fulfilling basic security features, our scheme also supports advanced features like unlinkability and untraceability. A detailed security analysis proves that the proposed scheme can handle both internal and external adversaries. We also justify the efficiency of our scheme through theoretical overhead analysis. Finally, the scheme is simulated using an integrated SUMO and NS-3 platform to show its feasibility for practical implementations while not compromising on the network performance. The results show an average 80.96\% and 37.5\% improvement in execution time and end-to-end delay respectively while maintaining at par results for packet delivery ratio over a state-of-the-art scheme.
△ Less
Submitted 23 November, 2021;
originally announced November 2021.
-
3D-FCT: Simultaneous 3D Object Detection and Tracking Using Feature Correlation
Authors:
Naman Sharma,
Hocksoon Lim
Abstract:
3D object detection using LiDAR data remains a key task for applications like autonomous driving and robotics. Unlike in the case of 2D images, LiDAR data is almost always collected over a period of time. However, most work in this area has focused on performing detection independent of the temporal domain. In this paper we present 3D-FCT, a Siamese network architecture that utilizes temporal info…
▽ More
3D object detection using LiDAR data remains a key task for applications like autonomous driving and robotics. Unlike in the case of 2D images, LiDAR data is almost always collected over a period of time. However, most work in this area has focused on performing detection independent of the temporal domain. In this paper we present 3D-FCT, a Siamese network architecture that utilizes temporal information to simultaneously perform the related tasks of 3D object detection and tracking. The network is trained to predict the movement of an object based on the correlation features of extracted keypoints across time. Calculating correlation across keypoints only allows for real-time object detection. We further extend the multi-task objective to include a tracking regression loss. Finally, we produce high accuracy detections by linking short-term object tracklets into long term tracks based on the predicted tracks. Our proposed method is evaluated on the KITTI tracking dataset where it is shown to provide an improvement of 5.57% mAP over a state-of-the-art approach.
△ Less
Submitted 6 October, 2021;
originally announced October 2021.
-
The Second DiCOVA Challenge: Dataset and performance analysis for COVID-19 diagnosis using acoustics
Authors:
Neeraj Kumar Sharma,
Srikanth Raj Chetupalli,
Debarpan Bhattacharya,
Debottam Dutta,
Pravin Mote,
Sriram Ganapathy
Abstract:
The Second Diagnosis of COVID-19 using Acoustics (DiCOVA) Challenge aimed at accelerating the research in acoustics based detection of COVID-19, a topic at the intersection of acoustics, signal processing, machine learning, and healthcare. This paper presents the details of the challenge, which was an open call for researchers to analyze a dataset of audio recordings consisting of breathing, cough…
▽ More
The Second Diagnosis of COVID-19 using Acoustics (DiCOVA) Challenge aimed at accelerating the research in acoustics based detection of COVID-19, a topic at the intersection of acoustics, signal processing, machine learning, and healthcare. This paper presents the details of the challenge, which was an open call for researchers to analyze a dataset of audio recordings consisting of breathing, cough and speech signals. This data was collected from individuals with and without COVID-19 infection, and the task in the challenge was a two-class classification. The development set audio recordings were collected from 965 (172 COVID-19 positive) individuals, while the evaluation set contained data from 471 individuals (71 COVID-19 positive). The challenge featured four tracks, one associated with each sound category of cough, speech and breathing, and a fourth fusion track. A baseline system was also released to benchmark the participants. In this paper, we present an overview of the challenge, the rationale for the data collection and the baseline system. Further, a performance analysis for the systems submitted by the $16$ participating teams in the leaderboard is also presented.
△ Less
Submitted 11 October, 2021; v1 submitted 4 October, 2021;
originally announced October 2021.
-
Interactive GIS Web-Atlas for Twelve Pacific Islands Countries
Authors:
Fabrice Lartigou,
Michael Govorov,
Tofiga Aisake,
Pankajeshwara N. Sharma
Abstract:
This article deals with the development of an interactive up-to-date Pacific Islands Web GIS Atlas. It focuses on the compilation of spatial data from the twelve member countries of the University of the South Pacific (Cook Islands, Fiji Islands, Kiribati Islands, Marshall Islands, Nauru, Niue, Tonga, Tuvalu, Tokelau, Solomon Islands, Vanuatu, and Western Samoa). A previous bitmap web Atlas was cr…
▽ More
This article deals with the development of an interactive up-to-date Pacific Islands Web GIS Atlas. It focuses on the compilation of spatial data from the twelve member countries of the University of the South Pacific (Cook Islands, Fiji Islands, Kiribati Islands, Marshall Islands, Nauru, Niue, Tonga, Tuvalu, Tokelau, Solomon Islands, Vanuatu, and Western Samoa). A previous bitmap web Atlas was created in 1996, and was a pilot activity investigating the potential for using Geographical Information Systems (GIS) in the South Pacific. The objective of the new atlas is to provide sets of spatial and attributive data and maps for use of educators, students, researchers, policy makers and other relevant user groups and the public. GIS is a highly flexible and dynamic technology that allows the construction and analysis of maps and data sets from a variety of sources and formats. Nowadays, GIS application has moved from local and client-server applications to a three-tier architecture: Client (Web Browser) -- Application Web Map Server -- Spatial Data Warehouses. The objective of this project is to produce an Atlas that will include interactive maps and data on an Application Web Map Server. Intergraph products such as GeoMedia Professional, Web Map and Web Publisher have been selected for the web atlas production and design. In an interactive environment, an atlas will be composed from a series of maps and data profiles, which will be based on legend entries, queries, hot spots and cartographic tools. Only the first stage of development of the atlas and related technological solutions are outlined in this article.
△ Less
Submitted 15 July, 2021;
originally announced July 2021.
-
Episodic Bandits with Stochastic Experts
Authors:
Nihal Sharma,
Soumya Basu,
Karthikeyan Shanmugam,
Sanjay Shakkottai
Abstract:
We study a version of the contextual bandit problem where an agent can intervene through a set of stochastic expert policies. The agent interacts with the environment over episodes, with each episode having different context distributions; this results in the `best expert' changing across episodes. Our goal is to develop an agent that tracks the best expert over episodes. We introduce the Empirica…
▽ More
We study a version of the contextual bandit problem where an agent can intervene through a set of stochastic expert policies. The agent interacts with the environment over episodes, with each episode having different context distributions; this results in the `best expert' changing across episodes. Our goal is to develop an agent that tracks the best expert over episodes. We introduce the Empirical Divergence-based UCB (ED-UCB) algorithm in this setting where the agent does not have any knowledge of the expert policies or changes in context distributions. With mild assumptions, we show that bootstrapping from $\mathcal{O}(N\log(NT^2\sqrt{E}))$ samples results in a regret of $\mathcal{O}(E(N+1) + \frac{N\sqrt{E}}{T^2})$ for $N$ experts over $E$ episodes, each of length $T$. If the expert policies are known to the agent a priori, then we can improve the regret to $\mathcal{O}(EN)$ without requiring any bootstrapping. Our analysis also tightens pre-existing logarithmic regret bounds to a problem-dependent constant in the non-episodic setting when expert policies are known. We finally empirically validate our findings through simulations.
△ Less
Submitted 26 October, 2021; v1 submitted 7 July, 2021;
originally announced July 2021.
-
Mentorship Network Structure: How Relationships Emerge Online and What They Mean for Amateur Creators
Authors:
Ruby Davis,
Jenna Frens,
Niharika Sharma,
Meena Devii Muralikumar,
Cecilia Aragon,
Sarah Evans
Abstract:
Relationships form the core of connected learning. In this study, we apply and extend social network analysis methods to uncover the layered network structure of relationships among Fanfiction.net authors and reviewers. Fanfiction.net, one of the world's largest fanfiction communities, is a space where millions of young people engage with written media, connect over shared interests, and receive s…
▽ More
Relationships form the core of connected learning. In this study, we apply and extend social network analysis methods to uncover the layered network structure of relationships among Fanfiction.net authors and reviewers. Fanfiction.net, one of the world's largest fanfiction communities, is a space where millions of young people engage with written media, connect over shared interests, and receive support and mentoring from a distributed audience. Does an affinity space such as Fanfiction.net have the same structure as social networks Facebook and Twitter? We applied k-means clustering on millions of relationships to determine that Fanfiction.net has 2 to 3 layers, in contrast with the 4-layer structure of Facebook and 5-layer structure of Twitter. In addition, we conducted a large-scale machine classification of Fanfiction.net reviews to reveal the types of mentoring exchanged in each layer. Our findings show that the relationships where reviews are exchanged most frequently are most likely to contain substantive reviews. We discuss implications of these findings for the theory of distributed mentoring as well as the design of online affinity networks.
△ Less
Submitted 26 June, 2021;
originally announced June 2021.
-
Towards sound based testing of COVID-19 -- Summary of the first Diagnostics of COVID-19 using Acoustics (DiCOVA) Challenge
Authors:
Neeraj Kumar Sharma,
Ananya Muguli,
Prashant Krishnan,
Rohit Kumar,
Srikanth Raj Chetupalli,
Sriram Ganapathy
Abstract:
The technology development for point-of-care tests (POCTs) targeting respiratory diseases has witnessed a growing demand in the recent past. Investigating the presence of acoustic biomarkers in modalities such as cough, breathing and speech sounds, and using them for building POCTs can offer fast, contactless and inexpensive testing. In view of this, over the past year, we launched the ``Coswara''…
▽ More
The technology development for point-of-care tests (POCTs) targeting respiratory diseases has witnessed a growing demand in the recent past. Investigating the presence of acoustic biomarkers in modalities such as cough, breathing and speech sounds, and using them for building POCTs can offer fast, contactless and inexpensive testing. In view of this, over the past year, we launched the ``Coswara'' project to collect cough, breathing and speech sound recordings via worldwide crowdsourcing. With this data, a call for development of diagnostic tools was announced in the Interspeech 2021 as a special session titled ``Diagnostics of COVID-19 using Acoustics (DiCOVA) Challenge''. The goal was to bring together researchers and practitioners interested in developing acoustics-based COVID-19 POCTs by enabling them to work on the same set of development and test datasets. As part of the challenge, datasets with breathing, cough, and speech sound samples from COVID-19 and non-COVID-19 individuals were released to the participants. The challenge consisted of two tracks. The Track-1 focused only on cough sounds, and participants competed in a leaderboard setting. In Track-2, breathing and speech samples were provided for the participants, without a competitive leaderboard. The challenge attracted 85 plus registrations with 29 final submissions for Track-1. This paper describes the challenge (datasets, tasks, baseline system), and presents a focused summary of the various systems submitted by the participating teams. An analysis of the results from the top four teams showed that a fusion of the scores from these teams yields an area-under-the-curve of 95.1% on the blind test data. By summarizing the lessons learned, we foresee the challenge overview in this paper to help accelerate technology for acoustic-based POCTs.
△ Less
Submitted 21 June, 2021;
originally announced June 2021.
-
Influence of Roles in Decision-Making during OSS Development -- A Study of Python
Authors:
Pankajeshwara Nand Sharma,
Bastin Tony Roy Savarimuthu,
Nigel Stanger
Abstract:
Governance has been highlighted as a key factor in the success of an Open Source Software (OSS) project. It is generally seen that in a mixed meritocracy and autocracy governance model, the decision-making (DM) responsibility regarding what features are included in the OSS is shared among members from select roles; prominently the project leader. However, less examination has been made whether mem…
▽ More
Governance has been highlighted as a key factor in the success of an Open Source Software (OSS) project. It is generally seen that in a mixed meritocracy and autocracy governance model, the decision-making (DM) responsibility regarding what features are included in the OSS is shared among members from select roles; prominently the project leader. However, less examination has been made whether members from these roles are also prominent in DM discussions and how decisions are made, to show they play an integral role in the success of the project. We believe that to establish their influence, it is necessary to examine not only discussions of proposals in which the project leader makes the decisions, but also those where others make the decisions. Therefore, in this study, we examine the prominence of members performing different roles in: (i) making decisions, (ii) performing certain social roles in DM discussions (e.g., discussion starters), (iii) contributing to the OSS development social network through DM discussions, and (iv) how decisions are made under both scenarios. We examine these aspects in the evolution of the well-known Python project. We carried out a data-driven longitudinal study of their email communication spanning 20 years, comprising about 1.5 million emails. These emails contain decisions for 466 Python Enhancement Proposals (PEPs) that document the language's evolution. Our findings make the influence of different roles transparent to future (new) members, other stakeholders, and more broadly, to the OSS research community.
△ Less
Submitted 3 June, 2021;
originally announced June 2021.
-
Multi-modal Point-of-Care Diagnostics for COVID-19 Based On Acoustics and Symptoms
Authors:
Srikanth Raj Chetupalli,
Prashant Krishnan,
Neeraj Sharma,
Ananya Muguli,
Rohit Kumar,
Viral Nanda,
Lancelot Mark Pinto,
Prasanta Kumar Ghosh,
Sriram Ganapathy
Abstract:
The research direction of identifying acoustic bio-markers of respiratory diseases has received renewed interest following the onset of COVID-19 pandemic. In this paper, we design an approach to COVID-19 diagnostic using crowd-sourced multi-modal data. The data resource, consisting of acoustic signals like cough, breathing, and speech signals, along with the data of symptoms, are recorded using a…
▽ More
The research direction of identifying acoustic bio-markers of respiratory diseases has received renewed interest following the onset of COVID-19 pandemic. In this paper, we design an approach to COVID-19 diagnostic using crowd-sourced multi-modal data. The data resource, consisting of acoustic signals like cough, breathing, and speech signals, along with the data of symptoms, are recorded using a web-application over a period of ten months. We investigate the use of statistical descriptors of simple time-frequency features for acoustic signals and binary features for the presence of symptoms. Unlike previous works, we primarily focus on the application of simple linear classifiers like logistic regression and support vector machines for acoustic data while decision tree models are employed on the symptoms data. We show that a multi-modal integration of acoustics and symptoms classifiers achieves an area-under-curve (AUC) of 92.40, a significant improvement over any individual modality. Several ablation experiments are also provided which highlight the acoustic and symptom dimensions that are important for the task of COVID-19 diagnostics.
△ Less
Submitted 5 June, 2021; v1 submitted 1 June, 2021;
originally announced June 2021.
-
A digital score of tumour-associated stroma infiltrating lymphocytes predicts survival in head and neck squamous cell carcinoma
Authors:
Muhammad Shaban,
Shan E Ahmed Raza,
Mariam Hassan,
Arif Jamshed,
Sajid Mushtaq,
Asif Loya,
Nikolaos Batis,
Jill Brooks,
Paul Nankivell,
Neil Sharma,
Max Robinson,
Hisham Mehanna,
Syed Ali Khurram,
Nasir Rajpoot
Abstract:
The infiltration of T-lymphocytes in the stroma and tumour is an indication of an effective immune response against the tumour, resulting in better survival. In this study, our aim is to explore the prognostic significance of tumour-associated stroma infiltrating lymphocytes (TASILs) in head and neck squamous cell carcinoma (HNSCC) through an AI based automated method. A deep learning based automa…
▽ More
The infiltration of T-lymphocytes in the stroma and tumour is an indication of an effective immune response against the tumour, resulting in better survival. In this study, our aim is to explore the prognostic significance of tumour-associated stroma infiltrating lymphocytes (TASILs) in head and neck squamous cell carcinoma (HNSCC) through an AI based automated method. A deep learning based automated method was employed to segment tumour, stroma and lymphocytes in digitally scanned whole slide images of HNSCC tissue slides. The spatial patterns of lymphocytes and tumour-associated stroma were digitally quantified to compute the TASIL-score. Finally, prognostic significance of the TASIL-score for disease-specific and disease-free survival was investigated with the Cox proportional hazard analysis. Three different cohorts of Haematoxylin & Eosin (H&E) stained tissue slides of HNSCC cases (n=537 in total) were studied, including publicly available TCGA head and neck cancer cases. The TASIL-score carries prognostic significance (p=0.002) for disease-specific survival of HNSCC patients. The TASIL-score also shows a better separation between low- and high-risk patients as compared to the manual TIL scoring by pathologists for both disease-specific and disease-free survival. A positive correlation of TASIL-score with molecular estimates of CD8+ T cells was also found, which is in line with existing findings. To the best of our knowledge, this is the first study to automate the quantification of TASIL from routine H&E slides of head and neck cancer. Our TASIL-score based findings are aligned with the clinical knowledge with the added advantages of objectivity, reproducibility and strong prognostic value. A comprehensive evaluation on large multicentric cohorts is required before the proposed digital score can be adopted in clinical practice.
△ Less
Submitted 16 April, 2021;
originally announced April 2021.
-
DiCOVA Challenge: Dataset, task, and baseline system for COVID-19 diagnosis using acoustics
Authors:
Ananya Muguli,
Lancelot Pinto,
Nirmala R.,
Neeraj Sharma,
Prashant Krishnan,
Prasanta Kumar Ghosh,
Rohit Kumar,
Shrirama Bhat,
Srikanth Raj Chetupalli,
Sriram Ganapathy,
Shreyas Ramoji,
Viral Nanda
Abstract:
The DiCOVA challenge aims at accelerating research in diagnosing COVID-19 using acoustics (DiCOVA), a topic at the intersection of speech and audio processing, respiratory health diagnosis, and machine learning. This challenge is an open call for researchers to analyze a dataset of sound recordings collected from COVID-19 infected and non-COVID-19 individuals for a two-class classification. These…
▽ More
The DiCOVA challenge aims at accelerating research in diagnosing COVID-19 using acoustics (DiCOVA), a topic at the intersection of speech and audio processing, respiratory health diagnosis, and machine learning. This challenge is an open call for researchers to analyze a dataset of sound recordings collected from COVID-19 infected and non-COVID-19 individuals for a two-class classification. These recordings were collected via crowdsourcing from multiple countries, through a website application. The challenge features two tracks, one focusing on cough sounds, and the other on using a collection of breath, sustained vowel phonation, and number counting speech recordings. In this paper, we introduce the challenge and provide a detailed description of the task, and present a baseline system for the task.
△ Less
Submitted 17 June, 2021; v1 submitted 16 March, 2021;
originally announced March 2021.
-
Extracting Rationale for Open Source Software Development Decisions -- A Study of Python Email Archives
Authors:
Pankajeshwara Nand Sharma,
Bastin Tony Roy Savarimuthu,
Nigel Stanger
Abstract:
A sound Decision-Making (DM) process is key to the successful governance of software projects. In many Open Source Software Development (OSSD) communities, DM processes lie buried amongst vast amounts of publicly available data. Hidden within this data lie the rationale for decisions that led to the evolution and maintenance of software products. While there have been some efforts to extract DM pr…
▽ More
A sound Decision-Making (DM) process is key to the successful governance of software projects. In many Open Source Software Development (OSSD) communities, DM processes lie buried amongst vast amounts of publicly available data. Hidden within this data lie the rationale for decisions that led to the evolution and maintenance of software products. While there have been some efforts to extract DM processes from publicly available data, the rationale behind how the decisions are made have seldom been explored. Extracting the rationale for these decisions can facilitate transparency (by making them known), and also promote accountability on the part of decision-makers. This work bridges this gap by means of a large-scale study that unearths the rationale behind decisions from Python development email archives comprising about 1.5 million emails. This paper makes two main contributions. First, it makes a knowledge contribution by unearthing and presenting the rationale behind decisions made. Second, it makes a methodological contribution by presenting a heuristics-based rationale extraction system called Rationale Miner that employs multiple heuristics, and follows a data-driven, bottom-up approach to infer the rationale behind specific decisions (e.g., whether a new module is implemented based on core developer consensus or benevolent dictator's pronouncement). Our approach can be applied to extract rationale in other OSSD communities that have similar governance structures.
△ Less
Submitted 9 February, 2021;
originally announced February 2021.
-
Role of Attentive History Selection in Conversational Information Seeking
Authors:
Somil Gupta,
Neeraj Sharma
Abstract:
The rise of intelligent assistant systems like Siri and Alexa have led to the emergence of Conversational Search, a research track of Information Retrieval (IR) that involves interactive and iterative information-seeking user-system dialog. Recently released OR-QuAC and TCAsT19 datasets narrow their research focus on the retrieval aspect of conversational search i.e. fetching the relevant document…
▽ More
The rise of intelligent assistant systems like Siri and Alexa have led to the emergence of Conversational Search, a research track of Information Retrieval (IR) that involves interactive and iterative information-seeking user-system dialog. Recently released OR-QuAC and TCAsT19 datasets narrow their research focus on the retrieval aspect of conversational search i.e. fetching the relevant documents (passages) from a large collection using the conversational search history. Currently proposed models for these datasets incorporate history in retrieval by appending the last N turns to the current question before encoding. We propose to use another history selection approach that dynamically selects and weighs history turns using the attention mechanism for question embedding. The novelty of our approach lies in experimenting with soft attention-based history selection approach in an open-retrieval setting.
△ Less
Submitted 7 February, 2021;
originally announced February 2021.
-
Survey2Survey: A deep learning generative model approach for cross-survey image mapping
Authors:
Brandon Buncher,
Awshesh Nath Sharma,
Matias Carrasco Kind
Abstract:
During the last decade, there has been an explosive growth in survey data and deep learning techniques, both of which have enabled great advances for astronomy. The amount of data from various surveys from multiple epochs with a wide range of wavelengths, albeit with varying brightness and quality, is overwhelming, and leveraging information from overlapping observations from different surveys has…
▽ More
During the last decade, there has been an explosive growth in survey data and deep learning techniques, both of which have enabled great advances for astronomy. The amount of data from various surveys from multiple epochs with a wide range of wavelengths, albeit with varying brightness and quality, is overwhelming, and leveraging information from overlapping observations from different surveys has limitless potential in understanding galaxy formation and evolution. Synthetic galaxy image generation using physical models has been an important tool for survey data analysis, while deep learning generative models show great promise. In this paper, we present a novel approach for robustly expanding and improving survey data through cross survey feature translation. We trained two types of neural networks to map images from the Sloan Digital Sky Survey (SDSS) to corresponding images from the Dark Energy Survey (DES). This map was used to generate false DES representations of SDSS images, increasing the brightness and S/N while retaining important morphological information. We substantiate the robustness of our method by generating DES representations of SDSS images from outside the overlapping region, showing that the brightness and quality are improved even when the source images are of lower quality than the training images. Finally, we highlight several images in which the reconstruction process appears to have removed large artifacts from SDSS images. While only an initial application, our method shows promise as a method for robustly expanding and improving the quality of optical survey data and provides a potential avenue for cross-band reconstruction.
△ Less
Submitted 5 February, 2021; v1 submitted 13 November, 2020;
originally announced November 2020.
-
Anonymous proof-of-asset transactions using designated blind signatures
Authors:
Neetu Sharma,
Rajeev Anand Sahu,
Vishal Saraswat,
Joaquin Garcia-Alfaro
Abstract:
We propose a scheme to preserve the anonymity of users in proof-of-asset transactions. We assume bitcoin-like cryptocurrency systems in which a user must prove the strength of its assets (i.e., solvency), prior conducting further transactions. The traditional way of addressing such a problem is the use of blind signatures, i.e., a kind of digital signature whose properties satisfy the anonymity of…
▽ More
We propose a scheme to preserve the anonymity of users in proof-of-asset transactions. We assume bitcoin-like cryptocurrency systems in which a user must prove the strength of its assets (i.e., solvency), prior conducting further transactions. The traditional way of addressing such a problem is the use of blind signatures, i.e., a kind of digital signature whose properties satisfy the anonymity of the signer. Our work focuses on the use of a designated verifier signature scheme that limits to only a single authorized party (within a group of signature requesters) to verify the correctness of the transaction.
△ Less
Submitted 26 October, 2020; v1 submitted 29 September, 2020;
originally announced September 2020.
-
CLARINET: A RISC-V Based Framework for Posit Arithmetic Empiricism
Authors:
Niraj Sharma,
Riya Jain,
Madhumita Mohan,
Sachin Patkar,
Rainer Leupers,
Nikhil Rishiyur,
Farhad Merchant
Abstract:
Many engineering and scientific applications require high precision arithmetic. IEEE~754-2008 compliant (floating-point) arithmetic is the de facto standard for performing these computations. Recently, posit arithmetic has been proposed as a drop-in replacement for floating-point arithmetic. The posit\texttrademark data representation and arithmetic claim several absolute advantages over the float…
▽ More
Many engineering and scientific applications require high precision arithmetic. IEEE~754-2008 compliant (floating-point) arithmetic is the de facto standard for performing these computations. Recently, posit arithmetic has been proposed as a drop-in replacement for floating-point arithmetic. The posit\texttrademark data representation and arithmetic claim several absolute advantages over the floating-point format and arithmetic, including higher dynamic range, better accuracy, and superior performance-area trade-offs. However, there does not exist any accessible, holistic framework that facilitates the validation of these claims of posit arithmetic, especially when the claims involve long accumulations (quire).
In this paper, we present a consolidated general-purpose processor-based framework to support posit arithmetic empiricism. The end-users of the framework have the liberty to seamlessly experiment with their applications using posit and floating-point arithmetic since the framework is designed for the two number systems to coexist. Melodica is a posit arithmetic core that implements parametric fused operations that uniquely involve the quire data type. Clarinet is a Melodica-enabled processor based on the RISC-V ISA. To the best of our knowledge, this is the first-ever integration of quire with a RISC-V core. To show the effectiveness of the Clarinet platform, we perform an extensive application study and benchmark some of the common linear algebra and computer vision kernels. We emulate Clarinet on a Xilinx FPGA and present utilization and timing data. Clarinet and Melodica remain actively under development and is available in open-source for posit arithmetic empiricism.
△ Less
Submitted 27 October, 2021; v1 submitted 30 May, 2020;
originally announced June 2020.
-
Coswara -- A Database of Breathing, Cough, and Voice Sounds for COVID-19 Diagnosis
Authors:
Neeraj Sharma,
Prashant Krishnan,
Rohit Kumar,
Shreyas Ramoji,
Srikanth Raj Chetupalli,
Nirmala R.,
Prasanta Kumar Ghosh,
Sriram Ganapathy
Abstract:
The COVID-19 pandemic presents global challenges transcending boundaries of country, race, religion, and economy. The current gold standard method for COVID-19 detection is the reverse transcription polymerase chain reaction (RT-PCR) testing. However, this method is expensive, time-consuming, and violates social distancing. Also, as the pandemic is expected to stay for a while, there is a need for…
▽ More
The COVID-19 pandemic presents global challenges transcending boundaries of country, race, religion, and economy. The current gold standard method for COVID-19 detection is the reverse transcription polymerase chain reaction (RT-PCR) testing. However, this method is expensive, time-consuming, and violates social distancing. Also, as the pandemic is expected to stay for a while, there is a need for an alternate diagnosis tool which overcomes these limitations, and is deployable at a large scale. The prominent symptoms of COVID-19 include cough and breathing difficulties. We foresee that respiratory sounds, when analyzed using machine learning techniques, can provide useful insights, enabling the design of a diagnostic tool. Towards this, the paper presents an early effort in creating (and analyzing) a database, called Coswara, of respiratory sounds, namely, cough, breath, and voice. The sound samples are collected via worldwide crowdsourcing using a website application. The curated dataset is released as open access. As the pandemic is evolving, the data collection and analysis is a work in progress. We believe that insights from analysis of Coswara can be effective in enabling sound based technology solutions for point-of-care diagnosis of respiratory infection, and in the near future this can help to diagnose COVID-19.
△ Less
Submitted 11 August, 2020; v1 submitted 21 May, 2020;
originally announced May 2020.
-
Temporal Attribute Prediction via Joint Modeling of Multi-Relational Structure Evolution
Authors:
Sankalp Garg,
Navodita Sharma,
Woojeong Jin,
Xiang Ren
Abstract:
Time series prediction is an important problem in machine learning. Previous methods for time series prediction did not involve additional information. With a lot of dynamic knowledge graphs available, we can use this additional information to predict the time series better. Recently, there has been a focus on the application of deep representation learning on dynamic graphs. These methods predict…
▽ More
Time series prediction is an important problem in machine learning. Previous methods for time series prediction did not involve additional information. With a lot of dynamic knowledge graphs available, we can use this additional information to predict the time series better. Recently, there has been a focus on the application of deep representation learning on dynamic graphs. These methods predict the structure of the graph by reasoning over the interactions in the graph at previous time steps. In this paper, we propose a new framework to incorporate the information from dynamic knowledge graphs for time series prediction. We show that if the information contained in the graph and the time series data are closely related, then this inter-dependence can be used to predict the time series with improved accuracy. Our framework, DArtNet, learns a static embedding for every node in the graph as well as a dynamic embedding which is dependent on the dynamic attribute value (time-series). Then it captures the information from the neighborhood by taking a relation specific mean and encodes the history information using RNN. We jointly train the model link prediction and attribute prediction. We evaluate our method on five specially curated datasets for this problem and show a consistent improvement in time series prediction results. We release the data and code of model DArtNet for future research at https://github.com/INK-USC/DArtNet .
△ Less
Submitted 13 July, 2020; v1 submitted 9 March, 2020;
originally announced March 2020.
-
Dual link failure survivability with recovery time constraint: A Parallel cross connection backup route recovery strategy
Authors:
Dinesh Kumar,
Rajiv Kumar,
Neeru Sharma
Abstract:
In this paper, we proposed a fast recovery strategy for a dual link failure in elastic optical network. The elastic optical network is a promising solution to meet the next generation higher bandwidth demand. The survivability of high speed network is very crucial. As the network size increases the probability of the dual link failure and node failure also increases. Here, we proposed a parallel c…
▽ More
In this paper, we proposed a fast recovery strategy for a dual link failure in elastic optical network. The elastic optical network is a promising solution to meet the next generation higher bandwidth demand. The survivability of high speed network is very crucial. As the network size increases the probability of the dual link failure and node failure also increases. Here, we proposed a parallel cross connection backup recovery strategy for dual link failure in the network. Proposed strategy shows lower bandwidth blocking probability (BBP), fast connection recovery, and bandwidth provisioning ratio when compared with the existing shared path protection (SPP) and dedicated path protection (DPP) approaches. Simulation is performed on ARPANET and COST239 topology networks.
△ Less
Submitted 5 March, 2020;
originally announced March 2020.
-
On Under-exploration in Bandits with Mean Bounds from Confounded Data
Authors:
Nihal Sharma,
Soumya Basu,
Karthikeyan Shanmugam,
Sanjay Shakkottai
Abstract:
We study a variant of the multi-armed bandit problem where side information in the form of bounds on the mean of each arm is provided. We develop the novel non-optimistic Global Under-Explore (GLUE) algorithm which uses the provided mean bounds (across all the arms) to infer pseudo-variances for each arm, which in turn decide the rate of exploration for the arms. We analyze the regret of GLUE and…
▽ More
We study a variant of the multi-armed bandit problem where side information in the form of bounds on the mean of each arm is provided. We develop the novel non-optimistic Global Under-Explore (GLUE) algorithm which uses the provided mean bounds (across all the arms) to infer pseudo-variances for each arm, which in turn decide the rate of exploration for the arms. We analyze the regret of GLUE and prove regret upper bounds that are never worse than that of the standard UCB algorithm. Furthermore, we show that GLUE improves upon regret guarantees that exists in literature for structured bandit problems (both theoretically and empirically). Finally, we study the practical setting of learning adaptive interventions using prior data that has been confounded by unrecorded variables that affect rewards. We show that mean bounds can be inferred naturally from such logs and can thus be used to improve the learning process. We validate our findings through semi-synthetic experiments on data derived from real data sets.
△ Less
Submitted 10 June, 2021; v1 submitted 19 February, 2020;
originally announced February 2020.
-
Duet: An Expressive Higher-order Language and Linear Type System for Statically Enforcing Differential Privacy
Authors:
Joseph P. Near,
David Darais,
Chike Abuah,
Tim Stevens,
Pranav Gaddamadugu,
Lun Wang,
Neel Somani,
Mu Zhang,
Nikhil Sharma,
Alex Shan,
Dawn Song
Abstract:
During the past decade, differential privacy has become the gold standard for protecting the privacy of individuals. However, verifying that a particular program provides differential privacy often remains a manual task to be completed by an expert in the field. Language-based techniques have been proposed for fully automating proofs of differential privacy via type system design, however these re…
▽ More
During the past decade, differential privacy has become the gold standard for protecting the privacy of individuals. However, verifying that a particular program provides differential privacy often remains a manual task to be completed by an expert in the field. Language-based techniques have been proposed for fully automating proofs of differential privacy via type system design, however these results have lagged behind advances in differentially-private algorithms, leaving a noticeable gap in programs which can be automatically verified while also providing state-of-the-art bounds on privacy.
We propose Duet, an expressive higher-order language, linear type system and tool for automatically verifying differential privacy of general-purpose higher-order programs. In addition to general purpose programming, Duet supports encoding machine learning algorithms such as stochastic gradient descent, as well as common auxiliary data analysis tasks such as clipping, normalization and hyperparameter tuning - each of which are particularly challenging to encode in a statically verified differential privacy framework.
We present a core design of the Duet language and linear type system, and complete key proofs about privacy for well-typed programs. We then show how to extend Duet to support realistic machine learning applications and recent variants of differential privacy which result in improved accuracy for many practical differentially private algorithms. Finally, we implement several differentially private machine learning algorithms in Duet which have never before been automatically verified by a language-based tool, and we present experimental results which demonstrate the benefits of Duet's language design in terms of accuracy of trained machine learning models.
△ Less
Submitted 5 September, 2019;
originally announced September 2019.