Skip to main content

Showing 1–50 of 52 results for author: Sharma, T

  1. arXiv:2407.04147  [pdf, other

    cs.SE

    ALPINE: An adaptive language-agnostic pruning method for language models for code

    Authors: Mootez Saad, José Antonio Hernández López, Boqi Chen, Dániel Varró, Tushar Sharma

    Abstract: Language models of code have demonstrated state-of-the-art performance across various software engineering and source code analysis tasks. However, their demanding computational resource requirements and consequential environmental footprint remain as significant challenges. This work introduces ALPINE, an adaptive programming language-agnostic pruning technique designed to substantially reduce th… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  2. arXiv:2406.10461  [pdf, ps, other

    cs.HC

    Exploring Parent-Child Perceptions on Safety in Generative AI: Concerns, Mitigation Strategies, and Design Implications

    Authors: Yaman Yu, Tanusree Sharma, Melinda Hu, Justin Wang, Yang Wang

    Abstract: The widespread use of Generative Artificial Intelligence (GAI) among teenagers has led to significant misuse and safety concerns. To identify risks and understand parental controls challenges, we conducted a content analysis on Reddit and interviewed 20 participants (seven teenagers and 13 parents). Our study reveals a significant gap in parental awareness of the extensive ways children use GAI, s… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 13 pages

  3. arXiv:2405.11138  [pdf, other

    cs.NI cs.CY

    Spatial Models for Crowdsourced Internet Access Network Performance Measurements

    Authors: Taveesh Sharma, Paul Schmitt, Francesco Bronzino, Nick Feamster, Nicole Marwell

    Abstract: Despite significant investments in access network infrastructure, universal access to high-quality Internet connectivity remains a challenge. Policymakers often rely on large-scale, crowdsourced measurement datasets to assess the distribution of access network performance across geographic areas. These decisions typically rest on the assumption that Internet performance is uniformly distributed wi… ▽ More

    Submitted 21 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    Comments: 13 pages

  4. arXiv:2405.02790  [pdf, other

    cs.CR cs.LG

    Confidential and Protected Disease Classifier using Fully Homomorphic Encryption

    Authors: Aditya Malik, Nalini Ratha, Bharat Yalavarthi, Tilak Sharma, Arjun Kaushik, Charanjit Jutla

    Abstract: With the rapid surge in the prevalence of Large Language Models (LLMs), individuals are increasingly turning to conversational AI for initial insights across various domains, including health-related inquiries such as disease diagnosis. Many users seek potential causes on platforms like ChatGPT or Bard before consulting a medical professional for their ailment. These platforms offer valuable benef… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

  5. arXiv:2402.01841  [pdf, other

    cs.SE cs.AI cs.CL

    COMET: Generating Commit Messages using Delta Graph Context Representation

    Authors: Abhinav Reddy Mandli, Saurabhsingh Rajput, Tushar Sharma

    Abstract: Commit messages explain code changes in a commit and facilitate collaboration among developers. Several commit message generation approaches have been proposed; however, they exhibit limited success in capturing the context of code changes. We propose Comet (Context-Aware Commit Message Generation), a novel approach that captures context of code changes using a graph-based representation and lever… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: 22 Pages, 7 Figures

  6. arXiv:2401.17967  [pdf, other

    cs.SE cs.LG

    CONCORD: Towards a DSL for Configurable Graph Code Representation

    Authors: Mootez Saad, Tushar Sharma

    Abstract: Deep learning is widely used to uncover hidden patterns in large code corpora. To achieve this, constructing a format that captures the relevant characteristics and features of source code is essential. Graph-based representations have gained attention for their ability to model structural and semantic information. However, existing tools lack flexibility in constructing graphs across different pr… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

  7. arXiv:2401.07930  [pdf, other

    cs.SE

    On Inter-dataset Code Duplication and Data Leakage in Large Language Models

    Authors: José Antonio Hernández López, Boqi Chen, Tushar Sharma, Dániel Varró

    Abstract: Motivation. Large language models (LLMs) have exhibited remarkable proficiency in diverse software engineering (SE) tasks. Handling such tasks typically involves acquiring foundational coding knowledge on large, general-purpose datasets during a pre-training phase, and subsequently refining on smaller, task-specific datasets as part of a fine-tuning phase. Problem statement. Data leakage is a we… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

  8. arXiv:2312.15896  [pdf, other

    cs.AR cs.DC cs.LG

    WWW: What, When, Where to Compute-in-Memory

    Authors: Tanvi Sharma, Mustafa Ali, Indranil Chakraborty, Kaushik Roy

    Abstract: Compute-in-memory (CiM) has emerged as a highly energy efficient solution for performing matrix multiplication during Machine Learning (ML) inference. However, integrating compute in memory poses key questions, such as 1) What type of CiM to use: Given a multitude of CiM design characteristics, determining their suitability from architecture perspective is needed. 2) When to use CiM: ML inference… ▽ More

    Submitted 20 June, 2024; v1 submitted 26 December, 2023; originally announced December 2023.

    Comments: updated methodology

  9. arXiv:2311.13508  [pdf, other

    cs.SE cs.LG

    Naturalness of Attention: Revisiting Attention in Code Language Models

    Authors: Mootez Saad, Tushar Sharma

    Abstract: Language models for code such as CodeBERT offer the capability to learn advanced source code representation, but their opacity poses barriers to understanding of captured properties. Recent attention analysis studies provide initial interpretability insights by focusing solely on attention weights rather than considering the wider context modeling of Transformers. This study aims to shed some ligh… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

    Comments: Accepted at ICSE-NIER (2024) track

  10. arXiv:2310.02859  [pdf, other

    cs.SI

    Tight Sampling in Unbounded Networks

    Authors: Kshitijaa Jaglan, Meher Chaitanya, Triansh Sharma, Abhijeeth Singam, Nidhi Goyal, Ponnurangam Kumaraguru, Ulrik Brandes

    Abstract: The default approach to deal with the enormous size and limited accessibility of many Web and social media networks is to sample one or more subnetworks from a conceptually unbounded unknown network. Clearly, the extracted subnetworks will crucially depend on the sampling scheme. Motivated by studies of homophily and opinion formation, we propose a variant of snowball sampling designed to prioriti… ▽ More

    Submitted 5 October, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: The first two authors contributed equally

  11. Using ChatGPT in HCI Research -- A Trioethnography

    Authors: Smit Desai, Tanusree Sharma, Pratyasha Saha

    Abstract: This paper explores the lived experience of using ChatGPT in HCI research through a month-long trioethnography. Our approach combines the expertise of three HCI researchers with diverse research interests to reflect on our daily experience of living and working with ChatGPT. Our findings are presented as three provocations grounded in our collective experiences and HCI theories. Specifically, we e… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

  12. arXiv:2308.12264  [pdf, other

    cs.LG cs.AI cs.PF cs.SE

    Enhancing Energy-Awareness in Deep Learning through Fine-Grained Energy Measurement

    Authors: Saurabhsingh Rajput, Tim Widmayer, Ziyuan Shang, Maria Kechagia, Federica Sarro, Tushar Sharma

    Abstract: With the increasing usage, scale, and complexity of Deep Learning (DL) models, their rapidly growing energy consumption has become a critical concern. Promoting green development and energy awareness at different granularities is the need of the hour to limit carbon emissions of DL systems. However, the lack of standard and repeatable tools to accurately measure and optimize energy consumption at… ▽ More

    Submitted 1 February, 2024; v1 submitted 23 August, 2023; originally announced August 2023.

  13. arXiv:2308.12199  [pdf, other

    cs.CV

    Towards Real-Time Analysis of Broadcast Badminton Videos

    Authors: Nitin Nilesh, Tushar Sharma, Anurag Ghosh, C. V. Jawahar

    Abstract: Analysis of player movements is a crucial subset of sports analysis. Existing player movement analysis methods use recorded videos after the match is over. In this work, we propose an end-to-end framework for player movement analysis for badminton matches on live broadcast match videos. We only use the visual inputs from the match and, unlike other approaches which use multi-modal sensor data, our… ▽ More

    Submitted 23 August, 2023; originally announced August 2023.

  14. arXiv:2308.06882  [pdf, other

    q-fin.ST cs.LG q-fin.CP stat.AP

    Quantifying Outlierness of Funds from their Categories using Supervised Similarity

    Authors: Dhruv Desai, Ashmita Dhiman, Tushar Sharma, Deepika Sharma, Dhagash Mehta, Stefano Pasquali

    Abstract: Mutual fund categorization has become a standard tool for the investment management industry and is extensively used by allocators for portfolio construction and manager selection, as well as by fund managers for peer analysis and competitive positioning. As a result, a (unintended) miscategorization or lack of precision can significantly impact allocation decisions and investment fund managers. H… ▽ More

    Submitted 13 August, 2023; originally announced August 2023.

    Comments: 8 pages, 5 tables, 8 figures

  15. arXiv:2307.08652  [pdf, other

    cs.GR

    Search Me Knot, Render Me Knot: Embedding Search and Differentiable Rendering of Knots in 3D

    Authors: Aalok Gangopadhyay, Paras Gupta, Tarun Sharma, Prajwal Singh, Shanmuganathan Raman

    Abstract: We introduce the problem of knot-based inverse perceptual art. Given multiple target images and their corresponding viewing configurations, the objective is to find a 3D knot-based tubular structure whose appearance resembles the target images when viewed from the specified viewing configurations. To solve this problem, we first design a differentiable rendering algorithm for rendering tubular kno… ▽ More

    Submitted 19 August, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

  16. arXiv:2306.06261  [pdf, other

    cs.CR cs.HC

    Iterative Design of An Accessible Crypto Wallet for Blind Users

    Authors: Zhixuan Zhou, Tanusree Sharma, Luke Emano, Sauvik Das, Yang Wang

    Abstract: Crypto wallets are a key touch-point for cryptocurrency use. People use crypto wallets to make transactions, manage crypto assets, and interact with decentralized apps (dApps). However, as is often the case with emergent technologies, little attention has been paid to understanding and improving accessibility barriers in crypto wallet software. We present a series of user studies that explored how… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

    Comments: 19th Symposium on Usable Privacy and Security

  17. Estimating WebRTC Video QoE Metrics Without Using Application Headers

    Authors: Taveesh Sharma, Tarun Mangla, Arpit Gupta, Junchen Jiang, Nick Feamster

    Abstract: The increased use of video conferencing applications (VCAs) has made it critical to understand and support end-user quality of experience (QoE) by all stakeholders in the VCA ecosystem, especially network operators, who typically do not have direct access to client software. Existing VCA QoE estimation methods use passive measurements of application-level Real-time Transport Protocol (RTP) headers… ▽ More

    Submitted 9 November, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: 16 pages

  18. arXiv:2304.09822  [pdf, other

    cs.CY cs.SI

    Unpacking How Decentralized Autonomous Organizations (DAOs) Work in Practice

    Authors: Tanusree Sharma, Yujin Kwon, Kornrapat Pongmala, Henry Wang, Andrew Miller, Dawn Song, Yang Wang

    Abstract: Decentralized Autonomous Organizations (DAOs) have emerged as a novel way to coordinate a group of (pseudonymous) entities towards a shared vision (e.g., promoting sustainability), utilizing self-executing smart contracts on blockchains to support decentralized governance and decision-making. In just a few years, over 4,000 DAOs have been launched in various domains, such as investment, education,… ▽ More

    Submitted 16 April, 2023; originally announced April 2023.

  19. arXiv:2304.07598  [pdf, other

    cs.CR

    Understanding Rug Pulls: An In-Depth Behavioral Analysis of Fraudulent NFT Creators

    Authors: Trishie Sharma, Rachit Agarwal, Sandeep Kumar Shukla

    Abstract: The explosive growth of non-fungible tokens (NFTs) on Web3 has created a new frontier for digital art and collectibles, but also an emerging space for fraudulent activities. This study provides an in-depth analysis of NFT rug pulls, which are fraudulent schemes aimed at stealing investors' funds. Using data from 758 rug pulls across 10 NFT marketplaces, we examine the structural and behavioral pro… ▽ More

    Submitted 15 April, 2023; originally announced April 2023.

  20. arXiv:2303.08729  [pdf, other

    cs.SE cs.AI cs.LG cs.PL

    DACOS-A Manually Annotated Dataset of Code Smells

    Authors: Himesh Nandani, Mootez Saad, Tushar Sharma

    Abstract: Researchers apply machine-learning techniques for code smell detection to counter the subjectivity of many code smells. Such approaches need a large, manually annotated dataset for training and benchmarking. Existing literature offers a few datasets; however, they are small in size and, more importantly, do not focus on the subjective code snippets. In this paper, we present DACOS, a manually anno… ▽ More

    Submitted 15 March, 2023; originally announced March 2023.

    Comments: 4 pages

  21. arXiv:2301.02211  [pdf, other

    cs.CY cs.CV

    Teaching Computer Vision for Ecology

    Authors: Elijah Cole, Suzanne Stathatos, Björn Lütjens, Tarun Sharma, Justin Kay, Jason Parham, Benjamin Kellenberger, Sara Beery

    Abstract: Computer vision can accelerate ecology research by automating the analysis of raw imagery from sensors like camera traps, drones, and satellites. However, computer vision is an emerging discipline that is rarely taught to ecologists. This work discusses our experience teaching a diverse group of ecologists to prototype and evaluate computer vision systems in the context of an intensive hands-on su… ▽ More

    Submitted 5 January, 2023; originally announced January 2023.

  22. arXiv:2211.06254  [pdf, other

    cs.NE cs.CV cs.LG

    Re-visiting Reservoir Computing architectures optimized by Evolutionary Algorithms

    Authors: Sebastián Basterrech, Tarun Kumar Sharma

    Abstract: For many years, Evolutionary Algorithms (EAs) have been applied to improve Neural Networks (NNs) architectures. They have been used for solving different problems, such as training the networks (adjusting the weights), designing network topology, optimizing global parameters, and selecting features. Here, we provide a systematic brief survey about applications of the EAs on the specific domain of… ▽ More

    Submitted 11 November, 2022; originally announced November 2022.

    Comments: Accepted manuscript to the 14th World Congress on Nature and Biologically Inspired Computing (NaBIC), Seattle, WA, United States, December 14-16, 2022. A revised manuscript will be published in the conference proceedings by Springer in the Lecture Notes in Networks and Systems

  23. arXiv:2209.02438  [pdf

    cs.CV

    Threat Detection In Self-Driving Vehicles Using Computer Vision

    Authors: Umang Goenka, Aaryan Jagetia, Param Patil, Akshay Singh, Taresh Sharma, Poonam Saini

    Abstract: On-road obstacle detection is an important field of research that falls in the scope of intelligent transportation infrastructure systems. The use of vision-based approaches results in an accurate and cost-effective solution to such systems. In this research paper, we propose a threat detection mechanism for autonomous self-driving cars using dashcam videos to ensure the presence of any unwanted o… ▽ More

    Submitted 6 September, 2022; originally announced September 2022.

    Comments: Presented in 3rd International Conference on Machine Learning, Image Processing, Network Security and Data Sciences MIND-2021

  24. arXiv:2206.13910  [pdf, other

    q-bio.PE cs.LG math.OC physics.soc-ph

    Epidemic Control Modeling using Parsimonious Models and Markov Decision Processes

    Authors: Edilson F. Arruda, Tarun Sharma, Rodrigo e A. Alexandre, Sinnu Susan Thomas

    Abstract: Many countries have experienced at least two waves of the COVID-19 pandemic. The second wave is far more dangerous as distinct strains appear more harmful to human health, but it stems from the complacency about the first wave. This paper introduces a parsimonious yet representative stochastic epidemic model that simulates the uncertain spread of the disease regardless of the latency and recovery… ▽ More

    Submitted 23 June, 2022; originally announced June 2022.

  25. arXiv:2204.11193  [pdf, other

    cs.CR cs.HC cs.SE

    Exploring Security Practices of Smart Contract Developers

    Authors: Tanusree Sharma, Zhixuan Zhou, Andrew Miller, Yang Wang

    Abstract: Smart contracts are self-executing programs that run on blockchains (e.g., Ethereum). 680 million US dollars worth of digital assets controlled by smart contracts have been hacked or stolen due to various security vulnerabilities in 2021. Although security is a fundamental concern for smart contracts, it is unclear how smart contract developers approach security. To help fill this research gap, we… ▽ More

    Submitted 24 April, 2022; originally announced April 2022.

  26. Empirical Standards for Repository Mining

    Authors: Preetha Chatterjee, Tushar Sharma, Paul Ralph

    Abstract: The purpose of scholarly peer review is to evaluate the quality of scientific manuscripts. However, study after study demonstrates that peer review neither effectively nor reliably assesses research quality. Empirical standards attempt to address this problem by modelling a scientific community's expectations for each kind of empirical study conducted in that community. This should enhance not onl… ▽ More

    Submitted 29 March, 2022; originally announced March 2022.

  27. arXiv:2201.13233  [pdf, other

    cs.CY cs.HC

    "It's A Blessing and A Curse": Unpacking Creators' Practices with Non-Fungible Tokens (NFTs) and Their Communities

    Authors: Tanusree Sharma, Zhixuan Zhou, Yun Huang, Yang Wang

    Abstract: NFTs (Non-Fungible Tokens) are blockchain-based cryptographic tokens to represent ownership of unique content such as images, videos, or 3D objects. Despite NFTs' increasing popularity and skyrocketing trading prices, little is known about people's perceptions of and experiences with NFTs. In this work, we focus on NFT creators and present results of an exploratory qualitative study in which we in… ▽ More

    Submitted 15 January, 2022; originally announced January 2022.

  28. BBM92 quantum key distribution over a free space dusty channel of 200 meters

    Authors: Sarika Mishra, Ayan Biswas, Satyajeet Patil, Pooja Chandravanshi, Vardaan Mongia, Tanya Sharma, Anju Rani, Shashi Prabhakar, S. Ramachandran, Ravindra P. Singh

    Abstract: Free space quantum communication assumes importance as it is a precursor for satellite-based quantum communication needed for secure key distribution over longer distances. Prepare and measure protocols like BB84 consider the satellite as a trusted device, which is fraught with security threat looking at the current trend for satellite-based optical communication. Therefore, entanglement-based pro… ▽ More

    Submitted 9 January, 2022; v1 submitted 22 December, 2021; originally announced December 2021.

    Comments: 7 pages, 6 figures, 2 tables

    Journal ref: Journal of Optics 24, 074002 (2022)

  29. arXiv:2110.09610  [pdf, other

    cs.SE cs.LG

    A Survey on Machine Learning Techniques for Source Code Analysis

    Authors: Tushar Sharma, Maria Kechagia, Stefanos Georgiou, Rohit Tiwari, Indira Vats, Hadi Moazen, Federica Sarro

    Abstract: The advancements in machine learning techniques have encouraged researchers to apply these techniques to a myriad of software engineering tasks that use source code analysis, such as testing and vulnerability detection. Such a large number of studies hinders the community from understanding the current research landscape. This paper aims to summarize the current knowledge in applied machine learni… ▽ More

    Submitted 13 September, 2022; v1 submitted 18 October, 2021; originally announced October 2021.

  30. arXiv:2104.10466  [pdf, other

    cs.SE eess.SY

    HDR-Fuzz: Detecting Buffer Overruns using AddressSanitizer Instrumentation and Fuzzing

    Authors: Raveendra Kumar Medicherla, Malathy Nagalakshmi, Tanya Sharma, Raghavan Komondoor

    Abstract: Buffer-overruns are a prevalent vulnerability in software libraries and applications. Fuzz testing is one of the effective techniques to detect vulnerabilities in general. Greybox fuzzers such as AFL automatically generate a sequence of test inputs for a given program using a fitness-guided search process. A recently proposed approach in the literature introduced a buffer-overrun specific fitness… ▽ More

    Submitted 21 April, 2021; originally announced April 2021.

    ACM Class: D.2.5; K.6.5

  31. arXiv:2012.12324  [pdf, other

    cs.SE

    Do We Need Improved Code Quality Metrics?

    Authors: Tushar Sharma, Diomidis Spinellis

    Abstract: The software development community has been using code quality metrics for the last five decades. Despite their wide adoption, code quality metrics have attracted a fair share of criticism. In this paper, first, we carry out a qualitative exploration by surveying software developers to gauge their opinions about current practices and potential gaps with the present set of metrics. We identify defi… ▽ More

    Submitted 22 December, 2020; originally announced December 2020.

  32. arXiv:2008.04114  [pdf, other

    cs.CV

    Improved Adaptive Type-2 Fuzzy Filter with Exclusively Two Fuzzy Membership Function for Filtering Salt and Pepper Noise

    Authors: Vikas Singh, Pooja Agrawal, Teena Sharma, Nishchal K. Verma

    Abstract: Image denoising is one of the preliminary steps in image processing methods in which the presence of noise can deteriorate the image quality. To overcome this limitation, in this paper a improved two-stage fuzzy filter is proposed for filtering salt and pepper noise from the images. In the first-stage, the pixels in the image are categorized as good or noisy based on adaptive thresholding using ty… ▽ More

    Submitted 10 August, 2020; originally announced August 2020.

  33. Computer and Network Security

    Authors: Jaydip Sen, Sidra Mehtab, Michael Ekonde Sone, Veeramreddy Jyothsna, Koneti Munivara Prasad, Rajeev Singh, Teek Parval Sharma, Anton Noskov, Ignacio Velasquez, Angelica Caro, Alfonco Rodriguez, Tamer S. A. Fatayer, Altaf O. Mulani, Pradeep B. Mane, Roshan Chitrakar, Roshan Bhusal, Prajwol Maharjan

    Abstract: In the era of Internet of Things and with the explosive worldwide growth of electronic data volume, and associated need of processing, analysis and storage of such humongous volume of data, several new challenges are faced in protecting privacy of sensitive data and securing systems by designing novel schemes for secure authentication, integrity protection, encryption and non-repudiation. Lightwei… ▽ More

    Submitted 31 July, 2020; originally announced July 2020.

    Comments: 175 pages, 87 figures and 44 Tables

  34. arXiv:2007.13737  [pdf, other

    cs.OH cs.HC q-bio.QM

    BIDEAL: A Toolbox for Bicluster Analysis -- Generation, Visualization and Validation

    Authors: Nishchal K. Verma, T. Sharma, S. Dixit, P. Agrawal, S. Sengupta, V. Singh

    Abstract: This paper introduces a novel toolbox named BIDEAL for the generation of biclusters, their analysis, visualization, and validation. The objective is to facilitate researchers to use forefront biclustering algorithms embedded on a single platform. A single toolbox comprising various biclustering algorithms play a vital role to extract meaningful patterns from the data for detecting diseases, biomar… ▽ More

    Submitted 26 July, 2020; originally announced July 2020.

  35. arXiv:2007.04444  [pdf, other

    cs.CR cs.HC cs.SE

    Are PETs (Privacy Enhancing Technologies) Giving Protection for Smartphones? -- A Case Study

    Authors: Tanusree Sharma, Masooda Bashir

    Abstract: With smartphone technologies enhanced way of interacting with the world around us, it has also been paving the way for easier access to our private and personal information. This has been amplified by the existence of numerous embedded sensors utilized by millions of apps to users. While mobile apps have positively transformed many aspects of our lives with new functionalities, many of these appli… ▽ More

    Submitted 8 July, 2020; originally announced July 2020.

  36. arXiv:2004.14165  [pdf, ps, other

    cs.CL

    Classification of Cuisines from Sequentially Structured Recipes

    Authors: Tript Sharma, Utkarsh Upadhyay, Ganesh Bagler

    Abstract: Cultures across the world are distinguished by the idiosyncratic patterns in their cuisines. These cuisines are characterized in terms of their substructures such as ingredients, cooking processes and utensils. A complex fusion of these substructures intrinsic to a region defines the identity of a cuisine. Accurate classification of cuisines based on their culinary features is an outstanding probl… ▽ More

    Submitted 26 April, 2020; originally announced April 2020.

    Comments: 36th IEEE International Conference on Data Engineering (ICDE 2020), DECOR Workshop; 4 pages, 4 tables

  37. arXiv:2004.12283  [pdf, other

    cs.SI physics.soc-ph

    Hierarchical Clustering of World Cuisines

    Authors: Tript Sharma, Utkarsh Upadhyay, Jushaan Kalra, Sakshi Arora, Saad Ahmad, Bhavay Aggarwal, Ganesh Bagler

    Abstract: Cultures across the world have evolved to have unique patterns despite shared ingredients and cooking techniques. Using data obtained from RecipeDB, an online resource for recipes, we extract patterns in 26 world cuisines and further probe for their inter-relatedness. By application of frequent itemset mining and ingredient authenticity we characterize the quintessential patterns in the cuisines a… ▽ More

    Submitted 25 April, 2020; originally announced April 2020.

    Comments: 36th IEEE International Conference on Data Engineering (ICDE 2020), DECOR Workshop; 6 pages, 6 figures, 1 table

  38. arXiv:2003.07666  [pdf, other

    physics.app-ph cs.LG

    Inverse Design of Potential Singlet Fission Molecules using a Transfer Learning Based Approach

    Authors: Akshay Subramanian, Utkarsh Saha, Tejasvini Sharma, Naveen K. Tailor, Soumitra Satapathi

    Abstract: Singlet fission has emerged as one of the most exciting phenomena known to improve the efficiencies of different types of solar cells and has found uses in diverse optoelectronic applications. The range of available singlet fission molecules is, however, limited as to undergo singlet fission, molecules have to satisfy certain energy conditions. Recent advances in material search using inverse desi… ▽ More

    Submitted 17 March, 2020; originally announced March 2020.

    Comments: 15 pages, 4 figures. The first two authors contributed equally

  39. arXiv:2001.10832  [pdf, other

    eess.AS cs.LG cs.MM cs.SD eess.IV

    Audio-Visual Decision Fusion for WFST-based and seq2seq Models

    Authors: Rohith Aralikatti, Sharad Roy, Abhinav Thanda, Dilip Kumar Margam, Pujitha Appan Kandala, Tanay Sharma, Shankar M Venkatesan

    Abstract: Under noisy conditions, speech recognition systems suffer from high Word Error Rates (WER). In such cases, information from the visual modality comprising the speaker lip movements can help improve the performance. In this work, we propose novel methods to fuse information from audio and visual modalities at inference time. This enables us to train the acoustic and visual models independently. Fir… ▽ More

    Submitted 29 January, 2020; originally announced January 2020.

    Comments: Submitted for review to ICASSP 2020 on October 21st, 2019

  40. arXiv:1906.12170  [pdf, other

    cs.CV cs.LG cs.SD eess.AS eess.IV

    LipReading with 3D-2D-CNN BLSTM-HMM and word-CTC models

    Authors: Dilip Kumar Margam, Rohith Aralikatti, Tanay Sharma, Abhinav Thanda, Pujitha A K, Sharad Roy, Shankar M Venkatesan

    Abstract: In recent years, deep learning based machine lipreading has gained prominence. To this end, several architectures such as LipNet, LCANet and others have been proposed which perform extremely well compared to traditional lipreading DNN-HMM hybrid systems trained on DCT features. In this work, we propose a simpler architecture of 3D-2D-CNN-BLSTM network with a bottleneck layer. We also present analy… ▽ More

    Submitted 25 June, 2019; originally announced June 2019.

    Comments: Submitted to Interspeech 2019

  41. arXiv:1906.11803  [pdf, ps, other

    cs.CY

    Data Consortia

    Authors: Eric Bax, John Donald, Melissa Gerber, Lisa Giaffo, Tanisha Sharma, Nikki Thompson, Kimberly Williams

    Abstract: Today, web-based companies use user data to provide and enhance services to users, both individually and collectively. Some also analyze user data for other purposes, for example to select advertisements or price offers for users. Some even use or allow the data to be used to evaluate investments in financial markets. Users' concerns about how their data is or may be used has prompted legislative… ▽ More

    Submitted 27 June, 2019; originally announced June 2019.

  42. On the Feasibility of Transfer-learning Code Smells using Deep Learning

    Authors: Tushar Sharma, Vasiliki Efstathiou, Panos Louridas, Diomidis Spinellis

    Abstract: Context: A substantial amount of work has been done to detect smells in source code using metrics-based and heuristics-based methods. Machine learning methods have been recently applied to detect source code smells; however, the current practices are considered far from mature. Objective: First, explore the feasibility of applying deep learning models to detect smells without extensive feature eng… ▽ More

    Submitted 16 September, 2019; v1 submitted 5 April, 2019; originally announced April 2019.

  43. arXiv:1808.02861  [pdf, other

    cs.CV

    Choose Your Neuron: Incorporating Domain Knowledge through Neuron-Importance

    Authors: Ramprasaath R. Selvaraju, Prithvijit Chattopadhyay, Mohamed Elhoseiny, Tilak Sharma, Dhruv Batra, Devi Parikh, Stefan Lee

    Abstract: Individual neurons in convolutional neural networks supervised for image-level classification tasks have been shown to implicitly learn semantically meaningful concepts ranging from simple textures and shapes to whole or partial objects - forming a "dictionary" of concepts acquired through the learning process. In this work we introduce a simple, efficient zero-shot learning approach based on this… ▽ More

    Submitted 8 August, 2018; originally announced August 2018.

    Comments: In Proceedings of ECCV 2018

  44. arXiv:1804.04353  [pdf, other

    eess.AS cs.AI eess.SP stat.ML

    Global SNR Estimation of Speech Signals using Entropy and Uncertainty Estimates from Dropout Networks

    Authors: Rohith Aralikatti, Dilip Margam, Tanay Sharma, Thanda Abhinav, Shankar M Venkatesan

    Abstract: This paper demonstrates two novel methods to estimate the global SNR of speech signals. In both methods, Deep Neural Network-Hidden Markov Model (DNN-HMM) acoustic model used in speech recognition systems is leveraged for the additional task of SNR estimation. In the first method, the entropy of the DNN-HMM output is computed. Recent work on bayesian deep learning has shown that a DNN-HMM trained… ▽ More

    Submitted 12 April, 2018; originally announced April 2018.

  45. arXiv:1701.02704  [pdf, other

    cs.CV

    What are the visual features underlying human versus machine vision?

    Authors: Drew Linsley, Sven Eberhardt, Tarun Sharma, Pankaj Gupta, Thomas Serre

    Abstract: Although Deep Convolutional Networks (DCNs) are approaching the accuracy of human observers at object recognition, it is unknown whether they leverage similar visual representations to achieve this performance. To address this, we introduce Clicktionary, a web-based game for identifying visual features used by human observers during object recognition. Importance maps derived from the game are con… ▽ More

    Submitted 7 November, 2017; v1 submitted 10 January, 2017; originally announced January 2017.

    Comments: 9 pages, 7 figures

  46. arXiv:1605.01802  [pdf

    cs.DC

    Multiple K Means++ Clustering of Satellite Image Using Hadoop MapReduce and Spark

    Authors: Tapan Sharma, Dr. Vinod Shokeen, Dr. Sunil Mathur

    Abstract: Clustering of image is one of the important steps of mining satellite images. In our experiment we have simultaneously run multiple K-means algorithms with different initial centroids and values of k in the same iteration of MapReduce jobs. For initialization of initial centroids we have implemented Scalable K-Means++ MapReduce (MR) job [1]. We have also run a validation algorithm of Simplified Si… ▽ More

    Submitted 5 May, 2016; originally announced May 2016.

    Comments: 9 Pages, Distributed Computing, Satellite Images, Clustering, Published in International Journal of Advanced Studies in Computer Science and Engineering, IJASCSE volume 5 issue 4, 2016

  47. arXiv:1604.08379  [pdf, other

    cs.GT

    Balanced Ranking Mechanisms

    Authors: Debasis Mishra, Tridib Sharma

    Abstract: In the private values single object auction model, we construct a satisfactory mechanism - a symmetric, dominant strategy incentive compatible, and budget-balanced mechanism. Our mechanism allocates the object to the highest valued agent with more than 99% probability provided there are at least 14 agents. It is also ex-post individually rational. We show that our mechanism is optimal in a restric… ▽ More

    Submitted 28 April, 2016; originally announced April 2016.

  48. Private Data Transfer over a Broadcast Channel

    Authors: Manoj Mishra, Tanmay Sharma, Bikash K. Dey, Vinod M. Prabhakaran

    Abstract: We study the following private data transfer problem: Alice has a database of files. Bob and Cathy want to access a file each from this database (which may or may not be the same file), but each of them wants to ensure that their choices of file do not get revealed even if Alice colludes with the other user. Alice, on the other hand, wants to make sure that each of Bob and Cathy does not learn any… ▽ More

    Submitted 16 April, 2015; v1 submitted 5 April, 2015; originally announced April 2015.

    Comments: To be presented at IEEE International Symposium on Information Theory (ISIT 2015), Hong Kong

  49. arXiv:1405.0787  [pdf

    cs.CR

    Analysis of Email Fraud detection using WEKA Tool

    Authors: Tarushi Sharma, AmanPreet Kaur

    Abstract: Data mining is also being useful to give solutions for invasion finding and auditing. While data mining has several applications in protection, there are also serious privacy fears. Because of email mining, even inexperienced users can connect data and make responsive associations. Therefore we must to implement the privacy of persons while working on practical data mining

    Submitted 5 May, 2014; originally announced May 2014.

  50. arXiv:1302.0965  [pdf

    cs.NI

    Adaptive Energy Aware Data Aggregation Tree for Wireless Sensor Networks

    Authors: Deepali Virmani, Tanu Sharma, Ritu Sharma

    Abstract: To meet the demands of wireless sensor networks (WSNs) where data are usually aggregated at a single source prior to transmitting to any distant user, there is a need to establish a tree structure inside to aggregate data. In this paper, an adaptive energy aware data aggregation tree (AEDT) is proposed. The proposed tree uses the maximum energy available node as the data aggregator node. The tree… ▽ More

    Submitted 5 February, 2013; originally announced February 2013.

    Comments: 12 pages, 8 figures, International Journal of Hybrid Information Technology Vol. 6, No. 1, January, 2013