subscribe to arXiv mailings

doi 10.1145/3661455.3669867

Reconfiguring Participatory Design to Resist AI Realism

Abstract: The growing trend of artificial intelligence (AI) as a solution to social and technical problems reinforces AI Realism -- the belief that AI is an inevitable and natural order. In response, this paper argues that participatory design (PD), with its focus on democratic values and processes, can play a role in questioning and resisting AI Realism. I examine three concerning aspects of AI Realism: th… ▽ More The growing trend of artificial intelligence (AI) as a solution to social and technical problems reinforces AI Realism -- the belief that AI is an inevitable and natural order. In response, this paper argues that participatory design (PD), with its focus on democratic values and processes, can play a role in questioning and resisting AI Realism. I examine three concerning aspects of AI Realism: the facade of democratization that lacks true empowerment, demands for human adaptability in contrast to AI systems' inflexibility, and the obfuscation of essential human labor enabling the AI system. I propose resisting AI Realism by reconfiguring PD to continue engaging with value-centered visions, increasing its exploration of non-AI alternatives, and making the essential human labor underpinning AI systems visible. I position PD as a means to generate friction against AI Realism and open space for alternative futures centered on human needs and values. △ Less

Submitted 8 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

Comments: 6 pages, 1 table

Journal ref: Participatory Design Conference 2024

arXiv:2405.15669 [pdf, other]

doi 10.1145/3643834.3660730

Enhancing Reentry Support Programs Through Digital Literacy Integration

Authors: Aakash Gautam, Khushboo Gandhi, Jessica Eileen Sendejo

Abstract: Challenges faced by formerly incarcerated individuals in the United States raise questions about our society's ability to truly provide second chances. This paper presents the outcomes of our ongoing collaboration with a non-profit organization dedicated to reentry support. We highlight the multifaceted challenges individuals face during their reentry journey, including support programs that prior… ▽ More Challenges faced by formerly incarcerated individuals in the United States raise questions about our society's ability to truly provide second chances. This paper presents the outcomes of our ongoing collaboration with a non-profit organization dedicated to reentry support. We highlight the multifaceted challenges individuals face during their reentry journey, including support programs that prioritize supervision over service, unresponsive support systems, limited access to resources, financial struggles exacerbated by restricted employment opportunities, and technological barriers. In the face of such complex social challenges, our work aims to facilitate our partner organization's ongoing efforts to promote digital literacy through a web application that is integrated into their existing processes. We share initial feedback from the stakeholders, draw out four implications: supporting continuity of care, promoting reflection through slow technology, building in flexibility, and reconfiguring toward existing infrastructure, and conclude with a reflection on our role as partners on the side. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: 15 pages, 1 table, 3 figures

Journal ref: Designing Interactive Systems Conference 2024

arXiv:2404.07775 [pdf, other]

Discourse-Aware In-Context Learning for Temporal Expression Normalization

Authors: Akash Kumar Gautam, Lukas Lange, Jannik Strötgen

Abstract: Temporal expression (TE) normalization is a well-studied problem. However, the predominately used rule-based systems are highly restricted to specific settings, and upcoming machine learning approaches suffer from a lack of labeled data. In this work, we explore the feasibility of proprietary and open-source large language models (LLMs) for TE normalization using in-context learning to inject task… ▽ More Temporal expression (TE) normalization is a well-studied problem. However, the predominately used rule-based systems are highly restricted to specific settings, and upcoming machine learning approaches suffer from a lack of labeled data. In this work, we explore the feasibility of proprietary and open-source large language models (LLMs) for TE normalization using in-context learning to inject task, document, and example information into the model. We explore various sample selection strategies to retrieve the most relevant set of examples. By using a window-based prompt design approach, we can perform TE normalization across sentences, while leveraging the LLM knowledge without training the model. Our experiments show competitive results to models designed for this task. In particular, our method achieves large performance improvements for non-standard settings by dynamically including relevant examples during inference. △ Less

Submitted 11 April, 2024; originally announced April 2024.

Comments: Accepted at NAACL 2024

arXiv:2403.08773 [pdf, other]

Veagle: Advancements in Multimodal Representation Learning

Authors: Rajat Chawla, Arkajit Datta, Tushar Verma, Adarsh Jha, Anmol Gautam, Ayush Vatsal, Sukrit Chaterjee, Mukunda NS, Ishaan Bhola

Abstract: Lately, researchers in artificial intelligence have been really interested in how language and vision come together, giving rise to the development of multimodal models that aim to seamlessly integrate textual and visual information. Multimodal models, an extension of Large Language Models (LLMs), have exhibited remarkable capabilities in addressing a diverse array of tasks, ranging from image cap… ▽ More Lately, researchers in artificial intelligence have been really interested in how language and vision come together, giving rise to the development of multimodal models that aim to seamlessly integrate textual and visual information. Multimodal models, an extension of Large Language Models (LLMs), have exhibited remarkable capabilities in addressing a diverse array of tasks, ranging from image captioning and visual question answering (VQA) to visual grounding. While these models have showcased significant advancements, challenges persist in accurately interpreting images and answering the question, a common occurrence in real-world scenarios. This paper introduces a novel approach to enhance the multimodal capabilities of existing models. In response to the limitations observed in current Vision Language Models (VLMs) and Multimodal Large Language Models (MLLMs), our proposed model Veagle, incorporates a unique mechanism inspired by the successes and insights of previous works. Veagle leverages a dynamic mechanism to project encoded visual information directly into the language model. This dynamic approach allows for a more nuanced understanding of intricate details present in visual contexts. To validate the effectiveness of Veagle, we conduct comprehensive experiments on benchmark datasets, emphasizing tasks such as visual question answering and image understanding. Our results indicate a improvement of 5-6 \% in performance, with Veagle outperforming existing models by a notable margin. The outcomes underscore the model's versatility and applicability beyond traditional benchmarks. △ Less

Submitted 18 January, 2024; originally announced March 2024.

arXiv:2402.07173 [pdf, other]

INSITE: labelling medical images using submodular functions and semi-supervised data programming

Authors: Akshat Gautam, Anurag Shandilya, Akshit Srivastava, Venkatapathy Subramanian, Ganesh Ramakrishnan, Kshitij Jadhav

Abstract: The necessity of large amounts of labeled data to train deep models, especially in medical imaging creates an implementation bottleneck in resource-constrained settings. In Insite (labelINg medical imageS usIng submodular funcTions and sEmi-supervised data programming) we apply informed subset selection to identify a small number of most representative or diverse images from a huge pool of unlabel… ▽ More The necessity of large amounts of labeled data to train deep models, especially in medical imaging creates an implementation bottleneck in resource-constrained settings. In Insite (labelINg medical imageS usIng submodular funcTions and sEmi-supervised data programming) we apply informed subset selection to identify a small number of most representative or diverse images from a huge pool of unlabelled data subsequently annotated by a domain expert. The newly annotated images are then used as exemplars to develop several data programming-driven labeling functions. These labelling functions output a predicted-label and a similarity score when given an unlabelled image as an input. A consensus is brought amongst the outputs of these labeling functions by using a label aggregator function to assign the final predicted label to each unlabelled data point. We demonstrate that informed subset selection followed by semi-supervised data programming methods using these images as exemplars perform better than other state-of-the-art semi-supervised methods. Further, for the first time we demonstrate that this can be achieved through a small set of images used as exemplars. △ Less

Submitted 11 February, 2024; originally announced February 2024.

arXiv:2402.06159 [pdf, other]

Passwords Are Meant to Be Secret: A Practical Secure Password Entry Channel for Web Browsers

Authors: Anuj Gautam, Tarun Kumar Yadav, Kent Seamons, Scott Ruoti

Abstract: Password-based authentication faces various security and usability issues. Password managers help alleviate some of these issues by enabling users to manage their passwords effectively. However, malicious client-side scripts and browser extensions can steal passwords after they have been autofilled by the manager into the web page. In this paper, we explore what role the password manager can take… ▽ More Password-based authentication faces various security and usability issues. Password managers help alleviate some of these issues by enabling users to manage their passwords effectively. However, malicious client-side scripts and browser extensions can steal passwords after they have been autofilled by the manager into the web page. In this paper, we explore what role the password manager can take in preventing the theft of autofilled credentials without requiring a change to user behavior. To this end, we identify a threat model for password exfiltration and then use this threat model to explore the design space for secure password entry implemented using a password manager. We identify five potential designs that address this issue, each with varying security and deployability tradeoffs. Our analysis shows the design that best balances security and usability is for the manager to autofill a fake password and then rely on the browser to replace the fake password with the actual password immediately before the web request is handed over to the operating system to be transmitted over the network. This removes the ability for malicious client-side scripts or browser extensions to access and exfiltrate the real password. We implement our design in the Firefox browser and conduct experiments, which show that it successfully thwarts malicious scripts and extensions on 97\% of the Alexa top 1000 websites, while also maintaining the capability to revert to default behavior on the remaining websites, avoiding functionality regressions. Most importantly, this design is transparent to users, requiring no change to user behavior. △ Less

Submitted 8 February, 2024; originally announced February 2024.

arXiv:2402.03255 [pdf, other]

Security Advice for Parents and Children About Content Filtering and Circumvention as Found on YouTube and TikTok

Authors: Ran Elgedawy, John Sadik, Anuj Gautam, Trinity Bissahoyo, Christopher Childress, Jacob Leonard, Clay Shubert, Scott Ruoti

Abstract: In today's digital age, concerns about online security and privacy have become paramount. However, addressing these issues can be difficult, especially within the context of family relationships, wherein parents and children may have conflicting interests. In this environment, parents and children may turn to online security advice to determine how to proceed. In this paper, we examine the advice… ▽ More In today's digital age, concerns about online security and privacy have become paramount. However, addressing these issues can be difficult, especially within the context of family relationships, wherein parents and children may have conflicting interests. In this environment, parents and children may turn to online security advice to determine how to proceed. In this paper, we examine the advice available to parents and children regarding content filtering and circumvention as found on YouTube and TikTok. In an analysis of 839 videos returned from queries on these topics, we found that half (n=399) provide relevant advice. Our results show that of these videos, roughly three-quarters are accurate, with the remaining one-fourth containing factually incorrect advice. We find that videos targeting children are both more likely to be incorrect and actionable than videos targeting parents, leaving children at increased risk of taking harmful action. Moreover, we find that while advice videos targeting parents will occasionally discuss the ethics of content filtering and device monitoring (including recommendations to respect children's autonomy) no such discussion of the ethics or risks of circumventing content filtering is given to children, leaving them unaware of any risks that may be involved with doing so. Ultimately, our research indicates that video-based social media sites are already effective sources of security advice propagation and that the public would benefit from security researchers and practitioners engaging more with these platforms, both for the creation of content and of tools designed to help with more effective filtering. △ Less

Submitted 5 February, 2024; originally announced February 2024.

Comments: 15 pages, 5 figures, 8 tables

arXiv:2402.00689 [pdf, other]

Ocassionally Secure: A Comparative Analysis of Code Generation Assistants

Authors: Ran Elgedawy, John Sadik, Senjuti Dutta, Anuj Gautam, Konstantinos Georgiou, Farzin Gholamrezae, Fujiao Ji, Kyungchan Lim, Qian Liu, Scott Ruoti

Abstract: $ $Large Language Models (LLMs) are being increasingly utilized in various applications, with code generations being a notable example. While previous research has shown that LLMs have the capability to generate both secure and insecure code, the literature does not take into account what factors help generate secure and effective code. Therefore in this paper we focus on identifying and understan… ▽ More $ $Large Language Models (LLMs) are being increasingly utilized in various applications, with code generations being a notable example. While previous research has shown that LLMs have the capability to generate both secure and insecure code, the literature does not take into account what factors help generate secure and effective code. Therefore in this paper we focus on identifying and understanding the conditions and contexts in which LLMs can be effectively and safely deployed in real-world scenarios to generate quality code. We conducted a comparative analysis of four advanced LLMs--GPT-3.5 and GPT-4 using ChatGPT and Bard and Gemini from Google--using 9 separate tasks to assess each model's code generation capabilities. We contextualized our study to represent the typical use cases of a real-life developer employing LLMs for everyday tasks as work. Additionally, we place an emphasis on security awareness which is represented through the use of two distinct versions of our developer persona. In total, we collected 61 code outputs and analyzed them across several aspects: functionality, security, performance, complexity, and reliability. These insights are crucial for understanding the models' capabilities and limitations, guiding future development and practical applications in the field of automated code generation. △ Less

Submitted 1 February, 2024; originally announced February 2024.

Comments: 12 pages, 2 figures

arXiv:2401.01285 [pdf, ps, other]

doi 10.1145/3626252.3630926

Socially Responsible Computing in an Introductory Course

Authors: Aakash Gautam, Anagha Kulkarni, Sarah Hug, Jane Lehr, Ilmi Yoon

Abstract: Given the potential for technology to inflict harm and injustice on society, it is imperative that we cultivate a sense of social responsibility among our students as they progress through the Computer Science (CS) curriculum. Our students need to be able to examine the social complexities in which technology development and use are situated. Also, aligning students' personal goals and their abili… ▽ More Given the potential for technology to inflict harm and injustice on society, it is imperative that we cultivate a sense of social responsibility among our students as they progress through the Computer Science (CS) curriculum. Our students need to be able to examine the social complexities in which technology development and use are situated. Also, aligning students' personal goals and their ability to achieve them in their field of study is important for promoting motivation and a sense of belonging. Promoting communal goals while learning computing can help broaden participation, particularly among groups who have been historically marginalized in computing. Keeping these considerations in mind, we piloted an introductory Java programming course in which activities engaging students in ethical and socially responsible considerations were integrated across modules. Rather than adding social on top of the technical content, our curricular approach seeks to weave them together. The data from the class suggests that the students found the inclusion of the social context in the technical assignments to be more motivating and expressed greater agency in realizing social change. We share our approach to designing this new introductory socially responsible computing course and the students' reflections. We also highlight seven considerations for educators seeking to incorporate socially responsible computing. △ Less

Submitted 2 January, 2024; originally announced January 2024.

Journal ref: Proceedings of the 55th ACM Technical Symposium on Computer Science Education (SIGCSE 2024)

arXiv:2311.14786 [pdf, other]

GPT-4V Takes the Wheel: Promises and Challenges for Pedestrian Behavior Prediction

Authors: Jia Huang, Peng Jiang, Alvika Gautam, Srikanth Saripalli

Abstract: Predicting pedestrian behavior is the key to ensure safety and reliability of autonomous vehicles. While deep learning methods have been promising by learning from annotated video frame sequences, they often fail to fully grasp the dynamic interactions between pedestrians and traffic, crucial for accurate predictions. These models also lack nuanced common sense reasoning. Moreover, the manual anno… ▽ More Predicting pedestrian behavior is the key to ensure safety and reliability of autonomous vehicles. While deep learning methods have been promising by learning from annotated video frame sequences, they often fail to fully grasp the dynamic interactions between pedestrians and traffic, crucial for accurate predictions. These models also lack nuanced common sense reasoning. Moreover, the manual annotation of datasets for these models is expensive and challenging to adapt to new situations. The advent of Vision Language Models (VLMs) introduces promising alternatives to these issues, thanks to their advanced visual and causal reasoning skills. To our knowledge, this research is the first to conduct both quantitative and qualitative evaluations of VLMs in the context of pedestrian behavior prediction for autonomous driving. We evaluate GPT-4V(ision) on publicly available pedestrian datasets: JAAD and WiDEVIEW. Our quantitative analysis focuses on GPT-4V's ability to predict pedestrian behavior in current and future frames. The model achieves a 57% accuracy in a zero-shot manner, which, while impressive, is still behind the state-of-the-art domain-specific models (70%) in predicting pedestrian crossing actions. Qualitatively, GPT-4V shows an impressive ability to process and interpret complex traffic scenarios, differentiate between various pedestrian behaviors, and detect and analyze groups. However, it faces challenges, such as difficulty in detecting smaller pedestrians and assessing the relative motion between pedestrians and the ego vehicle. △ Less

Submitted 25 January, 2024; v1 submitted 24 November, 2023; originally announced November 2023.

arXiv:2310.10290 [pdf, other]

Autonomous Mapping and Navigation using Fiducial Markers and Pan-Tilt Camera for Assisting Indoor Mobility of Blind and Visually Impaired People

Authors: Dharmateja Adapa, Virendra Singh Shekhawat, Avinash Gautam, Sudeept Mohan

Abstract: Large indoor spaces have complex layouts making them difficult to navigate. Indoor spaces in hospitals, universities, shopping complexes, etc., carry multi-modal information in the form of text and symbols. Hence, it is difficult for Blind and Visually Impaired (BVI) people to independently navigate such spaces. Indoor environments are usually GPS-denied; therefore, Bluetooth-based, WiFi-based, or… ▽ More Large indoor spaces have complex layouts making them difficult to navigate. Indoor spaces in hospitals, universities, shopping complexes, etc., carry multi-modal information in the form of text and symbols. Hence, it is difficult for Blind and Visually Impaired (BVI) people to independently navigate such spaces. Indoor environments are usually GPS-denied; therefore, Bluetooth-based, WiFi-based, or Range-based methods are used for localization. These methods have high setup costs, lesser accuracy, and sometimes need special sensing equipment. We propose a Visual Assist (VA) system for the indoor navigation of BVI individuals using visual Fiducial markers for localization. State-of-the-art (SOTA) approaches for visual localization using Fiducial markers use fixed cameras having a narrow field of view. These approaches stop tracking the markers when they are out of sight. We employ a Pan-Tilt turret-mounted camera which enhances the field of view to 360° for enhanced marker tracking. We, therefore, need fewer markers for mapping and navigation. The efficacy of the proposed VA system is measured on three metrics, i.e., RMSE (Root Mean Square Error), ADNN (Average Distance to Nearest Neighbours), and ATE (Absolute Trajectory Error). Our system outperforms Hector-SLAM, ORB-SLAM3, and UcoSLAM. The proposed system achieves localization accuracy within $\pm8cm$ compared to $\pm12cm$ and $\pm10cm$ for ORB-SLAM3 and UcoSLAM, respectively. △ Less

Submitted 16 October, 2023; originally announced October 2023.

ACM Class: I.3.5; H.5.2

arXiv:2309.16057 [pdf, other]

WiDEVIEW: An UltraWideBand and Vision Dataset for Deciphering Pedestrian-Vehicle Interactions

Authors: Jia Huang, Alvika Gautam, Junghun Choi, Srikanth Saripalli

Abstract: Robust and accurate tracking and localization of road users like pedestrians and cyclists is crucial to ensure safe and effective navigation of Autonomous Vehicles (AVs), particularly so in urban driving scenarios with complex vehicle-pedestrian interactions. Existing datasets that are useful to investigate vehicle-pedestrian interactions are mostly image-centric and thus vulnerable to vision fail… ▽ More Robust and accurate tracking and localization of road users like pedestrians and cyclists is crucial to ensure safe and effective navigation of Autonomous Vehicles (AVs), particularly so in urban driving scenarios with complex vehicle-pedestrian interactions. Existing datasets that are useful to investigate vehicle-pedestrian interactions are mostly image-centric and thus vulnerable to vision failures. In this paper, we investigate Ultra-wideband (UWB) as an additional modality for road users' localization to enable a better understanding of vehicle-pedestrian interactions. We present WiDEVIEW, the first multimodal dataset that integrates LiDAR, three RGB cameras, GPS/IMU, and UWB sensors for capturing vehicle-pedestrian interactions in an urban autonomous driving scenario. Ground truth image annotations are provided in the form of 2D bounding boxes and the dataset is evaluated on standard 2D object detection and tracking algorithms. The feasibility of UWB is evaluated for typical traffic scenarios in both line-of-sight and non-line-of-sight conditions using LiDAR as ground truth. We establish that UWB range data has comparable accuracy with LiDAR with an error of 0.19 meters and reliable anchor-tag range data for up to 40 meters in line-of-sight conditions. UWB performance for non-line-of-sight conditions is subjective to the nature of the obstruction (trees vs. buildings). Further, we provide a qualitative analysis of UWB performance for scenarios susceptible to intermittent vision failures. The dataset can be downloaded via https://github.com/unmannedlab/UWB_Dataset. △ Less

Submitted 27 September, 2023; originally announced September 2023.

arXiv:2305.15441 [pdf, other]

Improving few-shot learning-based protein engineering with evolutionary sampling

Authors: M. Zaki Jawaid, Robin W. Yeo, Aayushma Gautam, T. Blair Gainous, Daniel O. Hart, Timothy P. Daley

Abstract: Designing novel functional proteins remains a slow and expensive process due to a variety of protein engineering challenges; in particular, the number of protein variants that can be experimentally tested in a given assay pales in comparison to the vastness of the overall sequence space, resulting in low hit rates and expensive wet lab testing cycles. In this paper, we propose a few-shot learning… ▽ More Designing novel functional proteins remains a slow and expensive process due to a variety of protein engineering challenges; in particular, the number of protein variants that can be experimentally tested in a given assay pales in comparison to the vastness of the overall sequence space, resulting in low hit rates and expensive wet lab testing cycles. In this paper, we propose a few-shot learning approach to novel protein design that aims to accelerate the expensive wet lab testing cycle and is capable of leveraging a training dataset that is both small and skewed ($\approx 10^5$ datapoints, $< 1\%$ positive hits). Our approach is composed of two parts: a semi-supervised transfer learning approach to generate a discrete fitness landscape for a desired protein function and a novel evolutionary Monte Carlo Markov Chain sampling algorithm to more efficiently explore the fitness landscape. We demonstrate the performance of our approach by experimentally screening predicted high fitness gene activators, resulting in a dramatically improved hit rate compared to existing methods. Our method can be easily adapted to other protein engineering and design problems, particularly where the cost associated with obtaining labeled data is significantly high. We have provided open source code for our method at https:// github.com/SuperSecretBioTech/evolutionary_monte_carlo_search. △ Less

Submitted 25 May, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

arXiv:2305.13051 [pdf, other]

Learning Pedestrian Actions to Ensure Safe Autonomous Driving

Authors: Jia Huang, Alvika Gautam, Srikanth Saripalli

Abstract: To ensure safe autonomous driving in urban environments with complex vehicle-pedestrian interactions, it is critical for Autonomous Vehicles (AVs) to have the ability to predict pedestrians' short-term and immediate actions in real-time. In recent years, various methods have been developed to study estimating pedestrian behaviors for autonomous driving scenarios, but there is a lack of clear defin… ▽ More To ensure safe autonomous driving in urban environments with complex vehicle-pedestrian interactions, it is critical for Autonomous Vehicles (AVs) to have the ability to predict pedestrians' short-term and immediate actions in real-time. In recent years, various methods have been developed to study estimating pedestrian behaviors for autonomous driving scenarios, but there is a lack of clear definitions for pedestrian behaviors. In this work, the literature gaps are investigated and a taxonomy is presented for pedestrian behavior characterization. Further, a novel multi-task sequence to sequence Transformer encoders-decoders (TF-ed) architecture is proposed for pedestrian action and trajectory prediction using only ego vehicle camera observations as inputs. The proposed approach is compared against an existing LSTM encoders decoders (LSTM-ed) architecture for action and trajectory prediction. The performance of both models is evaluated on the publicly available Joint Attention Autonomous Driving (JAAD) dataset, CARLA simulation data as well as real-time self-driving shuttle data collected on university campus. Evaluation results illustrate that the proposed method reaches an accuracy of 81% on action prediction task on JAAD testing data and outperforms the LSTM-ed by 7.4%, while LSTM counterpart performs much better on trajectory prediction task for a prediction sequence length of 25 frames. △ Less

Submitted 22 May, 2023; originally announced May 2023.

Comments: 8 pages, 9 figures

arXiv:2209.07861 [pdf, other]

Machine Learning Decoder for 5G NR PUCCH Format 0

Authors: Anil Kumar Yerrapragada, Jeeva Keshav S, Ankit Gautam, Radha Krishna Ganti

Abstract: 5G cellular systems depend on the timely exchange of feedback control information between the user equipment and the base station. Proper decoding of this control information is necessary to set up and sustain high throughput radio links. This paper makes the first attempt at using Machine Learning techniques to improve the decoding performance of the Physical Uplink Control Channel Format 0. We u… ▽ More 5G cellular systems depend on the timely exchange of feedback control information between the user equipment and the base station. Proper decoding of this control information is necessary to set up and sustain high throughput radio links. This paper makes the first attempt at using Machine Learning techniques to improve the decoding performance of the Physical Uplink Control Channel Format 0. We use fully connected neural networks to classify the received samples based on the uplink control information content embedded within them. The trained neural network, tested on real-time wireless captures, shows significant improvement in accuracy over conventional DFT-based decoders, even at low SNR. The obtained accuracy results also demonstrate conformance with 3GPP requirements. △ Less

Submitted 26 August, 2022; originally announced September 2022.

Comments: Submitted to NCC conference

arXiv:2207.09126 [pdf, other]

doi 10.1145/3536169.3537781

Empowering Participation Within Structures of Dependency

Authors: Aakash Gautam, Deborah Tatar

Abstract: Participatory Design (PD) seeks political change to support people's democratic control over processes, solutions, and, in general, matters of concern to them. A particular challenge remains in supporting vulnerable groups to gain power and control when they are dependent on organizations and external structures. We reflect on our five-year engagement with survivors of sex trafficking in Nepal and… ▽ More Participatory Design (PD) seeks political change to support people's democratic control over processes, solutions, and, in general, matters of concern to them. A particular challenge remains in supporting vulnerable groups to gain power and control when they are dependent on organizations and external structures. We reflect on our five-year engagement with survivors of sex trafficking in Nepal and an anti-trafficking organization that supports the survivors. Arguing that the prevalence of deficit perspective in the setting promotes dependency and robs the survivors' agency, we sought to bring change by exploring possibilities based on the survivors' existing assets. Three configurations illuminate how our design decisions and collective exploration operate to empower participation while attending to the substantial power implicitly and explicitly manifest in existing structures. We highlight the challenges we faced, uncovering actions that PD practitioners can take, including an emphasis on collaborative entanglements, attending to contingent factors, and encouraging provisional collectives. △ Less

Submitted 19 July, 2022; originally announced July 2022.

Comments: 12 pages, 3 figures, 1 table. In Participatory Design Conference 2022: Volume 1 (pp. 75-86)

Journal ref: In Participatory Design Conference 2022: Volume 1 (pp. 75-86) 2022

arXiv:2104.10017 [pdf, other]

doi 10.1145/3485832.3485884

The Emperor's New Autofill Framework: A Security Analysis of Autofill on iOS and Android

Authors: Sean Oesch, Anuj Gautam, Scott Ruoti

Abstract: Password managers help users more effectively manage their passwords, encouraging them to adopt stronger passwords across their many accounts. In contrast to desktop systems where password managers receive no system-level support, mobile operating systems provide autofill frameworks designed to integrate with password managers to provide secure and usable autofill for browsers and other apps insta… ▽ More Password managers help users more effectively manage their passwords, encouraging them to adopt stronger passwords across their many accounts. In contrast to desktop systems where password managers receive no system-level support, mobile operating systems provide autofill frameworks designed to integrate with password managers to provide secure and usable autofill for browsers and other apps installed on mobile devices. In this paper, we evaluate mobile autofill frameworks on iOS and Android, examining whether they achieve substantive benefits over the ad-hoc desktop environment or become a problematic single point of failure. Our results find that while the frameworks address several common issues, they also enforce insecure behavior and fail to provide password managers sufficient information to override the frameworks' insecure behavior, resulting in mobile managers being less secure than their desktop counterparts overall. We also demonstrate how these frameworks act as a confused deputy in manager-assisted credential phishing attacks. Our results demonstrate the need for significant improvements to mobile autofill frameworks. We conclude the paper with recommendations for the design and implementation of secure autofill frameworks. △ Less

Submitted 28 September, 2021; v1 submitted 20 April, 2021; originally announced April 2021.

Comments: 12 pages, 3 pages appendix, published at ACSAC 2021

arXiv:2101.11425 [pdf, other]

Fake News Detection System using XLNet model with Topic Distributions: CONSTRAINT@AAAI2021 Shared Task

Authors: Akansha Gautam, Venktesh V, Sarah Masud

Abstract: With the ease of access to information, and its rapid dissemination over the internet (both velocity and volume), it has become challenging to filter out truthful information from fake ones. The research community is now faced with the task of automatic detection of fake news, which carries real-world socio-political impact. One such research contribution came in the form of the Constraint@AAA1202… ▽ More With the ease of access to information, and its rapid dissemination over the internet (both velocity and volume), it has become challenging to filter out truthful information from fake ones. The research community is now faced with the task of automatic detection of fake news, which carries real-world socio-political impact. One such research contribution came in the form of the Constraint@AAA12021 Shared Task on COVID19 Fake News Detection in English. In this paper, we shed light on a novel method we proposed as a part of this shared task. Our team introduced an approach to combine topical distributions from Latent Dirichlet Allocation (LDA) with contextualized representations from XLNet. We also compared our method with existing baselines to show that XLNet + Topic Distributions outperforms other approaches by attaining an F1-score of 0.967. △ Less

Submitted 12 January, 2021; originally announced January 2021.

Comments: Accepted at CONSTRAINT@AAAI2021 Shared Task for the CONSTRAINT workshop, collocated with AAAI 2021

arXiv:2008.06854 [pdf, other]

SGG: Spinbot, Grammarly and GloVe based Fake News Detection

Authors: Akansha Gautam, Koteswar Rao Jerripothula

Abstract: Recently, news consumption using online news portals has increased exponentially due to several reasons, such as low cost and easy accessibility. However, such online platforms inadvertently also become the cause of spreading false information across the web. They are being misused quite frequently as a medium to disseminate misinformation and hoaxes. Such malpractices call for a robust automatic… ▽ More Recently, news consumption using online news portals has increased exponentially due to several reasons, such as low cost and easy accessibility. However, such online platforms inadvertently also become the cause of spreading false information across the web. They are being misused quite frequently as a medium to disseminate misinformation and hoaxes. Such malpractices call for a robust automatic fake news detection system that can keep us at bay from such misinformation and hoaxes. We propose a robust yet simple fake news detection system, leveraging the tools for paraphrasing, grammar-checking, and word-embedding. In this paper, we try to the potential of these tools in jointly unearthing the authenticity of a news article. Notably, we leverage Spinbot (for paraphrasing), Grammarly (for grammar-checking), and GloVe (for word-embedding) tools for this purpose. Using these tools, we were able to extract novel features that could yield state-of-the-art results on the Fake News AMT dataset and comparable results on Celebrity datasets when combined with some of the essential features. More importantly, the proposed method is found to be more robust empirically than the existing ones, as revealed in our cross-domain analysis and multi-domain analysis. △ Less

Submitted 16 August, 2020; originally announced August 2020.

Comments: 9 pages, 7 figures, Accepted by IEEE International Conference on Multimedia Big Data (BigMM), 2020

arXiv:2005.03534 [pdf, other]

p for political: Participation Without Agency Is Not Enough

Authors: Aakash Gautam, Deborah Tatar

Abstract: Participatory Design's vision of democratic participation assumes participants' feelings of agency in envisioning a collective future. But this assumption may be leaky when dealing with vulnerable populations. We reflect on the results of a series of activities aimed at supporting agentic-future-envisionment with a group of sex-trafficking survivors in Nepal. We observed a growing sense among the… ▽ More Participatory Design's vision of democratic participation assumes participants' feelings of agency in envisioning a collective future. But this assumption may be leaky when dealing with vulnerable populations. We reflect on the results of a series of activities aimed at supporting agentic-future-envisionment with a group of sex-trafficking survivors in Nepal. We observed a growing sense among the survivors that they could play a role in bringing about change in their families. They also became aware of how they could interact with available institutional resources. Reflecting on the observations, we argue that building participant agency on the small and personal interactions is necessary before demanding larger Political participation. In particular, a value of PD, especially for vulnerable populations, can lie in the process itself if it helps participants position themselves as actors in the larger world. △ Less

Submitted 7 May, 2020; originally announced May 2020.

Comments: 5 pages, 1 figure. Accepted in the 16th Participatory Design Conference (PDC'20)

ACM Class: K.4

arXiv:2005.01459 [pdf, other]

doi 10.1145/3313831.3376647

Crafting, Communality, and Computing: Building on Existing Strengths To Support a Vulnerable Population

Authors: Aakash Gautam, Deborah Tatar, Steve Harrison

Abstract: In Nepal, sex-trafficking survivors and the organizations that support them have limited resources to assist the survivors in their on-going journey towards reintegration. We take an asset-based approach wherein we identify and build on the strengths possessed by such groups. In this work, we present reflections from introducing a voice-annotated web application to a group of survivors. The web ap… ▽ More In Nepal, sex-trafficking survivors and the organizations that support them have limited resources to assist the survivors in their on-going journey towards reintegration. We take an asset-based approach wherein we identify and build on the strengths possessed by such groups. In this work, we present reflections from introducing a voice-annotated web application to a group of survivors. The web application tapped into and built upon two elements of pre-existing strengths possessed by the survivors -- the social bond between them and knowledge of crafting as taught to them by the organization. Our findings provide insight into the array of factors influencing how the survivors act in relation to one another as they created novel use practices and adapted the technology. Experience with the application seemed to open knowledge of computing as a potential source of strength. Finally, we articulate three design desiderata that could help promote communal spaces: make activity perceptible to the group, create appropriable steps, and build in fun choices. △ Less

Submitted 4 May, 2020; originally announced May 2020.

Comments: 14 pages, 1 figure. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (CHI'20)

ACM Class: K.4

arXiv:2004.07359 [pdf, other]

Usable, Acceptable, Appropriable: Towards Practicable Privacy

Authors: Aakash Gautam

Abstract: A majority of the work on digital privacy and security has focused on users from developed countries who account for only around 20\% of the global population. Moreover, the privacy needs for population that is already marginalized and vulnerable differ from users who have privilege to access a greater social support system. We reflect on our experiences of introducing computers and the Internet t… ▽ More A majority of the work on digital privacy and security has focused on users from developed countries who account for only around 20\% of the global population. Moreover, the privacy needs for population that is already marginalized and vulnerable differ from users who have privilege to access a greater social support system. We reflect on our experiences of introducing computers and the Internet to a group of sex-trafficking survivors in Nepal and highlight a few socio-political factors that have influenced the design space around digital privacy. These factors include the population's limited digital and text literacy skills and the fear of stigma against trafficked persons widely prevalent in Nepali society. We underscore the need to widen our perspective by focusing on practicable privacy, that is, privacy practices that are (1) usable, (2) acceptable, and (3) appropriable. △ Less

Submitted 15 April, 2020; originally announced April 2020.

Comments: 6 pages, position paper submitted to the CHI 2020 workshop on Networked Privacy

ACM Class: H.0

arXiv:2004.02168 [pdf, other]

Comparative Analysis of Multiple Deep CNN Models for Waste Classification

Authors: Dipesh Gyawali, Alok Regmi, Aatish Shakya, Ashish Gautam, Surendra Shrestha

Abstract: Waste is a wealth in a wrong place. Our research focuses on analyzing possibilities for automatic waste sorting and collecting in such a way that helps it for further recycling process. Various approaches are being practiced managing waste but not efficient and require human intervention. The automatic waste segregation would fit in to fill the gap. The project tested well known Deep Learning Netw… ▽ More Waste is a wealth in a wrong place. Our research focuses on analyzing possibilities for automatic waste sorting and collecting in such a way that helps it for further recycling process. Various approaches are being practiced managing waste but not efficient and require human intervention. The automatic waste segregation would fit in to fill the gap. The project tested well known Deep Learning Network architectures for waste classification with dataset combined from own endeavors and Trash Net. The convolutional neural network is used for image classification. The hardware built in the form of dustbin is used to segregate those wastes into different compartments. Without the human exercise in segregating those waste products, the study would save the precious time and would introduce the automation in the area of waste management. Municipal solid waste is a huge, renewable source of energy. The situation is win-win for both government, society and industrialists. Because of fine-tuning of the ResNet18 Network, the best validation accuracy was found to be 87.8%. △ Less

Submitted 14 August, 2020; v1 submitted 5 April, 2020; originally announced April 2020.

Comments: 6 pages, 13 figures

Journal ref: 5th International Conference on Advanced Engineering and ICT-Convergence 2020

arXiv:2003.00826 [pdf]

Realistic River Image Synthesis using Deep Generative Adversarial Networks

Authors: Akshat Gautam, Muhammed Sit, Ibrahim Demir

Abstract: In this paper, we demonstrated a practical application of realistic river image generation using deep learning. Specifically, we explored a generative adversarial network (GAN) model capable of generating high-resolution and realistic river images that can be used to support modeling and analysis in surface water estimation, river meandering, wetland loss, and other hydrological research studies.… ▽ More In this paper, we demonstrated a practical application of realistic river image generation using deep learning. Specifically, we explored a generative adversarial network (GAN) model capable of generating high-resolution and realistic river images that can be used to support modeling and analysis in surface water estimation, river meandering, wetland loss, and other hydrological research studies. First, we have created an extensive repository of overhead river images to be used in training. Second, we incorporated the Progressive Growing GAN (PGGAN), a network architecture that iteratively trains smaller-resolution GANs to gradually build up to a very high resolution to generate high quality (i.e., 1024x1024) synthetic river imagery. With simpler GAN architectures, difficulties arose in terms of exponential increase of training time and vanishing/exploding gradient issues, which the PGGAN implementation seemed to significantly reduce. The results presented in this study show great promise in generating high-quality images and capturing the details of river structure and flow to support hydrological research, which often requires extensive imagery for model performance. △ Less

Submitted 27 July, 2021; v1 submitted 14 February, 2020; originally announced March 2020.

arXiv:2001.09215 [pdf, other]

An Iterative Approach for Identifying Complaint Based Tweets in Social Media Platforms

Authors: Gyanesh Anand, Akash Gautam, Puneet Mathur, Debanjan Mahata, Rajiv Ratn Shah, Ramit Sawhney

Abstract: Twitter is a social media platform where users express opinions over a variety of issues. Posts offering grievances or complaints can be utilized by private/ public organizations to improve their service and promptly gauge a low-cost assessment. In this paper, we propose an iterative methodology which aims to identify complaint based posts pertaining to the transport domain. We perform comprehensi… ▽ More Twitter is a social media platform where users express opinions over a variety of issues. Posts offering grievances or complaints can be utilized by private/ public organizations to improve their service and promptly gauge a low-cost assessment. In this paper, we propose an iterative methodology which aims to identify complaint based posts pertaining to the transport domain. We perform comprehensive evaluations along with releasing a novel dataset for the research purposes. △ Less

Submitted 17 June, 2020; v1 submitted 24 January, 2020; originally announced January 2020.

Comments: Preprint of paper accepted at AAAI, student abstract 2020

arXiv:1912.06927 [pdf, other]

#MeTooMA: Multi-Aspect Annotations of Tweets Related to the MeToo Movement

Authors: Akash Gautam, Puneet Mathur, Rakesh Gosangi, Debanjan Mahata, Ramit Sawhney, Rajiv Ratn Shah

Abstract: In this paper, we present a dataset containing 9,973 tweets related to the MeToo movement that were manually annotated for five different linguistic aspects: relevance, stance, hate speech, sarcasm, and dialogue acts. We present a detailed account of the data collection and annotation processes. The annotations have a very high inter-annotator agreement (0.79 to 0.93 k-alpha) due to the domain exp… ▽ More In this paper, we present a dataset containing 9,973 tweets related to the MeToo movement that were manually annotated for five different linguistic aspects: relevance, stance, hate speech, sarcasm, and dialogue acts. We present a detailed account of the data collection and annotation processes. The annotations have a very high inter-annotator agreement (0.79 to 0.93 k-alpha) due to the domain expertise of the annotators and clear annotation instructions. We analyze the data in terms of geographical distribution, label correlations, and keywords. Lastly, we present some potential use cases of this dataset. We expect this dataset would be of great interest to psycholinguists, socio-linguists, and computational linguists to study the discursive space of digitally mobilized social movements on sensitive issues like sexual harassment. △ Less

Submitted 20 April, 2020; v1 submitted 14 December, 2019; originally announced December 2019.

Comments: Preprint of paper accepted at ICWSM 2020

arXiv:1802.01787 [pdf, other]

A Distributed Hybrid Hardware-In-the-Loop Simulation framework for Infrastructure Enabled Autonomy

Authors: Abhishek Nayak, Kenny Chour, Tyler Marr, Deepika Ravipati, Sheelabhadra Dey, Alvika Gautam, Swaminathan Gopalswamy, Sivakumar Rathinam

Abstract: Infrastructure Enabled Autonomy (IEA) is a new paradigm that employs a distributed intelligence architecture for connected autonomous vehicles by offloading core functionalities to the infrastructure. In this paper, we develop a simulation framework that can be used to study the concept. A key challenge for such a simulation is the rapid increase in the scale of the computations with the size of t… ▽ More Infrastructure Enabled Autonomy (IEA) is a new paradigm that employs a distributed intelligence architecture for connected autonomous vehicles by offloading core functionalities to the infrastructure. In this paper, we develop a simulation framework that can be used to study the concept. A key challenge for such a simulation is the rapid increase in the scale of the computations with the size of the infrastructure to be considered. Our simulation framework is designed to be distributed and scales proportionally with the infrastructure. By integrally using both the hardware controllers and communication devices as part of the simulation framework, we achieve an optimal balance between modeling of the dynamics and sensors, and reusing real hardware for simulation of proprietary or complex communication methods. Multiple cameras on the infrastructure are simulated. The simulation of the camera image processing is done in distributed hardware and the resultant position information is transmitted wirelessly to the computer simulating the autonomous vehicle. We demonstrate closed loop control of a single vehicle following given waypoints using information from multiple cameras located on Road-Side-Units. △ Less

Submitted 5 February, 2018; originally announced February 2018.

Comments: Submitted to the IEEE IV 2018 conference

Showing 1–27 of 27 results for author: Gautam, A