subscribe to arXiv mailings

arXiv:2407.02978 [pdf, other]

Mast Kalandar at SemEval-2024 Task 8: On the Trail of Textual Origins: RoBERTa-BiLSTM Approach to Detect AI-Generated Text

Authors: Jainit Sushil Bafna, Hardik Mittal, Suyash Sethia, Manish Shrivastava, Radhika Mamidi

Abstract: Large Language Models (LLMs) have showcased impressive abilities in generating fluent responses to diverse user queries. However, concerns regarding the potential misuse of such texts in journalism, educational, and academic contexts have surfaced. SemEval 2024 introduces the task of Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection, aiming to develop automat… ▽ More Large Language Models (LLMs) have showcased impressive abilities in generating fluent responses to diverse user queries. However, concerns regarding the potential misuse of such texts in journalism, educational, and academic contexts have surfaced. SemEval 2024 introduces the task of Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection, aiming to develop automated systems for identifying machine-generated text and detecting potential misuse. In this paper, we i) propose a RoBERTa-BiLSTM based classifier designed to classify text into two categories: AI-generated or human ii) conduct a comparative study of our model with baseline approaches to evaluate its effectiveness. This paper contributes to the advancement of automatic text detection systems in addressing the challenges posed by machine-generated text misuse. Our architecture ranked 46th on the official leaderboard with an accuracy of 80.83 among 125. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: SemEval-2024

arXiv:2405.20305 [pdf, other]

Can't make an Omelette without Breaking some Eggs: Plausible Action Anticipation using Large Video-Language Models

Authors: Himangi Mittal, Nakul Agarwal, Shao-Yuan Lo, Kwonjoon Lee

Abstract: We introduce PlausiVL, a large video-language model for anticipating action sequences that are plausible in the real-world. While significant efforts have been made towards anticipating future actions, prior approaches do not take into account the aspect of plausibility in an action sequence. To address this limitation, we explore the generative capability of a large video-language model in our wo… ▽ More We introduce PlausiVL, a large video-language model for anticipating action sequences that are plausible in the real-world. While significant efforts have been made towards anticipating future actions, prior approaches do not take into account the aspect of plausibility in an action sequence. To address this limitation, we explore the generative capability of a large video-language model in our work and further, develop the understanding of plausibility in an action sequence by introducing two objective functions, a counterfactual-based plausible action sequence learning loss and a long-horizon action repetition loss. We utilize temporal logical constraints as well as verb-noun action pair logical constraints to create implausible/counterfactual action sequences and use them to train the model with plausible action sequence learning loss. This loss helps the model to differentiate between plausible and not plausible action sequences and also helps the model to learn implicit temporal cues crucial for the task of action anticipation. The long-horizon action repetition loss puts a higher penalty on the actions that are more prone to repetition over a longer temporal window. With this penalization, the model is able to generate diverse, plausible action sequences. We evaluate our approach on two large-scale datasets, Ego4D and EPIC-Kitchens-100, and show improvements on the task of action anticipation. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: CVPR 2024

arXiv:2404.02088 [pdf, other]

LastResort at SemEval-2024 Task 3: Exploring Multimodal Emotion Cause Pair Extraction as Sequence Labelling Task

Authors: Suyash Vardhan Mathur, Akshett Rai Jindal, Hardik Mittal, Manish Shrivastava

Abstract: Conversation is the most natural form of human communication, where each utterance can range over a variety of possible emotions. While significant work has been done towards the detection of emotions in text, relatively little work has been done towards finding the cause of the said emotions, especially in multimodal settings. SemEval 2024 introduces the task of Multimodal Emotion Cause Analysis… ▽ More Conversation is the most natural form of human communication, where each utterance can range over a variety of possible emotions. While significant work has been done towards the detection of emotions in text, relatively little work has been done towards finding the cause of the said emotions, especially in multimodal settings. SemEval 2024 introduces the task of Multimodal Emotion Cause Analysis in Conversations, which aims to extract emotions reflected in individual utterances in a conversation involving multiple modalities (textual, audio, and visual modalities) along with the corresponding utterances that were the cause for the emotion. In this paper, we propose models that tackle this task as an utterance labeling and a sequence labeling problem and perform a comparative study of these models, involving baselines using different encoders, using BiLSTM for adding contextual information of the conversation, and finally adding a CRF layer to try to model the inter-dependencies between adjacent utterances more effectively. In the official leaderboard for the task, our architecture was ranked 8th, achieving an F1-score of 0.1759 on the leaderboard. △ Less

Submitted 2 April, 2024; originally announced April 2024.

arXiv:2310.00600 [pdf, other]

On the Complexity of the Eigenvalue Deletion Problem

Authors: Neeldhara Misra, Harshil Mittal, Saket Saurabh, Dhara Thakkar

Abstract: For any fixed positive integer $r$ and a given budget $k$, the $r$-\textsc{Eigenvalue Vertex Deletion} ($r$-EVD) problem asks if a graph $G$ admits a subset $S$ of at most $k$ vertices such that the adjacency matrix of $G\setminus S$ has at most $r$ distinct eigenvalues. The edge deletion, edge addition, and edge editing variants are defined analogously. For $r = 1$, $r$-EVD is equivalent to the V… ▽ More For any fixed positive integer $r$ and a given budget $k$, the $r$-\textsc{Eigenvalue Vertex Deletion} ($r$-EVD) problem asks if a graph $G$ admits a subset $S$ of at most $k$ vertices such that the adjacency matrix of $G\setminus S$ has at most $r$ distinct eigenvalues. The edge deletion, edge addition, and edge editing variants are defined analogously. For $r = 1$, $r$-EVD is equivalent to the Vertex Cover problem. For $r = 2$, it turns out that $r$-EVD amounts to removing a subset $S$ of at most $k$ vertices so that $G\setminus S$ is a cluster graph where all connected components have the same size. We show that $r$-EVD is NP-complete even on bipartite graphs with maximum degree four for every fixed $r > 2$, and FPT when parameterized by the solution size and the maximum degree of the graph. We also establish several results for the special case when $r = 2$. For the vertex deletion variant, we show that $2$-EVD is NP-complete even on triangle-free and $3d$-regular graphs for any $d\geq 2$, and also NP-complete on $d$-regular graphs for any $d\geq 8$. The edge deletion, addition, and editing variants are all NP-complete for $r = 2$. The edge deletion problem admits a polynomial time algorithm if the input is a cluster graph, while the edge addition variant is hard even when the input is a cluster graph. We show that the edge addition variant has a quadratic kernel. The edge deletion and vertex deletion variants are FPT when parameterized by the solution size alone. Our main contribution is to develop the complexity landscape for the problem of modifying a graph with the aim of reducing the number of distinct eigenvalues in the spectrum of its adjacency matrix. It turns out that this captures, apart from Vertex Cover, also a natural variation of the problem of modifying to a cluster graph as a special case, which we believe may be of independent interest. △ Less

Submitted 1 October, 2023; originally announced October 2023.

Comments: 27 pages; this is the full version of a paper accepted for presentation at the 34th International Symposium on Algorithms and Computation (ISAAC 2023)

ACM Class: F.2

arXiv:2309.03517 [pdf, ps, other]

Parameterized Aspects of Distinct Kemeny Rank Aggregation

Authors: Koustav De, Harshil Mittal, Palash Dey, Neeldhara Misra

Abstract: The Kemeny method is one of the popular tools for rank aggregation. However, computing an optimal Kemeny ranking is NP-hard. Consequently, the computational task of finding a Kemeny ranking has been studied under the lens of parameterized complexity with respect to many parameters. We first present a comprehensive relationship, both theoretical and empirical, among these parameters. Further, we st… ▽ More The Kemeny method is one of the popular tools for rank aggregation. However, computing an optimal Kemeny ranking is NP-hard. Consequently, the computational task of finding a Kemeny ranking has been studied under the lens of parameterized complexity with respect to many parameters. We first present a comprehensive relationship, both theoretical and empirical, among these parameters. Further, we study the problem of computing all distinct Kemeny rankings under the lens of parameterized complexity. We consider the target Kemeny score, number of candidates, average distance of input rankings, maximum range of any candidate, and unanimity width as our parameters. For all these parameters, we already have FPT algorithms. We find that any desirable number of Kemeny rankings can also be found without substantial increase in running time. We also present FPT approximation algorithms for Kemeny rank aggregation with respect to these parameters. △ Less

Submitted 7 September, 2023; originally announced September 2023.

arXiv:2306.11417 [pdf, other]

PyRCA: A Library for Metric-based Root Cause Analysis

Authors: Chenghao Liu, Wenzhuo Yang, Himanshu Mittal, Manpreet Singh, Doyen Sahoo, Steven C. H. Hoi

Abstract: We introduce PyRCA, an open-source Python machine learning library of Root Cause Analysis (RCA) for Artificial Intelligence for IT Operations (AIOps). It provides a holistic framework to uncover the complicated metric causal dependencies and automatically locate root causes of incidents. It offers a unified interface for multiple commonly used RCA models, encompassing both graph construction and s… ▽ More We introduce PyRCA, an open-source Python machine learning library of Root Cause Analysis (RCA) for Artificial Intelligence for IT Operations (AIOps). It provides a holistic framework to uncover the complicated metric causal dependencies and automatically locate root causes of incidents. It offers a unified interface for multiple commonly used RCA models, encompassing both graph construction and scoring tasks. This library aims to provide IT operations staff, data scientists, and researchers a one-step solution to rapid model development, model evaluation and deployment to online applications. In particular, our library includes various causal discovery methods to support causal graph construction, and multiple types of root cause scoring methods inspired by Bayesian analysis, graph analysis and causal analysis, etc. Our GUI dashboard offers practitioners an intuitive point-and-click interface, empowering them to easily inject expert knowledge through human interaction. With the ability to visualize causal graphs and the root cause of incidents, practitioners can quickly gain insights and improve their workflow efficiency. This technical report introduces PyRCA's architecture and major functionalities, while also presenting benchmark performance numbers in comparison to various baseline models. Additionally, we demonstrate PyRCA's capabilities through several example use cases. △ Less

Submitted 20 June, 2023; originally announced June 2023.

Comments: Github repo: https://github.com/salesforce/PyRCA

arXiv:2306.00379 [pdf, other]

Large Scale Generative Multimodal Attribute Extraction for E-commerce Attributes

Authors: Anant Khandelwal, Happy Mittal, Shreyas Sunil Kulkarni, Deepak Gupta

Abstract: E-commerce websites (e.g. Amazon) have a plethora of structured and unstructured information (text and images) present on the product pages. Sellers often either don't label or mislabel values of the attributes (e.g. color, size etc.) for their products. Automatically identifying these attribute values from an eCommerce product page that contains both text and images is a challenging task, especia… ▽ More E-commerce websites (e.g. Amazon) have a plethora of structured and unstructured information (text and images) present on the product pages. Sellers often either don't label or mislabel values of the attributes (e.g. color, size etc.) for their products. Automatically identifying these attribute values from an eCommerce product page that contains both text and images is a challenging task, especially when the attribute value is not explicitly mentioned in the catalog. In this paper, we present a scalable solution for this problem where we pose attribute extraction problem as a question-answering task, which we solve using \textbf{MXT}, consisting of three key components: (i) \textbf{M}AG (Multimodal Adaptation Gate), (ii) \textbf{X}ception network, and (iii) \textbf{T}5 encoder-decoder. Our system consists of a generative model that \emph{generates} attribute-values for a given product by using both textual and visual characteristics (e.g. images) of the product. We show that our system is capable of handling zero-shot attribute prediction (when attribute value is not seen in training data) and value-absent prediction (when attribute value is not mentioned in the text) which are missing in traditional classification-based and NER-based models respectively. We have trained our models using distant supervision, removing dependency on human labeling, thus making them practical for real-world applications. With this framework, we are able to train a single model for 1000s of (product-type, attribute) pairs, thus reducing the overhead of training and maintaining separate models. Extensive experiments on two real world datasets show that our framework improves the absolute recall@90P by 10.16\% and 6.9\% from the existing state of the art models. In a popular e-commerce store, we have deployed our models for 1000s of (product-type, attribute) pairs. △ Less

Submitted 1 June, 2023; originally announced June 2023.

Comments: ACL 2023 Industry Track, 8 Pages

arXiv:2209.13583 [pdf, other]

Learning State-Aware Visual Representations from Audible Interactions

Authors: Himangi Mittal, Pedro Morgado, Unnat Jain, Abhinav Gupta

Abstract: We propose a self-supervised algorithm to learn representations from egocentric video data. Recently, significant efforts have been made to capture humans interacting with their own environments as they go about their daily activities. In result, several large egocentric datasets of interaction-rich multi-modal data have emerged. However, learning representations from videos can be challenging. Fi… ▽ More We propose a self-supervised algorithm to learn representations from egocentric video data. Recently, significant efforts have been made to capture humans interacting with their own environments as they go about their daily activities. In result, several large egocentric datasets of interaction-rich multi-modal data have emerged. However, learning representations from videos can be challenging. First, given the uncurated nature of long-form continuous videos, learning effective representations require focusing on moments in time when interactions take place. Second, visual representations of daily activities should be sensitive to changes in the state of the environment. However, current successful multi-modal learning frameworks encourage representation invariance over time. To address these challenges, we leverage audio signals to identify moments of likely interactions which are conducive to better learning. We also propose a novel self-supervised objective that learns from audible state changes caused by interactions. We validate these contributions extensively on two large-scale egocentric datasets, EPIC-Kitchens-100 and the recently released Ego4D, and show improvements on several downstream tasks, including action recognition, long-term action anticipation, and object state change classification. △ Less

Submitted 27 September, 2022; originally announced September 2022.

Comments: NeurIPS 2022. Code available at https://github.com/HimangiM/RepLAI

arXiv:2203.14864 [pdf, other]

Chess is hard even for a single player

Authors: N. R. Aravind, Neeldhara Misra, Harshil Mittal

Abstract: We introduce a generalization of "Solo Chess", a single-player variant of the game that can be played on chess.com. The standard version of the game is played on a regular 8 x 8 chessboard by a single player, with only white pieces, using the following rules: every move must capture a piece, no piece may capture more than 2 times, and if there is a King on the board, it must be the final piece. Th… ▽ More We introduce a generalization of "Solo Chess", a single-player variant of the game that can be played on chess.com. The standard version of the game is played on a regular 8 x 8 chessboard by a single player, with only white pieces, using the following rules: every move must capture a piece, no piece may capture more than 2 times, and if there is a King on the board, it must be the final piece. The goal is to clear the board, i.e, make a sequence of captures after which only one piece is left. We generalize this game to unbounded boards with $n$ pieces, each of which have a given number of captures that they are permitted to make. We show that Generalized Solo Chess is NP-complete, even when it is played by only rooks that have at most two captures remaining. It also turns out to be NP-complete even when every piece is a queen with exactly two captures remaining in the initial configuration. In contrast, we show that solvable instances of Generalized Solo Chess can be completely characterized when the game is: a) played by rooks on a one-dimensional board, and b) played by pawns with two captures left on a 2D board. Inspired by Generalized Solo Chess, we also introduce the Graph Capture Game, which involves clearing a graph of tokens via captures along edges. This game subsumes Generalized Solo Chess played by knights. We show that the Graph Capture Game is NP-complete for undirected graphs and DAGs. △ Less

Submitted 30 March, 2022; v1 submitted 28 March, 2022; originally announced March 2022.

Comments: 22 pages, a slightly shorter version to appear in FUN 2022

arXiv:2203.05835 [pdf, other]

MLRM: A Multiple Linear Regression based Model for Average Temperature Prediction of A Day

Authors: Ishu Gupta, Harsh Mittal, Deepak Rikhari, Ashutosh Kumar Singh

Abstract: Weather is a phenomenon that affects everything and everyone around us on a daily basis. Weather prediction has been an important point of study for decades as researchers have tried to predict the weather and climatic changes using traditional meteorological techniques. With the advent of modern technologies and computing power, we can do so with the help of machine learning techniques. We aim to… ▽ More Weather is a phenomenon that affects everything and everyone around us on a daily basis. Weather prediction has been an important point of study for decades as researchers have tried to predict the weather and climatic changes using traditional meteorological techniques. With the advent of modern technologies and computing power, we can do so with the help of machine learning techniques. We aim to predict the weather of an area using past meteorological data and features using the Multiple Linear Regression Model. The performance of the model is evaluated and a conclusion is drawn. The model is successfully able to predict the average temperature of a day with an error of 2.8 degrees Celsius. △ Less

Submitted 11 March, 2022; originally announced March 2022.

arXiv:2111.10701 [pdf, other]

Self-Supervised Point Cloud Completion via Inpainting

Authors: Himangi Mittal, Brian Okorn, Arpit Jangid, David Held

Abstract: When navigating in urban environments, many of the objects that need to be tracked and avoided are heavily occluded. Planning and tracking using these partial scans can be challenging. The aim of this work is to learn to complete these partial point clouds, giving us a full understanding of the object's geometry using only partial observations. Previous methods achieve this with the help of comple… ▽ More When navigating in urban environments, many of the objects that need to be tracked and avoided are heavily occluded. Planning and tracking using these partial scans can be challenging. The aim of this work is to learn to complete these partial point clouds, giving us a full understanding of the object's geometry using only partial observations. Previous methods achieve this with the help of complete, ground-truth annotations of the target objects, which are available only for simulated datasets. However, such ground truth is unavailable for real-world LiDAR data. In this work, we present a self-supervised point cloud completion algorithm, PointPnCNet, which is trained only on partial scans without assuming access to complete, ground-truth annotations. Our method achieves this via inpainting. We remove a portion of the input data and train the network to complete the missing region. As it is difficult to determine which regions were occluded in the initial cloud and which were synthetically removed, our network learns to complete the full cloud, including the missing regions in the initial partial cloud. We show that our method outperforms previous unsupervised and weakly-supervised methods on both the synthetic dataset, ShapeNet, and real-world LiDAR dataset, Semantic KITTI. △ Less

Submitted 20 November, 2021; originally announced November 2021.

Comments: BMVC 2021 (Oral)

arXiv:2108.10125 [pdf, other]

A study on Machine Learning Approaches for Player Performance and Match Results Prediction

Authors: Harsh Mittal, Deepak Rikhari, Jitendra Kumar, Ashutosh Kumar Singh

Abstract: Cricket is unarguably one of the most popular sports in the world. Predicting the outcome of a cricket match has become a fundamental problem as we are advancing in the field of machine learning. Multiple researchers have tried to predict the outcome of a cricket match or a tournament, or to predict the performance of players during a match, or to predict the players who should be selected as per… ▽ More Cricket is unarguably one of the most popular sports in the world. Predicting the outcome of a cricket match has become a fundamental problem as we are advancing in the field of machine learning. Multiple researchers have tried to predict the outcome of a cricket match or a tournament, or to predict the performance of players during a match, or to predict the players who should be selected as per their current performance, form, morale, etc. using machine learning and artificial intelligence techniques keeping in mind extensive detailing, features, and parameters. We discuss some of these techniques along with a brief comparison among these techniques. △ Less

Submitted 23 August, 2021; originally announced August 2021.

arXiv:2104.02947 [pdf, other]

Distantly Supervised Transformers For E-Commerce Product QA

Authors: Happy Mittal, Aniket Chakrabarti, Belhassen Bayar, Animesh Anant Sharma, Nikhil Rasiwasia

Abstract: We propose a practical instant question answering (QA) system on product pages of ecommerce services, where for each user query, relevant community question answer (CQA) pairs are retrieved. User queries and CQA pairs differ significantly in language characteristics making relevance learning difficult. Our proposed transformer-based model learns a robust relevance function by jointly learning unif… ▽ More We propose a practical instant question answering (QA) system on product pages of ecommerce services, where for each user query, relevant community question answer (CQA) pairs are retrieved. User queries and CQA pairs differ significantly in language characteristics making relevance learning difficult. Our proposed transformer-based model learns a robust relevance function by jointly learning unified syntactic and semantic representations without the need for human labeled data. This is achieved by distantly supervising our model by distilling from predictions of a syntactic matching system on user queries and simultaneously training with CQA pairs. Training with CQA pairs helps our model learning semantic QA relevance and distant supervision enables learning of syntactic features as well as the nuances of user querying language. Additionally, our model encodes queries and candidate responses independently allowing offline candidate embedding generation thereby minimizing the need for real-time transformer model execution. Consequently, our framework is able to scale to large e-commerce QA traffic. Extensive evaluation on user queries shows that our framework significantly outperforms both syntactic and semantic baselines in offline as well as large scale online A/B setups of a popular e-commerce service. △ Less

Submitted 7 April, 2021; originally announced April 2021.

Comments: NAACL 2021 (10 pages)

arXiv:2005.06046 [pdf, other]

Red-Blue Point Separation for Points on a Circle

Authors: Neeldhara Misra, Harshil Mittal, Aditi Sethia

Abstract: Given a set R of red points and a set B of blue points in the plane, the Red-Blue point separation problem asks if there are at most k lines that separate R from B, that is, each cell induced by the lines of the solution is either empty or monochromatic (containing points of only one color). A common variant of the problem is when the lines are required to be axis-parallel. The problem is known to… ▽ More Given a set R of red points and a set B of blue points in the plane, the Red-Blue point separation problem asks if there are at most k lines that separate R from B, that is, each cell induced by the lines of the solution is either empty or monochromatic (containing points of only one color). A common variant of the problem is when the lines are required to be axis-parallel. The problem is known to be NP-complete for both scenarios, and W[1]-hard parameterized by k in the former setting and FPT in the latter. We demonstrate a polynomial-time algorithm for the special case when the points lie on a circle. Further, we also demonstrate the W-hardness of a related problem in the axis-parallel setting, where the question is if there are p horizontal and q vertical lines that separate R from B. The hardness here is shown in the parameter p. △ Less

Submitted 12 May, 2020; originally announced May 2020.

arXiv:2005.03800 [pdf, other]

Imbalance Parameterized by Twin Cover Revisited

Authors: Neeldhara Misra, Harshil Mittal

Abstract: We study the problem of Imbalance parameterized by the twin cover of a graph. We show that Imbalance is XP parameterized by twin cover, and FPT when parameterized by the twin cover and the size of the largest clique outside the twin cover. In contrast, we introduce a notion of succinct representations of graphs in terms of their twin cover and demonstrate that Imbalance is NP-hard in the setting o… ▽ More We study the problem of Imbalance parameterized by the twin cover of a graph. We show that Imbalance is XP parameterized by twin cover, and FPT when parameterized by the twin cover and the size of the largest clique outside the twin cover. In contrast, we introduce a notion of succinct representations of graphs in terms of their twin cover and demonstrate that Imbalance is NP-hard in the setting of succinct representations, even for graphs that have a twin cover of size one. △ Less

Submitted 7 May, 2020; originally announced May 2020.

arXiv:1912.00501 [pdf, other]

Interpreting Context of Images using Scene Graphs

Authors: Himangi Mittal, Ajith Abraham, Anuja Arora

Abstract: Understanding a visual scene incorporates objects, relationships, and context. Traditional methods working on an image mostly focus on object detection and fail to capture the relationship between the objects. Relationships can give rich semantic information about the objects in a scene. The context can be conducive to comprehending an image since it will help us to perceive the relation between t… ▽ More Understanding a visual scene incorporates objects, relationships, and context. Traditional methods working on an image mostly focus on object detection and fail to capture the relationship between the objects. Relationships can give rich semantic information about the objects in a scene. The context can be conducive to comprehending an image since it will help us to perceive the relation between the objects and thus, give us a deeper insight into the image. Through this idea, our project delivers a model that focuses on finding the context present in an image by representing the image as a graph, where the nodes will the objects and edges will be the relation between them. The context is found using the visual and semantic cues which are further concatenated and given to the Support Vector Machines (SVM) to detect the relation between two objects. This presents us with the context of the image which can be further used in applications such as similar image retrieval, image captioning, or story generation. △ Less

Submitted 1 December, 2019; originally announced December 2019.

Comments: To appear in International Conference on Big Data Analytics (BDA2019) (Accepted)

arXiv:1912.00497 [pdf, other]

Just Go with the Flow: Self-Supervised Scene Flow Estimation

Authors: Himangi Mittal, Brian Okorn, David Held

Abstract: When interacting with highly dynamic environments, scene flow allows autonomous systems to reason about the non-rigid motion of multiple independent objects. This is of particular interest in the field of autonomous driving, in which many cars, people, bicycles, and other objects need to be accurately tracked. Current state-of-the-art methods require annotated scene flow data from autonomous drivi… ▽ More When interacting with highly dynamic environments, scene flow allows autonomous systems to reason about the non-rigid motion of multiple independent objects. This is of particular interest in the field of autonomous driving, in which many cars, people, bicycles, and other objects need to be accurately tracked. Current state-of-the-art methods require annotated scene flow data from autonomous driving scenes to train scene flow networks with supervised learning. As an alternative, we present a method of training scene flow that uses two self-supervised losses, based on nearest neighbors and cycle consistency. These self-supervised losses allow us to train our method on large unlabeled autonomous driving datasets; the resulting method matches current state-of-the-art supervised performance using no real world annotations and exceeds state-of-the-art performance when combining our self-supervised approach with supervised learning on a smaller labeled dataset. △ Less

Submitted 13 April, 2020; v1 submitted 1 December, 2019; originally announced December 2019.

Comments: Accepted at CVPR 2020 (Oral)

arXiv:1908.00860 [pdf, ps, other]

Advances in Symmetry Breaking for SAT Modulo Theories

Authors: Saket Dingliwal, Ronak Agarwal, Happy Mittal, Parag Singla

Abstract: Symmetry breaking is a popular technique to reduce the search space for SAT solving by exploiting the underlying symmetry over variables and clauses in a formula. The key idea is to first identify sets of assignments which fall in the same symmetry class, and then impose ordering constraints, called Symmetry Breaking Predicates (SBPs), such that only one (or a small subset) of these assignments is… ▽ More Symmetry breaking is a popular technique to reduce the search space for SAT solving by exploiting the underlying symmetry over variables and clauses in a formula. The key idea is to first identify sets of assignments which fall in the same symmetry class, and then impose ordering constraints, called Symmetry Breaking Predicates (SBPs), such that only one (or a small subset) of these assignments is allowed to be a solution of the original SAT formula. While this technique has been exploited extensively in the SAT literature, there is little work on using symmetry breaking for SAT Modulo Theories (SMT). In SMT, logical constraints in SAT theories are combined with another set of theory operations defined over non-Boolean variables such as integers, reals, etc. SMT solvers typically use a combination of SAT solving techniques augmented with calls to the theory solver. In this work, we take up the advances in SAT symmetry breaking and apply them to the domain of SMT. Our key technical contribution is the formulation of symmetry breaking over the Boolean skeleton variables, which are placeholders for actual theory operations in SMT solving. These SBPs are then applied over the SAT solving part of the SMT solver. We implement our SBP ideas on top of CVC4, which is a state-of-the-art SMT solver. Our approach can result in significantly faster solutions on several benchmark problems compared to the state-of-the-art. Our final solver is a hybrid of the original CVC4 solver, and an SBP based solver, and can solve up to 3.8% and 3.1% more problems in the QF_NIA category of 2018 and 2019 SMT benchmarks, respectively, compared to CVC4, the top performer in this category. △ Less

Submitted 16 January, 2020; v1 submitted 2 August, 2019; originally announced August 2019.

Comments: SMT 2019, SMT, CVC4, Symmetry-breaking, starAI

arXiv:1901.09517 [pdf, other]

ICLR Reproducibility Challenge Report (Padam : Closing The Generalization Gap Of Adaptive Gradient Methods in Training Deep Neural Networks)

Authors: Harshal Mittal, Kartikey Pandey, Yash Kant

Abstract: This work is a part of ICLR Reproducibility Challenge 2019, we try to reproduce the results in the conference submission PADAM: Closing The Generalization Gap of Adaptive Gradient Methods In Training Deep Neural Networks. Adaptive gradient methods proposed in past demonstrate a degraded generalization performance than the stochastic gradient descent (SGD) with momentum. The authors try to address… ▽ More This work is a part of ICLR Reproducibility Challenge 2019, we try to reproduce the results in the conference submission PADAM: Closing The Generalization Gap of Adaptive Gradient Methods In Training Deep Neural Networks. Adaptive gradient methods proposed in past demonstrate a degraded generalization performance than the stochastic gradient descent (SGD) with momentum. The authors try to address this problem by designing a new optimization algorithm that bridges the gap between the space of Adaptive Gradient algorithms and SGD with momentum. With this method a new tunable hyperparameter called partially adaptive parameter p is introduced that varies between [0, 0.5]. We build the proposed optimizer and use it to mirror the experiments performed by the authors. We review and comment on the empirical analysis performed by the authors. Finally, we also propose a future direction for further study of Padam. Our code is available at: https://github.com/yashkant/Padam-Tensorflow △ Less

Submitted 28 January, 2019; originally announced January 2019.

Comments: ICLR Reproducibility Challenge 2019 Report for Padam (11 pages, 30 figures)

arXiv:1807.01082 [pdf, other]

Domain Aware Markov Logic Networks

Authors: Happy Mittal, Ayush Bhardwaj, Vibhav Gogate, Parag Singla

Abstract: Combining logic and probability has been a long stand- ing goal of AI research. Markov Logic Networks (MLNs) achieve this by attaching weights to formulas in first-order logic, and can be seen as templates for constructing features for ground Markov networks. Most techniques for learning weights of MLNs are domain-size agnostic, i.e., the size of the domain is not explicitly taken into account whi… ▽ More Combining logic and probability has been a long stand- ing goal of AI research. Markov Logic Networks (MLNs) achieve this by attaching weights to formulas in first-order logic, and can be seen as templates for constructing features for ground Markov networks. Most techniques for learning weights of MLNs are domain-size agnostic, i.e., the size of the domain is not explicitly taken into account while learn- ing the parameters of the model. This often results in ex- treme probabilities when testing on domain sizes different from those seen during training. In this paper, we propose Domain Aware Markov logic Networks (DA-MLNs) which present a principled solution to this problem. While defin- ing the ground network distribution, DA-MLNs divide the ground feature weight by a scaling factor which is a function of the number of connections the ground atoms appearing in the feature are involved in. We show that standard MLNs fall out as a special case of our formalism when this func- tion evaluates to a constant equal to 1. Experiments on the benchmark Friends & Smokers domain show that our ap- proach results in significantly higher accuracies compared to existing methods when testing on domains whose sizes different from those seen during training. △ Less

Submitted 7 July, 2018; v1 submitted 3 July, 2018; originally announced July 2018.

Comments: 2 pages Position Paper accepted in StarAI workshop 2018

arXiv:1807.00589 [pdf, other]

Lifted Marginal MAP Inference

Authors: Vishal Sharma, Noman Ahmed Sheikh, Happy Mittal, Vibhav Gogate, Parag Singla

Abstract: Lifted inference reduces the complexity of inference in relational probabilistic models by identifying groups of constants (or atoms) which behave symmetric to each other. A number of techniques have been proposed in the literature for lifting marginal as well MAP inference. We present the first application of lifting rules for marginal-MAP (MMAP), an important inference problem in models having l… ▽ More Lifted inference reduces the complexity of inference in relational probabilistic models by identifying groups of constants (or atoms) which behave symmetric to each other. A number of techniques have been proposed in the literature for lifting marginal as well MAP inference. We present the first application of lifting rules for marginal-MAP (MMAP), an important inference problem in models having latent (random) variables. Our main contribution is two fold: (1) we define a new equivalence class of (logical) variables, called Single Occurrence for MAX (SOM), and show that solution lies at extreme with respect to the SOM variables, i.e., predicate groundings differing only in the instantiation of the SOM variables take the same truth value (2) we define a sub-class {\em SOM-R} (SOM Reduce) and exploit properties of extreme assignments to show that MMAP inference can be performed by reducing the domain of SOM-R variables to a single constant.We refer to our lifting technique as the {\em SOM-R} rule for lifted MMAP. Combined with existing rules such as decomposer and binomial, this results in a powerful framework for lifted MMAP. Experiments on three benchmark domains show significant gains in both time and memory compared to ground inference as well as lifted approaches not using SOM-R. △ Less

Submitted 8 July, 2018; v1 submitted 2 July, 2018; originally announced July 2018.

Comments: Accepted in UAI-18. Corrected some typos

arXiv:1711.04150 [pdf, other]

doi 10.1145/3152494.3152512

STWalk: Learning Trajectory Representations in Temporal Graphs

Authors: Supriya Pandhre, Himangi Mittal, Manish Gupta, Vineeth N Balasubramanian

Abstract: Analyzing the temporal behavior of nodes in time-varying graphs is useful for many applications such as targeted advertising, community evolution and outlier detection. In this paper, we present a novel approach, STWalk, for learning trajectory representations of nodes in temporal graphs. The proposed framework makes use of structural properties of graphs at current and previous time-steps to lear… ▽ More Analyzing the temporal behavior of nodes in time-varying graphs is useful for many applications such as targeted advertising, community evolution and outlier detection. In this paper, we present a novel approach, STWalk, for learning trajectory representations of nodes in temporal graphs. The proposed framework makes use of structural properties of graphs at current and previous time-steps to learn effective node trajectory representations. STWalk performs random walks on a graph at a given time step (called space-walk) as well as on graphs from past time-steps (called time-walk) to capture the spatio-temporal behavior of nodes. We propose two variants of STWalk to learn trajectory representations. In one algorithm, we perform space-walk and time-walk as part of a single step. In the other variant, we perform space-walk and time-walk separately and combine the learned representations to get the final trajectory embedding. Extensive experiments on three real-world temporal graph datasets validate the effectiveness of the learned representations when compared to three baseline methods. We also show the goodness of the learned trajectory embeddings for change point detection, as well as demonstrate that arithmetic operations on these trajectory representations yield interesting and interpretable results. △ Less

Submitted 11 November, 2017; originally announced November 2017.

Comments: 10 pages, 5 figures, 2 tables

arXiv:1605.02442 [pdf]

doi 10.5121/ijnlc.2016.5201

Machine Learning Techniques with Ontology for Subjective Answer Evaluation

Authors: M. Syamala Devi, Himani Mittal

Abstract: Computerized Evaluation of English Essays is performed using Machine learning techniques like Latent Semantic Analysis (LSA), Generalized LSA, Bilingual Evaluation Understudy and Maximum Entropy. Ontology, a concept map of domain knowledge, can enhance the performance of these techniques. Use of Ontology makes the evaluation process holistic as presence of keywords, synonyms, the right word combin… ▽ More Computerized Evaluation of English Essays is performed using Machine learning techniques like Latent Semantic Analysis (LSA), Generalized LSA, Bilingual Evaluation Understudy and Maximum Entropy. Ontology, a concept map of domain knowledge, can enhance the performance of these techniques. Use of Ontology makes the evaluation process holistic as presence of keywords, synonyms, the right word combination and coverage of concepts can be checked. In this paper, the above mentioned techniques are implemented both with and without Ontology and tested on common input data consisting of technical answers of Computer Science. Domain Ontology of Computer Graphics is designed and developed. The software used for implementation includes Java Programming Language and tools such as MATLAB, Protégé, etc. Ten questions from Computer Graphics with sixty answers for each question are used for testing. The results are analyzed and it is concluded that the results are more accurate with use of Ontology. △ Less

Submitted 9 May, 2016; originally announced May 2016.

Comments: 11 pages, 5 figures, journal, http://airccse.org/journal/ijnlc/current.html 2016

ACM Class: I.2.7

Showing 1–23 of 23 results for author: Mittal, H