subscribe to arXiv mailings

R4: rapid reproducible robotics research open hardware control system

Authors: Chris Waltham, Andy Perrett, Rakshit Soni, Charles Fox

Abstract: A key component of any robot is the interface between ROS2 software and physical motors. New robots often use arbitrary, messy mixtures of closed and open motor drivers and error-prone physical mountings, wiring, and connectors to interface them. There is a need for a standardizing OSH component to abstract this complexity, as Arduino did for interfacing to smaller components. We present a OSH pri… ▽ More A key component of any robot is the interface between ROS2 software and physical motors. New robots often use arbitrary, messy mixtures of closed and open motor drivers and error-prone physical mountings, wiring, and connectors to interface them. There is a need for a standardizing OSH component to abstract this complexity, as Arduino did for interfacing to smaller components. We present a OSH printed circuit board to solve this problem once and for all. On the high-level side, it interfaces to Arduino Giga -- acting as an unusually large and robust shield -- and thus to existing open source ROS software stacks. On the lower-level side, it interfaces to existing emerging standard open hardware including OSH motor drivers and relays, which can already be used to drive fully open hardware wheeled and arm robots. This enables the creation of a family of standardized, fully open hardware, fully reproducible, research platforms. △ Less

Submitted 13 May, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

arXiv:2311.01574 [pdf]

Improving Lesion Segmentation in FDG-18 Whole-Body PET/CT scans using Multilabel approach: AutoPET II challenge

Authors: Gowtham Krishnan Murugesan, Diana McCrumb, Eric Brunner, Jithendra Kumar, Rahul Soni, Vasily Grigorash, Stephen Moore, Jeff Van Oss

Abstract: Automatic segmentation of lesions in FDG-18 Whole Body (WB) PET/CT scans using deep learning models is instrumental for determining treatment response, optimizing dosimetry, and advancing theranostic applications in oncology. However, the presence of organs with elevated radiotracer uptake, such as the liver, spleen, brain, and bladder, often leads to challenges, as these regions are often misiden… ▽ More Automatic segmentation of lesions in FDG-18 Whole Body (WB) PET/CT scans using deep learning models is instrumental for determining treatment response, optimizing dosimetry, and advancing theranostic applications in oncology. However, the presence of organs with elevated radiotracer uptake, such as the liver, spleen, brain, and bladder, often leads to challenges, as these regions are often misidentified as lesions by deep learning models. To address this issue, we propose a novel approach of segmenting both organs and lesions, aiming to enhance the performance of automatic lesion segmentation methods. In this study, we assessed the effectiveness of our proposed method using the AutoPET II challenge dataset, which comprises 1014 subjects. We evaluated the impact of inclusion of additional labels and data in the segmentation performance of the model. In addition to the expert-annotated lesion labels, we introduced eight additional labels for organs, including the liver, kidneys, urinary bladder, spleen, lung, brain, heart, and stomach. These labels were integrated into the dataset, and a 3D UNET model was trained within the nnUNet framework. Our results demonstrate that our method achieved the top ranking in the held-out test dataset, underscoring the potential of this approach to significantly improve lesion segmentation accuracy in FDG-18 Whole-Body PET/CT scans, ultimately benefiting cancer patients and advancing clinical practice. △ Less

Submitted 2 November, 2023; originally announced November 2023.

Comments: AutoPET II challenge paper

arXiv:2307.16676 [pdf, other]

doi 10.1109/IROS55552.2023.10342187

End-to-End Reinforcement Learning for Torque Based Variable Height Hopping

Authors: Raghav Soni, Daniel Harnack, Hauke Isermann, Sotaro Fushimi, Shivesh Kumar, Frank Kirchner

Abstract: Legged locomotion is arguably the most suited and versatile mode to deal with natural or unstructured terrains. Intensive research into dynamic walking and running controllers has recently yielded great advances, both in the optimal control and reinforcement learning (RL) literature. Hopping is a challenging dynamic task involving a flight phase and has the potential to increase the traversability… ▽ More Legged locomotion is arguably the most suited and versatile mode to deal with natural or unstructured terrains. Intensive research into dynamic walking and running controllers has recently yielded great advances, both in the optimal control and reinforcement learning (RL) literature. Hopping is a challenging dynamic task involving a flight phase and has the potential to increase the traversability of legged robots. Model based control for hopping typically relies on accurate detection of different jump phases, such as lift-off or touch down, and using different controllers for each phase. In this paper, we present a end-to-end RL based torque controller that learns to implicitly detect the relevant jump phases, removing the need to provide manual heuristics for state detection. We also extend a method for simulation to reality transfer of the learned controller to contact rich dynamic tasks, resulting in successful deployment on the robot after training without parameter tuning. △ Less

Submitted 18 December, 2023; v1 submitted 31 July, 2023; originally announced July 2023.

Comments: Update publication info. Cite as: R. Soni, D. Harnack, H. Isermann, S. Fushimi, S. Kumar and F. Kirchner, "End-to-End Reinforcement Learning for Torque Based Variable Height Hopping," 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Detroit, MI, USA, 2023, pp. 7531-7538, doi: 10.1109/IROS55552.2023.10342187

Journal ref: End-to-End Reinforcement Learning for Torque Based Variable Height Hopping, 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Detroit, MI, USA, 2023, pp. 7531-7538

arXiv:2207.07739 [pdf, other]

Adversarial Focal Loss: Asking Your Discriminator for Hard Examples

Authors: Chen Liu, Xiaomeng Dong, Michael Potter, Hsi-Ming Chang, Ravi Soni

Abstract: Focal Loss has reached incredible popularity as it uses a simple technique to identify and utilize hard examples to achieve better performance on classification. However, this method does not easily generalize outside of classification tasks, such as in keypoint detection. In this paper, we propose a novel adaptation of Focal Loss for keypoint detection tasks, called Adversarial Focal Loss (AFL).… ▽ More Focal Loss has reached incredible popularity as it uses a simple technique to identify and utilize hard examples to achieve better performance on classification. However, this method does not easily generalize outside of classification tasks, such as in keypoint detection. In this paper, we propose a novel adaptation of Focal Loss for keypoint detection tasks, called Adversarial Focal Loss (AFL). AFL not only is semantically analogous to Focal loss, but also works as a plug-and-chug upgrade for arbitrary loss functions. While Focal Loss requires output from a classifier, AFL leverages a separate adversarial network to produce a difficulty score for each input. This difficulty score can then be used to dynamically prioritize learning on hard examples, even in absence of a classifier. In this work, we show AFL's effectiveness in enhancing existing methods in keypoint detection and verify its capability to re-weigh examples based on difficulty. △ Less

Submitted 15 July, 2022; originally announced July 2022.

arXiv:2011.05186 [pdf, other]

Pristine annotations-based multi-modal trained artificial intelligence solution to triage chest X-ray for COVID-19

Authors: Tao Tan, Bipul Das, Ravi Soni, Mate Fejes, Sohan Ranjan, Daniel Attila Szabo, Vikram Melapudi, K S Shriram, Utkarsh Agrawal, Laszlo Rusko, Zita Herczeg, Barbara Darazs, Pal Tegzes, Lehel Ferenczi, Rakesh Mullick, Gopal Avinash

Abstract: The COVID-19 pandemic continues to spread and impact the well-being of the global population. The front-line modalities including computed tomography (CT) and X-ray play an important role for triaging COVID patients. Considering the limited access of resources (both hardware and trained personnel) and decontamination considerations, CT may not be ideal for triaging suspected subjects. Artificial i… ▽ More The COVID-19 pandemic continues to spread and impact the well-being of the global population. The front-line modalities including computed tomography (CT) and X-ray play an important role for triaging COVID patients. Considering the limited access of resources (both hardware and trained personnel) and decontamination considerations, CT may not be ideal for triaging suspected subjects. Artificial intelligence (AI) assisted X-ray based applications for triaging and monitoring require experienced radiologists to identify COVID patients in a timely manner and to further delineate the disease region boundary are seen as a promising solution. Our proposed solution differs from existing solutions by industry and academic communities, and demonstrates a functional AI model to triage by inferencing using a single x-ray image, while the deep-learning model is trained using both X-ray and CT data. We report on how such a multi-modal training improves the solution compared to X-ray only training. The multi-modal solution increases the AUC (area under the receiver operating characteristic curve) from 0.89 to 0.93 and also positively impacts the Dice coefficient (0.59 to 0.62) for localizing the pathology. To the best our knowledge, it is the first X-ray solution by leveraging multi-modal information for the development. △ Less

Submitted 10 November, 2020; originally announced November 2020.

arXiv:2005.04588 [pdf]

Transformer Based Language Models for Similar Text Retrieval and Ranking

Authors: Javed Qadrud-Din, Ashraf Bah Rabiou, Ryan Walker, Ravi Soni, Martin Gajek, Gabriel Pack, Akhil Rangaraj

Abstract: Most approaches for similar text retrieval and ranking with long natural language queries rely at some level on queries and responses having words in common with each other. Recent applications of transformer-based neural language models to text retrieval and ranking problems have been very promising, but still involve a two-step process in which result candidates are first obtained through bag-of… ▽ More Most approaches for similar text retrieval and ranking with long natural language queries rely at some level on queries and responses having words in common with each other. Recent applications of transformer-based neural language models to text retrieval and ranking problems have been very promising, but still involve a two-step process in which result candidates are first obtained through bag-of-words-based approaches, and then reranked by a neural transformer. In this paper, we introduce novel approaches for effectively applying neural transformer models to similar text retrieval and ranking without an initial bag-of-words-based step. By eliminating the bag-of-words-based step, our approach is able to accurately retrieve and rank results even when they have no non-stopwords in common with the query. We accomplish this by using bidirectional encoder representations from transformers (BERT) to create vectorized representations of sentence-length texts, along with a vector nearest neighbor search index. We demonstrate both supervised and unsupervised means of using BERT to accomplish this task. △ Less

Submitted 21 May, 2020; v1 submitted 10 May, 2020; originally announced May 2020.

Comments: 5 pages, 2 figures

arXiv:2004.06378 [pdf]

Various Secure Routing Schemes for MANETs: A Survey

Authors: Priya R. Soni, Charmi A. Joshi, Dhwani R. Bhadra, Nikita P. Vyas, Rutvij H. Jhaveri

Abstract: MANET is an infrastructure less as well as self configuring network consisting of mobile nodes communicating with each other using radio medium. Its exclusive properties such as dynamic topology, decentralization, and wireless medium make MANET to become very unique network amongst other traditional networks, thereby determining security to be a major challenge. In this paper, we have carried out… ▽ More MANET is an infrastructure less as well as self configuring network consisting of mobile nodes communicating with each other using radio medium. Its exclusive properties such as dynamic topology, decentralization, and wireless medium make MANET to become very unique network amongst other traditional networks, thereby determining security to be a major challenge. In this paper, we have carried out the survey of various security approaches of Mobile Adhoc Networks and provide a comprehensive study regarding it. We have focused our work on three approaches such as Bayesian watch dog, Trust based systems, and Ant colony optimization. In wireless perspective, security is a crucial term to handle. Therefore it becomes necessary when we are concerning our work with Mobile Adhoc Network. △ Less

Submitted 14 April, 2020; originally announced April 2020.

arXiv:2002.04205 [pdf, other]

Fine-grained Uncertainty Modeling in Neural Networks

Authors: Rahul Soni, Naresh Shah, Jimmy D. Moore

Abstract: Existing uncertainty modeling approaches try to detect an out-of-distribution point from the in-distribution dataset. We extend this argument to detect finer-grained uncertainty that distinguishes between (a). certain points, (b). uncertain points but within the data distribution, and (c). out-of-distribution points. Our method corrects overconfident NN decisions, detects outlier points and learns… ▽ More Existing uncertainty modeling approaches try to detect an out-of-distribution point from the in-distribution dataset. We extend this argument to detect finer-grained uncertainty that distinguishes between (a). certain points, (b). uncertain points but within the data distribution, and (c). out-of-distribution points. Our method corrects overconfident NN decisions, detects outlier points and learns to say ``I don't know'' when uncertain about a critical point between the top two predictions. In addition, we provide a mechanism to quantify class distributions overlap in the decision manifold and investigate its implications in model interpretability. Our method is two-step: in the first step, the proposed method builds a class distribution using Kernel Activation Vectors (kav) extracted from the Network. In the second step, the algorithm determines the confidence of a test point by a hierarchical decision rule based on the chi-squared distribution of squared Mahalanobis distances. Our method sits on top of a given Neural Network, requires a single scan of training data to estimate class distribution statistics, and is highly scalable to deep networks and wider pre-softmax layer. As a positive side effect, our method helps to prevent adversarial attacks without requiring any additional training. It is directly achieved when the Softmax layer is substituted by our robust uncertainty layer at the evaluation phase. △ Less

Submitted 11 February, 2020; originally announced February 2020.

arXiv:2002.03549 [pdf, other]

Adversarial TCAV -- Robust and Effective Interpretation of Intermediate Layers in Neural Networks

Authors: Rahul Soni, Naresh Shah, Chua Tat Seng, Jimmy D. Moore

Abstract: Interpreting neural network decisions and the information learned in intermediate layers is still a challenge due to the opaque internal state and shared non-linear interactions. Although (Kim et al, 2017) proposed to interpret intermediate layers by quantifying its ability to distinguish a user-defined concept (from random examples), the questions of robustness (variation against the choice of ra… ▽ More Interpreting neural network decisions and the information learned in intermediate layers is still a challenge due to the opaque internal state and shared non-linear interactions. Although (Kim et al, 2017) proposed to interpret intermediate layers by quantifying its ability to distinguish a user-defined concept (from random examples), the questions of robustness (variation against the choice of random examples) and effectiveness (retrieval rate of concept images) remain. We investigate these two properties and propose improvements to make concept activations reliable for practical use. Effectiveness: If the intermediate layer has effectively learned a user-defined concept, it should be able to recall --- at the testing step --- most of the images containing the proposed concept. For instance, we observed that the recall rate of Tiger shark and Great white shark from the ImageNet dataset with "Fins" as a user-defined concept was only 18.35% for VGG16. To increase the effectiveness of concept learning, we propose A-CAV --- the Adversarial Concept Activation Vector --- this results in larger margins between user concepts and (negative) random examples. This approach improves the aforesaid recall to 76.83% for VGG16. For robustness, we define it as the ability of an intermediate layer to be consistent in its recall rate (the effectiveness) for different random seeds. We observed that TCAV has a large variance in recalling a concept across different random seeds. For example, the recall of cat images (from a layer learning the concept of tail) varies from 18% to 86% with 20.85% standard deviation on VGG16. We propose a simple and scalable modification that employs a Gram-Schmidt process to sample random noise from concepts and learn an average "concept classifier". This approach improves the aforesaid standard deviation from 20.85% to 6.4%. △ Less

Submitted 26 February, 2020; v1 submitted 10 February, 2020; originally announced February 2020.

arXiv:1509.02437 [pdf]

Improved Twitter Sentiment Prediction through Cluster-then-Predict Model

Authors: Rishabh Soni, K. James Mathai

Abstract: Over the past decade humans have experienced exponential growth in the use of online resources, in particular social media and microblogging websites such as Facebook, Twitter, YouTube and also mobile applications such as WhatsApp, Line, etc. Many companies have identified these resources as a rich mine of marketing knowledge. This knowledge provides valuable feedback which allows them to further… ▽ More Over the past decade humans have experienced exponential growth in the use of online resources, in particular social media and microblogging websites such as Facebook, Twitter, YouTube and also mobile applications such as WhatsApp, Line, etc. Many companies have identified these resources as a rich mine of marketing knowledge. This knowledge provides valuable feedback which allows them to further develop the next generation of their product. In this paper, sentiment analysis of a product is performed by extracting tweets about that product and classifying the tweets showing it as positive and negative sentiment. The authors propose a hybrid approach which combines unsupervised learning in the form of K-means clustering to cluster the tweets and then performing supervised learning methods such as Decision Trees and Support Vector Machines for classification. △ Less

Submitted 8 September, 2015; originally announced September 2015.

Comments: 5 pages

Showing 1–10 of 10 results for author: Soni, R