subscribe to arXiv mailings

Towards Practical and Useful Automated Program Repair for Debugging

Authors: Qi Xin, Haojun Wu, Steven P. Reiss, Jifeng Xuan

Abstract: Current automated program repair (APR) techniques are far from being practical and useful enough to be considered for realistic debugging. They rely on unrealistic assumptions including the requirement of a comprehensive suite of test cases as the correctness criterion and frequent program re-execution for patch validation; they are not fast; and their ability of repairing the commonly arising com… ▽ More Current automated program repair (APR) techniques are far from being practical and useful enough to be considered for realistic debugging. They rely on unrealistic assumptions including the requirement of a comprehensive suite of test cases as the correctness criterion and frequent program re-execution for patch validation; they are not fast; and their ability of repairing the commonly arising complex bugs by fixing multiple locations of the program is very limited. We hope to substantially improve APR's practicality, effectiveness, and usefulness to help people debug. Towards this goal, we envision PracAPR, an interactive repair system that works in an Integrated Development Environment (IDE) to provide effective repair suggestions for debugging. PracAPR does not require a test suite or program re-execution. It assumes that the developer uses an IDE debugger and the program has suspended at a location where a problem is observed. It interacts with the developer to obtain a problem specification. Based on the specification, it performs test-free, flow-analysis-based fault localization, patch generation that combines large language model-based local repair and tailored strategy-driven global repair, and program re-execution-free patch validation based on simulated trace comparison to suggest repairs. By having PracAPR, we hope to take a significant step towards making APR useful and an everyday part of debugging. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.05844 [pdf, other]

Anatomy-guided Pathology Segmentation

Authors: Alexander Jaus, Constantin Seibold, Simon Reiß, Lukas Heine, Anton Schily, Moon Kim, Fin Hendrik Bahnsen, Ken Herrmann, Rainer Stiefelhagen, Jens Kleesiek

Abstract: Pathological structures in medical images are typically deviations from the expected anatomy of a patient. While clinicians consider this interplay between anatomy and pathology, recent deep learning algorithms specialize in recognizing either one of the two, rarely considering the patient's body from such a joint perspective. In this paper, we develop a generalist segmentation model that combines… ▽ More Pathological structures in medical images are typically deviations from the expected anatomy of a patient. While clinicians consider this interplay between anatomy and pathology, recent deep learning algorithms specialize in recognizing either one of the two, rarely considering the patient's body from such a joint perspective. In this paper, we develop a generalist segmentation model that combines anatomical and pathological information, aiming to enhance the segmentation accuracy of pathological features. Our Anatomy-Pathology Exchange (APEx) training utilizes a query-based segmentation transformer which decodes a joint feature space into query-representations for human anatomy and interleaves them via a mixing strategy into the pathology-decoder for anatomy-informed pathology predictions. In doing so, we are able to report the best results across the board on FDG-PET-CT and Chest X-Ray pathology segmentation tasks with a margin of up to 3.3% as compared to strong baseline methods. Code and models will be publicly available at github.com/alexanderjaus/APEx. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2406.10421 [pdf, other]

SciEx: Benchmarking Large Language Models on Scientific Exams with Human Expert Grading and Automatic Grading

Authors: Tu Anh Dinh, Carlos Mullov, Leonard Bärmann, Zhaolin Li, Danni Liu, Simon Reiß, Jueun Lee, Nathan Lerzer, Fabian Ternava, Jianfeng Gao, Tobias Röddiger, Alexander Waibel, Tamim Asfour, Michael Beigl, Rainer Stiefelhagen, Carsten Dachsbacher, Klemens Böhm, Jan Niehues

Abstract: With the rapid development of Large Language Models (LLMs), it is crucial to have benchmarks which can evaluate the ability of LLMs on different domains. One common use of LLMs is performing tasks on scientific topics, such as writing algorithms, querying databases or giving mathematical proofs. Inspired by the way university students are evaluated on such tasks, in this paper, we propose SciEx -… ▽ More With the rapid development of Large Language Models (LLMs), it is crucial to have benchmarks which can evaluate the ability of LLMs on different domains. One common use of LLMs is performing tasks on scientific topics, such as writing algorithms, querying databases or giving mathematical proofs. Inspired by the way university students are evaluated on such tasks, in this paper, we propose SciEx - a benchmark consisting of university computer science exam questions, to evaluate LLMs ability on solving scientific tasks. SciEx is (1) multilingual, containing both English and German exams, and (2) multi-modal, containing questions that involve images, and (3) contains various types of freeform questions with different difficulty levels, due to the nature of university exams. We evaluate the performance of various state-of-the-art LLMs on our new benchmark. Since SciEx questions are freeform, it is not straightforward to evaluate LLM performance. Therefore, we provide human expert grading of the LLM outputs on SciEx. We show that the free-form exams in SciEx remain challenging for the current LLMs, where the best LLM only achieves 59.4\% exam grade on average. We also provide detailed comparisons between LLM performance and student performance on SciEx. To enable future evaluation of new LLMs, we propose using LLM-as-a-judge to grade the LLM answers on SciEx. Our experiments show that, although they do not perform perfectly on solving the exams, LLMs are decent as graders, achieving 0.948 Pearson correlation with expert grading. △ Less

Submitted 12 July, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

ACM Class: I.2.7

arXiv:2306.03934 [pdf, other]

Accurate Fine-Grained Segmentation of Human Anatomy in Radiographs via Volumetric Pseudo-Labeling

Authors: Constantin Seibold, Alexander Jaus, Matthias A. Fink, Moon Kim, Simon Reiß, Ken Herrmann, Jens Kleesiek, Rainer Stiefelhagen

Abstract: Purpose: Interpreting chest radiographs (CXR) remains challenging due to the ambiguity of overlapping structures such as the lungs, heart, and bones. To address this issue, we propose a novel method for extracting fine-grained anatomical structures in CXR using pseudo-labeling of three-dimensional computed tomography (CT) scans. Methods: We created a large-scale dataset of 10,021 thoracic CTs wi… ▽ More Purpose: Interpreting chest radiographs (CXR) remains challenging due to the ambiguity of overlapping structures such as the lungs, heart, and bones. To address this issue, we propose a novel method for extracting fine-grained anatomical structures in CXR using pseudo-labeling of three-dimensional computed tomography (CT) scans. Methods: We created a large-scale dataset of 10,021 thoracic CTs with 157 labels and applied an ensemble of 3D anatomy segmentation models to extract anatomical pseudo-labels. These labels were projected onto a two-dimensional plane, similar to the CXR, allowing the training of detailed semantic segmentation models for CXR without any manual annotation effort. Results: Our resulting segmentation models demonstrated remarkable performance on CXR, with a high average model-annotator agreement between two radiologists with mIoU scores of 0.93 and 0.85 for frontal and lateral anatomy, while inter-annotator agreement remained at 0.95 and 0.83 mIoU. Our anatomical segmentations allowed for the accurate extraction of relevant explainable medical features such as the cardio-thoracic-ratio. Conclusion: Our method of volumetric pseudo-labeling paired with CT projection offers a promising approach for detailed anatomical segmentation of CXR with a high agreement with human annotators. This technique may have important clinical implications, particularly in the analysis of various thoracic pathologies. △ Less

Submitted 6 June, 2023; originally announced June 2023.

Comments: 28 pages, 1 table, 10 figures

ACM Class: I.4.6; I.4.7; I.4.8

arXiv:2303.11910 [pdf, other]

360BEV: Panoramic Semantic Mapping for Indoor Bird's-Eye View

Authors: Zhifeng Teng, Jiaming Zhang, Kailun Yang, Kunyu Peng, Hao Shi, Simon Reiß, Ke Cao, Rainer Stiefelhagen

Abstract: Seeing only a tiny part of the whole is not knowing the full circumstance. Bird's-eye-view (BEV) perception, a process of obtaining allocentric maps from egocentric views, is restricted when using a narrow Field of View (FoV) alone. In this work, mapping from 360° panoramas to BEV semantics, the 360BEV task, is established for the first time to achieve holistic representations of indoor scenes in… ▽ More Seeing only a tiny part of the whole is not knowing the full circumstance. Bird's-eye-view (BEV) perception, a process of obtaining allocentric maps from egocentric views, is restricted when using a narrow Field of View (FoV) alone. In this work, mapping from 360° panoramas to BEV semantics, the 360BEV task, is established for the first time to achieve holistic representations of indoor scenes in a top-down view. Instead of relying on narrow-FoV image sequences, a panoramic image with depth information is sufficient to generate a holistic BEV semantic map. To benchmark 360BEV, we present two indoor datasets, 360BEV-Matterport and 360BEV-Stanford, both of which include egocentric panoramic images and semantic segmentation labels, as well as allocentric semantic maps. Besides delving deep into different mapping paradigms, we propose a dedicated solution for panoramic semantic mapping, namely 360Mapper. Through extensive experiments, our methods achieve 44.32% and 45.78% in mIoU on both datasets respectively, surpassing previous counterparts with gains of +7.60% and +9.70% in mIoU. Code and datasets are available at the project page: https://jamycheung.github.io/360BEV.html. △ Less

Submitted 4 September, 2023; v1 submitted 21 March, 2023; originally announced March 2023.

Comments: Code and datasets are available at the project page: https://jamycheung.github.io/360BEV.html. Accepted to WACV 2024

arXiv:2303.07126 [pdf, ps, other]

Mirror U-Net: Marrying Multimodal Fission with Multi-task Learning for Semantic Segmentation in Medical Imaging

Authors: Zdravko Marinov, Simon Reiß, David Kersting, Jens Kleesiek, Rainer Stiefelhagen

Abstract: Positron Emission Tomography (PET) and Computer Tomography (CT) are routinely used together to detect tumors. PET/CT segmentation models can automate tumor delineation, however, current multimodal models do not fully exploit the complementary information in each modality, as they either concatenate PET and CT data or fuse them at the decision level. To combat this, we propose Mirror U-Net, which r… ▽ More Positron Emission Tomography (PET) and Computer Tomography (CT) are routinely used together to detect tumors. PET/CT segmentation models can automate tumor delineation, however, current multimodal models do not fully exploit the complementary information in each modality, as they either concatenate PET and CT data or fuse them at the decision level. To combat this, we propose Mirror U-Net, which replaces traditional fusion methods with multimodal fission by factorizing the multimodal representation into modality-specific branches and an auxiliary multimodal decoder. At these branches, Mirror U-Net assigns a task tailored to each modality to reinforce unimodal features while preserving multimodal features in the shared representation. In contrast to previous methods that use either fission or multi-task learning, Mirror U-Net combines both paradigms in a unified framework. We explore various task combinations and examine which parameters to share in the model. We evaluate Mirror U-Net on the AutoPET PET/CT and on the multimodal MSD BrainTumor datasets, demonstrating its effectiveness in multimodal segmentation and achieving state-of-the-art performance on both datasets. Our code will be made publicly available. △ Less

Submitted 13 March, 2023; originally announced March 2023.

Comments: 8 pages; 8 figures; 5 tables

arXiv:2303.01480 [pdf, other]

Delivering Arbitrary-Modal Semantic Segmentation

Authors: Jiaming Zhang, Ruiping Liu, Hao Shi, Kailun Yang, Simon Reiß, Kunyu Peng, Haodong Fu, Kaiwei Wang, Rainer Stiefelhagen

Abstract: Multimodal fusion can make semantic segmentation more robust. However, fusing an arbitrary number of modalities remains underexplored. To delve into this problem, we create the DeLiVER arbitrary-modal segmentation benchmark, covering Depth, LiDAR, multiple Views, Events, and RGB. Aside from this, we provide this dataset in four severe weather conditions as well as five sensor failure cases to expl… ▽ More Multimodal fusion can make semantic segmentation more robust. However, fusing an arbitrary number of modalities remains underexplored. To delve into this problem, we create the DeLiVER arbitrary-modal segmentation benchmark, covering Depth, LiDAR, multiple Views, Events, and RGB. Aside from this, we provide this dataset in four severe weather conditions as well as five sensor failure cases to exploit modal complementarity and resolve partial outages. To make this possible, we present the arbitrary cross-modal segmentation model CMNeXt. It encompasses a Self-Query Hub (SQ-Hub) designed to extract effective information from any modality for subsequent fusion with the RGB representation and adds only negligible amounts of parameters (~0.01M) per additional modality. On top, to efficiently and flexibly harvest discriminative cues from the auxiliary modalities, we introduce the simple Parallel Pooling Mixer (PPX). With extensive experiments on a total of six benchmarks, our CMNeXt achieves state-of-the-art performance on the DeLiVER, KITTI-360, MFNet, NYU Depth V2, UrbanLF, and MCubeS datasets, allowing to scale from 1 to 81 modalities. On the freshly collected DeLiVER, the quad-modal CMNeXt reaches up to 66.30% in mIoU with a +9.10% gain as compared to the mono-modal baseline. The DeLiVER dataset and our code are at: https://jamycheung.github.io/DELIVER.html. △ Less

Submitted 2 March, 2023; originally announced March 2023.

Comments: Accepted by CVPR 2023. Dataset and our code are at: https://jamycheung.github.io/DELIVER.html

arXiv:2210.03416 [pdf, other]

Detailed Annotations of Chest X-Rays via CT Projection for Report Understanding

Authors: Constantin Seibold, Simon Reiß, Saquib Sarfraz, Matthias A. Fink, Victoria Mayer, Jan Sellner, Moon Sung Kim, Klaus H. Maier-Hein, Jens Kleesiek, Rainer Stiefelhagen

Abstract: In clinical radiology reports, doctors capture important information about the patient's health status. They convey their observations from raw medical imaging data about the inner structures of a patient. As such, formulating reports requires medical experts to possess wide-ranging knowledge about anatomical regions with their normal, healthy appearance as well as the ability to recognize abnorma… ▽ More In clinical radiology reports, doctors capture important information about the patient's health status. They convey their observations from raw medical imaging data about the inner structures of a patient. As such, formulating reports requires medical experts to possess wide-ranging knowledge about anatomical regions with their normal, healthy appearance as well as the ability to recognize abnormalities. This explicit grasp on both the patient's anatomy and their appearance is missing in current medical image-processing systems as annotations are especially difficult to gather. This renders the models to be narrow experts e.g. for identifying specific diseases. In this work, we recover this missing link by adding human anatomy into the mix and enable the association of content in medical reports to their occurrence in associated imagery (medical phrase grounding). To exploit anatomical structures in this scenario, we present a sophisticated automatic pipeline to gather and integrate human bodily structures from computed tomography datasets, which we incorporate in our PAXRay: A Projected dataset for the segmentation of Anatomical structures in X-Ray data. Our evaluation shows that methods that take advantage of anatomical information benefit heavily in visually grounding radiologists' findings, as our anatomical segmentations allow for up to absolute 50% better grounding results on the OpenI dataset as compared to commonly used region proposals. The PAXRay dataset is available at https://constantinseibold.github.io/paxray/. △ Less

Submitted 7 October, 2022; originally announced October 2022.

Comments: 33rd British Machine Vision Conference (BMVC 2022)

ACM Class: I.4.6; I.4.8; I.4.9

arXiv:2209.09804 [pdf, other]

Assisted Specification of Code Using Search

Authors: Steven P. Reiss

Abstract: We describe an intelligent assistant based on mining existing software repositories to help the developer interactively create checkable specifications of code. To be most useful we apply this at the subsystem level, that is chunks of code of 1000-10000 lines that can be standalone or integrated into an existing application to provide additional functionality or capabilities. The resultant specifi… ▽ More We describe an intelligent assistant based on mining existing software repositories to help the developer interactively create checkable specifications of code. To be most useful we apply this at the subsystem level, that is chunks of code of 1000-10000 lines that can be standalone or integrated into an existing application to provide additional functionality or capabilities. The resultant specifications include both a syntactic description of what should be written and a semantic specification of what it should do, initially in the form of test cases. The generated specification is designed to be used for automatic code generation using various technologies that have been proposed including machine learning, code search, and program synthesis. Our research goal is to enable these technologies to be used effectively for creating subsystems without requiring the developer to write detailed specifications from scratch. △ Less

Submitted 20 September, 2022; originally announced September 2022.

MSC Class: ACM-class: D.2.2

arXiv:2207.11860 [pdf, other]

Behind Every Domain There is a Shift: Adapting Distortion-aware Vision Transformers for Panoramic Semantic Segmentation

Authors: Jiaming Zhang, Kailun Yang, Hao Shi, Simon Reiß, Kunyu Peng, Chaoxiang Ma, Haodong Fu, Philip H. S. Torr, Kaiwei Wang, Rainer Stiefelhagen

Abstract: In this paper, we address panoramic semantic segmentation which is under-explored due to two critical challenges: (1) image distortions and object deformations on panoramas; (2) lack of semantic annotations in the 360° imagery. To tackle these problems, first, we propose the upgraded Transformer for Panoramic Semantic Segmentation, i.e., Trans4PASS+, equipped with Deformable Patch Embedding (DPE)… ▽ More In this paper, we address panoramic semantic segmentation which is under-explored due to two critical challenges: (1) image distortions and object deformations on panoramas; (2) lack of semantic annotations in the 360° imagery. To tackle these problems, first, we propose the upgraded Transformer for Panoramic Semantic Segmentation, i.e., Trans4PASS+, equipped with Deformable Patch Embedding (DPE) and Deformable MLP (DMLPv2) modules for handling object deformations and image distortions whenever (before or after adaptation) and wherever (shallow or deep levels). Second, we enhance the Mutual Prototypical Adaptation (MPA) strategy via pseudo-label rectification for unsupervised domain adaptive panoramic segmentation. Third, aside from Pinhole-to-Panoramic (Pin2Pan) adaptation, we create a new dataset (SynPASS) with 9,080 panoramic images, facilitating Synthetic-to-Real (Syn2Real) adaptation scheme in 360° imagery. Extensive experiments are conducted, which cover indoor and outdoor scenarios, and each of them is investigated with Pin2Pan and Syn2Real regimens. Trans4PASS+ achieves state-of-the-art performances on four domain adaptive panoramic semantic segmentation benchmarks. Code is available at https://github.com/jamycheung/Trans4PASS. △ Less

Submitted 31 May, 2024; v1 submitted 24 July, 2022; originally announced July 2022.

Comments: Accepted to IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI). Extended version of CVPR 2022 paper arXiv:2203.01452. Code is available at https://github.com/jamycheung/Trans4PASS

arXiv:2205.07139 [pdf, other]

doi 10.1007/978-3-031-16443-9_66

Breaking with Fixed Set Pathology Recognition through Report-Guided Contrastive Training

Authors: Constantin Seibold, Simon Reiß, M. Saquib Sarfraz, Rainer Stiefelhagen, Jens Kleesiek

Abstract: When reading images, radiologists generate text reports describing the findings therein. Current state-of-the-art computer-aided diagnosis tools utilize a fixed set of predefined categories automatically extracted from these medical reports for training. This form of supervision limits the potential usage of models as they are unable to pick up on anomalies outside of their predefined set, thus, m… ▽ More When reading images, radiologists generate text reports describing the findings therein. Current state-of-the-art computer-aided diagnosis tools utilize a fixed set of predefined categories automatically extracted from these medical reports for training. This form of supervision limits the potential usage of models as they are unable to pick up on anomalies outside of their predefined set, thus, making it a necessity to retrain the classifier with additional data when faced with novel classes. In contrast, we investigate direct text supervision to break away from this closed set assumption. By doing so, we avoid noisy label extraction via text classifiers and incorporate more contextual information. We employ a contrastive global-local dual-encoder architecture to learn concepts directly from unstructured medical reports while maintaining its ability to perform free form classification. We investigate relevant properties of open set recognition for radiological data and propose a method to employ currently weakly annotated data into training. We evaluate our approach on the large-scale chest X-Ray datasets MIMIC-CXR, CheXpert, and ChestX-Ray14 for disease classification. We show that despite using unstructured medical report supervision, we perform on par with direct label supervision through a sophisticated inference setting. △ Less

Submitted 14 May, 2022; originally announced May 2022.

Comments: Provisionally Accepted at MICCAI2022

arXiv:2203.01452 [pdf, other]

Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic Segmentation

Authors: Jiaming Zhang, Kailun Yang, Chaoxiang Ma, Simon Reiß, Kunyu Peng, Rainer Stiefelhagen

Abstract: Panoramic images with their 360-degree directional view encompass exhaustive information about the surrounding space, providing a rich foundation for scene understanding. To unfold this potential in the form of robust panoramic segmentation models, large quantities of expensive, pixel-wise annotations are crucial for success. Such annotations are available, but predominantly for narrow-angle, pinh… ▽ More Panoramic images with their 360-degree directional view encompass exhaustive information about the surrounding space, providing a rich foundation for scene understanding. To unfold this potential in the form of robust panoramic segmentation models, large quantities of expensive, pixel-wise annotations are crucial for success. Such annotations are available, but predominantly for narrow-angle, pinhole-camera images which, off the shelf, serve as sub-optimal resources for training panoramic models. Distortions and the distinct image-feature distribution in 360-degree panoramas impede the transfer from the annotation-rich pinhole domain and therefore come with a big dent in performance. To get around this domain difference and bring together semantic annotations from pinhole- and 360-degree surround-visuals, we propose to learn object deformations and panoramic image distortions in the Deformable Patch Embedding (DPE) and Deformable MLP (DMLP) components which blend into our Transformer for PAnoramic Semantic Segmentation (Trans4PASS) model. Finally, we tie together shared semantics in pinhole- and panoramic feature embeddings by generating multi-scale prototype features and aligning them in our Mutual Prototypical Adaptation (MPA) for unsupervised domain adaptation. On the indoor Stanford2D3D dataset, our Trans4PASS with MPA maintains comparable performance to fully-supervised state-of-the-arts, cutting the need for over 1,400 labeled panoramas. On the outdoor DensePASS dataset, we break state-of-the-art by 14.39% mIoU and set the new bar at 56.38%. Code will be made publicly available at https://github.com/jamycheung/Trans4PASS. △ Less

Submitted 17 March, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

Comments: Accepted to CVPR2022. Code will be made publicly available at https://github.com/jamycheung/Trans4PASS

arXiv:2202.05577 [pdf, other]

A Quick Repair Facility for Debugging

Authors: Steven P. Reiss, Qi Xin

Abstract: Modern development environments provide a widely used auto-correction facility for quickly repairing syntactic errors. Auto-correction cannot deal with semantic errors, which are much more difficult to repair. Automated program repair techniques, designed for repairing semantic errors, are not well-suited for interactive use while debugging, as they typically assume the existence of a high-quality… ▽ More Modern development environments provide a widely used auto-correction facility for quickly repairing syntactic errors. Auto-correction cannot deal with semantic errors, which are much more difficult to repair. Automated program repair techniques, designed for repairing semantic errors, are not well-suited for interactive use while debugging, as they typically assume the existence of a high-quality test suite and take considerable time. To bridge the gap, we developed ROSE, a tool to suggest quick-yet-effective repairs of semantic errors during debugging. ROSE does not rely on a test suite. Instead, it assumes a debugger stopping point where a problem is observed. It asks the developer to quickly describe what is wrong, performs a light-weight fault localization to identify potential responsible locations, and uses a generate-and-validate strategy to produce and validate repairs. Finally, it presents the results so the developer can choose and make the appropriate repair. To assess its utility, we implemented a prototype of ROSE that works in the Eclipse IDE and applied it to two benchmarks, QuixBugs and Defects4J, for repair. ROSE was able to suggest correct repairs for 17 QuixBugs and 16 Defects4J errors in seconds. △ Less

Submitted 11 February, 2022; originally announced February 2022.

arXiv:2112.00735 [pdf, other]

doi 10.1609/aaai.v36i2.20114

Reference-guided Pseudo-Label Generation for Medical Semantic Segmentation

Authors: Constantin Seibold, Simon Reiß, Jens Kleesiek, Rainer Stiefelhagen

Abstract: Producing densely annotated data is a difficult and tedious task for medical imaging applications. To address this problem, we propose a novel approach to generate supervision for semi-supervised semantic segmentation. We argue that visually similar regions between labeled and unlabeled images likely contain the same semantics and therefore should share their label. Following this thought, we use… ▽ More Producing densely annotated data is a difficult and tedious task for medical imaging applications. To address this problem, we propose a novel approach to generate supervision for semi-supervised semantic segmentation. We argue that visually similar regions between labeled and unlabeled images likely contain the same semantics and therefore should share their label. Following this thought, we use a small number of labeled images as reference material and match pixels in an unlabeled image to the semantics of the best fitting pixel in a reference set. This way, we avoid pitfalls such as confirmation bias, common in purely prediction-based pseudo-labeling. Since our method does not require any architectural changes or accompanying networks, one can easily insert it into existing frameworks. We achieve the same performance as a standard fully supervised model on X-ray anatomy segmentation, albeit 95% fewer labeled images. Aside from an in-depth analysis of different aspects of our proposed method, we further demonstrate the effectiveness of our reference-guided learning paradigm by comparing our approach against existing methods for retinal fluid segmentation with competitive performance as we improve upon recent work by up to 15% mean IoU. △ Less

Submitted 1 December, 2021; originally announced December 2021.

Comments: 36th AAAI Conference on Artificial Intelligence 2022

MSC Class: 68T07; 68T45 ACM Class: I.5.4

arXiv:2110.13061 [pdf, other]

Where were my keys? -- Aggregating Spatial-Temporal Instances of Objects for Efficient Retrieval over Long Periods of Time

Authors: Ifrah Idrees, Zahid Hasan, Steven P. Reiss, Stefanie Tellex

Abstract: Robots equipped with situational awareness can help humans efficiently find their lost objects by leveraging spatial and temporal structure. Existing approaches to video and image retrieval do not take into account the unique constraints imposed by a moving camera with a partial view of the environment. We present a Detection-based 3-level hierarchical Association approach, D3A, to create an effic… ▽ More Robots equipped with situational awareness can help humans efficiently find their lost objects by leveraging spatial and temporal structure. Existing approaches to video and image retrieval do not take into account the unique constraints imposed by a moving camera with a partial view of the environment. We present a Detection-based 3-level hierarchical Association approach, D3A, to create an efficient query-able spatial-temporal representation of unique object instances in an environment. D3A performs online incremental and hierarchical learning to identify keyframes that best represent the unique objects in the environment. These keyframes are learned based on both spatial and temporal features and once identified their corresponding spatial-temporal information is organized in a key-value database. D3A allows for a variety of query patterns such as querying for objects with/without the following: 1) specific attributes, 2) spatial relationships with other objects, and 3) time slices. For a given set of 150 queries, D3A returns a small set of candidate keyframes (which occupy only 0.17% of the total sensory data) with 81.98\% mean accuracy in 11.7 ms. This is 47x faster and 33% more accurate than a baseline that naively stores the object matches (detections) in the database without associating spatial-temporal information. △ Less

Submitted 25 October, 2021; originally announced October 2021.

Comments: Presented at AI-HRI symposium as part of AAAI-FSS 2021 (arXiv:2109.10836)

Report number: AIHRI/2021/19

arXiv:2107.05617 [pdf, other]

Let's Play for Action: Recognizing Activities of Daily Living by Learning from Life Simulation Video Games

Authors: Alina Roitberg, David Schneider, Aulia Djamal, Constantin Seibold, Simon Reiß, Rainer Stiefelhagen

Abstract: Recognizing Activities of Daily Living (ADL) is a vital process for intelligent assistive robots, but collecting large annotated datasets requires time-consuming temporal labeling and raises privacy concerns, e.g., if the data is collected in a real household. In this work, we explore the concept of constructing training examples for ADL recognition by playing life simulation video games and intro… ▽ More Recognizing Activities of Daily Living (ADL) is a vital process for intelligent assistive robots, but collecting large annotated datasets requires time-consuming temporal labeling and raises privacy concerns, e.g., if the data is collected in a real household. In this work, we explore the concept of constructing training examples for ADL recognition by playing life simulation video games and introduce the SIMS4ACTION dataset created with the popular commercial game THE SIMS 4. We build Sims4Action by specifically executing actions-of-interest in a "top-down" manner, while the gaming circumstances allow us to freely switch between environments, camera angles and subject appearances. While ADL recognition on gaming data is interesting from the theoretical perspective, the key challenge arises from transferring it to the real-world applications, such as smart-homes or assistive robotics. To meet this requirement, Sims4Action is accompanied with a GamingToReal benchmark, where the models are evaluated on real videos derived from an existing ADL dataset. We integrate two modern algorithms for video-based activity recognition in our framework, revealing the value of life simulation video games as an inexpensive and far less intrusive source of training data. However, our results also indicate that tasks involving a mixture of gaming and real data are challenging, opening a new research direction. We will make our dataset publicly available at https://github.com/aroitberg/sims4action. △ Less

Submitted 12 July, 2021; originally announced July 2021.

arXiv:2104.13243 [pdf, other]

Every Annotation Counts: Multi-label Deep Supervision for Medical Image Segmentation

Authors: Simon Reiß, Constantin Seibold, Alexander Freytag, Erik Rodner, Rainer Stiefelhagen

Abstract: Pixel-wise segmentation is one of the most data and annotation hungry tasks in our field. Providing representative and accurate annotations is often mission-critical especially for challenging medical applications. In this paper, we propose a semi-weakly supervised segmentation algorithm to overcome this barrier. Our approach is based on a new formulation of deep supervision and student-teacher mo… ▽ More Pixel-wise segmentation is one of the most data and annotation hungry tasks in our field. Providing representative and accurate annotations is often mission-critical especially for challenging medical applications. In this paper, we propose a semi-weakly supervised segmentation algorithm to overcome this barrier. Our approach is based on a new formulation of deep supervision and student-teacher model and allows for easy integration of different supervision signals. In contrast to previous work, we show that care has to be taken how deep supervision is integrated in lower layers and we present multi-label deep supervision as the most important secret ingredient for success. With our novel training regime for segmentation that flexibly makes use of images that are either fully labeled, marked with bounding boxes, just global labels, or not at all, we are able to cut the requirement for expensive labels by 94.22% - narrowing the gap to the best fully supervised baseline to only 5% mean IoU. Our approach is validated by extensive experiments on retinal fluid segmentation and we provide an in-depth analysis of the anticipated effect each annotation type can have in boosting segmentation performance. △ Less

Submitted 27 April, 2021; originally announced April 2021.

Comments: Accepted at CVPR 2021

ACM Class: I.4.6; I.2.6; I.5.4; I.5.1

arXiv:2103.05687 [pdf, other]

Capturing Omni-Range Context for Omnidirectional Segmentation

Authors: Kailun Yang, Jiaming Zhang, Simon Reiß, Xinxin Hu, Rainer Stiefelhagen

Abstract: Convolutional Networks (ConvNets) excel at semantic segmentation and have become a vital component for perception in autonomous driving. Enabling an all-encompassing view of street-scenes, omnidirectional cameras present themselves as a perfect fit in such systems. Most segmentation models for parsing urban environments operate on common, narrow Field of View (FoV) images. Transferring these model… ▽ More Convolutional Networks (ConvNets) excel at semantic segmentation and have become a vital component for perception in autonomous driving. Enabling an all-encompassing view of street-scenes, omnidirectional cameras present themselves as a perfect fit in such systems. Most segmentation models for parsing urban environments operate on common, narrow Field of View (FoV) images. Transferring these models from the domain they were designed for to 360-degree perception, their performance drops dramatically, e.g., by an absolute 30.0% (mIoU) on established test-beds. To bridge the gap in terms of FoV and structural distribution between the imaging domains, we introduce Efficient Concurrent Attention Networks (ECANets), directly capturing the inherent long-range dependencies in omnidirectional imagery. In addition to the learned attention-based contextual priors that can stretch across 360-degree images, we upgrade model training by leveraging multi-source and omni-supervised learning, taking advantage of both: Densely labeled and unlabeled data originating from multiple datasets. To foster progress in panoramic image segmentation, we put forward and extensively evaluate models on Wild PAnoramic Semantic Segmentation (WildPASS), a dataset designed to capture diverse scenes from all around the globe. Our novel model, training regimen and multi-source prediction fusion elevate the performance (mIoU) to new state-of-the-art results on the public PASS (60.2%) and the fresh WildPASS (69.0%) benchmarks. △ Less

Submitted 9 March, 2021; originally announced March 2021.

Comments: Accepted to CVPR2021

arXiv:2003.10553 [pdf, other]

RoboMem: Giving Long Term Memory to Robots

Authors: Ifrah Idrees, Steven P. Reiss, Stefanie Tellex

Abstract: Robots have the potential to improve health monitoring outcomes for the elderly by providing doctors, and caregivers with information about the person's behavior, health activities and their surrounding environment. Over the years, less work has been done to enable robots to preserve information for longer periods of time, on the order of months and years of data, and use this contextual informati… ▽ More Robots have the potential to improve health monitoring outcomes for the elderly by providing doctors, and caregivers with information about the person's behavior, health activities and their surrounding environment. Over the years, less work has been done to enable robots to preserve information for longer periods of time, on the order of months and years of data, and use this contextual information to answer queries. Time complexity to process this massive sensor data in a timely fashion, inability to anticipate the future queries in advance and imprecision involved in the results have been the main impediments in making progress in this area. We make a contribution by introducing RoboMem, a query answering system for health-care assistance of elderly over long term; continuous data feeds that intends to overcome the challenges of giving long term memory to robots. The design for our framework preprocesses the sensor data and stores this preprocessed data into the database. This data is updated in the database by going through successive refinements, improving its accuracy for responding to queries. If data in the database is not enough to answer a query, a small set of relevant frames (also obtained from the database) will be reprocessed to obtain the answer. [Our initial prototype of RoboMem stores 3.5MB of data in the database as compared to 535.8MB of actual video frames and with minimal data in the database it is able to fetch information fundamental to respond to queries in 0.0002 seconds on average]. △ Less

Submitted 23 March, 2020; originally announced March 2020.

Comments: Poster Paper accepted in ICRA 2019 workshop - MoRobAE - Mobile Robot Assistants for the Elderly

arXiv:1909.13683 [pdf]

Continuous Flow Analysis to Detect Security Problems

Authors: Steven P. Reiss

Abstract: We introduce a tool that supports continuous flow analysis in order to detect security problems as the user edits. The tool uses abstract interpretation over both byte codes and abstract syntax trees to trace the flow of both type annotations and system states from their sources to security problems. The flow analysis achieves a balance between performance and accuracy in order to detect security… ▽ More We introduce a tool that supports continuous flow analysis in order to detect security problems as the user edits. The tool uses abstract interpretation over both byte codes and abstract syntax trees to trace the flow of both type annotations and system states from their sources to security problems. The flow analysis achieves a balance between performance and accuracy in order to detect security vulnerabilities within seconds, and uses incremental update to provide immediate feedback to the programmer. Resource files are used to specify the specific security constraints of an application and to tune the analysis. The system can also provide detailed information to the programmer as to why it flagged a particular problem. The tool is integrated into the Code Bubbles development environment. △ Less

Submitted 30 September, 2019; originally announced September 2019.

arXiv:1903.04583 [pdf, other]

Revisiting ssFix for Better Program Repair

Authors: Qi Xin, Steven P. Reiss

Abstract: A branch of automated program repair (APR) techniques look at finding and reusing existing code for bug repair. ssFix is one of such techniques that is syntactic search-based: it searches a code database for code fragments that are syntactically similar to the bug context and reuses such retrieved code fragments to produce patches. Using such a syntactic approach, ssFix is relatively lightweight a… ▽ More A branch of automated program repair (APR) techniques look at finding and reusing existing code for bug repair. ssFix is one of such techniques that is syntactic search-based: it searches a code database for code fragments that are syntactically similar to the bug context and reuses such retrieved code fragments to produce patches. Using such a syntactic approach, ssFix is relatively lightweight and was shown to outperform many other APR techniques. In this paper, to investigate the true effectiveness of ssFix, we conducted multiple experiments to validate ssFix's built-upon assumption (i.e., to see whether it is often possible to reuse existing code for bug repair) and evaluate its code search and code reuse approaches. Our results show that while the basic idea of ssFix, i.e., reusing existing code for bug repair, is promising, the approaches ssFix uses are not the best and can be significantly improved. We proposed a new repair technique sharpFix which follows ssFix's basic idea but differs in the code search and reuse approaches used. We evaluated sharpFix and ssFix on two bug datasets: Defects4J and Bugs.jar-ELIXIR. The results confirm that sharpFix is an improvement over ssFix. For the Defects4J dataset, sharpFix successfully repaired a total of 36 bugs and outperformed many existing repair techniques in repairing more bugs. For the Bugs.jar-ELIXIR dataset, we compared sharpFix, ssFix, and four other APR techniques, and found that sharpFix has the best repair performance. In essence, the paper shows how effective a syntactic search-based approach can be and what techniques should be used for such an approach. △ Less

Submitted 11 March, 2019; originally announced March 2019.

Comments: 12 pages

arXiv:1707.06737 [pdf]

Learning Program Component Order

Authors: Steven P. Reiss, Qi Xin

Abstract: Successful programs are written to be maintained. One aspect to this is that programmers order the components in the code files in a particular way. This is part of programming style. While the conventions for ordering are sometimes given as part of a style guideline, such guidelines are often incomplete and programmers tend to have their own more comprehensive orderings in mind. This paper define… ▽ More Successful programs are written to be maintained. One aspect to this is that programmers order the components in the code files in a particular way. This is part of programming style. While the conventions for ordering are sometimes given as part of a style guideline, such guidelines are often incomplete and programmers tend to have their own more comprehensive orderings in mind. This paper defines a model for ordering program components and shows how this model can be learned from sample code. Such a model is a useful tool for a programming environment in that it can be used to find the proper location for inserting new components or for reordering files to better meet the needs of the programmer. The model is designed so that it can be fine- tuned by the programmer. The learning framework is evaluated both by looking at code with known style guidelines and by testing whether it inserts existing components into a file correctly. △ Less

Submitted 20 July, 2017; originally announced July 2017.

arXiv:1608.07745 [pdf, other]

Type-Directed Code Reuse using Integer Linear Programming

Authors: Yuepeng Wang, Yu Feng, Ruben Martins, Arati Kaushik, Isil Dillig, Steven P. Reiss

Abstract: In many common scenarios, programmers need to implement functionality that is already provided by some third party library. This paper presents a tool called Hunter that facilitates code reuse by finding relevant methods in large code bases and automatically synthesizing any necessary wrapper code. The key technical idea underlying our approach is to use types to both improve search results and gu… ▽ More In many common scenarios, programmers need to implement functionality that is already provided by some third party library. This paper presents a tool called Hunter that facilitates code reuse by finding relevant methods in large code bases and automatically synthesizing any necessary wrapper code. The key technical idea underlying our approach is to use types to both improve search results and guide synthesis. Specifically, our method computes similarity metrics between types and uses this information to solve an integer linear programming (ILP) problem in which the objective is to minimize the cost of synthesis. We have implemented Hunter as an Eclipse plug-in and evaluate it by (a) comparing it against S6, a state-of-the-art code reuse tool, and (b) performing a user study. Our evaluation shows that Hunter compares favorably with S6 and significantly increases programmer productivity. △ Less

Submitted 27 August, 2016; originally announced August 2016.

arXiv:1412.7977 [pdf, other]

doi 10.1209/0295-5075/104/48001

Sculplexity: Sculptures of Complexity using 3D printing

Authors: D. S. Reiss, J. J. Price, T. S. Evans

Abstract: We show how to convert models of complex systems such as 2D cellular automata into a 3D printed object. Our method takes into account the limitations inherent to 3D printing processes and materials. Our approach automates the greater part of this task, bypassing the use of CAD software and the need for manual design. As a proof of concept, a physical object representing a modified forest fire mode… ▽ More We show how to convert models of complex systems such as 2D cellular automata into a 3D printed object. Our method takes into account the limitations inherent to 3D printing processes and materials. Our approach automates the greater part of this task, bypassing the use of CAD software and the need for manual design. As a proof of concept, a physical object representing a modified forest fire model was successfully printed. Automated conversion methods similar to the ones developed here can be used to create objects for research, for demonstration and teaching, for outreach, or simply for aesthetic pleasure. As our outputs can be touched, they may be particularly useful for those with visual disabilities. △ Less

Submitted 8 December, 2014; originally announced December 2014.

Comments: Free access to article on European Physics Letters

Report number: Imperial/TP/13/TSE/1

Journal ref: European Physics Letters 2013, 104, 48001

arXiv:cs/0310040 [pdf, ps, other]

Automated Fault Localization Using Potential Invariants

Authors: Brock Pytlik, Manos Renieris, Shriram Krishnamurthi, Steven P. Reiss

Abstract: We present a general method for fault localization based on abstracting over program traces, and a tool that implements the method using Ernst's notion of potential invariants. Our experiments so far have been unsatisfactory, suggesting that further research is needed before invariants can be used to locate faults. We present a general method for fault localization based on abstracting over program traces, and a tool that implements the method using Ernst's notion of potential invariants. Our experiments so far have been unsatisfactory, suggesting that further research is needed before invariants can be used to locate faults. △ Less

Submitted 18 October, 2003; originally announced October 2003.

Comments: In M. Ronsse, K. De Bosschere (eds), proceedings of the Fifth International Workshop on Automated Debugging (AADEBUG 2003), September 2003, Ghent. cs.SE/0309027

ACM Class: D.2.5

Showing 1–25 of 25 results for author: Reiß, S