-
AirDnD -- Asynchronous In-Range Dynamic and Distributed Network Orchestration Framework
Authors:
Malsha Ashani Mahawatta Dona,
Christian Berger,
Yinan Yu
Abstract:
The increasing usage of IoT devices has generated an extensive volume of data which resulted in the establishment of data centers with well-structured computing infrastructure. Reducing underutilized resources of such data centers can be achieved by monitoring the tasks and offloading them across various compute units. This approach can also be used in mini mobile data ponds generated by edge devi…
▽ More
The increasing usage of IoT devices has generated an extensive volume of data which resulted in the establishment of data centers with well-structured computing infrastructure. Reducing underutilized resources of such data centers can be achieved by monitoring the tasks and offloading them across various compute units. This approach can also be used in mini mobile data ponds generated by edge devices and smart vehicles. This research aims to improve and utilize the usage of computing resources in distributed edge devices by forming a dynamic mesh network. The nodes in the mesh network shall share their computing tasks with another node that possesses unused computing resources. This proposed method ensures the minimization of data transfer between entities. The proposed AirDnD vision will be applied to a practical scenario relevant to an autonomous vehicle that approaches an intersection commonly known as ``looking around the corner'' in related literature, collecting essential computational results from nearby vehicles to enhance its perception. The proposed solution consists of three models that transform growing amounts of geographically distributed edge devices into a living organism.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Implementing a hybrid approach in a knowledge engineering process to manage technical advice relating to feedback from the operation of complex sensitive equipment
Authors:
Alain Claude Hervé Berger,
Sébastien Boblet,
Thierry Cartié,
Jean-Pierre Cotton,
François Vexler
Abstract:
How can technical advice on operating experience feedback be managed efficiently in an organization that has never used knowledge engineering techniques and methods? This article explains how an industrial company in the nuclear and defense sectors adopted such an approach, adapted to its "TA KM" organizational context and falls within the ISO30401 framework, to build a complete system with a "SAR…
▽ More
How can technical advice on operating experience feedback be managed efficiently in an organization that has never used knowledge engineering techniques and methods? This article explains how an industrial company in the nuclear and defense sectors adopted such an approach, adapted to its "TA KM" organizational context and falls within the ISO30401 framework, to build a complete system with a "SARBACANES" application to support its business processes and perpetuate its know-how and expertise in a knowledge base. Over and above the classic transfer of knowledge between experts and business specialists, SARBACANES also reveals the ability of this type of engineering to deliver multi-functional operation. Modeling was accelerated by the use of a tool adapted to this type of operation: the Ardans Knowledge Maker platform.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
QUBIQ: Uncertainty Quantification for Biomedical Image Segmentation Challenge
Authors:
Hongwei Bran Li,
Fernando Navarro,
Ivan Ezhov,
Amirhossein Bayat,
Dhritiman Das,
Florian Kofler,
Suprosanna Shit,
Diana Waldmannstetter,
Johannes C. Paetzold,
Xiaobin Hu,
Benedikt Wiestler,
Lucas Zimmer,
Tamaz Amiranashvili,
Chinmay Prabhakar,
Christoph Berger,
Jonas Weidner,
Michelle Alonso-Basant,
Arif Rashid,
Ujjwal Baid,
Wesam Adel,
Deniz Ali,
Bhakti Baheti,
Yingbin Bai,
Ishaan Bhatt,
Sabri Can Cetindag
, et al. (55 additional authors not shown)
Abstract:
Uncertainty in medical image segmentation tasks, especially inter-rater variability, arising from differences in interpretations and annotations by various experts, presents a significant challenge in achieving consistent and reliable image segmentation. This variability not only reflects the inherent complexity and subjective nature of medical image interpretation but also directly impacts the de…
▽ More
Uncertainty in medical image segmentation tasks, especially inter-rater variability, arising from differences in interpretations and annotations by various experts, presents a significant challenge in achieving consistent and reliable image segmentation. This variability not only reflects the inherent complexity and subjective nature of medical image interpretation but also directly impacts the development and evaluation of automated segmentation algorithms. Accurately modeling and quantifying this variability is essential for enhancing the robustness and clinical applicability of these algorithms. We report the set-up and summarize the benchmark results of the Quantification of Uncertainties in Biomedical Image Quantification Challenge (QUBIQ), which was organized in conjunction with International Conferences on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2020 and 2021. The challenge focuses on the uncertainty quantification of medical image segmentation which considers the omnipresence of inter-rater variability in imaging datasets. The large collection of images with multi-rater annotations features various modalities such as MRI and CT; various organs such as the brain, prostate, kidney, and pancreas; and different image dimensions 2D-vs-3D. A total of 24 teams submitted different solutions to the problem, combining various baseline models, Bayesian neural networks, and ensemble model techniques. The obtained results indicate the importance of the ensemble models, as well as the need for further research to develop efficient 3D methods for uncertainty quantification methods in 3D segmentation tasks.
△ Less
Submitted 24 June, 2024; v1 submitted 19 March, 2024;
originally announced May 2024.
-
Predicting and Analyzing Pedestrian Crossing Behavior at Unsignalized Crossings
Authors:
Chi Zhang,
Janis Sprenger,
Zhongjun Ni,
Christian Berger
Abstract:
Understanding and predicting pedestrian crossing behavior is essential for enhancing automated driving and improving driving safety. Predicting gap selection behavior and the use of zebra crossing enables driving systems to proactively respond and prevent potential conflicts. This task is particularly challenging at unsignalized crossings due to the ambiguous right of way, requiring pedestrians to…
▽ More
Understanding and predicting pedestrian crossing behavior is essential for enhancing automated driving and improving driving safety. Predicting gap selection behavior and the use of zebra crossing enables driving systems to proactively respond and prevent potential conflicts. This task is particularly challenging at unsignalized crossings due to the ambiguous right of way, requiring pedestrians to constantly interact with vehicles and other pedestrians. This study addresses these challenges by utilizing simulator data to investigate scenarios involving multiple vehicles and pedestrians. We propose and evaluate machine learning models to predict gap selection in non-zebra scenarios and zebra crossing usage in zebra scenarios. We investigate and discuss how pedestrians' behaviors are influenced by various factors, including pedestrian waiting time, walking speed, the number of unused gaps, the largest missed gap, and the influence of other pedestrians. This research contributes to the evolution of intelligent vehicles by providing predictive models and valuable insights into pedestrian crossing behavior.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
Engineering Safety Requirements for Autonomous Driving with Large Language Models
Authors:
Ali Nouri,
Beatriz Cabrero-Daniel,
Fredrik Törner,
Hȧkan Sivencrona,
Christian Berger
Abstract:
Changes and updates in the requirement artifacts, which can be frequent in the automotive domain, are a challenge for SafetyOps. Large Language Models (LLMs), with their impressive natural language understanding and generating capabilities, can play a key role in automatically refining and decomposing requirements after each update. In this study, we propose a prototype of a pipeline of prompts an…
▽ More
Changes and updates in the requirement artifacts, which can be frequent in the automotive domain, are a challenge for SafetyOps. Large Language Models (LLMs), with their impressive natural language understanding and generating capabilities, can play a key role in automatically refining and decomposing requirements after each update. In this study, we propose a prototype of a pipeline of prompts and LLMs that receives an item definition and outputs solutions in the form of safety requirements. This pipeline also performs a review of the requirement dataset and identifies redundant or contradictory requirements. We first identified the necessary characteristics for performing HARA and then defined tests to assess an LLM's capability in meeting these criteria. We used design science with multiple iterations and let experts from different companies evaluate each cycle quantitatively and qualitatively. Finally, the prototype was implemented at a case company and the responsible team evaluated its efficiency.
△ Less
Submitted 24 March, 2024;
originally announced March 2024.
-
Welcome Your New AI Teammate: On Safety Analysis by Leashing Large Language Models
Authors:
Ali Nouri,
Beatriz Cabrero-Daniel,
Fredrik Törner,
Hȧkan Sivencrona,
Christian Berger
Abstract:
DevOps is a necessity in many industries, including the development of Autonomous Vehicles. In those settings, there are iterative activities that reduce the speed of SafetyOps cycles. One of these activities is "Hazard Analysis & Risk Assessment" (HARA), which is an essential step to start the safety requirements specification. As a potential approach to increase the speed of this step in SafetyO…
▽ More
DevOps is a necessity in many industries, including the development of Autonomous Vehicles. In those settings, there are iterative activities that reduce the speed of SafetyOps cycles. One of these activities is "Hazard Analysis & Risk Assessment" (HARA), which is an essential step to start the safety requirements specification. As a potential approach to increase the speed of this step in SafetyOps, we have delved into the capabilities of Large Language Models (LLMs).
Our objective is to systematically assess their potential for application in the field of safety engineering. To that end, we propose a framework to support a higher degree of automation of HARA with LLMs. Despite our endeavors to automate as much of the process as possible, expert review remains crucial to ensure the validity and correctness of the analysis results, with necessary modifications made accordingly.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
On STPA for Distributed Development of Safe Autonomous Driving: An Interview Study
Authors:
Ali Nouri,
Christian Berger,
Fredrik Törner
Abstract:
Safety analysis is used to identify hazards and build knowledge during the design phase of safety-relevant functions. This is especially true for complex AI-enabled and software intensive systems such as Autonomous Drive (AD). System-Theoretic Process Analysis (STPA) is a novel method applied in safety-related fields like defense and aerospace, which is also becoming popular in the automotive indu…
▽ More
Safety analysis is used to identify hazards and build knowledge during the design phase of safety-relevant functions. This is especially true for complex AI-enabled and software intensive systems such as Autonomous Drive (AD). System-Theoretic Process Analysis (STPA) is a novel method applied in safety-related fields like defense and aerospace, which is also becoming popular in the automotive industry. However, STPA assumes prerequisites that are not fully valid in the automotive system engineering with distributed system development and multi-abstraction design levels. This would inhibit software developers from using STPA to analyze their software as part of a bigger system, resulting in a lack of traceability. This can be seen as a maintainability challenge in continuous development and deployment (DevOps). In this paper, STPA's different guidelines for the automotive industry, e.g. J31887/ISO21448/STPA handbook, are firstly compared to assess their applicability to the distributed development of complex AI-enabled systems like AD. Further, an approach to overcome the challenges of using STPA in a multi-level design context is proposed. By conducting an interview study with automotive industry experts for the development of AD, the challenges are validated and the effectiveness of the proposed approach is evaluated.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
An Industrial Experience Report about Challenges from Continuous Monitoring, Improvement, and Deployment for Autonomous Driving Features
Authors:
Ali Nouri,
Christian Berger,
Fredrik Torner
Abstract:
Using continuous development, deployment, and monitoring (CDDM) to understand and improve applications in a customer's context is widely used for non-safety applications such as smartphone apps or web applications to enable rapid and innovative feature improvements. Having demonstrated its potential in such domains, it may have the potential to also improve the software development for automotive…
▽ More
Using continuous development, deployment, and monitoring (CDDM) to understand and improve applications in a customer's context is widely used for non-safety applications such as smartphone apps or web applications to enable rapid and innovative feature improvements. Having demonstrated its potential in such domains, it may have the potential to also improve the software development for automotive functions as some OEMs described on a high level in their financial company communiqus. However, the application of a CDDM strategy also faces challenges from a process adherence and documentation perspective as required by safety-related products such as autonomous driving systems (ADS) and guided by industry standards such as ISO-26262 and ISO21448. There are publications on CDDM in safety-relevant contexts that focus on safety-critical functions on a rather generic level and thus, not specifically ADS or automotive, or that are concentrating only on software and hence, missing out the particular context of an automotive OEM: Well-established legacy processes and the need of their adaptations, and aspects originating from the role of being a system integrator for software/software, hardware/hardware, and hardware/software. In this paper, particular challenges from the automotive domain to better adopt CDDM are identified and discussed to shed light on research gaps to enhance CDDM, especially for the software development of safe ADS. The challenges are identified from today's industrial well-established ways of working by conducting interviews with domain experts and complemented by a literature study.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
Evaluation of Out-of-Distribution Detection Performance on Autonomous Driving Datasets
Authors:
Jens Henriksson,
Christian Berger,
Stig Ursing,
Markus Borg
Abstract:
Safety measures need to be systemically investigated to what extent they evaluate the intended performance of Deep Neural Networks (DNNs) for critical applications. Due to a lack of verification methods for high-dimensional DNNs, a trade-off is needed between accepted performance and handling of out-of-distribution (OOD) samples.
This work evaluates rejecting outputs from semantic segmentation D…
▽ More
Safety measures need to be systemically investigated to what extent they evaluate the intended performance of Deep Neural Networks (DNNs) for critical applications. Due to a lack of verification methods for high-dimensional DNNs, a trade-off is needed between accepted performance and handling of out-of-distribution (OOD) samples.
This work evaluates rejecting outputs from semantic segmentation DNNs by applying a Mahalanobis distance (MD) based on the most probable class-conditional Gaussian distribution for the predicted class as an OOD score. The evaluation follows three DNNs trained on the Cityscapes dataset and tested on four automotive datasets and finds that classification risk can drastically be reduced at the cost of pixel coverage, even when applied on unseen datasets. The applicability of our findings will support legitimizing safety measures and motivate their usage when arguing for safe usage of DNNs in automotive perception.
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
Prompt Smells: An Omen for Undesirable Generative AI Outputs
Authors:
Krishna Ronanki,
Beatriz Cabrero-Daniel,
Christian Berger
Abstract:
Recent Generative Artificial Intelligence (GenAI) trends focus on various applications, including creating stories, illustrations, poems, articles, computer code, music compositions, and videos. Extrinsic hallucinations are a critical limitation of such GenAI, which can lead to significant challenges in achieving and maintaining the trustworthiness of GenAI. In this paper, we propose two new conce…
▽ More
Recent Generative Artificial Intelligence (GenAI) trends focus on various applications, including creating stories, illustrations, poems, articles, computer code, music compositions, and videos. Extrinsic hallucinations are a critical limitation of such GenAI, which can lead to significant challenges in achieving and maintaining the trustworthiness of GenAI. In this paper, we propose two new concepts that we believe will aid the research community in addressing limitations associated with the application of GenAI models. First, we propose a definition for the "desirability" of GenAI outputs and three factors which are observed to influence it. Second, drawing inspiration from Martin Fowler's code smells, we propose the concept of "prompt smells" and the adverse effects they are observed to have on the desirability of GenAI outputs. We expect our work will contribute to the ongoing conversation about the desirability of GenAI outputs and help advance the field in a meaningful way.
△ Less
Submitted 23 January, 2024;
originally announced January 2024.
-
Systematic Evaluation of Applying Space-Filling Curves to Automotive Maneuver Detection
Authors:
Christian Berger,
Beatriz Cabrero-Daniel,
M. Cagri Kaya,
Maryam Esmaeili Darestani,
Hannah Shiels
Abstract:
Identifying driving maneuvers plays an essential role on-board vehicles to monitor driving and driver states, as well as off-board to train and evaluate machine learning algorithms for automated driving for example. Maneuvers can be characterized by vehicle kinematics or data from its surroundings including other traffic participants. Extracting relevant maneuvers therefore requires analyzing time…
▽ More
Identifying driving maneuvers plays an essential role on-board vehicles to monitor driving and driver states, as well as off-board to train and evaluate machine learning algorithms for automated driving for example. Maneuvers can be characterized by vehicle kinematics or data from its surroundings including other traffic participants. Extracting relevant maneuvers therefore requires analyzing time-series of (i) structured, multi-dimensional kinematic data, and (ii) unstructured, large data samples for video, radar, or LiDAR sensors. However, such data analysis requires scalable and computationally efficient approaches, especially for non-annotated data. In this paper, we are presenting a maneuver detection approach based on two variants of space-filling curves (Z-order and Hilbert) to detect maneuvers when passing roundabouts that do not use GPS data. We systematically evaluate their respective performance by including permutations of selections of kinematic signals at varying frequencies and compare them with two alternative baselines: All manually identified roundabouts, and roundabouts that are marked by geofences. We find that encoding just longitudinal and lateral accelerations sampled at 10Hz using a Hilbert space-filling curve is already successfully identifying roundabout maneuvers, which allows to avoid the use of potentially sensitive signals such as GPS locations to comply with data protection and privacy regulations like GDPR.
△ Less
Submitted 23 October, 2023;
originally announced November 2023.
-
Requirements Engineering using Generative AI: Prompts and Prompting Patterns
Authors:
Krishna Ronanki,
Beatriz Cabrero-Daniel,
Jennifer Horkoff,
Christian Berger
Abstract:
[Context]: Companies are increasingly recognizing the importance of automating Requirements Engineering (RE) tasks due to their resource-intensive nature. The advent of GenAI has made these tasks more amenable to automation, thanks to its ability to understand and interpret context effectively. [Problem]: However, in the context of GenAI, prompt engineering is a critical factor for success. Despit…
▽ More
[Context]: Companies are increasingly recognizing the importance of automating Requirements Engineering (RE) tasks due to their resource-intensive nature. The advent of GenAI has made these tasks more amenable to automation, thanks to its ability to understand and interpret context effectively. [Problem]: However, in the context of GenAI, prompt engineering is a critical factor for success. Despite this, we currently lack tools and methods to systematically assess and determine the most effective prompt patterns to employ for a particular RE task. [Method]: Two tasks related to requirements, specifically requirement classification and tracing, were automated using the GPT-3.5 turbo API. The performance evaluation involved assessing various prompts created using 5 prompt patterns and implemented programmatically to perform the selected RE tasks, focusing on metrics such as precision, recall, accuracy, and F-Score. [Results]: This paper evaluates the effectiveness of the 5 prompt patterns' ability to make GPT-3.5 turbo perform the selected RE tasks and offers recommendations on which prompt pattern to use for a specific RE task. Additionally, it also provides an evaluation framework as a reference for researchers and practitioners who want to evaluate different prompt patterns for different RE tasks.
△ Less
Submitted 7 November, 2023;
originally announced November 2023.
-
Semantic Modelling of Organizational Knowledge as a Basis for Enterprise Data Governance 4.0 -- Application to a Unified Clinical Data Model
Authors:
Miguel AP Oliveira,
Stephane Manara,
Bruno Molé,
Thomas Muller,
Aurélien Guillouche,
Lysann Hesske,
Bruce Jordan,
Gilles Hubert,
Chinmay Kulkarni,
Pralipta Jagdev,
Cedric R. Berger
Abstract:
Individuals and organizations cope with an always-growing amount of data, which is heterogeneous in its contents and formats. An adequate data management process yielding data quality and control over its lifecycle is a prerequisite to getting value out of this data and minimizing inherent risks related to multiple usages. Common data governance frameworks rely on people, policies, and processes t…
▽ More
Individuals and organizations cope with an always-growing amount of data, which is heterogeneous in its contents and formats. An adequate data management process yielding data quality and control over its lifecycle is a prerequisite to getting value out of this data and minimizing inherent risks related to multiple usages. Common data governance frameworks rely on people, policies, and processes that fall short of the overwhelming complexity of data. Yet, harnessing this complexity is necessary to achieve high-quality standards. The latter will condition any downstream data usage outcome, including generative artificial intelligence trained on this data. In this paper, we report our concrete experience establishing a simple, cost-efficient framework that enables metadata-driven, agile and (semi-)automated data governance (i.e. Data Governance 4.0). We explain how we implement and use this framework to integrate 25 years of clinical study data at an enterprise scale in a fully productive environment. The framework encompasses both methodologies and technologies leveraging semantic web principles. We built a knowledge graph describing avatars of data assets in their business context, including governance principles. Multiple ontologies articulated by an enterprise upper ontology enable key governance actions such as FAIRification, lifecycle management, definition of roles and responsibilities, lineage across transformations and provenance from source systems. This metadata model is the keystone to data governance 4.0: a semi-automatised data management process that considers the business context in an agile manner to adapt governance constraints to each use case and dynamically tune it based on business changes.
△ Less
Submitted 23 November, 2023; v1 submitted 20 October, 2023;
originally announced November 2023.
-
Textiverse: A Scalable Visual Analytics System for Exploring Geotagged and Timestamped Text Corpora
Authors:
Caroline Berger,
Hanjun Xian,
Krishna Madhavan,
Niklas Elmqvist
Abstract:
We propose Textiverse, a big data approach for mining geotagged timestamped textual data on a map, such as for Twitter feeds, crime reports, or restaurant reviews. We use a scalable data management pipeline that extracts keyphrases from online databases in parallel. We speed up this time-consuming step so that it outpaces the content creation rate of popular social media. The result is presented i…
▽ More
We propose Textiverse, a big data approach for mining geotagged timestamped textual data on a map, such as for Twitter feeds, crime reports, or restaurant reviews. We use a scalable data management pipeline that extracts keyphrases from online databases in parallel. We speed up this time-consuming step so that it outpaces the content creation rate of popular social media. The result is presented in a web-based interface that integrates with Google Maps to visualize textual content of massive scale. The visual design is based on aggregating spatial regions into discrete sites and rendering each such site as a circular tag cloud. To demonstrate the intended use of our technique, we first show how it can be used to characterize the U.S.\ National Science Foundation funding status based on all 489,151 awards. We then apply the same technique on visually representing a more spatially scattered and linguistically informal dataset: 1.2 million Twitter posts about the Android mobile operating system.
△ Less
Submitted 11 October, 2023;
originally announced October 2023.
-
Scalable Performance Evaluation of Byzantine Fault-Tolerant Systems Using Network Simulation
Authors:
Christian Berger,
Sadok Ben Toumia,
Hans P. Reiser
Abstract:
Recent Byzantine fault-tolerant (BFT) state machine replication (SMR) protocols increasingly focus on scalability to meet the requirements of distributed ledger technology (DLT). Validating the performance of scalable BFT protocol implementations requires careful evaluation. Our solution uses network simulations to forecast the performance of BFT protocols while experimentally scaling the environm…
▽ More
Recent Byzantine fault-tolerant (BFT) state machine replication (SMR) protocols increasingly focus on scalability to meet the requirements of distributed ledger technology (DLT). Validating the performance of scalable BFT protocol implementations requires careful evaluation. Our solution uses network simulations to forecast the performance of BFT protocols while experimentally scaling the environment. Our method seamlessly plug-and-plays existing BFT implementations into the simulation without requiring code modification or re-implementation, which is often time-consuming and error-prone. Furthermore, our approach is also significantly cheaper than experiments with real large-scale cloud deployments. In this paper, we first explain our simulation architecture, which enables scalable performance evaluations of BFT systems through high performance network simulations. We validate the accuracy of these simulations for predicting the performance of BFT systems by comparing simulation results with measurements of real systems deployed on cloud infrastructures. We found that simulation results display a reasonable approximation at a larger system scale, because the network eventually becomes the dominating factor limiting system performance. In the second part of our paper, we use our simulation method to evaluate the performance of PBFT and BFT protocols from the blockchain generation, such as HotStuff and Kauri, in large-scale and realistic wide-area network scenarios, as well as under induced faults.
△ Less
Submitted 29 September, 2023;
originally announced September 2023.
-
Investigating ChatGPT's Potential to Assist in Requirements Elicitation Processes
Authors:
Krishna Ronanki,
Christian Berger,
Jennifer Horkoff
Abstract:
Natural Language Processing (NLP) for Requirements Engineering (RE) (NLP4RE) seeks to apply NLP tools, techniques, and resources to the RE process to increase the quality of the requirements. There is little research involving the utilization of Generative AI-based NLP tools and techniques for requirements elicitation. In recent times, Large Language Models (LLM) like ChatGPT have gained significa…
▽ More
Natural Language Processing (NLP) for Requirements Engineering (RE) (NLP4RE) seeks to apply NLP tools, techniques, and resources to the RE process to increase the quality of the requirements. There is little research involving the utilization of Generative AI-based NLP tools and techniques for requirements elicitation. In recent times, Large Language Models (LLM) like ChatGPT have gained significant recognition due to their notably improved performance in NLP tasks. To explore the potential of ChatGPT to assist in requirements elicitation processes, we formulated six questions to elicit requirements using ChatGPT. Using the same six questions, we conducted interview-based surveys with five RE experts from academia and industry and collected 30 responses containing requirements. The quality of these 36 responses (human-formulated + ChatGPT-generated) was evaluated over seven different requirements quality attributes by another five RE experts through a second round of interview-based surveys. In comparing the quality of requirements generated by ChatGPT with those formulated by human experts, we found that ChatGPT-generated requirements are highly Abstract, Atomic, Consistent, Correct, and Understandable. Based on these results, we present the most pressing issues related to LLMs and what future research should focus on to leverage the emergent behaviour of LLMs more effectively in natural language-based RE activities.
△ Less
Submitted 14 July, 2023;
originally announced July 2023.
-
ChatGPT as a tool for User Story Quality Evaluation: Trustworthy Out of the Box?
Authors:
Krishna Ronanki,
Beatriz Cabrero-Daniel,
Christian Berger
Abstract:
In Agile software development, user stories play a vital role in capturing and conveying end-user needs, prioritizing features, and facilitating communication and collaboration within development teams. However, automated methods for evaluating user stories require training in NLP tools and can be time-consuming to develop and integrate. This study explores using ChatGPT for user story quality eva…
▽ More
In Agile software development, user stories play a vital role in capturing and conveying end-user needs, prioritizing features, and facilitating communication and collaboration within development teams. However, automated methods for evaluating user stories require training in NLP tools and can be time-consuming to develop and integrate. This study explores using ChatGPT for user story quality evaluation and compares its performance with an existing benchmark. Our study shows that ChatGPT's evaluation aligns well with human evaluation, and we propose a ``best of three'' strategy to improve its output stability. We also discuss the concept of trustworthiness in AI and its implications for non-experts using ChatGPT's unprocessed outputs. Our research contributes to understanding the reliability and applicability of AI in user story evaluation and offers recommendations for future research.
△ Less
Submitted 21 June, 2023;
originally announced June 2023.
-
RE-centric Recommendations for the Development of Trustworthy(er) Autonomous Systems
Authors:
Krishna Ronanki,
Beatriz Cabrero-Daniel,
Jennifer Horkoff,
Christian Berger
Abstract:
Complying with the EU AI Act (AIA) guidelines while developing and implementing AI systems will soon be mandatory within the EU. However, practitioners lack actionable instructions to operationalise ethics during AI systems development. A literature review of different ethical guidelines revealed inconsistencies in the principles addressed and the terminology used to describe them. Furthermore, re…
▽ More
Complying with the EU AI Act (AIA) guidelines while developing and implementing AI systems will soon be mandatory within the EU. However, practitioners lack actionable instructions to operationalise ethics during AI systems development. A literature review of different ethical guidelines revealed inconsistencies in the principles addressed and the terminology used to describe them. Furthermore, requirements engineering (RE), which is identified to foster trustworthiness in the AI development process from the early stages was observed to be absent in a lot of frameworks that support the development of ethical and trustworthy AI. This incongruous phrasing combined with a lack of concrete development practices makes trustworthy AI development harder. To address this concern, we formulated a comparison table for the terminology used and the coverage of the ethical AI principles in major ethical AI guidelines. We then examined the applicability of ethical AI development frameworks for performing effective RE during the development of trustworthy AI systems. A tertiary review and meta-analysis of literature discussing ethical AI frameworks revealed their limitations when developing trustworthy AI. Based on our findings, we propose recommendations to address such limitations during the development of trustworthy AI.
△ Less
Submitted 5 January, 2024; v1 submitted 29 May, 2023;
originally announced June 2023.
-
Chasing the Speed of Light: Low-Latency Planetary-Scale Adaptive Byzantine Consensus
Authors:
Christian Berger,
Lívio Rodrigues,
Hans P. Reiser,
Vinicius Cogo,
Alysson Bessani
Abstract:
Blockchain technology has sparked renewed interest in planetary-scale Byzantine fault-tolerant (BFT) state machine replication (SMR). While recent works have mainly focused on improving the scalability and throughput of these protocols, few have addressed latency. We present FlashConsensus, a novel transformation for optimizing the latency of quorum-based BFT consensus protocols. FLASHCONSENSUS us…
▽ More
Blockchain technology has sparked renewed interest in planetary-scale Byzantine fault-tolerant (BFT) state machine replication (SMR). While recent works have mainly focused on improving the scalability and throughput of these protocols, few have addressed latency. We present FlashConsensus, a novel transformation for optimizing the latency of quorum-based BFT consensus protocols. FLASHCONSENSUS uses an adaptive resilience threshold that enables faster transaction ordering when the system contains few faulty replicas. Our construction exploits adaptive weighted replication to automatically assign high voting power to the fastest replicas, forming small quorums that significantly speed up consensus. Even when using such quorums with a smaller resilience threshold, FlashConsensus still satisfies the standard SMR safety and liveness guarantees with optimal resilience, thanks to the judicious integration of abortable SMR and BFT forensics techniques. Our experiments with tens of replicas spread in all continents show that FLASHCONSENSUS can order transactions with finality in less than 0.4s, half the time of a PBFT-like protocol (with optimal consensus latency) in the same network, and matching the latency of this protocol running on the theoretically best possible internet links (transmitting at 67% of the speed of light).
△ Less
Submitted 24 May, 2023;
originally announced May 2023.
-
ZEBRA: Z-order Curve-based Event Retrieval Approach to Efficiently Explore Automotive Data
Authors:
Christian Berger,
Lukas Birkemeyer
Abstract:
Evaluating the performance of software for automated vehicles is predominantly driven by data collected from the real world. While professional test drivers are supported with technical means to semi-automatically annotate driving maneuvers to allow better event identification, simple data loggers in large vehicle fleets typically lack automatic and detailed event classification and hence, extra e…
▽ More
Evaluating the performance of software for automated vehicles is predominantly driven by data collected from the real world. While professional test drivers are supported with technical means to semi-automatically annotate driving maneuvers to allow better event identification, simple data loggers in large vehicle fleets typically lack automatic and detailed event classification and hence, extra effort is needed when post-processing such data. Yet, the data quality from professional test drivers is apparently higher than the one from large fleets where labels are missing, but the non-annotated data set from large vehicle fleets is much more representative for typical, realistic driving scenarios to be handled by automated vehicles. However, while growing the data from large fleets is relatively simple, adding valuable annotations during post-processing has become increasingly expensive. In this paper, we leverage Z-order space-filling curves to systematically reduce data dimensionality while preserving domain-specific data properties, which allows us to explore even large-scale field data sets to spot interesting events orders of magnitude faster than processing time-series data directly. Furthermore, the proposed concept is based on an analytical approach, which preserves explainability for the identified events.
△ Less
Submitted 20 April, 2023;
originally announced April 2023.
-
Cross or Wait? Predicting Pedestrian Interaction Outcomes at Unsignalized Crossings
Authors:
Chi Zhang,
Amir Hossein Kalantari,
Yue Yang,
Zhongjun Ni,
Gustav Markkula,
Natasha Merat,
Christian Berger
Abstract:
Predicting pedestrian behavior when interacting with vehicles is one of the most critical challenges in the field of automated driving. Pedestrian crossing behavior is influenced by various interaction factors, including time to arrival, pedestrian waiting time, the presence of zebra crossing, and the properties and personality traits of both pedestrians and drivers. However, these factors have no…
▽ More
Predicting pedestrian behavior when interacting with vehicles is one of the most critical challenges in the field of automated driving. Pedestrian crossing behavior is influenced by various interaction factors, including time to arrival, pedestrian waiting time, the presence of zebra crossing, and the properties and personality traits of both pedestrians and drivers. However, these factors have not been fully explored for use in predicting interaction outcomes. In this paper, we use machine learning to predict pedestrian crossing behavior including pedestrian crossing decision, crossing initiation time (CIT), and crossing duration (CD) when interacting with vehicles at unsignalized crossings. Distributed simulator data are utilized for predicting and analyzing the interaction factors. Compared with the logistic regression baseline model, our proposed neural network model improves the prediction accuracy and F1 score by 4.46% and 3.23%, respectively. Our model also reduces the root mean squared error (RMSE) for CIT and CD by 21.56% and 30.14% compared with the linear regression model. Additionally, we have analyzed the importance of interaction factors, and present the results of models using fewer factors. This provides information for model selection in different scenarios with limited input features.
△ Less
Submitted 19 March, 2024; v1 submitted 17 April, 2023;
originally announced April 2023.
-
SoK: Scalability Techniques for BFT Consensus
Authors:
Christian Berger,
Signe Schwarz-Rüsch,
Arne Vogel,
Kai Bleeke,
Leander Jehl,
Hans P. Reiser,
Rüdiger Kapitza
Abstract:
With the advancement of blockchain systems, many recent research works have proposed distributed ledger technology~(DLT) that employs Byzantine fault-tolerant~(BFT) consensus protocols to decide which block to append next to the ledger. Notably, BFT consensus can offer high performance, energy efficiency, and provable correctness properties, and it is thus considered a promising building block for…
▽ More
With the advancement of blockchain systems, many recent research works have proposed distributed ledger technology~(DLT) that employs Byzantine fault-tolerant~(BFT) consensus protocols to decide which block to append next to the ledger. Notably, BFT consensus can offer high performance, energy efficiency, and provable correctness properties, and it is thus considered a promising building block for creating highly resilient and performant blockchain infrastructures. Yet, a major ongoing challenge is to make BFT consensus applicable to large-scale environments. A large body of recent work addresses this challenge by developing novel ideas to improve the scalability of BFT consensus, thus opening the path for a new generation of BFT protocols tailored to the needs of blockchain. In this survey, we create a systematization of knowledge about the novel scalability-enhancing techniques that state-of-the-art BFT consensus protocols use. For our comparison, we closely analyze the efforts, assumptions, and trade-offs these protocols make.
△ Less
Submitted 20 March, 2023;
originally announced March 2023.
-
Simulating BFT Protocol Implementations at Scale
Authors:
Christian Berger,
Sadok Ben Toumia,
Hans P. Reiser
Abstract:
The novel blockchain generation of Byzantine fault-tolerant (BFT) state machine replication (SMR) protocols focuses on scalability and performance to meet requirements of distributed ledger technology (DLT), e.g., decentralization and geographic dispersion. Validating scalability and performance of BFT protocol implementations requires careful evaluation. While experiments with real protocol deplo…
▽ More
The novel blockchain generation of Byzantine fault-tolerant (BFT) state machine replication (SMR) protocols focuses on scalability and performance to meet requirements of distributed ledger technology (DLT), e.g., decentralization and geographic dispersion. Validating scalability and performance of BFT protocol implementations requires careful evaluation. While experiments with real protocol deployments usually offer the best realism, they are costly and time-consuming. In this paper, we explore simulation of unmodified BFT protocol implementations as as a method for cheap and rapid protocol evaluation: We can accurately forecast the performance of a BFT protocol while experimentally scaling its environment, i.e., by varying the number of nodes or geographic dispersion. Our approach is resource-friendly and preserves application-realism, since existing BFT frameworks can be simply plugged into the simulation engine without requiring code modifications or re-implementation.
△ Less
Submitted 6 September, 2022; v1 submitted 31 August, 2022;
originally announced August 2022.
-
Automatic Integration of BFT State-Machine Replication into IoT Systems
Authors:
Christian Berger,
Hans P. Reiser,
Franz J. Hauck,
Florian Held,
Jörg Domaschka
Abstract:
Byzantine fault tolerance (BFT) can preserve the availability and integrity of IoT systems where single components may suffer from random data corruption or attacks that can expose them to malicious behavior. While state-of-the-art BFT state-machine replication (SMR) libraries are often tailored to fit a standard request-response interaction model with dedicated client-server roles, in our design,…
▽ More
Byzantine fault tolerance (BFT) can preserve the availability and integrity of IoT systems where single components may suffer from random data corruption or attacks that can expose them to malicious behavior. While state-of-the-art BFT state-machine replication (SMR) libraries are often tailored to fit a standard request-response interaction model with dedicated client-server roles, in our design, we employ an IoT-fit interaction model that assumes a loosly-coupled, event-driven interaction between arbitrarily wired IoT components. In this paper, we explore the possibility of automating and streamlining the complete process of integrating BFT SMR into a component-based IoT execution environment. Our main goal is providing simplicity for the developer: We strive to decouple the specification of a logical application architecture from the difficulty of incorporating BFT replication mechanisms into it. Thus, our contributions address the automated configuration, re-wiring and deployment of IoT components, and their replicas, within a component-based, event-driven IoT platform.
△ Less
Submitted 6 July, 2022; v1 submitted 1 July, 2022;
originally announced July 2022.
-
Digital Sovereignty and Software Engineering for the IoT-laden, AI/ML-driven Era
Authors:
Christian Berger
Abstract:
Today's software engineering already needs to deal with challenges originating from the multidisciplinarity that is required to realize IoT products: Many variants consist of sensor/actuator-powered systems that already today use AI/ML systems to better cope with the unstructuredness of their intended operational design domain (ODD), while, at the same time, such systems need to be monitored, diag…
▽ More
Today's software engineering already needs to deal with challenges originating from the multidisciplinarity that is required to realize IoT products: Many variants consist of sensor/actuator-powered systems that already today use AI/ML systems to better cope with the unstructuredness of their intended operational design domain (ODD), while, at the same time, such systems need to be monitored, diagnosed, maintained, and evolved using cloud-powered dashboards and data analytics pipelines that process, aggregate, and analyze countless data points preferably in real-time. This position paper discusses selected aspects related to Digital Sovereignty from a software engineering's perspective for the IoT-laden, AI/ML-driven era: While we can undeniably expect more and more benefits from such solutions, a specific light shall be shed in particular on challenges and responsibilities at design- and operation-time that, at minimum, prepare for and enable or, even better, preserve and extend digital sovereignty from a software engineering's perspective.
△ Less
Submitted 27 May, 2022;
originally announced May 2022.
-
Understanding the Impact of Edge Cases from Occluded Pedestrians for ML Systems
Authors:
Jens Henriksson,
Christian Berger,
Stig Ursing
Abstract:
Machine learning (ML)-enabled approaches are considered a substantial support technique of detection and classification of obstacles of traffic participants in self-driving vehicles. Major breakthroughs have been demonstrated the past few years, even covering complete end-to-end data processing chain from sensory inputs through perception and planning to vehicle control of acceleration, breaking a…
▽ More
Machine learning (ML)-enabled approaches are considered a substantial support technique of detection and classification of obstacles of traffic participants in self-driving vehicles. Major breakthroughs have been demonstrated the past few years, even covering complete end-to-end data processing chain from sensory inputs through perception and planning to vehicle control of acceleration, breaking and steering. YOLO (you-only-look-once) is a state-of-the-art perception neural network (NN) architecture providing object detection and classification through bounding box estimations on camera images. As the NN is trained on well annotated images, in this paper we study the variations of confidence levels from the NN when tested on hand-crafted occlusion added to a test set. We compare regular pedestrian detection to upper and lower body detection. Our findings show that the two NN using only partial information perform similarly well like the NN for the full body when the full body NN's performance is 0.75 or better. Furthermore and as expected, the network, which is only trained on the lower half body is least prone to disturbances from occlusions of the upper half and vice versa.
△ Less
Submitted 26 April, 2022;
originally announced April 2022.
-
Performance Analysis of Out-of-Distribution Detection on Trained Neural Networks
Authors:
Jens Henriksson,
Christian Berger,
Markus Borg,
Lars Tornberg,
Sankar Raman Sathyamoorthy,
Cristofer Englund
Abstract:
Several areas have been improved with Deep Learning during the past years. Implementing Deep Neural Networks (DNN) for non-safety related applications have shown remarkable achievements over the past years; however, for using DNNs in safety critical applications, we are missing approaches for verifying the robustness of such models. A common challenge for DNNs occurs when exposed to out-of-distrib…
▽ More
Several areas have been improved with Deep Learning during the past years. Implementing Deep Neural Networks (DNN) for non-safety related applications have shown remarkable achievements over the past years; however, for using DNNs in safety critical applications, we are missing approaches for verifying the robustness of such models. A common challenge for DNNs occurs when exposed to out-of-distribution samples that are outside of the scope of a DNN, but which result in high confidence outputs despite no prior knowledge of such input.
In this paper, we analyze three methods that separate between in- and out-of-distribution data, called supervisors, on four well-known DNN architectures. We find that the outlier detection performance improves with the quality of the model. We also analyse the performance of the particular supervisors during the training procedure by applying the supervisor at a predefined interval to investigate its performance as the training proceeds. We observe that understanding the relationship between training results and supervisor performance is crucial to improve the model's robustness and to indicate, what input samples require further measures to improve the robustness of a DNN. In addition, our work paves the road towards an instrument for safety argumentation for safety critical applications. This paper is an extended version of our previous work presented at 2019 SEAA (cf. [1]); here, we elaborate on the used metrics, add an additional supervisor and test them on two additional datasets.
△ Less
Submitted 26 April, 2022;
originally announced April 2022.
-
Learning the Pedestrian-Vehicle Interaction for Pedestrian Trajectory Prediction
Authors:
Chi Zhang,
Christian Berger
Abstract:
In this paper, we study the interaction between pedestrians and vehicles and propose a novel neural network structure called the Pedestrian-Vehicle Interaction (PVI) extractor for learning the pedestrian-vehicle interaction. We implement the proposed PVI extractor on both sequential approaches (long short-term memory (LSTM) models) and non-sequential approaches (convolutional models). We use the W…
▽ More
In this paper, we study the interaction between pedestrians and vehicles and propose a novel neural network structure called the Pedestrian-Vehicle Interaction (PVI) extractor for learning the pedestrian-vehicle interaction. We implement the proposed PVI extractor on both sequential approaches (long short-term memory (LSTM) models) and non-sequential approaches (convolutional models). We use the Waymo Open Dataset that contains real-world urban traffic scenes with both pedestrian and vehicle annotations. For the LSTM-based models, our proposed model is compared with Social-LSTM and Social-GAN, and using our proposed PVI extractor reduces the average displacement error (ADE) and the final displacement error (FDE) by 7.46% and 5.24%, respectively. For the convolutional-based models, our proposed model is compared with Social-STGCNN and Social-IWSTCNN, and using our proposed PVI extractor reduces the ADE and FDE by 2.10% and 1.27%, respectively. The results show that the pedestrian-vehicle interaction influences pedestrian behavior, and the models using the proposed PVI extractor can capture the interaction between pedestrians and vehicles, and thereby outperform the compared methods.
△ Less
Submitted 25 April, 2022; v1 submitted 10 February, 2022;
originally announced February 2022.
-
Evaluating Blockchain Application Requirements and their Satisfaction in Hyperledger Fabric
Authors:
Sadok Ben Toumia,
Christian Berger,
Hans P. Reiser
Abstract:
Blockchain applications may offer better fault-tolerance, integrity, traceability and transparency compared to centralized solutions. Despite these benefits, few businesses switch to blockchain-based applications. Industries worry that the current blockchain implementations do not meet their requirements, e.g., when it comes to scalability, throughput or latency. Hyperledger Fabric (HLF) is a perm…
▽ More
Blockchain applications may offer better fault-tolerance, integrity, traceability and transparency compared to centralized solutions. Despite these benefits, few businesses switch to blockchain-based applications. Industries worry that the current blockchain implementations do not meet their requirements, e.g., when it comes to scalability, throughput or latency. Hyperledger Fabric (HLF) is a permissioned blockchain infrastructure that aims to meet enterprise needs and provides a highly modular and well-conceived architecture. In this paper, we survey and analyse requirements of blockchain applications in respect to their underlying infrastructure by focusing mainly on performance and resilience characteristics. Subsequently, we discuss to what extent Fabric's current design allows it to meet these requirements. We further evaluate the performance of Hyperledger Fabric 2.2 simulating different use case scenarios by comparing single with multi ordering service performance and conducting an evaluation with mixed workloads.
△ Less
Submitted 30 November, 2021;
originally announced November 2021.
-
Polygon Area Decomposition Using a Compactness Metric
Authors:
Mariusz Wzorek,
Cyrille Berger,
Patrick Doherty
Abstract:
In this paper, we consider the problem of partitioning a polygon into a set of connected disjoint sub-polygons, each of which covers an area of a specific size. The work is motivated by terrain covering applications in robotics, where the goal is to find a set of efficient plans for a team of heterogeneous robots to cover a given area. Within this application, solving a polygon partitioning proble…
▽ More
In this paper, we consider the problem of partitioning a polygon into a set of connected disjoint sub-polygons, each of which covers an area of a specific size. The work is motivated by terrain covering applications in robotics, where the goal is to find a set of efficient plans for a team of heterogeneous robots to cover a given area. Within this application, solving a polygon partitioning problem is an essential stepping stone. Unlike previous work, the problem formulation proposed in this paper also considers a compactness metric of the generated sub-polygons, in addition to the area size constraints. Maximizing the compactness of sub-polygons directly influences the optimality of any generated motion plans. Consequently, this increases the efficiency with which robotic tasks can be performed within each sub-region. The proposed problem representation is based on grid cell decomposition and a potential field model that allows for the use of standard optimization techniques. A new algorithm, the AreaDecompose algorithm, is proposed to solve this problem. The algorithm includes a number of existing and new optimization techniques combined with two post-processing methods. The approach has been evaluated on a set of randomly generated polygons which are then divided using different criteria and the results have been compared with a state-of-the-art algorithm. Results show that the proposed algorithm can efficiently divide polygon regions maximizing compactness of the resulting partitions, where the sub-polygon regions are on average up to 73% more compact in comparison to existing techniques.
△ Less
Submitted 8 October, 2021;
originally announced October 2021.
-
Are we ready for beyond-application high-volume data? The Reeds robot perception benchmark dataset
Authors:
Ola Benderius,
Christian Berger,
Krister Blanch
Abstract:
This paper presents a dataset, called Reeds, for research on robot perception algorithms. The dataset aims to provide demanding benchmark opportunities for algorithms, rather than providing an environment for testing application-specific solutions. A boat was selected as a logging platform in order to provide highly dynamic kinematics. The sensor package includes six high-performance vision sensor…
▽ More
This paper presents a dataset, called Reeds, for research on robot perception algorithms. The dataset aims to provide demanding benchmark opportunities for algorithms, rather than providing an environment for testing application-specific solutions. A boat was selected as a logging platform in order to provide highly dynamic kinematics. The sensor package includes six high-performance vision sensors, two long-range lidars, radar, as well as GNSS and an IMU. The spatiotemporal resolution of sensors were maximized in order to provide large variations and flexibility in the data, offering evaluation at a large number of different resolution presets based on the resolution found in other datasets. Reeds also provides means of a fair and reproducible comparison of algorithms, by running all evaluations on a common server backend. As the dataset contains massive-scale data, the evaluation principle also serves as a way to avoid moving data unnecessarily.
It was also found that naive evaluation of algorithms, where each evaluation is computed sequentially, was not practical as the fetch and decode task of each frame would not scale well. Instead, each frame is only decoded once and then fed to all algorithms in parallel, including for GPU-based algorithms.
△ Less
Submitted 16 September, 2021;
originally announced September 2021.
-
A Survey on Resilience in the IoT: Taxonomy, Classification and Discussion of Resilience Mechanisms
Authors:
Christian Berger,
Philipp Eichhammer,
Hans P. Reiser,
Jörg Domaschka,
Franz J. Hauck,
Gerhard Habiger
Abstract:
Internet-of-Things (IoT) ecosystems tend to grow both in scale and complexity as they consist of a variety of heterogeneous devices, which span over multiple architectural IoT layers (e.g., cloud, edge, sensors). Further, IoT systems increasingly demand the resilient operability of services as they become part of critical infrastructures. This leads to a broad variety of research works that aim to…
▽ More
Internet-of-Things (IoT) ecosystems tend to grow both in scale and complexity as they consist of a variety of heterogeneous devices, which span over multiple architectural IoT layers (e.g., cloud, edge, sensors). Further, IoT systems increasingly demand the resilient operability of services as they become part of critical infrastructures. This leads to a broad variety of research works that aim to increase the resilience of these systems. In this paper, we create a systematization of knowledge about existing scientific efforts of making IoT systems resilient. In particular, we first discuss the taxonomy and classification of resilience and resilience mechanisms and subsequently survey state-of-the-art resilience mechanisms that have been proposed by research work and are applicable to IoT. As part of the survey, we also discuss questions that focus on the practical aspects of resilience, e.g., which constraints resilience mechanisms impose on developers when designing resilient systems by incorporating a specific mechanism into IoT systems.
△ Less
Submitted 6 September, 2021;
originally announced September 2021.
-
Making Reads in BFT State Machine Replication Fast, Linearizable, and Live
Authors:
Christian Berger,
Hans P. Reiser,
Alysson Bessani
Abstract:
Practical Byzantine Fault Tolerance (PBFT) is a seminal state machine replication protocol that achieves a performance comparable to non-replicated systems in realistic environments. A reason for such high performance is the set of optimizations introduced in the protocol. One of these optimizations is read-only requests, a particular type of client request which avoids running the three-step agre…
▽ More
Practical Byzantine Fault Tolerance (PBFT) is a seminal state machine replication protocol that achieves a performance comparable to non-replicated systems in realistic environments. A reason for such high performance is the set of optimizations introduced in the protocol. One of these optimizations is read-only requests, a particular type of client request which avoids running the three-step agreement protocol and allows replicas to respond directly, thus reducing the latency of reads from five to two communication steps. Given PBFT's broad influence, its design and optimizations influenced many BFT protocols and systems that followed, e.g., BFT-SMaRt. We show, for the first time, that the read-only request optimization introduced in PBFT more than 20 years ago can violate its liveness. Notably, the problem affects not only the optimized read-only operations but also standard, totally-ordered operations. We show this weakness by presenting an attack in which a malicious leader blocks correct clients and present two solutions for patching the protocol, making read-only operations fast and correct. The two solutions were implemented on BFT-SMaRt and evaluated in different scenarios, showing their effectiveness in preventing the identified attack.
△ Less
Submitted 23 July, 2021;
originally announced July 2021.
-
Confidence-based Out-of-Distribution Detection: A Comparative Study and Analysis
Authors:
Christoph Berger,
Magdalini Paschali,
Ben Glocker,
Konstantinos Kamnitsas
Abstract:
Image classification models deployed in the real world may receive inputs outside the intended data distribution. For critical applications such as clinical decision making, it is important that a model can detect such out-of-distribution (OOD) inputs and express its uncertainty. In this work, we assess the capability of various state-of-the-art approaches for confidence-based OOD detection throug…
▽ More
Image classification models deployed in the real world may receive inputs outside the intended data distribution. For critical applications such as clinical decision making, it is important that a model can detect such out-of-distribution (OOD) inputs and express its uncertainty. In this work, we assess the capability of various state-of-the-art approaches for confidence-based OOD detection through a comparative study and in-depth analysis. First, we leverage a computer vision benchmark to reproduce and compare multiple OOD detection methods. We then evaluate their capabilities on the challenging task of disease classification using chest X-rays. Our study shows that high performance in a computer vision task does not directly translate to accuracy in a medical imaging task. We analyse factors that affect performance of the methods between the two tasks. Our results provide useful insights for developing the next generation of OOD detection methods.
△ Less
Submitted 6 July, 2021;
originally announced July 2021.
-
A Structured Analysis of the Video Degradation Effects on the Performance of a Machine Learning-enabled Pedestrian Detector
Authors:
Christian Berger
Abstract:
ML-enabled software systems have been incorporated in many public demonstrations for automated driving (AD) systems. Such solutions have also been considered as a crucial approach to aim at SAE Level 5 systems, where the passengers in such vehicles do not have to interact with the system at all anymore. Already in 2016, Nvidia demonstrated a complete end-to-end approach for training the complete s…
▽ More
ML-enabled software systems have been incorporated in many public demonstrations for automated driving (AD) systems. Such solutions have also been considered as a crucial approach to aim at SAE Level 5 systems, where the passengers in such vehicles do not have to interact with the system at all anymore. Already in 2016, Nvidia demonstrated a complete end-to-end approach for training the complete software stack covering perception, planning and decision making, and the actual vehicle control. While such approaches show the great potential of such ML-enabled systems, there have also been demonstrations where already changes to single pixels in a video frame can potentially lead to completely different decisions with dangerous consequences. In this paper, a structured analysis has been conducted to explore video degradation effects on the performance of an ML-enabled pedestrian detector. Firstly, a baseline of applying YOLO to 1,026 frames with pedestrian annotations in the KITTI Vision Benchmark Suite has been established. Next, video degradation candidates for each of these frames were generated using the leading video codecs libx264, libx265, Nvidia HEVC, and AV1: 52 frames for the various compression presets for color and gray-scale frames resulting in 104 degradation candidates per original KITTI frame and 426,816 images in total. YOLO was applied to each image to compute the intersection-over-union (IoU) metric to compare the performance with the original baseline. While aggressively lossy compression settings result in significant performance drops as expected, it was also observed that some configurations actually result in slightly better IoU results compared to the baseline. The findings show that carefully chosen lossy video configurations preserve a decent performance of particular ML-enabled systems while allowing for substantial savings when storing or transmitting data.
△ Less
Submitted 30 June, 2021;
originally announced June 2021.
-
Social-IWSTCNN: A Social Interaction-Weighted Spatio-Temporal Convolutional Neural Network for Pedestrian Trajectory Prediction in Urban Traffic Scenarios
Authors:
Chi Zhang,
Christian Berger,
Marco Dozza
Abstract:
Pedestrian trajectory prediction in urban scenarios is essential for automated driving. This task is challenging because the behavior of pedestrians is influenced by both their own history paths and the interactions with others. Previous research modeled these interactions with pooling mechanisms or aggregating with hand-crafted attention weights. In this paper, we present the Social Interaction-W…
▽ More
Pedestrian trajectory prediction in urban scenarios is essential for automated driving. This task is challenging because the behavior of pedestrians is influenced by both their own history paths and the interactions with others. Previous research modeled these interactions with pooling mechanisms or aggregating with hand-crafted attention weights. In this paper, we present the Social Interaction-Weighted Spatio-Temporal Convolutional Neural Network (Social-IWSTCNN), which includes both the spatial and the temporal features. We propose a novel design, namely the Social Interaction Extractor, to learn the spatial and social interaction features of pedestrians. Most previous works used ETH and UCY datasets which include five scenes but do not cover urban traffic scenarios extensively for training and evaluation. In this paper, we use the recently released large-scale Waymo Open Dataset in urban traffic scenarios, which includes 374 urban training scenes and 76 urban testing scenes to analyze the performance of our proposed algorithm in comparison to the state-of-the-art (SOTA) models. The results show that our algorithm outperforms SOTA algorithms such as Social-LSTM, Social-GAN, and Social-STGCNN on both Average Displacement Error (ADE) and Final Displacement Error (FDE). Furthermore, our Social-IWSTCNN is 54.8 times faster in data pre-processing speed, and 4.7 times faster in total test speed than the current best SOTA algorithm Social-STGCNN.
△ Less
Submitted 26 May, 2021;
originally announced May 2021.
-
Performance Analysis of Out-of-Distribution Detection on Various Trained Neural Networks
Authors:
Jens Henriksson,
Christian Berger,
Markus Borg,
Lars Tornberg,
Sankar Raman Sathyamoorthy,
Cristofer Englund
Abstract:
Several areas have been improved with Deep Learning during the past years. For non-safety related products adoption of AI and ML is not an issue, whereas in safety critical applications, robustness of such approaches is still an issue. A common challenge for Deep Neural Networks (DNN) occur when exposed to out-of-distribution samples that are previously unseen, where DNNs can yield high confidence…
▽ More
Several areas have been improved with Deep Learning during the past years. For non-safety related products adoption of AI and ML is not an issue, whereas in safety critical applications, robustness of such approaches is still an issue. A common challenge for Deep Neural Networks (DNN) occur when exposed to out-of-distribution samples that are previously unseen, where DNNs can yield high confidence predictions despite no prior knowledge of the input.
In this paper we analyse two supervisors on two well-known DNNs with varied setups of training and find that the outlier detection performance improves with the quality of the training procedure. We analyse the performance of the supervisor after each epoch during the training cycle, to investigate supervisor performance as the accuracy converges. Understanding the relationship between training results and supervisor performance is valuable to improve robustness of the model and indicates where more work has to be done to create generalized models for safety critical applications.
△ Less
Submitted 29 March, 2021;
originally announced March 2021.
-
Hastily Formed Knowledge Networks and Distributed Situation Awareness for Collaborative Robotics
Authors:
Cyrille Berger,
Patrick Doherty,
Piotr Rudol,
Mariusz Wzorek
Abstract:
In the context of collaborative robotics, distributed situation awareness is essential for supporting collective intelligence in teams of robots and human agents where it can be used for both individual and collective decision support. This is particularly important in applications pertaining to emergency rescue and crisis management. During operational missions, data and knowledge is gathered inc…
▽ More
In the context of collaborative robotics, distributed situation awareness is essential for supporting collective intelligence in teams of robots and human agents where it can be used for both individual and collective decision support. This is particularly important in applications pertaining to emergency rescue and crisis management. During operational missions, data and knowledge is gathered incrementally and in different ways by heterogeneous robots and humans. We describe this as the creation of \emph{Hastily Formed Knowledge Networks} (HFKNs). The focus of this paper is the specification and prototyping of a general distributed system architecture that supports the creation of HFKNs by teams of robots and humans. The information collected ranges from low-level sensor data to high-level semantic knowledge, the latter represented in part as RDF Graphs. The framework includes a synchronization protocol and associated algorithms that allow for the automatic distribution and sharing of data and knowledge between agents. This is done through the distributed synchronization of RDF Graphs shared between agents. High-level semantic queries specified in SPARQL can be used by robots and humans alike to acquire both knowledge and data content from team members. The system is empirically validated and complexity results of the proposed algorithms are provided. Additionally, a field robotics case study is described, where a 3D mapping mission has been executed using several UAVs in a collaborative emergency rescue scenario while using the full HFKN Framework.
△ Less
Submitted 25 March, 2021;
originally announced March 2021.
-
Are we using appropriate segmentation metrics? Identifying correlates of human expert perception for CNN training beyond rolling the DICE coefficient
Authors:
Florian Kofler,
Ivan Ezhov,
Fabian Isensee,
Fabian Balsiger,
Christoph Berger,
Maximilian Koerner,
Beatrice Demiray,
Julia Rackerseder,
Johannes Paetzold,
Hongwei Li,
Suprosanna Shit,
Richard McKinley,
Marie Piraud,
Spyridon Bakas,
Claus Zimmer,
Nassir Navab,
Jan Kirschke,
Benedikt Wiestler,
Bjoern Menze
Abstract:
Metrics optimized in complex machine learning tasks are often selected in an ad-hoc manner. It is unknown how they align with human expert perception. We explore the correlations between established quantitative segmentation quality metrics and qualitative evaluations by professionally trained human raters. Therefore, we conduct psychophysical experiments for two complex biomedical semantic segmen…
▽ More
Metrics optimized in complex machine learning tasks are often selected in an ad-hoc manner. It is unknown how they align with human expert perception. We explore the correlations between established quantitative segmentation quality metrics and qualitative evaluations by professionally trained human raters. Therefore, we conduct psychophysical experiments for two complex biomedical semantic segmentation problems. We discover that current standard metrics and loss functions correlate only moderately with the segmentation quality assessment of experts. Importantly, this effect is particularly pronounced for clinically relevant structures, such as the enhancing tumor compartment of glioma in brain magnetic resonance and grey matter in ultrasound imaging. It is often unclear how to optimize abstract metrics, such as human expert perception, in convolutional neural network (CNN) training. To cope with this challenge, we propose a novel strategy employing techniques of classical statistics to create complementary compound loss functions to better approximate human expert perception. Across all rating experiments, human experts consistently scored computer-generated segmentations better than the human-curated reference labels. Our results, therefore, strongly question many current practices in medical image segmentation and provide meaningful cues for future research.
△ Less
Submitted 2 May, 2023; v1 submitted 10 March, 2021;
originally announced March 2021.
-
Open-Source Concealed EEG Data Collection for Brain-Computer-Interfaces -- Real-World Neural Observation Through OpenBCI Amplifiers with Around-the-Ear cEEGrid Electrodes
Authors:
Michael Thomas Knierim,
Christoph Berger,
Pierluigi Reali
Abstract:
Observing brain activity in real-world settings offers exciting possibilities like the support of physical health, mental well-being, and thought-controlled interaction modalities. The development of such applications is, however, strongly impeded by poor accessibility to research-grade neural data and by a lack of easy-to-use and comfortable sensors. This work presents the cost-effective adaptati…
▽ More
Observing brain activity in real-world settings offers exciting possibilities like the support of physical health, mental well-being, and thought-controlled interaction modalities. The development of such applications is, however, strongly impeded by poor accessibility to research-grade neural data and by a lack of easy-to-use and comfortable sensors. This work presents the cost-effective adaptation of concealed around-the-ear EEG electrodes (cEEGrids) to the open-source OpenBCI EEG signal acquisition platform to provide a promising new toolkit. An integrated system design is described, that combines publicly available electronics components with newly designed 3D-printed parts to form an easily replicable, versatile, single-unit around-the-ear EEG recording system for prolonged use and easy application development. To demonstrate the system's feasibility, observations of experimentally induced changes in visual stimulation and mental workload are presented. Lastly, as there have been no applications of the cEEGrids to HCI contexts, a novel application area for the system is investigated, namely the observation of flow experiences through observation of temporal Alpha power changes. Support for a link between temporal Alpha power and flow is found, which indicates an efficient engagement of verbal-analytic reasoning with intensified flow experiences, and specifically intensified task absorption.
△ Less
Submitted 31 January, 2021;
originally announced February 2021.
-
HPM-Frame: A Decision Framework for Executing Software on Heterogeneous Platforms
Authors:
Hugo Andrade,
Ola Benderius,
Christian Berger,
Ivica Crnkovic,
Jan Bosch
Abstract:
Heterogeneous computing is one of the most important computational solutions to meet rapidly increasing demands on system performance. It typically allows the main flow of applications to be executed on a CPU while the most computationally intensive tasks are assigned to one or more accelerators, such as GPUs and FPGAs. The refactoring of systems for execution on such platforms is highly desired b…
▽ More
Heterogeneous computing is one of the most important computational solutions to meet rapidly increasing demands on system performance. It typically allows the main flow of applications to be executed on a CPU while the most computationally intensive tasks are assigned to one or more accelerators, such as GPUs and FPGAs. The refactoring of systems for execution on such platforms is highly desired but also difficult to perform, mainly due the inherent increase in software complexity. After exploration, we have identified a current need for a systematic approach that supports engineers in the refactoring process -- from CPU-centric applications to software that is executed on heterogeneous platforms. In this paper, we introduce a decision framework that assists engineers in the task of refactoring software to incorporate heterogeneous platforms. It covers the software engineering lifecycle through five steps, consisting of questions to be answered in order to successfully address aspects that are relevant for the refactoring procedure. We evaluate the feasibility of the framework in two ways. First, we capture the practitioner's impressions, concerns and suggestions through a questionnaire. Then, we conduct a case study showing the step-by-step application of the framework using a computer vision application in the automotive domain.
△ Less
Submitted 10 December, 2020; v1 submitted 1 December, 2020;
originally announced December 2020.
-
AWARE: Adaptive Wide-Area Replication for Fast and Resilient Byzantine Consensus
Authors:
Christian Berger,
Hans P. Reiser,
João Sousa,
Alysson Bessani
Abstract:
With upcoming blockchain infrastructures, world-spanning Byzantine consensus is getting practical and necessary. In geographically distributed systems, the pace at which consensus is achieved is limited by the heterogenous latencies of connections between replicas. If deployed on a wide-area network, consensus-based systems benefit from weighted replication, an approach that utilizes extra replica…
▽ More
With upcoming blockchain infrastructures, world-spanning Byzantine consensus is getting practical and necessary. In geographically distributed systems, the pace at which consensus is achieved is limited by the heterogenous latencies of connections between replicas. If deployed on a wide-area network, consensus-based systems benefit from weighted replication, an approach that utilizes extra replicas and assigns higher voting power to well connected replicas. This enables more choice in quorum formation and replicas can leverage proportionally smaller quorums to advance, thus decreasing consensus latency. However, the system needs a solution to autonomously adjust to its environment if network conditions change or faults occur. We present Adaptive Wide-Area REplication (AWARE), a mechanism which improves the geographical scalability of consensus with nodes being widely spread across the world. Essentially, AWARE is an automated and dynamic voting weight tuning and leader positioning scheme, which supports the emergence of fast quorums in the system. It employs a reliable self-monitoring process and provides a prediction model seeking to minimize the system's consensus latency. In experiments using several AWS EC2 regions, AWARE dynamically optimizes consensus latency by self-reliantly finding a fast weight configuration yielding latency gains observed by clients located across the globe.
△ Less
Submitted 3 November, 2020;
originally announced November 2020.
-
Traction Adaptive Motion Planning at the Limits of Handling
Authors:
Lars Svensson,
Monimoy Bujarbaruah,
Arpit Karsolia,
Christian Berger,
Martin Törngren
Abstract:
In this paper, we address the problem of motion planning and control at the limits of handling, under locally varying traction conditions. We propose a novel solution method where traction variations over the prediction horizon are represented by time-varying tire force constraints, derived from a predictive friction estimate. A constrained finite time optimal control problem is solved in a recedi…
▽ More
In this paper, we address the problem of motion planning and control at the limits of handling, under locally varying traction conditions. We propose a novel solution method where traction variations over the prediction horizon are represented by time-varying tire force constraints, derived from a predictive friction estimate. A constrained finite time optimal control problem is solved in a receding horizon fashion, imposing these time-varying constraints. Furthermore, our method features an integrated sampling augmentation procedure that addresses the problems of infeasibility and sensitivity to local minima that arise at abrupt constraint alterations, e.g., due to sudden friction changes.
We validate the proposed algorithm on a Volvo FH16 heavy-duty vehicle, in a range of critical scenarios. Experimental results indicate that traction adaptive motion planning and control improves the vehicle's capacity to avoid accidents, both when adapting to low local traction, by ensuring dynamic feasibility of the planned motion, and when adapting to high local traction, by realizing high traction utilization.
△ Less
Submitted 18 November, 2021; v1 submitted 9 September, 2020;
originally announced September 2020.
-
The Automotive Take on Continuous Experimentation: A Multiple Case Study
Authors:
Federico Giaimo,
Hugo Andrade,
Christian Berger
Abstract:
Recently, an increasingly growing number of companies is focusing on achieving self-driving systems towards SAE level 3 and higher. Such systems will have much more complex capabilities than today's advanced driver assistance systems (ADAS) like adaptive cruise control and lane-keeping assistance. For complex software systems in the Web-application domain, the logical successor for Continuous Inte…
▽ More
Recently, an increasingly growing number of companies is focusing on achieving self-driving systems towards SAE level 3 and higher. Such systems will have much more complex capabilities than today's advanced driver assistance systems (ADAS) like adaptive cruise control and lane-keeping assistance. For complex software systems in the Web-application domain, the logical successor for Continuous Integration and Deployment (CI/CD) is known as Continuous Experimentation (CE), where product owners jointly with engineers systematically run A/B experiments on possible new features to get quantifiable data about a feature's adoption from the users. While this methodology is increasingly adopted in software-intensive companies, our study is set out to explore advantages and challenges when applying CE during the development and roll-out of functionalities required for self-driving vehicles. This paper reports about the design and results from a multiple case study that was conducted at four companies including two automotive OEMs with a long history of developing vehicles, a Tier-1 supplier, and a start-up company within the area of automated driving systems. Unanimously, all expect higher quality and fast roll-out cycles to the fleet; as major challenges, however, safety concerns next to organizational structures are mentioned.
△ Less
Submitted 9 March, 2020;
originally announced March 2020.
-
Continuous Experimentation and the Cyber-Physical Systems challenge: An overview of the literature and the industrial perspective
Authors:
Federico Giaimo,
Hugo Andrade,
Christian Berger
Abstract:
Context: New software development patterns are emerging aiming at accelerating the process of delivering value. One is Continuous Experimentation, which allows to systematically deploy and run instrumented software variants during development phase in order to collect data from the field of application. While currently this practice is used on a daily basis on web-based systems, technical difficul…
▽ More
Context: New software development patterns are emerging aiming at accelerating the process of delivering value. One is Continuous Experimentation, which allows to systematically deploy and run instrumented software variants during development phase in order to collect data from the field of application. While currently this practice is used on a daily basis on web-based systems, technical difficulties challenge its adoption in fields where computational resources are constrained, e.g., cyber-physical systems and the automotive industry. Objective: This paper aims at providing an overview of the engagement on the Continuous Experimentation practice in the context of cyber-physical systems. %To provide an understanding of what is the state-of-the-art of the Continuous Experimentation practice in the context of cyber-physical systems, and what is the practitioners' feedback about this practice. Method: A systematic literature review has been conducted to investigate the link between the practice and the field of application. Additionally, an industrial multiple case study is reported. Results: The study presents the current state-of-the-art regarding Continuous Experimentation in the field of cyber-physical systems. The current perspective of Continuous Experimentation in industry is also reported. Conclusions: The field has not reached maturity yet. More conceptual analyses are found than solution proposals and the state-of-practice is yet to be achieved. However it is expected that in time an increasing number of solutions will be proposed and validated.
△ Less
Submitted 11 September, 2020; v1 submitted 8 March, 2020;
originally announced March 2020.
-
Continuous Experimentation for Automotive Software on the Example of a Heavy Commercial Vehicle in Daily Operation
Authors:
Federico Giaimo,
Christian Berger
Abstract:
As the automotive industry focuses its attention more and more towards the software functionality of vehicles, techniques to deliver new software value at a fast pace are needed. Continuous Experimentation, a practice coming from the web-based systems world, is one of such techniques. It enables researchers and developers to use real-world data to verify their hypothesis and steer the software evo…
▽ More
As the automotive industry focuses its attention more and more towards the software functionality of vehicles, techniques to deliver new software value at a fast pace are needed. Continuous Experimentation, a practice coming from the web-based systems world, is one of such techniques. It enables researchers and developers to use real-world data to verify their hypothesis and steer the software evolution based on performances and user preferences, reducing the reliance on simulations and guesswork. Several challenges prevent the verbatim adoption of this practice on automotive cyber-physical systems, e.g., safety concerns and limitations from computational resources; nonetheless, the automotive field is starting to take interest in this technique. This work aims at demonstrating and evaluating a prototypical Continuous Experimentation infrastructure, implemented on a distributed computational system housed in a commercial truck tractor that is used in daily operations by a logistic company on public roads. The system comprises computing units and sensors, and software deployment and data retrieval are only possible remotely via a mobile data connection due to the commercial interests of the logistics company. This study shows that the proposed experimentation process resulted in the development team being able to base software development choices on the real-world data collected during the experimental procedure. Additionally, a set of previously identified design criteria to enable Continuous Experimentation on automotive systems was discussed and their validity confirmed in the light of the presented work.
△ Less
Submitted 11 September, 2020; v1 submitted 8 March, 2020;
originally announced March 2020.
-
ModelHub.AI: Dissemination Platform for Deep Learning Models
Authors:
Ahmed Hosny,
Michael Schwier,
Christoph Berger,
Evin P Örnek,
Mehmet Turan,
Phi V Tran,
Leon Weninger,
Fabian Isensee,
Klaus H Maier-Hein,
Richard McKinley,
Michael T Lu,
Udo Hoffmann,
Bjoern Menze,
Spyridon Bakas,
Andriy Fedorov,
Hugo JWL Aerts
Abstract:
Recent advances in artificial intelligence research have led to a profusion of studies that apply deep learning to problems in image analysis and natural language processing among others. Additionally, the availability of open-source computational frameworks has lowered the barriers to implementing state-of-the-art methods across multiple domains. Albeit leading to major performance breakthroughs…
▽ More
Recent advances in artificial intelligence research have led to a profusion of studies that apply deep learning to problems in image analysis and natural language processing among others. Additionally, the availability of open-source computational frameworks has lowered the barriers to implementing state-of-the-art methods across multiple domains. Albeit leading to major performance breakthroughs in some tasks, effective dissemination of deep learning algorithms remains challenging, inhibiting reproducibility and benchmarking studies, impeding further validation, and ultimately hindering their effectiveness in the cumulative scientific progress. In developing a platform for sharing research outputs, we present ModelHub.AI (www.modelhub.ai), a community-driven container-based software engine and platform for the structured dissemination of deep learning models. For contributors, the engine controls data flow throughout the inference cycle, while the contributor-facing standard template exposes model-specific functions including inference, as well as pre- and post-processing. Python and RESTful Application programming interfaces (APIs) enable users to interact with models hosted on ModelHub.AI and allows both researchers and developers to utilize models out-of-the-box. ModelHub.AI is domain-, data-, and framework-agnostic, catering to different workflows and contributors' preferences.
△ Less
Submitted 26 November, 2019;
originally announced November 2019.
-
Towards Structured Evaluation of Deep Neural Network Supervisors
Authors:
Jens Henriksson,
Christian Berger,
Markus Borg,
Lars Tornberg,
Cristofer Englund,
Sankar Raman Sathyamoorthy,
Stig Ursing
Abstract:
Deep Neural Networks (DNN) have improved the quality of several non-safety related products in the past years. However, before DNNs should be deployed to safety-critical applications, their robustness needs to be systematically analyzed. A common challenge for DNNs occurs when input is dissimilar to the training set, which might lead to high confidence predictions despite proper knowledge of the i…
▽ More
Deep Neural Networks (DNN) have improved the quality of several non-safety related products in the past years. However, before DNNs should be deployed to safety-critical applications, their robustness needs to be systematically analyzed. A common challenge for DNNs occurs when input is dissimilar to the training set, which might lead to high confidence predictions despite proper knowledge of the input. Several previous studies have proposed to complement DNNs with a supervisor that detects when inputs are outside the scope of the network. Most of these supervisors, however, are developed and tested for a selected scenario using a specific performance metric. In this work, we emphasize the need to assess and compare the performance of supervisors in a structured way. We present a framework constituted by four datasets organized in six test cases combined with seven evaluation metrics. The test cases provide varying complexity and include data from publicly available sources as well as a novel dataset consisting of images from simulated driving scenarios. The latter we plan to make publicly available. Our framework can be used to support DNN supervisor evaluation, which in turn could be used to motive development, validation, and deployment of DNNs in safety-critical applications.
△ Less
Submitted 7 March, 2019; v1 submitted 4 March, 2019;
originally announced March 2019.
-
Microservice Architectures for Advanced Driver Assistance Systems: A Case-Study
Authors:
Jannik Lotz,
Andreas Vogelsang,
Ola Benderius,
Christian Berger
Abstract:
The technological advancements of recent years have steadily increased the complexity of vehicle-internal software systems, and the ongoing development towards autonomous driving will further aggravate this situation. This is leading to a level of complexity that is pushing the limits of existing vehicle software architectures and system designs. By changing the software structure to a service-bas…
▽ More
The technological advancements of recent years have steadily increased the complexity of vehicle-internal software systems, and the ongoing development towards autonomous driving will further aggravate this situation. This is leading to a level of complexity that is pushing the limits of existing vehicle software architectures and system designs. By changing the software structure to a service-based architecture, companies in other domains successfully managed the rising complexity and created a more agile and future-oriented development process. This paper presents a case-study investigating the feasibility and possible effects of changing the software architecture for a complex driver assistance function to a microservice architecture. The complete procedure is described, starting with the description of the software-environment and the corresponding requirements, followed by the implementation, and the final testing. In addition, this paper provides a high-level evaluation of the microservice architecture for the automotive use-case. The results show that microservice architectures can reduce complexity and time-consuming process steps and makes the automotive software systems prepared for upcoming challenges as long as the principles of microservice architectures are carefully followed.
△ Less
Submitted 25 February, 2019;
originally announced February 2019.
-
Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge
Authors:
Spyridon Bakas,
Mauricio Reyes,
Andras Jakab,
Stefan Bauer,
Markus Rempfler,
Alessandro Crimi,
Russell Takeshi Shinohara,
Christoph Berger,
Sung Min Ha,
Martin Rozycki,
Marcel Prastawa,
Esther Alberts,
Jana Lipkova,
John Freymann,
Justin Kirby,
Michel Bilello,
Hassan Fathallah-Shaykh,
Roland Wiest,
Jan Kirschke,
Benedikt Wiestler,
Rivka Colen,
Aikaterini Kotrotsou,
Pamela Lamontagne,
Daniel Marcus,
Mikhail Milchenko
, et al. (402 additional authors not shown)
Abstract:
Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneous histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrotic core, active and non-enhancing core. This intrinsic heterogeneity is also portrayed in their radio-phenotype, as their sub-regions are depicted by varying intensity profiles dissem…
▽ More
Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneous histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrotic core, active and non-enhancing core. This intrinsic heterogeneity is also portrayed in their radio-phenotype, as their sub-regions are depicted by varying intensity profiles disseminated across multi-parametric magnetic resonance imaging (mpMRI) scans, reflecting varying biological properties. Their heterogeneous shape, extent, and location are some of the factors that make these tumors difficult to resect, and in some cases inoperable. The amount of resected tumor is a factor also considered in longitudinal scans, when evaluating the apparent tumor for potential diagnosis of progression. Furthermore, there is mounting evidence that accurate segmentation of the various tumor sub-regions can offer the basis for quantitative image analysis towards prediction of patient overall survival. This study assesses the state-of-the-art machine learning (ML) methods used for brain tumor image analysis in mpMRI scans, during the last seven instances of the International Brain Tumor Segmentation (BraTS) challenge, i.e., 2012-2018. Specifically, we focus on i) evaluating segmentations of the various glioma sub-regions in pre-operative mpMRI scans, ii) assessing potential tumor progression by virtue of longitudinal growth of tumor sub-regions, beyond use of the RECIST/RANO criteria, and iii) predicting the overall survival from pre-operative mpMRI scans of patients that underwent gross total resection. Finally, we investigate the challenge of identifying the best ML algorithms for each of these tasks, considering that apart from being diverse on each instance of the challenge, the multi-institutional mpMRI BraTS dataset has also been a continuously evolving/growing dataset.
△ Less
Submitted 23 April, 2019; v1 submitted 5 November, 2018;
originally announced November 2018.