subscribe to arXiv mailings

Large language models for generating rules, yay or nay?

Authors: Shangeetha Sivasothy, Scott Barnett, Rena Logothetis, Mohamed Abdelrazek, Zafaryab Rasool, Srikanth Thudumu, Zac Brannelly

Abstract: Engineering safety-critical systems such as medical devices and digital health intervention systems is complex, where long-term engagement with subject-matter experts (SMEs) is needed to capture the systems' expected behaviour. In this paper, we present a novel approach that leverages Large Language Models (LLMs), such as GPT-3.5 and GPT-4, as a potential world model to accelerate the engineering… ▽ More Engineering safety-critical systems such as medical devices and digital health intervention systems is complex, where long-term engagement with subject-matter experts (SMEs) is needed to capture the systems' expected behaviour. In this paper, we present a novel approach that leverages Large Language Models (LLMs), such as GPT-3.5 and GPT-4, as a potential world model to accelerate the engineering of software systems. This approach involves using LLMs to generate logic rules, which can then be reviewed and informed by SMEs before deployment. We evaluate our approach using a medical rule set, created from the pandemic intervention monitoring system in collaboration with medical professionals during COVID-19. Our experiments show that 1) LLMs have a world model that bootstraps implementation, 2) LLMs generated less number of rules compared to experts, and 3) LLMs do not have the capacity to generate thresholds for each rule. Our work shows how LLMs augment the requirements' elicitation process by providing access to a world model for domains. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: 5 pages, 1 figure

arXiv:2403.05033 [pdf, other]

Quantifying Manifolds: Do the manifolds learned by Generative Adversarial Networks converge to the real data manifold

Authors: Anupam Chaudhuri, Anj Simmons, Mohamed Abdelrazek

Abstract: This paper presents our experiments to quantify the manifolds learned by ML models (in our experiment, we use a GAN model) as they train. We compare the manifolds learned at each epoch to the real manifolds representing the real data. To quantify a manifold, we study the intrinsic dimensions and topological features of the manifold learned by the ML model, how these metrics change as we continue t… ▽ More This paper presents our experiments to quantify the manifolds learned by ML models (in our experiment, we use a GAN model) as they train. We compare the manifolds learned at each epoch to the real manifolds representing the real data. To quantify a manifold, we study the intrinsic dimensions and topological features of the manifold learned by the ML model, how these metrics change as we continue to train the model, and whether these metrics convergence over the course of training to the metrics of the real data manifold. △ Less

Submitted 7 March, 2024; originally announced March 2024.

Comments: arXiv admin note: text overlap with arXiv:2311.13102

arXiv:2401.08138 [pdf, other]

LLMs for Test Input Generation for Semantic Caches

Authors: Zafaryab Rasool, Scott Barnett, David Willie, Stefanus Kurniawan, Sherwin Balugo, Srikanth Thudumu, Mohamed Abdelrazek

Abstract: Large language models (LLMs) enable state-of-the-art semantic capabilities to be added to software systems such as semantic search of unstructured documents and text generation. However, these models are computationally expensive. At scale, the cost of serving thousands of users increases massively affecting also user experience. To address this problem, semantic caches are used to check for answe… ▽ More Large language models (LLMs) enable state-of-the-art semantic capabilities to be added to software systems such as semantic search of unstructured documents and text generation. However, these models are computationally expensive. At scale, the cost of serving thousands of users increases massively affecting also user experience. To address this problem, semantic caches are used to check for answers to similar queries (that may have been phrased differently) without hitting the LLM service. Due to the nature of these semantic cache techniques that rely on query embeddings, there is a high chance of errors impacting user confidence in the system. Adopting semantic cache techniques usually requires testing the effectiveness of a semantic cache (accurate cache hits and misses) which requires a labelled test set of similar queries and responses which is often unavailable. In this paper, we present VaryGen, an approach for using LLMs for test input generation that produces similar questions from unstructured text documents. Our novel approach uses the reasoning capabilities of LLMs to 1) adapt queries to the domain, 2) synthesise subtle variations to queries, and 3) evaluate the synthesised test dataset. We evaluated our approach in the domain of a student question and answer system by qualitatively analysing 100 generated queries and result pairs, and conducting an empirical case study with an open source semantic cache. Our results show that query pairs satisfy human expectations of similarity and our generated data demonstrates failure cases of a semantic cache. Additionally, we also evaluate our approach on Qasper dataset. This work is an important first step into test input generation for semantic applications and presents considerations for practitioners when calibrating a semantic cache. △ Less

Submitted 16 January, 2024; originally announced January 2024.

Comments: Accepted in International Conference on AI Engineering Software Engineering (CAIN 2024)

arXiv:2401.06513 [pdf, other]

ML-On-Rails: Safeguarding Machine Learning Models in Software Systems A Case Study

Authors: Hala Abdelkader, Mohamed Abdelrazek, Scott Barnett, Jean-Guy Schneider, Priya Rani, Rajesh Vasa

Abstract: Machine learning (ML), especially with the emergence of large language models (LLMs), has significantly transformed various industries. However, the transition from ML model prototyping to production use within software systems presents several challenges. These challenges primarily revolve around ensuring safety, security, and transparency, subsequently influencing the overall robustness and trus… ▽ More Machine learning (ML), especially with the emergence of large language models (LLMs), has significantly transformed various industries. However, the transition from ML model prototyping to production use within software systems presents several challenges. These challenges primarily revolve around ensuring safety, security, and transparency, subsequently influencing the overall robustness and trustworthiness of ML models. In this paper, we introduce ML-On-Rails, a protocol designed to safeguard ML models, establish a well-defined endpoint interface for different ML tasks, and clear communication between ML providers and ML consumers (software engineers). ML-On-Rails enhances the robustness of ML models via incorporating detection capabilities to identify unique challenges specific to production ML. We evaluated the ML-On-Rails protocol through a real-world case study of the MoveReminder application. Through this evaluation, we emphasize the importance of safeguarding ML models in production. △ Less

Submitted 12 January, 2024; originally announced January 2024.

arXiv:2401.05856 [pdf, other]

Seven Failure Points When Engineering a Retrieval Augmented Generation System

Authors: Scott Barnett, Stefanus Kurniawan, Srikanth Thudumu, Zach Brannelly, Mohamed Abdelrazek

Abstract: Software engineers are increasingly adding semantic search capabilities to applications using a strategy known as Retrieval Augmented Generation (RAG). A RAG system involves finding documents that semantically match a query and then passing the documents to a large language model (LLM) such as ChatGPT to extract the right answer using an LLM. RAG systems aim to: a) reduce the problem of hallucinat… ▽ More Software engineers are increasingly adding semantic search capabilities to applications using a strategy known as Retrieval Augmented Generation (RAG). A RAG system involves finding documents that semantically match a query and then passing the documents to a large language model (LLM) such as ChatGPT to extract the right answer using an LLM. RAG systems aim to: a) reduce the problem of hallucinated responses from LLMs, b) link sources/references to generated responses, and c) remove the need for annotating documents with meta-data. However, RAG systems suffer from limitations inherent to information retrieval systems and from reliance on LLMs. In this paper, we present an experience report on the failure points of RAG systems from three case studies from separate domains: research, education, and biomedical. We share the lessons learned and present 7 failure points to consider when designing a RAG system. The two key takeaways arising from our work are: 1) validation of a RAG system is only feasible during operation, and 2) the robustness of a RAG system evolves rather than designed in at the start. We conclude with a list of potential research directions on RAG systems for the software engineering community. △ Less

Submitted 11 January, 2024; originally announced January 2024.

arXiv:2311.03657 [pdf, other]

6DVF: Data Visualisation Framework for mHealth Apps

Authors: Yasmeen Anjeer Alshehhi, Khlood Ahmad, Mohamed Abdelrazek, Alessio Bonti

Abstract: The widespread of data visualisation tools on smartphones has provided end users an easy way to track their health data, leading designers to put more effort into delivering suitable visualisations. Both academia and industry have developed several frameworks to guide the creation of informative and well-designed charts, such as the visualisation and design framework and Google Material Design.… ▽ More The widespread of data visualisation tools on smartphones has provided end users an easy way to track their health data, leading designers to put more effort into delivering suitable visualisations. Both academia and industry have developed several frameworks to guide the creation of informative and well-designed charts, such as the visualisation and design framework and Google Material Design. Despite the typical focus on design and chart types in these existing frameworks, our study highlights the need to incorporate additional components when developing data visualisations. The needs of non-expert users, the nature of the data being represented, and the mobile environment are often not prioritised in these frameworks, leading to visualisations that do not meet user needs and expectations. To address these issues, we propose our Six-Dimensions Data Visualisation Framework (6DVF) to assist in the design and evaluation of visualisations on mobile devices. Finally, we present our initial findings from a designer evaluation experiment. △ Less

Submitted 6 November, 2023; originally announced November 2023.

Comments: 6 pages

arXiv:2310.13976 [pdf, other]

Advancing Requirements Engineering through Generative AI: Assessing the Role of LLMs

Authors: Chetan Arora, John Grundy, Mohamed Abdelrazek

Abstract: Requirements Engineering (RE) is a critical phase in software development including the elicitation, analysis, specification, and validation of software requirements. Despite the importance of RE, it remains a challenging process due to the complexities of communication, uncertainty in the early stages and inadequate automation support. In recent years, large-language models (LLMs) have shown sign… ▽ More Requirements Engineering (RE) is a critical phase in software development including the elicitation, analysis, specification, and validation of software requirements. Despite the importance of RE, it remains a challenging process due to the complexities of communication, uncertainty in the early stages and inadequate automation support. In recent years, large-language models (LLMs) have shown significant promise in diverse domains, including natural language processing, code generation, and program understanding. This chapter explores the potential of LLMs in driving RE processes, aiming to improve the efficiency and accuracy of requirements-related tasks. We propose key directions and SWOT analysis for research and development in using LLMs for RE, focusing on the potential for requirements elicitation, analysis, specification, and validation. We further present the results from a preliminary evaluation, in this context. △ Less

Submitted 1 November, 2023; v1 submitted 21 October, 2023; originally announced October 2023.

arXiv:2310.11257 [pdf, other]

An empirical study of automatic wildlife detection using drone thermal imaging and object detection

Authors: Miao Chang, Tan Vuong, Manas Palaparthi, Lachlan Howell, Alessio Bonti, Mohamed Abdelrazek, Duc Thanh Nguyen

Abstract: Artificial intelligence has the potential to make valuable contributions to wildlife management through cost-effective methods for the collection and interpretation of wildlife data. Recent advances in remotely piloted aircraft systems (RPAS or ``drones'') and thermal imaging technology have created new approaches to collect wildlife data. These emerging technologies could provide promising altern… ▽ More Artificial intelligence has the potential to make valuable contributions to wildlife management through cost-effective methods for the collection and interpretation of wildlife data. Recent advances in remotely piloted aircraft systems (RPAS or ``drones'') and thermal imaging technology have created new approaches to collect wildlife data. These emerging technologies could provide promising alternatives to standard labourious field techniques as well as cover much larger areas. In this study, we conduct a comprehensive review and empirical study of drone-based wildlife detection. Specifically, we collect a realistic dataset of drone-derived wildlife thermal detections. Wildlife detections, including arboreal (for instance, koalas, phascolarctos cinereus) and ground dwelling species in our collected data are annotated via bounding boxes by experts. We then benchmark state-of-the-art object detection algorithms on our collected dataset. We use these experimental results to identify issues and discuss future directions in automatic animal monitoring using drones. △ Less

Submitted 17 October, 2023; originally announced October 2023.

arXiv:2303.02920 [pdf, other]

Requirements Engineering Framework for Human-centered Artificial Intelligence Software Systems

Authors: Khlood Ahmad, Mohamed Abdelrazek, Chetan Arora, Arbind Agrahari Baniya, Muneera Bano, John Grundy

Abstract: [Context] Artificial intelligence (AI) components used in building software solutions have substantially increased in recent years. However, many of these solutions focus on technical aspects and ignore critical human-centered aspects. [Objective] Including human-centered aspects during requirements engineering (RE) when building AI-based software can help achieve more responsible, unbiased, and i… ▽ More [Context] Artificial intelligence (AI) components used in building software solutions have substantially increased in recent years. However, many of these solutions focus on technical aspects and ignore critical human-centered aspects. [Objective] Including human-centered aspects during requirements engineering (RE) when building AI-based software can help achieve more responsible, unbiased, and inclusive AI-based software solutions. [Method] In this paper, we present a new framework developed based on human-centered AI guidelines and a user survey to aid in collecting requirements for human-centered AI-based software. We provide a catalog to elicit these requirements and a conceptual model to present them visually. [Results] The framework is applied to a case study to elicit and model requirements for enhancing the quality of 360 degree~videos intended for virtual reality (VR) users. [Conclusion] We found that our proposed approach helped the project team fully understand the human-centered needs of the project to deliver. Furthermore, the framework helped to understand what requirements need to be captured at the initial stages against later stages in the engineering process of AI-based software. △ Less

Submitted 18 May, 2023; v1 submitted 6 March, 2023; originally announced March 2023.

arXiv:2302.06034 [pdf, other]

Requirements Elicitation and Modelling of Artificial Intelligence Systems: An Empirical Study

Authors: Khlood Ahmad, Mohamed Abdelrazek, Chetan Arora, John Grundy, Muneera Bano

Abstract: Artificial Intelligence (AI) systems have gained significant traction in the recent past, creating new challenges in requirements engineering (RE) when building AI software systems. RE for AI practices have not been studied much and have scarce empirical studies. Additionally, many AI software solutions tend to focus on the technical aspects and ignore human-centered values. In this paper, we repo… ▽ More Artificial Intelligence (AI) systems have gained significant traction in the recent past, creating new challenges in requirements engineering (RE) when building AI software systems. RE for AI practices have not been studied much and have scarce empirical studies. Additionally, many AI software solutions tend to focus on the technical aspects and ignore human-centered values. In this paper, we report on a case study for eliciting and modeling requirements using our framework and a supporting tool for human-centred RE for AI systems. Our case study is a mobile health application for encouraging type-2 diabetic people to reduce their sedentary behavior. We conducted our study with three experts from the app team -- a software engineer, a project manager and a data scientist. We found in our study that most human-centered aspects were not originally considered when developing the first version of the application. We also report on other insights and challenges faced in RE for the health application, e.g., frequently changing requirements. △ Less

Submitted 12 February, 2023; originally announced February 2023.

arXiv:2301.10404 [pdf, other]

Requirements Practices and Gaps When Engineering Human-Centered Artificial Intelligence Systems

Authors: Khlood Ahmad, Mohamed Abdelrazek, Chetan Arora, Muneera Bano, John Grundy

Abstract: [Context] Engineering Artificial Intelligence (AI) software is a relatively new area with many challenges, unknowns, and limited proven best practices. Big companies such as Google, Microsoft, and Apple have provided a suite of recent guidelines to assist engineering teams in building human-centered AI systems. [Objective] The practices currently adopted by practitioners for developing such system… ▽ More [Context] Engineering Artificial Intelligence (AI) software is a relatively new area with many challenges, unknowns, and limited proven best practices. Big companies such as Google, Microsoft, and Apple have provided a suite of recent guidelines to assist engineering teams in building human-centered AI systems. [Objective] The practices currently adopted by practitioners for developing such systems, especially during Requirements Engineering (RE), are little studied and reported to date. [Method] This paper presents the results of a survey conducted to understand current industry practices in RE for AI (RE4AI) and to determine which key human-centered AI guidelines should be followed. Our survey is based on mapping existing industrial guidelines, best practices, and efforts in the literature. [Results] We surveyed 29 professionals and found most participants agreed that all the human-centered aspects we mapped should be addressed in RE. Further, we found that most participants were using UML or Microsoft Office to present requirements. [Conclusion] We identify that most of the tools currently used are not equipped to manage AI-based software, and the use of UML and Office may pose issues to the quality of requirements captured for AI. Also, all human-centered practices mapped from the guidelines should be included in RE. △ Less

Submitted 24 January, 2023; originally announced January 2023.

arXiv:2212.10693 [pdf, other]

Requirements Engineering for Artificial Intelligence Systems: A Systematic Mapping Study

Authors: Khlood Ahmad, Mohamed Abdelrazek, Chetan Arora, Muneera Bano, John Grundy

Abstract: [Context] In traditional software systems, Requirements Engineering (RE) activities are well-established and researched. However, building Artificial Intelligence (AI) based software with limited or no insight into the system's inner workings poses significant new challenges to RE. Existing literature has focused on using AI to manage RE activities, with limited research on RE for AI (RE4AI). [Obj… ▽ More [Context] In traditional software systems, Requirements Engineering (RE) activities are well-established and researched. However, building Artificial Intelligence (AI) based software with limited or no insight into the system's inner workings poses significant new challenges to RE. Existing literature has focused on using AI to manage RE activities, with limited research on RE for AI (RE4AI). [Objective] This paper investigates current approaches for specifying requirements for AI systems, identifies available frameworks, methodologies, tools, and techniques used to model requirements, and finds existing challenges and limitations. [Method] We performed a systematic mapping study to find papers on current RE4AI approaches. We identified 43 primary studies and analysed the existing methodologies, models, tools, and techniques used to specify and model requirements in real-world scenarios. [Results] We found several challenges and limitations of existing RE4AI practices. The findings highlighted that current RE applications were not adequately adaptable for building AI systems and emphasised the need to provide new techniques and tools to support RE4AI. [Conclusion] Our results showed that most of the empirical studies on RE4AI focused on autonomous, self-driving vehicles and managing data requirements, and areas such as ethics, trust, and explainability need further research. △ Less

Submitted 20 December, 2022; originally announced December 2022.

arXiv:2209.00838 [pdf, other]

Needs and Challenges of Personal Data Visualisations in Mobile Health Apps: User Survey

Authors: Yasmeen Anjeer Alshehhi, Mohamed Abdelrazek, Alessio Bonti

Abstract: Personal data visualisations are becoming a critical contributor toward the successful adoption of mobile health (m-health) apps. Thus, understanding user needs and challenges when using mobile personal data visualisation is essential to ensuring the adoption of these apps. This paper presents the results of a user survey to understand users' demographics, tasks, needs, and challenges of using mob… ▽ More Personal data visualisations are becoming a critical contributor toward the successful adoption of mobile health (m-health) apps. Thus, understanding user needs and challenges when using mobile personal data visualisation is essential to ensuring the adoption of these apps. This paper presents the results of a user survey to understand users' demographics, tasks, needs, and challenges of using mobile personal data visualisations. We had 56 complete responses. The survey's key findings are: 1) 51\% of the users use multiple health tracking apps to achieve their goals/needs; 2) bar charts and pie charts are the most favourable charts to view health data; 3) users prefer to visualise their data using a mix of text and charts - explanation is essential. Furthermore, the top three challenges reported by the participants are: too much data displayed, overlapping text, and visualisations are not helpful in information exploration. On the other hand, users' top three encouragement factors are easy-to-read presented data, easy to navigate, and quality data are shown in the chart. Furthermore, fun and curiosity are the primary drivers of m-health tracking apps. Finally, based on survey results, we propose data visualisation designing and developing guidelines that should avoid the reported challenges and ensure user satisfaction. In future work, we plan to contextualise our study and investigate the pain and gain of data visualised in the following m-health domains: sports activities, heart monitoring, blood pressure, sleeping pattern, and eating habits. △ Less

Submitted 2 September, 2022; originally announced September 2022.

Comments: 16 pages

arXiv:2203.01374 [pdf, other]

Personal Data Visualisation on Mobile Devices: A Systematic Literature Review

Authors: Yasmeen Anjeer Alshehhi, Mohamed Abdelrazek, Alessio Bonti

Abstract: Personal data cover multiple aspects of our daily life and activities, including health, finance, social, Internet, Etc. Personal data visualisations aim to improve the user experience when exploring these large amounts of personal data and potentially provide insights to assist individuals in their decision making and achieving goals. People with different backgrounds, gender and ages usually nee… ▽ More Personal data cover multiple aspects of our daily life and activities, including health, finance, social, Internet, Etc. Personal data visualisations aim to improve the user experience when exploring these large amounts of personal data and potentially provide insights to assist individuals in their decision making and achieving goals. People with different backgrounds, gender and ages usually need to access their data on their mobile devices. Although there are many personal tracking apps, the user experience when using these apps and visualisations is not evaluated yet. There are publications on personal data visualisation in the literature. Still, no systematic literature review investigated the gaps in this area to assist in developing new personal data visualisation techniques focusing on user experience. In this systematic literature review, we considered studies published between 2010 and 2020 in three online databases. We screened 195 studies and identified 29 papers that met our inclusion criteria. Our key findings are various types of personal data, and users have been addressed well in the found papers, including health, sport, diet, Driving habits, lifelogging, productivity, Etc. The user types range from naive users to expert and developers users based on the experiment's target. However, mobile device capabilities and limitations regarding data visualisation tasks have not been well addressed. There are no studies on the best practices of personal data visualisation on mobile devices, assessment frameworks for data visualisation, or design frameworks for personal data visualisations △ Less

Submitted 17 February, 2022; originally announced March 2022.

Comments: 15 pages

arXiv:2202.10620 [pdf]

Analysis of Personal Data Visualization Reviews On mHealth Apps (short paper)

Authors: Mohamed Abdelrazek, Yasmeen Anjeer Alshehhi, Alessio Bonti

Abstract: Mobile devices, specifically, smartphones proved easy and quick access to data visualisations throughout various tracking apps. Mobile health (mHealth) apps have given non-expert users access to data visualisation to track their activities and health-related issues such as heart tracking and medication. However, no work is done on user experience or perception of data visualisations in mHealth app… ▽ More Mobile devices, specifically, smartphones proved easy and quick access to data visualisations throughout various tracking apps. Mobile health (mHealth) apps have given non-expert users access to data visualisation to track their activities and health-related issues such as heart tracking and medication. However, no work is done on user experience or perception of data visualisations in mHealth apps. App reviews offer an indirect anchor for researchers to examine how non-expert users perceive and interact with data visualisations and identify the key challenges and recommendations. This paper introduces an analysis of app reviews on data visualisations reported on a dataset of 250 mHealth apps on the Google Play Store. We identified 8,406 comments related to data visualisations. 919 neutral comments, 1,557 negative comments and 5,930 positive comments. From analysing the user reviews, functional requirements turned out to be the most common problem across these app reviews, followed by the look and feel and then data problems. A complete set of data visualisations seem to be the most well-received capability of mHealth apps. We used these comments to develop classification and data visualisation guidelines when developing mobile data visualisations. △ Less

Submitted 21 February, 2022; originally announced February 2022.

Comments: 9 pages

arXiv:2103.01779 [pdf]

COVID-19 vs Social Media Apps: Does Privacy Really Matter?

Authors: Omar Haggag, Sherif Haggag, John Grundy, Mohamed Abdelrazek

Abstract: Many people around the world are worried about using or even downloading COVID-19 contact tracing mobile apps. The main reported concerns are centered around privacy and ethical issues. At the same time, people are voluntarily using Social Media apps at a significantly higher rate during the pandemic without similar privacy concerns compared with COVID-19 apps. To better understand these seemingly… ▽ More Many people around the world are worried about using or even downloading COVID-19 contact tracing mobile apps. The main reported concerns are centered around privacy and ethical issues. At the same time, people are voluntarily using Social Media apps at a significantly higher rate during the pandemic without similar privacy concerns compared with COVID-19 apps. To better understand these seemingly anomalous behaviours, we analysed the privacy policies, terms & conditions and data use agreements of the most commonly used COVID-19, Social Media & Productivity apps. We also developed a tool to extract and analyse nearly 2 million user reviews for these apps. Our results show that Social Media & Productivity apps actually have substantially higher privacy and ethical issues compared with the majority of COVID-19 apps. Surprisingly, lots of people indicated in their user reviews that they feel more secure as their privacy are better handled in COVID-19 apps than in Social Media apps. On the other hand, most of the COVID-19 apps are less accessible and stable compared to most Social Media apps, which negatively impacted their store ratings and led users to uninstall COVID-19 apps more frequently. Our findings suggest that in order to effectively fight this pandemic, health officials and technologists will need to better raise awareness among people about COVID-19 app behaviour and trustworthiness. This will allow people to better understand COVID-19 apps and encourage them to download and use these apps. Moreover, COVID-19 apps need many accessibility enhancements to allow a wider range of users from different societies and cultures to access to these apps. △ Less

Submitted 28 February, 2021; originally announced March 2021.

arXiv:2012.13728 [pdf, other]

doi 10.1109/TSE.2020.3047088

Requirements of API Documentation: A Case Study into Computer Vision Services

Authors: Alex Cummaudo, Rajesh Vasa, John Grundy, Mohamed Abdelrazek

Abstract: Using cloud-based computer vision services is gaining traction, where developers access AI-powered components through familiar RESTful APIs, not needing to orchestrate large training and inference infrastructures or curate/label training datasets. However, while these APIs seem familiar to use, their non-deterministic run-time behaviour and evolution is not adequately communicated to developers. T… ▽ More Using cloud-based computer vision services is gaining traction, where developers access AI-powered components through familiar RESTful APIs, not needing to orchestrate large training and inference infrastructures or curate/label training datasets. However, while these APIs seem familiar to use, their non-deterministic run-time behaviour and evolution is not adequately communicated to developers. Therefore, improving these services' API documentation is paramount-more extensive documentation facilitates the development process of intelligent software. In a prior study, we extracted 34 API documentation artefacts from 21 seminal works, devising a taxonomy of five key requirements to produce quality API documentation. We extend this study in two ways. Firstly, by surveying 104 developers of varying experience to understand what API documentation artefacts are of most value to practitioners. Secondly, identifying which of these highly-valued artefacts are or are not well-documented through a case study in the emerging computer vision service domain. We identify: (i) several gaps in the software engineering literature, where aspects of API documentation understanding is/is not extensively investigated; and (ii) where industry vendors (in contrast) document artefacts to better serve their end-developers. We provide a set of recommendations to enhance intelligent software documentation for both vendors and the wider research community. △ Less

Submitted 26 December, 2020; originally announced December 2020.

Comments: Early Access preprint for an upcoming issue of the IEEE Transactions on Software Engineering

arXiv:2012.03754 [pdf]

Deep Learning Methods for Credit Card Fraud Detection

Authors: Thanh Thi Nguyen, Hammad Tahir, Mohamed Abdelrazek, Ali Babar

Abstract: Credit card frauds are at an ever-increasing rate and have become a major problem in the financial sector. Because of these frauds, card users are hesitant in making purchases and both the merchants and financial institutions bear heavy losses. Some major challenges in credit card frauds involve the availability of public data, high class imbalance in data, changing nature of frauds and the high n… ▽ More Credit card frauds are at an ever-increasing rate and have become a major problem in the financial sector. Because of these frauds, card users are hesitant in making purchases and both the merchants and financial institutions bear heavy losses. Some major challenges in credit card frauds involve the availability of public data, high class imbalance in data, changing nature of frauds and the high number of false alarms. Machine learning techniques have been used to detect credit card frauds but no fraud detection systems have been able to offer great efficiency to date. Recent development of deep learning has been applied to solve complex problems in various areas. This paper presents a thorough study of deep learning methods for the credit card fraud detection problem and compare their performance with various machine learning algorithms on three different financial datasets. Experimental results show great performance of the proposed deep learning methods against traditional machine learning models and imply that the proposed approaches can be implemented effectively for real-world credit card fraud detection systems. △ Less

Submitted 7 December, 2020; originally announced December 2020.

arXiv:2009.14683 [pdf, other]

RCM: Requirement Capturing Model for Automated Requirements Formalisation

Authors: Aya Zaki-Ismail, Mohamed Osama, Mohamed Abdelrazek, John Grundy, Amani Ibrahim

Abstract: Most existing automated requirements formalisation techniques require system engineers to (re)write their requirements using a set of predefined requirement templates with a fixed structure and known semantics to simplify the formalisation process. However, these techniques require understanding and memorising requirement templates, which are usually fixed format, limit requirements captured, and… ▽ More Most existing automated requirements formalisation techniques require system engineers to (re)write their requirements using a set of predefined requirement templates with a fixed structure and known semantics to simplify the formalisation process. However, these techniques require understanding and memorising requirement templates, which are usually fixed format, limit requirements captured, and do not allow capture of more diverse requirements. To address these limitations, we need a reference model that captures key requirement details regardless of their structure, format or order. Then, using NLP techniques we can transform textual requirements into the reference model. Finally, using a suite of transformation rules we can then convert these requirements into formal notations. In this paper, we introduce the first and key step in this process, a Requirement Capturing Model (RCM) - as a reference model - to model the key elements of a system requirement regardless of their format, or order. We evaluated the robustness of the RCM model compared to 15 existing requirements representation approaches and a benchmark of 162 requirements. Our evaluation shows that RCM breakdowns support a wider range of requirements formats compared to the existing approaches. We also implemented a suite of transformation rules that transforms RCM-based requirements into temporal logic(s). In the future, we will develop NLP-based RCM extraction technique to provide end-to-end solution. △ Less

Submitted 30 September, 2020; originally announced September 2020.

arXiv:2005.13186 [pdf, other]

Beware the evolving 'intelligent' web service! An integration architecture tactic to guard AI-first components

Authors: Alex Cummaudo, Scott Barnett, Rajesh Vasa, John Grundy, Mohamed Abdelrazek

Abstract: Intelligent services provide the power of AI to developers via simple RESTful API endpoints, abstracting away many complexities of machine learning. However, most of these intelligent services-such as computer vision-continually learn with time. When the internals within the abstracted 'black box' become hidden and evolve, pitfalls emerge in the robustness of applications that depend on these evol… ▽ More Intelligent services provide the power of AI to developers via simple RESTful API endpoints, abstracting away many complexities of machine learning. However, most of these intelligent services-such as computer vision-continually learn with time. When the internals within the abstracted 'black box' become hidden and evolve, pitfalls emerge in the robustness of applications that depend on these evolving services. Without adapting the way developers plan and construct projects reliant on intelligent services, significant gaps and risks result in both project planning and development. Therefore, how can software engineers best mitigate software evolution risk moving forward, thereby ensuring that their own applications maintain quality? Our proposal is an architectural tactic designed to improve intelligent service-dependent software robustness. The tactic involves creating an application-specific benchmark dataset baselined against an intelligent service, enabling evolutionary behaviour changes to be mitigated. A technical evaluation of our implementation of this architecture demonstrates how the tactic can identify 1,054 cases of substantial confidence evolution and 2,461 cases of substantial changes to response label sets using a dataset consisting of 331 images that evolve when sent to a service. △ Less

Submitted 27 May, 2020; originally announced May 2020.

arXiv:2001.10130 [pdf, other]

Interpreting Cloud Computer Vision Pain-Points: A Mining Study of Stack Overflow

Authors: Alex Cummaudo, Rajesh Vasa, Scott Barnett, John Grundy, Mohamed Abdelrazek

Abstract: Intelligent services are becoming increasingly more pervasive; application developers want to leverage the latest advances in areas such as computer vision to provide new services and products to users, and large technology firms enable this via RESTful APIs. While such APIs promise an easy-to-integrate on-demand machine intelligence, their current design, documentation and developer interface hid… ▽ More Intelligent services are becoming increasingly more pervasive; application developers want to leverage the latest advances in areas such as computer vision to provide new services and products to users, and large technology firms enable this via RESTful APIs. While such APIs promise an easy-to-integrate on-demand machine intelligence, their current design, documentation and developer interface hides much of the underlying machine learning techniques that power them. Such APIs look and feel like conventional APIs but abstract away data-driven probabilistic behaviour - the implications of a developer treating these APIs in the same way as other, traditional cloud services, such as cloud storage, is of concern. The objective of this study is to determine the various pain-points developers face when implementing systems that rely on the most mature of these intelligent services, specifically those that provide computer vision. We use Stack Overflow to mine indications of the frustrations that developers appear to face when using computer vision services, classifying their questions against two recent classification taxonomies (documentation-related and general questions). We find that, unlike mature fields like mobile development, there is a contrast in the types of questions asked by developers. These indicate a shallow understanding of the underlying technology that empower such systems. We discuss several implications of these findings via the lens of learning taxonomies to suggest how the software engineering community can improve these services and comment on the nature by which developers use them. △ Less

Submitted 27 January, 2020; originally announced January 2020.

arXiv:1908.10661 [pdf, other]

Method and System for Image Analysis to Detect Cancer

Authors: Waleed A. Yousef, Ahmed A. Abouelkahire, Deyaaeldeen Almahallawi, Omar S. Marzouk, Sameh K. Mohamed, Waleed A. Mustafa, Omar M. Osama, Ali A. Saleh, Naglaa M. Abdelrazek

Abstract: Breast cancer is the most common cancer and is the leading cause of cancer death among women worldwide. Detection of breast cancer, while it is still small and confined to the breast, provides the best chance of effective treatment. Computer Aided Detection (CAD) systems that detect cancer from mammograms will help in reducing the human errors that lead to missing breast carcinoma. Literature is r… ▽ More Breast cancer is the most common cancer and is the leading cause of cancer death among women worldwide. Detection of breast cancer, while it is still small and confined to the breast, provides the best chance of effective treatment. Computer Aided Detection (CAD) systems that detect cancer from mammograms will help in reducing the human errors that lead to missing breast carcinoma. Literature is rich of scientific papers for methods of CAD design, yet with no complete system architecture to deploy those methods. On the other hand, commercial CADs are developed and deployed only to vendors' mammography machines with no availability to public access. This paper presents a complete CAD; it is complete since it combines, on a hand, the rigor of algorithm design and assessment (method), and, on the other hand, the implementation and deployment of a system architecture for public accessibility (system). (1) We develop a novel algorithm for image enhancement so that mammograms acquired from any digital mammography machine look qualitatively of the same clarity to radiologists' inspection; and is quantitatively standardized for the detection algorithms. (2) We develop novel algorithms for masses and microcalcifications detection with accuracy superior to both literature results and the majority of approved commercial systems. (3) We design, implement, and deploy a system architecture that is computationally effective to allow for deploying these algorithms to cloud for public access. △ Less

Submitted 26 August, 2019; originally announced August 2019.

arXiv:1907.11580 [pdf, other]

Edge User Allocation with Dynamic Quality of Service

Authors: Phu Lai, Qiang He, Guangming Cui, Xiaoyu Xia, Mohamed Abdelrazek, Feifei Chen, John Hosking, John Grundy, Yun Yang

Abstract: In edge computing, edge servers are placed in close proximity to end-users. App vendors can deploy their services on edge servers to reduce network latency experienced by their app users. The edge user allocation (EUA) problem challenges service providers with the objective to maximize the number of allocated app users with hired computing resources on edge servers while ensuring their fixed quali… ▽ More In edge computing, edge servers are placed in close proximity to end-users. App vendors can deploy their services on edge servers to reduce network latency experienced by their app users. The edge user allocation (EUA) problem challenges service providers with the objective to maximize the number of allocated app users with hired computing resources on edge servers while ensuring their fixed quality of service (QoS), e.g., the amount of computing resources allocated to an app user. In this paper, we take a step forward to consider dynamic QoS levels for app users, which generalizes but further complicates the EUA problem, turning it into a dynamic QoS EUA problem. This enables flexible levels of quality of experience (QoE) for app users. We propose an optimal approach for finding a solution that maximizes app users' overall QoE. We also propose a heuristic approach for quickly finding sub-optimal solutions to large-scale instances of the dynamic QoS EUA problem. Experiments are conducted on a real-world dataset to demonstrate the effectiveness and efficiency of our approaches against a baseline approach and the state of the art. △ Less

Submitted 26 July, 2019; originally announced July 2019.

Comments: This manuscript has been accepted for publication at the 17th International Conference on Service-Oriented Computing and may be published in the book series Lecture Notes in Computer Science. All copyrights reserved to Springer Nature Switzerland AG, Gewerbestrasse 11, 6330 Cham, Switzerland

arXiv:1907.07748 [pdf, other]

End-to-end sensor modeling for LiDAR Point Cloud

Authors: Khaled Elmadawi, Moemen Abdelrazek, Mohamed Elsobky, Hesham M. Eraqi, Mohamed Zahran

Abstract: Advanced sensors are a key to enable self-driving cars technology. Laser scanner sensors (LiDAR, Light Detection And Ranging) became a fundamental choice due to its long-range and robustness to low light driving conditions. The problem of designing a control software for self-driving cars is a complex task to explicitly formulate in rule-based systems, thus recent approaches rely on machine learni… ▽ More Advanced sensors are a key to enable self-driving cars technology. Laser scanner sensors (LiDAR, Light Detection And Ranging) became a fundamental choice due to its long-range and robustness to low light driving conditions. The problem of designing a control software for self-driving cars is a complex task to explicitly formulate in rule-based systems, thus recent approaches rely on machine learning that can learn those rules from data. The major problem with such approaches is that the amount of training data required for generalizing a machine learning model is big, and on the other hand LiDAR data annotation is very costly compared to other car sensors. An accurate LiDAR sensor model can cope with such problem. Moreover, its value goes beyond this because existing LiDAR development, validation, and evaluation platforms and processes are very costly, and virtual testing and development environments are still immature in terms of physical properties representation. In this work we propose a novel Deep Learning-based LiDAR sensor model. This method models the sensor echos, using a Deep Neural Network to model echo pulse widths learned from real data using Polar Grid Maps (PGM). We benchmark our model performance against comprehensive real sensor data and very promising results are achieved that sets a baseline for future works. △ Less

Submitted 17 July, 2019; originally announced July 2019.

Comments: Accepted in IEEE Intelligent Transportation Systems Conference - ITSC 2019

arXiv:1906.07328 [pdf, other]

Losing Confidence in Quality: Unspoken Evolution of Computer Vision Services

Authors: Alex Cummaudo, Rajesh Vasa, John Grundy, Mohamed Abdelrazek, Andrew Cain

Abstract: Recent advances in artificial intelligence (AI) and machine learning (ML), such as computer vision, are now available as intelligent services and their accessibility and simplicity is compelling. Multiple vendors now offer this technology as cloud services and developers want to leverage these advances to provide value to end-users. However, there is no firm investigation into the maintenance and… ▽ More Recent advances in artificial intelligence (AI) and machine learning (ML), such as computer vision, are now available as intelligent services and their accessibility and simplicity is compelling. Multiple vendors now offer this technology as cloud services and developers want to leverage these advances to provide value to end-users. However, there is no firm investigation into the maintenance and evolution risks arising from use of these intelligent services; in particular, their behavioural consistency and transparency of their functionality. We evaluated the responses of three different intelligent services (specifically computer vision) over 11 months using 3 different data sets, verifying responses against the respective documentation and assessing evolution risk. We found that there are: (1) inconsistencies in how these services behave; (2) evolution risk in the responses; and (3) a lack of clear communication that documents these risks and inconsistencies. We propose a set of recommendations to both developers and intelligent service providers to inform risk and assist maintainability. △ Less

Submitted 30 July, 2019; v1 submitted 17 June, 2019; originally announced June 2019.

arXiv:1904.05553 [pdf, other]

doi 10.1007/978-3-030-03596-9_15

Optimal Edge User Allocation in Edge Computing with Variable Sized Vector Bin Packing

Authors: Phu Lai, Qiang He, Mohamed Abdelrazek, Feifei Chen, John Hosking, John Grundy, Yun Yang

Abstract: In mobile edge computing, edge servers are geographically distributed around base stations placed near end-users to provide highly accessible and efficient computing capacities and services. In the mobile edge computing environment, a service provider can deploy its service on hired edge servers to reduce end-to-end service delays experienced by its end-users allocated to those edge servers. An op… ▽ More In mobile edge computing, edge servers are geographically distributed around base stations placed near end-users to provide highly accessible and efficient computing capacities and services. In the mobile edge computing environment, a service provider can deploy its service on hired edge servers to reduce end-to-end service delays experienced by its end-users allocated to those edge servers. An optimal deployment must maximize the number of allocated end-users and minimize the number of hired edge servers while ensuring the required quality of service for end-users. In this paper, we model the edge user allocation (EUA) problem as a bin packing problem, and introduce a novel, optimal approach to solving the EUA problem based on the Lexicographic Goal Programming technique. We have conducted three series of experiments to evaluate the proposed approach against two representative baseline approaches. Experimental results show that our approach significantly outperforms the other two approaches. △ Less

Submitted 11 April, 2019; originally announced April 2019.

Showing 1–26 of 26 results for author: Abdelrazek, M