subscribe to arXiv mailings

doi 10.5281/zenodo.10776102

LSTM-Based Text Generation: A Study on Historical Datasets

Authors: Mustafa Abbas Hussein Hussein, Serkan Savaş

Abstract: This paper presents an exploration of Long Short-Term Memory (LSTM) networks in the realm of text generation, focusing on the utilization of historical datasets for Shakespeare and Nietzsche. LSTMs, known for their effectiveness in handling sequential data, are applied here to model complex language patterns and structures inherent in historical texts. The study demonstrates that LSTM-based models… ▽ More This paper presents an exploration of Long Short-Term Memory (LSTM) networks in the realm of text generation, focusing on the utilization of historical datasets for Shakespeare and Nietzsche. LSTMs, known for their effectiveness in handling sequential data, are applied here to model complex language patterns and structures inherent in historical texts. The study demonstrates that LSTM-based models, when trained on historical datasets, can not only generate text that is linguistically rich and contextually relevant but also provide insights into the evolution of language patterns over time. The finding presents models that are highly accurate and efficient in predicting text from works of Nietzsche, with low loss values and a training time of 100 iterations. The accuracy of the model is 0.9521, indicating high accuracy. The loss of the model is 0.2518, indicating its effectiveness. The accuracy of the model in predicting text from the work of Shakespeare is 0.9125, indicating a low error rate. The training time of the model is 100, mirroring the efficiency of the Nietzsche dataset. This efficiency demonstrates the effectiveness of the model design and training methodology, especially when handling complex literary texts. This research contributes to the field of natural language processing by showcasing the versatility of LSTM networks in text generation and offering a pathway for future explorations in historical linguistics and beyond. △ Less

Submitted 11 March, 2024; originally announced March 2024.

Report number: ISBN: 978-625-6879-50-8

Journal ref: 16th International Istanbul Scientific Research Congress on Life, Engineering, Architecture, and Mathematical Sciences Proceedings Book, Pages: 42-49, 2024

arXiv:2309.15686 [pdf, other]

Enhancing End-to-End Conversational Speech Translation Through Target Language Context Utilization

Authors: Amir Hussein, Brian Yan, Antonios Anastasopoulos, Shinji Watanabe, Sanjeev Khudanpur

Abstract: Incorporating longer context has been shown to benefit machine translation, but the inclusion of context in end-to-end speech translation (E2E-ST) remains under-studied. To bridge this gap, we introduce target language context in E2E-ST, enhancing coherence and overcoming memory constraints of extended audio segments. Additionally, we propose context dropout to ensure robustness to the absence of… ▽ More Incorporating longer context has been shown to benefit machine translation, but the inclusion of context in end-to-end speech translation (E2E-ST) remains under-studied. To bridge this gap, we introduce target language context in E2E-ST, enhancing coherence and overcoming memory constraints of extended audio segments. Additionally, we propose context dropout to ensure robustness to the absence of context, and further improve performance by adding speaker information. Our proposed contextual E2E-ST outperforms the isolated utterance-based E2E-ST approach. Lastly, we demonstrate that in conversational speech, contextual information primarily contributes to capturing context style, as well as resolving anaphora and named entities. △ Less

Submitted 27 September, 2023; originally announced September 2023.

arXiv:2309.15674 [pdf, other]

Speech collage: code-switched audio generation by collaging monolingual corpora

Authors: Amir Hussein, Dorsa Zeinali, Ondřej Klejch, Matthew Wiesner, Brian Yan, Shammur Chowdhury, Ahmed Ali, Shinji Watanabe, Sanjeev Khudanpur

Abstract: Designing effective automatic speech recognition (ASR) systems for Code-Switching (CS) often depends on the availability of the transcribed CS resources. To address data scarcity, this paper introduces Speech Collage, a method that synthesizes CS data from monolingual corpora by splicing audio segments. We further improve the smoothness quality of audio generation using an overlap-add approach. We… ▽ More Designing effective automatic speech recognition (ASR) systems for Code-Switching (CS) often depends on the availability of the transcribed CS resources. To address data scarcity, this paper introduces Speech Collage, a method that synthesizes CS data from monolingual corpora by splicing audio segments. We further improve the smoothness quality of audio generation using an overlap-add approach. We investigate the impact of generated data on speech recognition in two scenarios: using in-domain CS text and a zero-shot approach with synthesized CS text. Empirical results highlight up to 34.4% and 16.2% relative reductions in Mixed-Error Rate and Word-Error Rate for in-domain and zero-shot scenarios, respectively. Lastly, we demonstrate that CS augmentation bolsters the model's code-switching inclination and reduces its monolingual bias. △ Less

Submitted 27 September, 2023; originally announced September 2023.

arXiv:2307.11468 [pdf, other]

doi 10.1109/MCOM.001.2200508

Zero-touch realization of Pervasive Artificial Intelligence-as-a-service in 6G networks

Authors: Emna Baccour, Mhd Saria Allahham, Aiman Erbad, Amr Mohamed, Ahmed Refaey Hussein, Mounir Hamdi

Abstract: The vision of the upcoming 6G technologies, characterized by ultra-dense network, low latency, and fast data rate is to support Pervasive AI (PAI) using zero-touch solutions enabling self-X (e.g., self-configuration, self-monitoring, and self-healing) services. However, the research on 6G is still in its infancy, and only the first steps have been taken to conceptualize its design, investigate its… ▽ More The vision of the upcoming 6G technologies, characterized by ultra-dense network, low latency, and fast data rate is to support Pervasive AI (PAI) using zero-touch solutions enabling self-X (e.g., self-configuration, self-monitoring, and self-healing) services. However, the research on 6G is still in its infancy, and only the first steps have been taken to conceptualize its design, investigate its implementation, and plan for use cases. Toward this end, academia and industry communities have gradually shifted from theoretical studies of AI distribution to real-world deployment and standardization. Still, designing an end-to-end framework that systematizes the AI distribution by allowing easier access to the service using a third-party application assisted by a zero-touch service provisioning has not been well explored. In this context, we introduce a novel platform architecture to deploy a zero-touch PAI-as-a-Service (PAIaaS) in 6G networks supported by a blockchain-based smart system. This platform aims to standardize the pervasive AI at all levels of the architecture and unify the interfaces in order to facilitate the service deployment across application and infrastructure domains, relieve the users worries about cost, security, and resource allocation, and at the same time, respect the 6G stringent performance requirements. As a proof of concept, we present a Federated Learning-as-a-service use case where we evaluate the ability of our proposed system to self-optimize and self-adapt to the dynamics of 6G networks in addition to minimizing the users' perceived costs. △ Less

Submitted 21 July, 2023; originally announced July 2023.

Comments: IEEE Communications Magazine

Journal ref: in IEEE Communications Magazine, vol. 61, no. 2, pp. 110-116, 2023

arXiv:2211.16319 [pdf, other]

Benchmarking Evaluation Metrics for Code-Switching Automatic Speech Recognition

Authors: Injy Hamed, Amir Hussein, Oumnia Chellah, Shammur Chowdhury, Hamdy Mubarak, Sunayana Sitaram, Nizar Habash, Ahmed Ali

Abstract: Code-switching poses a number of challenges and opportunities for multilingual automatic speech recognition. In this paper, we focus on the question of robust and fair evaluation metrics. To that end, we develop a reference benchmark data set of code-switching speech recognition hypotheses with human judgments. We define clear guidelines for minimal editing of automatic hypotheses. We validate the… ▽ More Code-switching poses a number of challenges and opportunities for multilingual automatic speech recognition. In this paper, we focus on the question of robust and fair evaluation metrics. To that end, we develop a reference benchmark data set of code-switching speech recognition hypotheses with human judgments. We define clear guidelines for minimal editing of automatic hypotheses. We validate the guidelines using 4-way inter-annotator agreement. We evaluate a large number of metrics in terms of correlation with human judgments. The metrics we consider vary in terms of representation (orthographic, phonological, semantic), directness (intrinsic vs extrinsic), granularity (e.g. word, character), and similarity computation method. The highest correlation to human judgment is achieved using transliteration followed by text normalization. We release the first corpus for human acceptance of code-switching speech recognition results in dialectal Arabic/English conversation speech. △ Less

Submitted 22 November, 2022; originally announced November 2022.

Comments: Accepted to SLT 2022

arXiv:2211.12560 [pdf, other]

Contextually Aware Intelligent Control Agents for Heterogeneous Swarms

Authors: Adam Hepworth, Aya Hussein, Darryn Reid, Hussein Abbass

Abstract: An emerging challenge in swarm shepherding research is to design effective and efficient artificial intelligence algorithms that maintain a low-computational ceiling while increasing the swarm's abilities to operate in diverse contexts. We propose a methodology to design a context-aware swarm-control intelligent agent. The intelligent control agent (shepherd) first uses swarm metrics to recognise… ▽ More An emerging challenge in swarm shepherding research is to design effective and efficient artificial intelligence algorithms that maintain a low-computational ceiling while increasing the swarm's abilities to operate in diverse contexts. We propose a methodology to design a context-aware swarm-control intelligent agent. The intelligent control agent (shepherd) first uses swarm metrics to recognise the type of swarm it interacts with to then select a suitable parameterisation from its behavioural library for that particular swarm type. The design principle of our methodology is to increase the situation awareness (i.e. information contents) of the control agent without sacrificing the low-computational cost necessary for efficient swarm control. We demonstrate successful shepherding in both homogeneous and heterogeneous swarms. △ Less

Submitted 22 November, 2022; originally announced November 2022.

Comments: 37 pages, 3 figures, 11 tables

arXiv:2211.03063 [pdf, other]

Developing Decentralised Resilience to Malicious Influence in Collective Perception Problem

Authors: Chris Wise, Aya Hussein, Heba El-Fiqi

Abstract: In collective decision-making, designing algorithms that use only local information to effect swarm-level behaviour is a non-trivial problem. We used machine learning techniques to teach swarm members to map their local perceptions of the environment to an optimal action. A curriculum inspired by Machine Education approaches was designed to facilitate this learning process and teach the members th… ▽ More In collective decision-making, designing algorithms that use only local information to effect swarm-level behaviour is a non-trivial problem. We used machine learning techniques to teach swarm members to map their local perceptions of the environment to an optimal action. A curriculum inspired by Machine Education approaches was designed to facilitate this learning process and teach the members the skills required for optimal performance in the collective perception problem. We extended upon previous approaches by creating a curriculum that taught agents resilience to malicious influence. The experimental results show that well-designed rules-based algorithms can produce effective agents. When performing opinion fusion, we implemented decentralised resilience by having agents dynamically weight received opinion. We found a non-significant difference between constant and dynamic weights, suggesting that momentum-based opinion fusion is perhaps already a resilience mechanism. △ Less

Submitted 6 November, 2022; originally announced November 2022.

Comments: 14 Pages, 14 Figures

arXiv:2208.12386 [pdf, other]

doi 10.1177/10597123221137090

Swarm Analytics: Designing Information Markers to Characterise Swarm Systems in Shepherding Contexts

Authors: Adam Hepworth, Aya Hussein, Darryn Reid, Hussein Abbass

Abstract: Contemporary swarm indicators are often used in isolation, focused on extracting information at the individual or collective levels. Consequently, these are seldom integrated to infer a top-level operating picture of the swarm, its members, and its overall collective dynamics. The primary contribution of this paper is to organise a suite of indicators about swarms into an ontologically-arranged co… ▽ More Contemporary swarm indicators are often used in isolation, focused on extracting information at the individual or collective levels. Consequently, these are seldom integrated to infer a top-level operating picture of the swarm, its members, and its overall collective dynamics. The primary contribution of this paper is to organise a suite of indicators about swarms into an ontologically-arranged collection of information markers to characterise the swarm from the perspective of an external observer\textemdash, a recognition agent. Our contribution shows the foundations for a new area of research that we tile swarm analytics, whose primary concern is with the design and organisation of collections of swarm markers to understand, detect, recognise, track, and learn a particular insight about a swarm system. We present our designed framework of information markers that offer a new avenue for swarm research, especially for heterogeneous and cognitive swarms that may require more advanced capabilities to detect agencies and categorise agent influences and responses. △ Less

Submitted 18 October, 2022; v1 submitted 25 August, 2022; originally announced August 2022.

Comments: 28 pages, 15 tables, 13 figures

arXiv:2206.09790 [pdf, other]

The Makerere Radio Speech Corpus: A Luganda Radio Corpus for Automatic Speech Recognition

Authors: Jonathan Mukiibi, Andrew Katumba, Joyce Nakatumba-Nabende, Ali Hussein, Josh Meyer

Abstract: Building a usable radio monitoring automatic speech recognition (ASR) system is a challenging task for under-resourced languages and yet this is paramount in societies where radio is the main medium of public communication and discussions. Initial efforts by the United Nations in Uganda have proved how understanding the perceptions of rural people who are excluded from social media is important in… ▽ More Building a usable radio monitoring automatic speech recognition (ASR) system is a challenging task for under-resourced languages and yet this is paramount in societies where radio is the main medium of public communication and discussions. Initial efforts by the United Nations in Uganda have proved how understanding the perceptions of rural people who are excluded from social media is important in national planning. However, these efforts are being challenged by the absence of transcribed speech datasets. In this paper, The Makerere Artificial Intelligence research lab releases a Luganda radio speech corpus of 155 hours. To our knowledge, this is the first publicly available radio dataset in sub-Saharan Africa. The paper describes the development of the voice corpus and presents baseline Luganda ASR performance results using Coqui STT toolkit, an open source speech recognition toolkit. △ Less

Submitted 20 June, 2022; originally announced June 2022.

Comments: Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022), pages 1945 to 1954 Marseille, 20 to 25 June 2022

arXiv:2204.04190 [pdf]

doi 10.1109/ACIT53391.2021.9677407

Internet of Robotic Things: Current Technologies and Applications

Authors: Ghassan Samara, Abla Hussein, Israa Abdullah Matarneh, Mohammed Alrefai, Maram Y Al-Safarini

Abstract: The Internet of Robotic Things (IoRT) is a new domain that aims to link the IoT environment with robotic systems and technologies. IoRT connects robotic systems, connects them to the cloud, and transfers critical information as well as knowledge exchange to conduct complicated and intricate activities that a human cannot readily perform. The pertinent notion of IoRT has been discussed in this pape… ▽ More The Internet of Robotic Things (IoRT) is a new domain that aims to link the IoT environment with robotic systems and technologies. IoRT connects robotic systems, connects them to the cloud, and transfers critical information as well as knowledge exchange to conduct complicated and intricate activities that a human cannot readily perform. The pertinent notion of IoRT has been discussed in this paper, along with the issues that this area faces on a daily basis. Furthermore, technological applications have been examined in order to provide a better understanding of IoRT and its current development phenomenon. The study describes three layers of IoRT infrastructure: network and control, physical, and service and application layer. In the next section, IoRT problems have been presented, with a focus on data processing and the security and safety of IoRT technological systems. In addition to discussing the difficulties, appropriate solutions have been offered and recommended. IoRT is regarded as an essential technology with the ability to bring about a plethora of benefits in smart society upon adoption, contributing to the generation and development of smart cities and industries in the near future. △ Less

Submitted 30 March, 2022; originally announced April 2022.

Comments: 6 pages

Journal ref: 2021 22nd International Arab Conference on Information Technology (ACIT)

arXiv:2201.02550 [pdf, other]

Textual Data Augmentation for Arabic-English Code-Switching Speech Recognition

Authors: Amir Hussein, Shammur Absar Chowdhury, Ahmed Abdelali, Najim Dehak, Ahmed Ali, Sanjeev Khudanpur

Abstract: The pervasiveness of intra-utterance code-switching (CS) in spoken content requires that speech recognition (ASR) systems handle mixed language. Designing a CS-ASR system has many challenges, mainly due to data scarcity, grammatical structure complexity, and domain mismatch. The most common method for addressing CS is to train an ASR system with the available transcribed CS speech, along with mono… ▽ More The pervasiveness of intra-utterance code-switching (CS) in spoken content requires that speech recognition (ASR) systems handle mixed language. Designing a CS-ASR system has many challenges, mainly due to data scarcity, grammatical structure complexity, and domain mismatch. The most common method for addressing CS is to train an ASR system with the available transcribed CS speech, along with monolingual data. In this work, we propose a zero-shot learning methodology for CS-ASR by augmenting the monolingual data with artificially generating CS text. We based our approach on random lexical replacements and Equivalence Constraint (EC) while exploiting aligned translation pairs to generate random and grammatically valid CS content. Our empirical results show a 65.5% relative reduction in language model perplexity, and 7.7% in ASR WER on two ecologically valid CS test sets. The human evaluation of the generated text using EC suggests that more than 80% is of adequate quality. △ Less

Submitted 11 January, 2023; v1 submitted 7 January, 2022; originally announced January 2022.

arXiv:2107.01573 [pdf, other]

Arabic Code-Switching Speech Recognition using Monolingual Data

Authors: Ahmed Ali, Shammur Chowdhury, Amir Hussein, Yasser Hifny

Abstract: Code-switching in automatic speech recognition (ASR) is an important challenge due to globalization. Recent research in multilingual ASR shows potential improvement over monolingual systems. We study key issues related to multilingual modeling for ASR through a series of large-scale ASR experiments. Our innovative framework deploys a multi-graph approach in the weighted finite state transducers (W… ▽ More Code-switching in automatic speech recognition (ASR) is an important challenge due to globalization. Recent research in multilingual ASR shows potential improvement over monolingual systems. We study key issues related to multilingual modeling for ASR through a series of large-scale ASR experiments. Our innovative framework deploys a multi-graph approach in the weighted finite state transducers (WFST) framework. We compare our WFST decoding strategies with a transformer sequence to sequence system trained on the same data. Given a code-switching scenario between Arabic and English languages, our results show that the WFST decoding approaches were more suitable for the intersentential code-switching datasets. In addition, the transformer system performed better for intrasentential code-switching task. With this study, we release an artificially generated development and test sets, along with ecological code-switching test set, to benchmark the ASR performance. △ Less

Submitted 4 July, 2021; originally announced July 2021.

Comments: Accepted in Interspeech 2021, speech recognition, code-switching, ASR, transformer, WFST, graph approach

arXiv:2106.13000 [pdf, other]

QASR: QCRI Aljazeera Speech Resource -- A Large Scale Annotated Arabic Speech Corpus

Authors: Hamdy Mubarak, Amir Hussein, Shammur Absar Chowdhury, Ahmed Ali

Abstract: We introduce the largest transcribed Arabic speech corpus, QASR, collected from the broadcast domain. This multi-dialect speech dataset contains 2,000 hours of speech sampled at 16kHz crawled from Aljazeera news channel. The dataset is released with lightly supervised transcriptions, aligned with the audio segments. Unlike previous datasets, QASR contains linguistically motivated segmentation, pun… ▽ More We introduce the largest transcribed Arabic speech corpus, QASR, collected from the broadcast domain. This multi-dialect speech dataset contains 2,000 hours of speech sampled at 16kHz crawled from Aljazeera news channel. The dataset is released with lightly supervised transcriptions, aligned with the audio segments. Unlike previous datasets, QASR contains linguistically motivated segmentation, punctuation, speaker information among others. QASR is suitable for training and evaluating speech recognition systems, acoustics- and/or linguistics- based Arabic dialect identification, punctuation restoration, speaker identification, speaker linking, and potentially other NLP modules for spoken data. In addition to QASR transcription, we release a dataset of 130M words to aid in designing and training a better language model. We show that end-to-end automatic speech recognition trained on QASR reports a competitive word error rate compared to the previous MGB-2 corpus. We report baseline results for downstream natural language processing tasks such as named entity recognition using speech transcript. We also report the first baseline for Arabic punctuation restoration. We make the corpus available for the research community. △ Less

Submitted 24 June, 2021; originally announced June 2021.

Comments: Speech Corpus, Spoken Conversation, ASR, Dialect Identification, Punctuation Restoration, Speaker Verification, NER, Named Entity, Arabic, Speaker gender, Turn-taking Accepted in ACL 2021

arXiv:2106.07256 [pdf, other]

Deterministic Guided LiDAR Depth Map Completion

Authors: Bryan Krauss, Gregory Schroeder, Marko Gustke, Ahmed Hussein

Abstract: Accurate dense depth estimation is crucial for autonomous vehicles to analyze their environment. This paper presents a non-deep learning-based approach to densify a sparse LiDAR-based depth map using a guidance RGB image. To achieve this goal the RGB image is at first cleared from most of the camera-LiDAR misalignment artifacts. Afterward, it is over segmented and a plane for each superpixel is ap… ▽ More Accurate dense depth estimation is crucial for autonomous vehicles to analyze their environment. This paper presents a non-deep learning-based approach to densify a sparse LiDAR-based depth map using a guidance RGB image. To achieve this goal the RGB image is at first cleared from most of the camera-LiDAR misalignment artifacts. Afterward, it is over segmented and a plane for each superpixel is approximated. In the case a superpixel is not well represented by a plane, a plane is approximated for a convex hull of the most inlier. Finally, the pinhole camera model is used for the interpolation process and the remaining areas are interpolated. The evaluation of this work is executed using the KITTI depth completion benchmark, which validates the proposed work and shows that it outperforms the state-of-the-art non-deep learning-based methods, in addition to several deep learning-based methods. △ Less

Submitted 14 June, 2021; originally announced June 2021.

Comments: Submitted to 2021 IEEE Intelligent Vehicles Symposium (IV21). This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2106.05885 [pdf, other]

Balanced End-to-End Monolingual pre-training for Low-Resourced Indic Languages Code-Switching Speech Recognition

Authors: Amir Hussein, Shammur Chowdhury, Najim Dehak, Ahmed Ali

Abstract: The success in designing Code-Switching (CS) ASR often depends on the availability of the transcribed CS resources. Such dependency harms the development of ASR in low-resourced languages such as Bengali and Hindi. In this paper, we exploit the transfer learning approach to design End-to-End (E2E) CS ASR systems for the two low-resourced language pairs using different monolingual speech data and a… ▽ More The success in designing Code-Switching (CS) ASR often depends on the availability of the transcribed CS resources. Such dependency harms the development of ASR in low-resourced languages such as Bengali and Hindi. In this paper, we exploit the transfer learning approach to design End-to-End (E2E) CS ASR systems for the two low-resourced language pairs using different monolingual speech data and a small set of noisy CS data. We trained the CS-ASR, following two steps: (i) building a robust bilingual ASR system using a convolution-augmented transformer (Conformer) based acoustic model and n-gram language model, and (ii) fine-tuned the entire E2E ASR with limited noisy CS data. We tested our method on MUCS 2021 challenge and achieved 3rd place in the CS track. We then tested the proposed method using noisy CS data released for Hindi-English and Bengali-English pairs in Multilingual and Code-Switching ASR Challenges for Low Resource Indian Languages (MUCS 2021) and achieved 3rd place in the CS track. Unlike, the leading two systems that benefited from crawling YouTube and learning transliteration pairs, our proposed transfer learning approach focused on using only the limited CS data with no data-cleaning or data re-segmentation. Our approach achieved 14.1% relative gain in word error rate (WER) in Hindi-English and 27.1% in Bengali-English. We provide detailed guidelines on the steps to finetune the self-attention based model for limited data for ASR. Moreover, we release the code and recipe used in this paper. △ Less

Submitted 15 February, 2022; v1 submitted 10 June, 2021; originally announced June 2021.

arXiv:2105.14779 [pdf, other]

Towards One Model to Rule All: Multilingual Strategy for Dialectal Code-Switching Arabic ASR

Authors: Shammur Absar Chowdhury, Amir Hussein, Ahmed Abdelali, Ahmed Ali

Abstract: With the advent of globalization, there is an increasing demand for multilingual automatic speech recognition (ASR), handling language and dialectal variation of spoken content. Recent studies show its efficacy over monolingual systems. In this study, we design a large multilingual end-to-end ASR using self-attention based conformer architecture. We trained the system using Arabic (Ar), English (E… ▽ More With the advent of globalization, there is an increasing demand for multilingual automatic speech recognition (ASR), handling language and dialectal variation of spoken content. Recent studies show its efficacy over monolingual systems. In this study, we design a large multilingual end-to-end ASR using self-attention based conformer architecture. We trained the system using Arabic (Ar), English (En) and French (Fr) languages. We evaluate the system performance handling: (i) monolingual (Ar, En and Fr); (ii) multi-dialectal (Modern Standard Arabic, along with dialectal variation such as Egyptian and Moroccan); (iii) code-switching -- cross-lingual (Ar-En/Fr) and dialectal (MSA-Egyptian dialect) test cases, and compare with current state-of-the-art systems. Furthermore, we investigate the influence of different embedding/character representations including character vs word-piece; shared vs distinct input symbol per language. Our findings demonstrate the strength of such a model by outperforming state-of-the-art monolingual dialectal Arabic and code-switching Arabic ASR. △ Less

Submitted 5 July, 2021; v1 submitted 31 May, 2021; originally announced May 2021.

Comments: Accepted in INTERSPEECH 2021, Multilingual ASR, Multi-dialectal ASR, Code-Switching ASR, Arabic ASR, Conformer, Transformer, E2E ASR, Speech Recognition, ASR, Arabic, English, French

arXiv:2101.08454 [pdf, other]

Arabic Speech Recognition by End-to-End, Modular Systems and Human

Authors: Amir Hussein, Shinji Watanabe, Ahmed Ali

Abstract: Recent advances in automatic speech recognition (ASR) have achieved accuracy levels comparable to human transcribers, which led researchers to debate if the machine has reached human performance. Previous work focused on the English language and modular hidden Markov model-deep neural network (HMM-DNN) systems. In this paper, we perform a comprehensive benchmarking for end-to-end transformer ASR,… ▽ More Recent advances in automatic speech recognition (ASR) have achieved accuracy levels comparable to human transcribers, which led researchers to debate if the machine has reached human performance. Previous work focused on the English language and modular hidden Markov model-deep neural network (HMM-DNN) systems. In this paper, we perform a comprehensive benchmarking for end-to-end transformer ASR, modular HMM-DNN ASR, and human speech recognition (HSR) on the Arabic language and its dialects. For the HSR, we evaluate linguist performance and lay-native speaker performance on a new dataset collected as a part of this study. For ASR the end-to-end work led to 12.5%, 27.5%, 33.8% WER; a new performance milestone for the MGB2, MGB3, and MGB5 challenges respectively. Our results suggest that human performance in the Arabic language is still considerably better than the machine with an absolute WER gap of 3.5% on average. △ Less

Submitted 29 June, 2021; v1 submitted 21 January, 2021; originally announced January 2021.

arXiv:2007.15404 [pdf, other]

Regional Rainfall Prediction Using Support Vector Machine Classification of Large-Scale Precipitation Maps

Authors: Eslam A. Hussein, Mehrdad Ghaziasgar, Christopher Thron

Abstract: Rainfall prediction helps planners anticipate potential social and economic impacts produced by too much or too little rain. This research investigates a class-based approach to rainfall prediction from 1-30 days in advance. The study made regional predictions based on sequences of daily rainfall maps of the continental US, with rainfall quantized at 3 levels: light or no rain; moderate; and heavy… ▽ More Rainfall prediction helps planners anticipate potential social and economic impacts produced by too much or too little rain. This research investigates a class-based approach to rainfall prediction from 1-30 days in advance. The study made regional predictions based on sequences of daily rainfall maps of the continental US, with rainfall quantized at 3 levels: light or no rain; moderate; and heavy rain. Three regions were selected, corresponding to three squares from a $5\times5$ grid covering the map area. Rainfall predictions up to 30 days ahead for these three regions were based on a support vector machine (SVM) applied to consecutive sequences of prior daily rainfall map images. The results show that predictions for corner squares in the grid were less accurate than predictions obtained by a simple untrained classifier. However, SVM predictions for a central region outperformed the other two regions, as well as the untrained classifier. We conclude that there is some evidence that SVMs applied to large-scale precipitation maps can under some conditions give useful information for predicting regional rainfall, but care must be taken to avoid pitfall △ Less

Submitted 30 July, 2020; originally announced July 2020.

arXiv:2001.00560 [pdf, ps, other]

doi 10.1109/TVT.2021.3131305

Vehicle Platooning Impact on Drag Coefficients and Energy/Fuel Saving Implications

Authors: Ahmed A. Hussein, Hesham A. Rakha

Abstract: In this paper, empirical data from the literature are used to develop general power models that capture the impact of a vehicle position, in a platoon of homogeneous vehicles, and the distance gap to its lead (and following) vehicle on its drag coefficient. These models are developed for light duty vehicles, buses, and heavy duty trucks. The models were fit using a constrained optimization framewo… ▽ More In this paper, empirical data from the literature are used to develop general power models that capture the impact of a vehicle position, in a platoon of homogeneous vehicles, and the distance gap to its lead (and following) vehicle on its drag coefficient. These models are developed for light duty vehicles, buses, and heavy duty trucks. The models were fit using a constrained optimization framework to fit a general power function using either direct drag force or fuel measurements. The model is then used to extrapolate the empirical measurements to a wide range of vehicle distance gaps within a platoon. Using these models we estimate the potential fuel reduction associated with homogeneous platoons of light duty vehicles, buses, and heavy duty trucks. The results show a significant reduction in the vehicle fuel consumption when compared with those based on a constant drag coefficient assumption. Specifically, considering a minimum time gap between vehicles of $0.5 \; secs$ (which is typical considering state-of-practice communication and mechanical system latencies) running at a speed of $100 \; km/hr$, the optimum fuel reduction that is achieved is $4.5 \%$, $15.5 \%$, and $7.0 \%$ for light duty vehicle, bus, and heavy duty truck platoons, respectively. For longer time gaps, the bus and heavy duty truck platoons still produce fuel reductions in the order of $9.0 \%$ and $4.5 \%$, whereas light duty vehicles produce negligible fuel savings. △ Less

Submitted 2 March, 2020; v1 submitted 2 January, 2020; originally announced January 2020.

Comments: In review in the Journal of Applied Energy. IEEE Transactions on Vehicular Technology, 2021

arXiv:1912.00157 [pdf, other]

Correction Filter for Single Image Super-Resolution: Robustifying Off-the-Shelf Deep Super-Resolvers

Authors: Shady Abu Hussein, Tom Tirer, Raja Giryes

Abstract: The single image super-resolution task is one of the most examined inverse problems in the past decade. In the recent years, Deep Neural Networks (DNNs) have shown superior performance over alternative methods when the acquisition process uses a fixed known downsampling kernel-typically a bicubic kernel. However, several recent works have shown that in practical scenarios, where the test data mism… ▽ More The single image super-resolution task is one of the most examined inverse problems in the past decade. In the recent years, Deep Neural Networks (DNNs) have shown superior performance over alternative methods when the acquisition process uses a fixed known downsampling kernel-typically a bicubic kernel. However, several recent works have shown that in practical scenarios, where the test data mismatch the training data (e.g. when the downsampling kernel is not the bicubic kernel or is not available at training), the leading DNN methods suffer from a huge performance drop. Inspired by the literature on generalized sampling, in this work we propose a method for improving the performance of DNNs that have been trained with a fixed kernel on observations acquired by other kernels. For a known kernel, we design a closed-form correction filter that modifies the low-resolution image to match one which is obtained by another kernel (e.g. bicubic), and thus improves the results of existing pre-trained DNNs. For an unknown kernel, we extend this idea and propose an algorithm for blind estimation of the required correction filter. We show that our approach outperforms other super-resolution methods, which are designed for general downsampling kernels. △ Less

Submitted 24 May, 2020; v1 submitted 30 November, 2019; originally announced December 2019.

Comments: Accepted to CVPR 2020 (Oral). Code is available at https://github.com/shadyabh/Correction-Filter

arXiv:1910.02292 [pdf, other]

Keyword Spotter Model for Crop Pest and Disease Monitoring from Community Radio Data

Authors: Benjamin Akera, Joyce Nakatumba-Nabende, Jonathan Mukiibi, Ali Hussein, Nathan Baleeta, Daniel Ssendiwala, Samiiha Nalwooga

Abstract: In societies with well developed internet infrastructure, social media is the leading medium of communication for various social issues especially for breaking news situations. In rural Uganda however, public community radio is still a dominant means for news dissemination. Community radio gives audience to the general public especially to individuals living in rural areas, and thus plays an impor… ▽ More In societies with well developed internet infrastructure, social media is the leading medium of communication for various social issues especially for breaking news situations. In rural Uganda however, public community radio is still a dominant means for news dissemination. Community radio gives audience to the general public especially to individuals living in rural areas, and thus plays an important role in giving a voice to those living in the broadcast area. It is an avenue for participatory communication and a tool relevant in both economic and social development.This is supported by the rise to ubiquity of mobile phones providing access to phone-in or text-in talk shows. In this paper, we describe an approach to analysing the readily available community radio data with machine learning-based speech keyword spotting techniques. We identify the keywords of interest related to agriculture and build models to automatically identify these keywords from audio streams. Our contribution through these techniques is a cost-efficient and effective way to monitor food security concerns particularly in rural areas. Through keyword spotting and radio talk show analysis, issues such as crop diseases, pests, drought and famine can be captured and fed into an early warning system for stakeholders and policy makers. △ Less

Submitted 5 October, 2019; originally announced October 2019.

Comments: Presented at NeurIPS 2019 Workshop on Machine Learning for the Developing World

arXiv:1907.09455 [pdf]

doi 10.1109/TPEL.2021.3096164

Latent Function Decomposition for Forecasting Li-ion Battery Cells Capacity: A Multi-Output Convolved Gaussian Process Approach

Authors: Abdallah A. Chehade, Ala A. Hussein

Abstract: A latent function decomposition method is proposed for forecasting the capacity of lithium-ion battery cells. The method uses the Multi-Output Gaussian Process, a generative machine learning framework for multi-task and transfer learning. The MCGP decomposes the available capacity trends from multiple battery cells into latent functions. The latent functions are then convolved over kernel smoother… ▽ More A latent function decomposition method is proposed for forecasting the capacity of lithium-ion battery cells. The method uses the Multi-Output Gaussian Process, a generative machine learning framework for multi-task and transfer learning. The MCGP decomposes the available capacity trends from multiple battery cells into latent functions. The latent functions are then convolved over kernel smoothers to reconstruct and/or forecast capacity trends of the battery cells. Besides the high prediction accuracy the proposed method possesses, it provides uncertainty information for the predictions and captures nontrivial cross-correlations between capacity trends of different battery cells. These two merits make the proposed MCGP a very reliable and practical solution for applications that use battery cell packs. The MCGP is derived and compared to benchmark methods on an experimental lithium-ion battery cells data. The results show the effectiveness of the proposed method. △ Less

Submitted 19 July, 2019; originally announced July 2019.

arXiv:1906.05284 [pdf, other]

Image-Adaptive GAN based Reconstruction

Authors: Shady Abu Hussein, Tom Tirer, Raja Giryes

Abstract: In the recent years, there has been a significant improvement in the quality of samples produced by (deep) generative models such as variational auto-encoders and generative adversarial networks. However, the representation capabilities of these methods still do not capture the full distribution for complex classes of images, such as human faces. This deficiency has been clearly observed in previo… ▽ More In the recent years, there has been a significant improvement in the quality of samples produced by (deep) generative models such as variational auto-encoders and generative adversarial networks. However, the representation capabilities of these methods still do not capture the full distribution for complex classes of images, such as human faces. This deficiency has been clearly observed in previous works that use pre-trained generative models to solve imaging inverse problems. In this paper, we suggest to mitigate the limited representation capabilities of generators by making them image-adaptive and enforcing compliance of the restoration with the observations via back-projections. We empirically demonstrate the advantages of our proposed approach for image super-resolution and compressed sensing. △ Less

Submitted 25 November, 2019; v1 submitted 12 June, 2019; originally announced June 2019.

Comments: Accepted to AAAI 2020. Code available at https://github.com/shadyabh/IAGAN

arXiv:1905.11652 [pdf]

Crowdsourced Peer Learning Activity for Internet of Things Education: A Case Study

Authors: Ahmed Hussein, Mahmoud Barhamgi, Massimo Vecchio, Charith Perera

Abstract: Computing devices such as laptops, tablets and mobile phones have become part of our daily lives. End users increasingly know more and more information about these devices. Further, more technically savvy end users know how such devices are being built and know how to choose one over the others. However, we cannot say the same about the Internet of Things (IoT) products. Due to its infancy nature… ▽ More Computing devices such as laptops, tablets and mobile phones have become part of our daily lives. End users increasingly know more and more information about these devices. Further, more technically savvy end users know how such devices are being built and know how to choose one over the others. However, we cannot say the same about the Internet of Things (IoT) products. Due to its infancy nature of the marketplace, end users have very little idea about IoT products. To address this issue, we developed a method, a crowdsourced peer learning activity, supported by an online platform (OLYMPUS) to enable a group of learners to learn IoT products space better. We conducted two different user studies to validate that our tool enables better IoT education. Our method guide learners to think more deeply about IoT products and their design decisions. The learning platform we developed is open source and available for the community. △ Less

Submitted 28 May, 2019; originally announced May 2019.

arXiv:1812.06938 [pdf]

VLC Systems with CGHs

Authors: Safwan Hafeedh Younus, Ahmed Taha Hussein, Mohammed T. Alresheedi, Jaafar M. H. Elmirghani

Abstract: The achievable data rate in indoor wireless systems that employ visible light communication (VLC) can be limited by multipath propagation. Here, we use computer generated holograms (CGHs) in VLC system design to improve the achievable system data rate. The CGHs are utilized to produce a fixed broad beam from the light source, selecting the light source that offers the best performance. The CGHs di… ▽ More The achievable data rate in indoor wireless systems that employ visible light communication (VLC) can be limited by multipath propagation. Here, we use computer generated holograms (CGHs) in VLC system design to improve the achievable system data rate. The CGHs are utilized to produce a fixed broad beam from the light source, selecting the light source that offers the best performance. The CGHs direct this beam to a specific zone on the room's communication floor where the receiver is located. This reduces the effect of diffuse reflections. Consequently, decreasing the intersymbol interference (ISI) and enabling the VLC indoor channel to support higher data rates. We consider two settings to examine our propose VLC system and consider lighting constraints. We evaluate the performance in idealistic and realistic room setting in a diffuse environment with up to second order reflections and also under mobility. The results show that using the CGHs enhances the 3dB bandwidth of the VLC channel and improves the received optical power. △ Less

Submitted 12 November, 2018; originally announced December 2018.

arXiv:1808.06211 [pdf, other]

Mixed Initiative Systems for Human-Swarm Interaction: Opportunities and Challenges

Authors: Aya Hussein, Hussein Abbass

Abstract: Human-swarm interaction (HSI) involves a number of human factors impacting human behaviour throughout the interaction. As the technologies used within HSI advance, it is more tempting to increase the level of swarm autonomy within the interaction to reduce the workload on humans. Yet, the prospective negative effects of high levels of autonomy on human situational awareness can hinder this process… ▽ More Human-swarm interaction (HSI) involves a number of human factors impacting human behaviour throughout the interaction. As the technologies used within HSI advance, it is more tempting to increase the level of swarm autonomy within the interaction to reduce the workload on humans. Yet, the prospective negative effects of high levels of autonomy on human situational awareness can hinder this process. Flexible autonomy aims at trading-off these effects by changing the level of autonomy within the interaction when required; with mixed-initiatives combining human preferences and automation's recommendations to select an appropriate level of autonomy at a certain point of time. However, the effective implementation of mixed-initiative systems raises fundamental questions on how to combine human preferences and automation recommendations, how to realise the selected level of autonomy, and what the future impacts on the cognitive states of a human are. We explore open challenges that hamper the process of developing effective flexible autonomy. We then highlight the potential benefits of using system modelling techniques in HSI by illustrating how they provide HSI designers with an opportunity to evaluate different strategies for assessing the state of the mission and for adapting the level of autonomy within the interaction to maximise mission success metrics. △ Less

Submitted 19 August, 2018; originally announced August 2018.

Comments: Author version, accepted at the 2018 IEEE Annual Systems Modelling Conference, Canberra, Australia

arXiv:1803.05909 [pdf, other]

Efficient Hardware Realization of Convolutional Neural Networks using Intra-Kernel Regular Pruning

Authors: Maurice Yang, Mahmoud Faraj, Assem Hussein, Vincent Gaudet

Abstract: The recent trend toward increasingly deep convolutional neural networks (CNNs) leads to a higher demand of computational power and memory storage. Consequently, the deployment of CNNs in hardware has become more challenging. In this paper, we propose an Intra-Kernel Regular (IKR) pruning scheme to reduce the size and computational complexity of the CNNs by removing redundant weights at a fine-grai… ▽ More The recent trend toward increasingly deep convolutional neural networks (CNNs) leads to a higher demand of computational power and memory storage. Consequently, the deployment of CNNs in hardware has become more challenging. In this paper, we propose an Intra-Kernel Regular (IKR) pruning scheme to reduce the size and computational complexity of the CNNs by removing redundant weights at a fine-grained level. Unlike other pruning methods such as Fine-Grained pruning, IKR pruning maintains regular kernel structures that are exploitable in a hardware accelerator. Experimental results demonstrate up to 10x parameter reduction and 7x computational reduction at a cost of less than 1% degradation in accuracy versus the un-pruned case. △ Less

Submitted 15 March, 2018; originally announced March 2018.

Comments: 6 pages, 8 figures, ISMVL 2018

arXiv:1803.03093 [pdf, other]

Towards Bi-Directional Communication in Human-Swarm Teaming: A Survey

Authors: Aya Hussein, Leo Ghignone, Tung Nguyen, Nima Salimi, Hung Nguyen, Min Wang, Hussein A. Abbass

Abstract: Swarm systems consist of large numbers of robots that collaborate autonomously. With an appropriate level of human control, swarm systems could be applied in a variety of contexts ranging from search-and-rescue situations to Cyber defence. The two decision making cycles of swarms and humans operate on two different time-scales, where the former is normally orders of magnitude faster than the latte… ▽ More Swarm systems consist of large numbers of robots that collaborate autonomously. With an appropriate level of human control, swarm systems could be applied in a variety of contexts ranging from search-and-rescue situations to Cyber defence. The two decision making cycles of swarms and humans operate on two different time-scales, where the former is normally orders of magnitude faster than the latter. Closing the loop at the intersection of these two cycles will create fast and adaptive human-swarm teaming networks. This paper brings desperate pieces of the ground work in this research area together to review this multidisciplinary literature. We conclude with a framework to synthesize the findings and summarize the multi-modal indicators needed for closed-loop human-swarm adaptive systems. △ Less

Submitted 4 March, 2018; originally announced March 2018.

arXiv:1712.04245 [pdf]

Coordinator Location Effects in AODV Routing Protocol in ZigBee Mesh Network

Authors: Abla Hussein, Ghassan Samara

Abstract: ZigBee mesh network is very important research field in computer networks. However, the location of ZigBee coordinator plays a significant role in design and routing performance. In this paper, an extensive study on the factors that influence the performance of AODV routing protocol had been performed through the study of battery voltage decaying of nodes, neighboring tables, time delay and networ… ▽ More ZigBee mesh network is very important research field in computer networks. However, the location of ZigBee coordinator plays a significant role in design and routing performance. In this paper, an extensive study on the factors that influence the performance of AODV routing protocol had been performed through the study of battery voltage decaying of nodes, neighboring tables, time delay and network topology structure. Simulation results reveal that the location of the coordinator within approximate equal distances to all nodes is more appropriate for lifelong batteries and AODV routing performance. △ Less

Submitted 12 December, 2017; originally announced December 2017.

Comments: 7 pages

Journal ref: International Journal of Computer Applications (0975-8887), October 2015 Volume 127 - No.8

arXiv:1712.04186 [pdf]

Mathematical Modeling and Analysis of ZigBee Node Battery Characteristics and Operation

Authors: Abla Hussein, Ghassan Samara

Abstract: ZigBee network technology has been used widely in different commercial, medical and industrial applications, and the importance of keeping the network operating at a longer time was the main objective of ZigBee manufacturers. In this paper, ZigBee battery characteristics and operation has been researched extensively and a mathematical modeling has been applied on existed practical data provided by… ▽ More ZigBee network technology has been used widely in different commercial, medical and industrial applications, and the importance of keeping the network operating at a longer time was the main objective of ZigBee manufacturers. In this paper, ZigBee battery characteristics and operation has been researched extensively and a mathematical modeling has been applied on existed practical data provided by Freescale semiconductors Inc [1], and Farnell [2]. As a result a mathematical optimized formula has been established which describes battery characteristic and voltage behavior as a function of time, and since ZigBee node batteries has been the core objective of this research work, a decline in battery voltage below 50% of battery capacity could influence and degrade ZigBee network performance. △ Less

Submitted 12 December, 2017; originally announced December 2017.

Comments: 8 pages

Report number: Vol.3 (6). PP: 99-106, 2015

Journal ref: MAGNT Research Report, ISSN. 1444-8939, 2015

arXiv:1712.01970 [pdf, other]

What's in my closet?: Image classification using fuzzy logic

Authors: Amina E. Hussein

Abstract: A fuzzy system was created in MATLAB to identify an item of clothing as a dress, shirt, or pair of pants from a series of input images. The system was initialized using a high-contrast vector-image of each item of clothing as the state closest to a direct solution. Nine other user-input images (three of each item) were also used to determine the characteristic function of each item and recognize e… ▽ More A fuzzy system was created in MATLAB to identify an item of clothing as a dress, shirt, or pair of pants from a series of input images. The system was initialized using a high-contrast vector-image of each item of clothing as the state closest to a direct solution. Nine other user-input images (three of each item) were also used to determine the characteristic function of each item and recognize each pattern. Mamdani inference systems were used for edge location and identification of characteristic regions of interest for each item of clothing. Based on these non-dimensional trends, a second Mamdani fuzzy inference system was used to characterize each image as containing a shirt, a dress, or a pair of pants. An outline of the fuzzy inference system and image processing techniques used for creating an image pattern recognition system are discussed. △ Less

Submitted 5 December, 2017; originally announced December 2017.

Comments: 12 pages, 8 Figures

arXiv:1710.03088 [pdf]

doi 10.1142/9789814667364_0028

Finger Based Techniques for Nonvisual Touchscreen Text Entry

Authors: Mohammed Fakrudeen, Sufian Yousef, Mahdi H. Miraz, AbdelRahman Hamza Hussein

Abstract: This research proposes Finger Based Technique (FBT) for non-visual touch screen device interaction designed for blind users. Based on the proposed technique, the blind user can access virtual keys based on finger holding positions. Three different models have been proposed. They are Single Digit Finger-Digit Input (FDI), Double Digit FDI for digital text entry, and Finger-Text Input (FTI) for norm… ▽ More This research proposes Finger Based Technique (FBT) for non-visual touch screen device interaction designed for blind users. Based on the proposed technique, the blind user can access virtual keys based on finger holding positions. Three different models have been proposed. They are Single Digit Finger-Digit Input (FDI), Double Digit FDI for digital text entry, and Finger-Text Input (FTI) for normal text entry. All the proposed models were implemented with voice feedback while enabling touch as the input gesture. The models were evaluated with 7 blind participants with Samsung Galaxy S2 apparatus. The results show that Single Digit FDI is substantially faster and more accurate than Double Digit FDI and iPhone voice-over. FTI also looks promising for text entry. Our study also reveals 11 accessible regions to place widgets for quick access by blind users in flat touch screen based smartphones. Identification of these accessible regions will promote dynamic interactions for blind users and serve as a usability design framework for touch screen applications. △ Less

Submitted 2 October, 2017; originally announced October 2017.

Comments: arXiv admin note: substantial text overlap with arXiv:1708.05073

Journal ref: IAENG Transactions on Engineering Sciences, Chapter 28, pp. 372-386, April 2015

arXiv:1708.08189 [pdf]

An IoT Real-Time Biometric Authentication System Based on ECG Fiducial Extracted Features Using Discrete Cosine Transform

Authors: Ahmed F. Hussein, Abbas K. AlZubaidi, Ali Al-Bayaty, Qais A. Habash

Abstract: The conventional authentication technologies, like RFID tags and authentication cards/badges, suffer from different weaknesses, therefore a prompt replacement to use biometric method of authentication should be applied instead. Biometrics, such as fingerprints, voices, and ECG signals, are unique human characters that can be used for authentication processing. In this work, we present an IoT real-… ▽ More The conventional authentication technologies, like RFID tags and authentication cards/badges, suffer from different weaknesses, therefore a prompt replacement to use biometric method of authentication should be applied instead. Biometrics, such as fingerprints, voices, and ECG signals, are unique human characters that can be used for authentication processing. In this work, we present an IoT real-time authentication system based on using extracted ECG features to identify the unknown persons. The Discrete Cosine Transform (DCT) is used as an ECG feature extraction, where it has better characteristics for real-time system implementations. There are a substantial number of researches with a high accuracy of authentication, but most of them ignore the real-time capability of authenticating individuals. With the accuracy rate of 97.78% at around 1.21 seconds of processing time, the proposed system is more suitable for use in many applications that require fast and reliable authentication processing demands. △ Less

Submitted 28 August, 2017; originally announced August 2017.

Comments: 6 pages, 8 figures, IoT, Authentication, ECG, DCT

ACM Class: J.3; K.6.5; H.1.2; C.5.2

arXiv:1708.05068 [pdf]

doi 10.13140/2.1.1307.1681

Analysis of QoS of VoIP Traffic through WiFi-UMTS Networks

Authors: Mahdi H. Miraz, Suhail A. Molvi, Maaruf Ali, Muzafar A. Ganie, AbdelRahman H. Hussein

Abstract: Simulation of VoIP (Voice over Internet Protocol) traffic through UMTS (Universal Mobile Telecommunication System) and WiFi (IEEE 802.11x) alone and together are analysed for Quality of Service (QoS) performance. The average jitter of VoIP transiting the WiFi-UMTS network has been found to be lower than that of either solely through the WiFi and the UMTS networks. It is normally expected to be hig… ▽ More Simulation of VoIP (Voice over Internet Protocol) traffic through UMTS (Universal Mobile Telecommunication System) and WiFi (IEEE 802.11x) alone and together are analysed for Quality of Service (QoS) performance. The average jitter of VoIP transiting the WiFi-UMTS network has been found to be lower than that of either solely through the WiFi and the UMTS networks. It is normally expected to be higher than traversing through the WiFi network only. Both the MOS (Mean Opinion Score) and the packet end-to-end delay were also found to be much lower than expected through the heterogeneous WiFi-UMTS network. △ Less

Submitted 9 August, 2017; originally announced August 2017.

Journal ref: World Congress on Engineering (WCE 2014) held at Imperial College, London, UK, 2-4 July 2014, ISBN-13: 978-988-19252-7-5, Print ISSN: 2078-0958, Online ISSN: 2078-0966, Vol. 1, pp. 684-689

arXiv:1708.01572 [pdf]

doi 10.14569/IJACSA.2017.080732

Simulation and Analysis of Quality of Service (QoS) Parameters of Voice over IP (VoIP) Traffic through Heterogeneous Networks

Authors: Mahdi H. Miraz, Suhail A. Molvi, Muzafar A. Ganie, Maaruf Ali, AbdelRahman H. Hussein

Abstract: Identifying those causes and parameters that affect the Quality of Service (QoS) of Voice-over-Internet Protocol (VoIP) through heterogeneous networks such as WiFi, WiMAX and between them are carried out using the OPNET simulation tool. Optimization of the network for both intra- and intersystem traffic to mitigate the deterioration of the QoS are discussed. The average value of the jitter of the… ▽ More Identifying those causes and parameters that affect the Quality of Service (QoS) of Voice-over-Internet Protocol (VoIP) through heterogeneous networks such as WiFi, WiMAX and between them are carried out using the OPNET simulation tool. Optimization of the network for both intra- and intersystem traffic to mitigate the deterioration of the QoS are discussed. The average value of the jitter of the VoIP traffic traversing through the WiFi-WiMAX network was observed to be higher than that of utilizing WiFi alone at some points in time. It is routinely surmised to be less than that of transiting across the WiFi network only and obviously higher than passing through the increased bandwidth network of WiMAX. Moreover, both the values of the packet end-to-end delay and the Mean Opinion Score (MOS) were considerably higher than expected. The consequences of this optimization, leading to a solution, which can ameliorate the QoS over these networks are analyzed and offered as the conclusion of this ongoing research. △ Less

Submitted 1 August, 2017; originally announced August 2017.

Comments: Voice over Internet Protocol (VoIP); Quality of Service (QoS); Mean Opinion Score (MOS); simulation

Journal ref: International Journal of Advanced Computer Science and Applications (IJACSA) Online ISSN: 2156-5570, Print ISSN: 2158-107X, Volume 8 No 7 July 2017, pp. 242-248, published by Science and Information (SAI) Organization

arXiv:1706.07886 [pdf, other]

doi 10.1016/j.patrec.2010.09.019

Fundamental Matrix Estimation: A Study of Error Criteria

Authors: Mohammed E. Fathy, Ashraf S. Hussein, Mohammed F. Tolba

Abstract: The fundamental matrix (FM) describes the geometric relations that exist between two images of the same scene. Different error criteria are used for estimating FMs from an input set of correspondences. In this paper, the accuracy and efficiency aspects of the different error criteria were studied. We mathematically and experimentally proved that the most popular error criterion, the symmetric epip… ▽ More The fundamental matrix (FM) describes the geometric relations that exist between two images of the same scene. Different error criteria are used for estimating FMs from an input set of correspondences. In this paper, the accuracy and efficiency aspects of the different error criteria were studied. We mathematically and experimentally proved that the most popular error criterion, the symmetric epipolar distance, is biased. It was also shown that despite the similarity between the algebraic expressions of the symmetric epipolar distance and Sampson distance, they have different accuracy properties. In addition, a new error criterion, Kanatani distance, was proposed and was proved to be the most effective for use during the outlier removal phase from accuracy and efficiency perspectives. To thoroughly test the accuracy of the different error criteria, we proposed a randomized algorithm for Reprojection Error-based Correspondence Generation (RE-CG). As input, RE-CG takes an FM and a desired reprojection error value $d$. As output, RE-CG generates a random correspondence having that error value. Mathematical analysis of this algorithm revealed that the success probability for any given trial is 1 - (2/3)^2 at best and is 1 - (6/7)^2 at worst while experiments demonstrated that the algorithm often succeeds after only one trial. △ Less

Submitted 23 June, 2017; originally announced June 2017.

Comments: 15 pages, 7 figures, Pattern Recognition Letters, 2011

arXiv:1001.3496 [pdf]

Spatial Domain Watermarking Scheme for Colored Images Based on Log-average Luminance

Authors: Jamal A. Hussein

Abstract: In this paper a new watermarking scheme is presented based on log-average luminance. A colored-image is divided into blocks after converting the RGB colored image to YCbCr color space. A monochrome image of 1024 bytes is used as the watermark. To embed the watermark, 16 blocks of size 8X8 are selected and used to embed the watermark image into the original image. The selected blocks are chosen s… ▽ More In this paper a new watermarking scheme is presented based on log-average luminance. A colored-image is divided into blocks after converting the RGB colored image to YCbCr color space. A monochrome image of 1024 bytes is used as the watermark. To embed the watermark, 16 blocks of size 8X8 are selected and used to embed the watermark image into the original image. The selected blocks are chosen spirally (beginning form the center of the image) among the blocks that have log-average luminance higher than or equal the log-average luminance of the entire image. Each byte of the monochrome watermark is added by updating a luminance value of a pixel of the image. If the byte of the watermark image represented white color (255) a value <alpha> is added to the image pixel luminance value, if it is black (0) the <alpha> is subtracted from the luminance value. To extract the watermark, the selected blocks are chosen as the above, if the difference between the luminance value of the watermarked image pixel and the original image pixel is greater than 0, the watermark pixel is supposed to be white, otherwise it supposed to be black. Experimental results show that the proposed scheme is efficient against changing the watermarked image to grayscale, image cropping, and JPEG compression. △ Less

Submitted 20 January, 2010; originally announced January 2010.

Journal ref: Journal of Computing, Vol. 2, Issue 1, January 2010

arXiv:0912.3980 [pdf]

Fair Exchange of Digital Signatures using RSA-based CEMBS and Offline STTP

Authors: Jamal A. Hussein, Mumtaz A. AlMukhtar

Abstract: One of the essential security services needed to safeguard online transactions is fair exchange. In fair exchange protocols two parties can exchange their signatures in a fair manner, so that either each party gain the other's signature or no one obtain anything useful. This paper examines security solutions for achieving fair exchange. It proposes new security protocols based on the "Certified… ▽ More One of the essential security services needed to safeguard online transactions is fair exchange. In fair exchange protocols two parties can exchange their signatures in a fair manner, so that either each party gain the other's signature or no one obtain anything useful. This paper examines security solutions for achieving fair exchange. It proposes new security protocols based on the "Certified Encrypted Message Being Signature" (CEMBS) by using RSA signature scheme. This protocol relies on the help of an "off-line Semi-Trusted Third Party" (STTP) to achieve fairness. They provide with confidential protection from the STTP for the exchanged items by limiting the role and power of the STTP. Three different protocols have been proposed. In the first protocol, the two main parties exchange their signatures on a common message. In the second protocol, the signatures are exchanged on two different messages. While in the third one, the exchange is between confidential data and signature. △ Less

Submitted 20 December, 2009; originally announced December 2009.

Journal ref: Journal of Computing, Volume 1, Issue 1, pp 87-91, December 2009

Showing 1–38 of 38 results for author: Hussein, A