-
Leveraging Knowledge Distillation for Lightweight Skin Cancer Classification: Balancing Accuracy and Computational Efficiency
Authors:
Niful Islam,
Khan Md Hasib,
Fahmida Akter Joti,
Asif Karim,
Sami Azam
Abstract:
Skin cancer is a major concern to public health, accounting for one-third of the reported cancers. If not detected early, the cancer has the potential for severe consequences. Recognizing the critical need for effective skin cancer classification, we address the limitations of existing models, which are often too large to deploy in areas with limited computational resources. In response, we presen…
▽ More
Skin cancer is a major concern to public health, accounting for one-third of the reported cancers. If not detected early, the cancer has the potential for severe consequences. Recognizing the critical need for effective skin cancer classification, we address the limitations of existing models, which are often too large to deploy in areas with limited computational resources. In response, we present a knowledge distillation based approach for creating a lightweight yet high-performing classifier. The proposed solution involves fusing three models, namely ResNet152V2, ConvNeXtBase, and ViT Base, to create an effective teacher model. The teacher model is then employed to guide a lightweight student model of size 2.03 MB. This student model is further compressed to 469.77 KB using 16-bit quantization, enabling smooth incorporation into edge devices. With six-stage image preprocessing, data augmentation, and a rigorous ablation study, the model achieves an impressive accuracy of 98.75% on the HAM10000 dataset and 98.94% on the Kaggle dataset in classifying benign and malignant skin cancers. With its high accuracy and compact size, our model appears to be a potential choice for accurate skin cancer classification, particularly in resource-constrained settings.
△ Less
Submitted 28 June, 2024; v1 submitted 24 June, 2024;
originally announced June 2024.
-
Intent Detection and Slot Filling for Home Assistants: Dataset and Analysis for Bangla and Sylheti
Authors:
Fardin Ahsan Sakib,
A H M Rezaul Karim,
Saadat Hasan Khan,
Md Mushfiqur Rahman
Abstract:
As voice assistants cement their place in our technologically advanced society, there remains a need to cater to the diverse linguistic landscape, including colloquial forms of low-resource languages. Our study introduces the first-ever comprehensive dataset for intent detection and slot filling in formal Bangla, colloquial Bangla, and Sylheti languages, totaling 984 samples across 10 unique inten…
▽ More
As voice assistants cement their place in our technologically advanced society, there remains a need to cater to the diverse linguistic landscape, including colloquial forms of low-resource languages. Our study introduces the first-ever comprehensive dataset for intent detection and slot filling in formal Bangla, colloquial Bangla, and Sylheti languages, totaling 984 samples across 10 unique intents. Our analysis reveals the robustness of large language models for tackling downstream tasks with inadequate data. The GPT-3.5 model achieves an impressive F1 score of 0.94 in intent detection and 0.51 in slot filling for colloquial Bangla.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
mmWave Enabled Connected Autonomous Vehicles: A Use Case with V2V Cooperative Perception
Authors:
Muhammad Baqer Mollah,
Honggang Wang,
Mohammad Ataul Karim,
Hua Fang
Abstract:
Connected and autonomous vehicles (CAVs) will revolutionize tomorrow's intelligent transportation systems, being considered promising to improve transportation safety, traffic efficiency, and mobility. In fact, envisioned use cases of CAVs demand very high throughput, lower latency, highly reliable communications, and precise positioning capabilities. The availability of a large spectrum at millim…
▽ More
Connected and autonomous vehicles (CAVs) will revolutionize tomorrow's intelligent transportation systems, being considered promising to improve transportation safety, traffic efficiency, and mobility. In fact, envisioned use cases of CAVs demand very high throughput, lower latency, highly reliable communications, and precise positioning capabilities. The availability of a large spectrum at millimeter-wave (mmWave) band potentially promotes new specifications in spectrum technologies capable of supporting such service requirements. In this article, we specifically focus on how mmWave communications are being approached in vehicular standardization activities, CAVs use cases and deployment challenges in realizing the future fully connected settings. Finally, we also present a detailed performance assessment on mmWave-enabled vehicle-to-vehicle (V2V) cooperative perception as an example case study to show the impact of different configurations.
△ Less
Submitted 14 October, 2023;
originally announced October 2023.
-
Bridging the Gap: Exploring the Capabilities of Bridge-Architectures for Complex Visual Reasoning Tasks
Authors:
Kousik Rajesh,
Mrigank Raman,
Mohammed Asad Karim,
Pranit Chawla
Abstract:
In recent times there has been a surge of multi-modal architectures based on Large Language Models, which leverage the zero shot generation capabilities of LLMs and project image embeddings into the text space and then use the auto-regressive capacity to solve tasks such as VQA, captioning, and image retrieval. We name these architectures as "bridge-architectures" as they project from the image sp…
▽ More
In recent times there has been a surge of multi-modal architectures based on Large Language Models, which leverage the zero shot generation capabilities of LLMs and project image embeddings into the text space and then use the auto-regressive capacity to solve tasks such as VQA, captioning, and image retrieval. We name these architectures as "bridge-architectures" as they project from the image space to the text space. These models deviate from the traditional recipe of training transformer based multi-modal models, which involve using large-scale pre-training and complex multi-modal interactions through co or cross attention. However, the capabilities of bridge architectures have not been tested on complex visual reasoning tasks which require fine grained analysis about the image. In this project, we investigate the performance of these bridge-architectures on the NLVR2 dataset, and compare it to state-of-the-art transformer based architectures. We first extend the traditional bridge architectures for the NLVR2 dataset, by adding object level features to faciliate fine-grained object reasoning. Our analysis shows that adding object level features to bridge architectures does not help, and that pre-training on multi-modal data is key for good performance on complex reasoning tasks such as NLVR2. We also demonstrate some initial results on a recently bridge-architecture, LLaVA, in the zero shot setting and analyze its performance.
△ Less
Submitted 30 July, 2023;
originally announced July 2023.
-
Extending the Frontier of ChatGPT: Code Generation and Debugging
Authors:
Fardin Ahsan Sakib,
Saadat Hasan Khan,
A. H. M. Rezaul Karim
Abstract:
Large-scale language models (LLMs) have emerged as a groundbreaking innovation in the realm of question-answering and conversational agents. These models, leveraging different deep learning architectures such as Transformers, are trained on vast corpora to predict sentences based on given queries. Among these LLMs, ChatGPT, developed by OpenAI, has ushered in a new era by utilizing artificial inte…
▽ More
Large-scale language models (LLMs) have emerged as a groundbreaking innovation in the realm of question-answering and conversational agents. These models, leveraging different deep learning architectures such as Transformers, are trained on vast corpora to predict sentences based on given queries. Among these LLMs, ChatGPT, developed by OpenAI, has ushered in a new era by utilizing artificial intelligence (AI) to tackle diverse problem domains, ranging from composing essays and biographies to solving intricate mathematical integrals. The versatile applications enabled by ChatGPT offer immense value to users. However, assessing the performance of ChatGPT's output poses a challenge, particularly in scenarios where queries lack clear objective criteria for correctness. For instance, evaluating the quality of generated essays becomes arduous and relies heavily on manual labor, in stark contrast to evaluating solutions to well-defined, closed-ended questions such as mathematical problems. This research paper delves into the efficacy of ChatGPT in solving programming problems, examining both the correctness and the efficiency of its solution in terms of time and memory complexity. The research reveals a commendable overall success rate of 71.875\%, denoting the proportion of problems for which ChatGPT was able to provide correct solutions that successfully satisfied all the test cases present in Leetcode. It exhibits strengths in structured problems and shows a linear correlation between its success rate and problem acceptance rates. However, it struggles to improve solutions based on feedback, pointing to potential shortcomings in debugging tasks. These findings provide a compact yet insightful glimpse into ChatGPT's capabilities and areas for improvement.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
Exploring the Vulnerabilities of Machine Learning and Quantum Machine Learning to Adversarial Attacks using a Malware Dataset: A Comparative Analysis
Authors:
Mst Shapna Akter,
Hossain Shahriar,
Iysa Iqbal,
MD Hossain,
M. A. Karim,
Victor Clincy,
Razvan Voicu
Abstract:
The burgeoning fields of machine learning (ML) and quantum machine learning (QML) have shown remarkable potential in tackling complex problems across various domains. However, their susceptibility to adversarial attacks raises concerns when deploying these systems in security sensitive applications. In this study, we present a comparative analysis of the vulnerability of ML and QML models, specifi…
▽ More
The burgeoning fields of machine learning (ML) and quantum machine learning (QML) have shown remarkable potential in tackling complex problems across various domains. However, their susceptibility to adversarial attacks raises concerns when deploying these systems in security sensitive applications. In this study, we present a comparative analysis of the vulnerability of ML and QML models, specifically conventional neural networks (NN) and quantum neural networks (QNN), to adversarial attacks using a malware dataset. We utilize a software supply chain attack dataset known as ClaMP and develop two distinct models for QNN and NN, employing Pennylane for quantum implementations and TensorFlow and Keras for traditional implementations. Our methodology involves crafting adversarial samples by introducing random noise to a small portion of the dataset and evaluating the impact on the models performance using accuracy, precision, recall, and F1 score metrics. Based on our observations, both ML and QML models exhibit vulnerability to adversarial attacks. While the QNNs accuracy decreases more significantly compared to the NN after the attack, it demonstrates better performance in terms of precision and recall, indicating higher resilience in detecting true positives under adversarial conditions. We also find that adversarial samples crafted for one model type can impair the performance of the other, highlighting the need for robust defense mechanisms. Our study serves as a foundation for future research focused on enhancing the security and resilience of ML and QML models, particularly QNN, given its recent advancements. A more extensive range of experiments will be conducted to better understand the performance and robustness of both models in the face of adversarial attacks.
△ Less
Submitted 31 May, 2023;
originally announced May 2023.
-
Attaining Class-level Forgetting in Pretrained Model using Few Samples
Authors:
Pravendra Singh,
Pratik Mazumder,
Mohammed Asad Karim
Abstract:
In order to address real-world problems, deep learning models are jointly trained on many classes. However, in the future, some classes may become restricted due to privacy/ethical concerns, and the restricted class knowledge has to be removed from the models that have been trained on them. The available data may also be limited due to privacy/ethical concerns, and re-training the model will not b…
▽ More
In order to address real-world problems, deep learning models are jointly trained on many classes. However, in the future, some classes may become restricted due to privacy/ethical concerns, and the restricted class knowledge has to be removed from the models that have been trained on them. The available data may also be limited due to privacy/ethical concerns, and re-training the model will not be possible. We propose a novel approach to address this problem without affecting the model's prediction power for the remaining classes. Our approach identifies the model parameters that are highly relevant to the restricted classes and removes the knowledge regarding the restricted classes from them using the limited available training data. Our approach is significantly faster and performs similar to the model re-trained on the complete data of the remaining classes.
△ Less
Submitted 19 October, 2022;
originally announced October 2022.
-
RealityTalk: Real-Time Speech-Driven Augmented Presentation for AR Live Storytelling
Authors:
Jian Liao,
Adnan Karim,
Shivesh Jadon,
Rubaiat Habib Kazi,
Ryo Suzuki
Abstract:
We present RealityTalk, a system that augments real-time live presentations with speech-driven interactive virtual elements. Augmented presentations leverage embedded visuals and animation for engaging and expressive storytelling. However, existing tools for live presentations often lack interactivity and improvisation, while creating such effects in video editing tools require significant time an…
▽ More
We present RealityTalk, a system that augments real-time live presentations with speech-driven interactive virtual elements. Augmented presentations leverage embedded visuals and animation for engaging and expressive storytelling. However, existing tools for live presentations often lack interactivity and improvisation, while creating such effects in video editing tools require significant time and expertise. RealityTalk enables users to create live augmented presentations with real-time speech-driven interactions. The user can interactively prompt, move, and manipulate graphical elements through real-time speech and supporting modalities. Based on our analysis of 177 existing video-edited augmented presentations, we propose a novel set of interaction techniques and then incorporated them into RealityTalk. We evaluate our tool from a presenter's perspective to demonstrate the effectiveness of our system.
△ Less
Submitted 12 August, 2022;
originally announced August 2022.
-
Localization and Classification of Parasitic Eggs in Microscopic Images Using an EfficientDet Detector
Authors:
Nouar AlDahoul,
Hezerul Abdul Karim,
Shaira Limson Kee,
Myles Joshua Toledo Tan
Abstract:
IPIs caused by protozoan and helminth parasites are among the most common infections in humans in LMICs. They are regarded as a severe public health concern, as they cause a wide array of potentially detrimental health conditions. Researchers have been developing pattern recognition techniques for the automatic identification of parasite eggs in microscopic images. Existing solutions still need im…
▽ More
IPIs caused by protozoan and helminth parasites are among the most common infections in humans in LMICs. They are regarded as a severe public health concern, as they cause a wide array of potentially detrimental health conditions. Researchers have been developing pattern recognition techniques for the automatic identification of parasite eggs in microscopic images. Existing solutions still need improvements to reduce diagnostic errors and generate fast, efficient, and accurate results. Our paper addresses this and proposes a multi-modal learning detector to localize parasitic eggs and categorize them into 11 categories. The experiments were conducted on the novel Chula-ParasiteEgg-11 dataset that was used to train both EfficientDet model with EfficientNet-v2 backbone and EfficientNet-B7+SVM. The dataset has 11,000 microscopic training images from 11 categories. Our results show robust performance with an accuracy of 92%, and an F1 score of 93%. Additionally, the IOU distribution illustrates the high localization capability of the detector.
△ Less
Submitted 3 August, 2022;
originally announced August 2022.
-
Augmented Reality and Robotics: A Survey and Taxonomy for AR-enhanced Human-Robot Interaction and Robotic Interfaces
Authors:
Ryo Suzuki,
Adnan Karim,
Tian Xia,
Hooman Hedayati,
Nicolai Marquardt
Abstract:
This paper contributes to a taxonomy of augmented reality and robotics based on a survey of 460 research papers. Augmented and mixed reality (AR/MR) have emerged as a new way to enhance human-robot interaction (HRI) and robotic interfaces (e.g., actuated and shape-changing interfaces). Recently, an increasing number of studies in HCI, HRI, and robotics have demonstrated how AR enables better inter…
▽ More
This paper contributes to a taxonomy of augmented reality and robotics based on a survey of 460 research papers. Augmented and mixed reality (AR/MR) have emerged as a new way to enhance human-robot interaction (HRI) and robotic interfaces (e.g., actuated and shape-changing interfaces). Recently, an increasing number of studies in HCI, HRI, and robotics have demonstrated how AR enables better interactions between people and robots. However, often research remains focused on individual explorations and key design strategies, and research questions are rarely analyzed systematically. In this paper, we synthesize and categorize this research field in the following dimensions: 1) approaches to augmenting reality; 2) characteristics of robots; 3) purposes and benefits; 4) classification of presented information; 5) design components and strategies for visual augmentation; 6) interaction techniques and modalities; 7) application domains; and 8) evaluation strategies. We formulate key challenges and opportunities to guide and inform future research in AR and robotics.
△ Less
Submitted 7 March, 2022;
originally announced March 2022.
-
Effective classification of ECG signals using enhanced convolutional neural network in IOT
Authors:
Ahmad M. Karim
Abstract:
In this paper, a novel ECG monitoring approach based on IoT technology is suggested. This paper proposes a routing system for IoT healthcare platforms based on Dynamic Source Routing (DSR) and Routing by Energy and Link Quality (REL). In addition, the Artificial Neural Network (ANN), Support Vector Machine (SVM), and Convolution Neural Networks (CNNs)-based approaches for ECG signal categorization…
▽ More
In this paper, a novel ECG monitoring approach based on IoT technology is suggested. This paper proposes a routing system for IoT healthcare platforms based on Dynamic Source Routing (DSR) and Routing by Energy and Link Quality (REL). In addition, the Artificial Neural Network (ANN), Support Vector Machine (SVM), and Convolution Neural Networks (CNNs)-based approaches for ECG signal categorization were tested in this study. Deep-ECG will employ a deep CNN to extract important characteristics, which will then be compared using simple and fast distance functions in order to classify cardiac problems efficiently. This work has suggested algorithms for the categorization of ECG data acquired from mobile watch users in order to identify aberrant data. The Massachusetts Institute of Technology (MIT) and Beth Israel Hospital (MIT/BIH) Arrhythmia Database have been used for experimental verification of the suggested approaches. The results show that the proposed strategy outperforms others in terms of classification accuracy.
△ Less
Submitted 8 February, 2022;
originally announced February 2022.
-
A new Sparse Auto-encoder based Framework using Grey Wolf Optimizer for Data Classification Problem
Authors:
Ahmad Mozaffer Karim
Abstract:
One of the most important properties of deep auto-encoders (DAEs) is their capability to extract high level features from row data. Hence, especially recently, the autoencoders are preferred to be used in various classification problems such as image and voice recognition, computer security, medical data analysis, etc. Despite, its popularity and high performance, the training phase of autoencoder…
▽ More
One of the most important properties of deep auto-encoders (DAEs) is their capability to extract high level features from row data. Hence, especially recently, the autoencoders are preferred to be used in various classification problems such as image and voice recognition, computer security, medical data analysis, etc. Despite, its popularity and high performance, the training phase of autoencoders is still a challenging task, involving to select best parameters that let the model to approach optimal results. Different training approaches are applied to train sparse autoencoders. Previous studies and preliminary experiments reveal that those approaches may present remarkable results in same problems but also disappointing results can be obtained in other complex problems. Metaheuristic algorithms have emerged over the last two decades and are becoming an essential part of contemporary optimization techniques. Gray wolf optimization (GWO) is one of the current of those algorithms and is applied to train sparse auto-encoders for this study. This model is validated by employing several popular Gene expression databases. Results are compared with previous state-of-the art methods studied with the same data sets and also are compared with other popular metaheuristic algorithms, namely, Genetic Algorithms (GA), Particle Swarm Optimization (PSO) and Artificial Bee Colony (ABC). Results reveal that the performance of the trained model using GWO outperforms on both conventional models and models trained with most popular metaheuristic algorithms.
△ Less
Submitted 28 January, 2022;
originally announced January 2022.
-
DILF-EN framework for Class-Incremental Learning
Authors:
Mohammed Asad Karim,
Indu Joshi,
Pratik Mazumder,
Pravendra Singh
Abstract:
Deep learning models suffer from catastrophic forgetting of the classes in the older phases as they get trained on the classes introduced in the new phase in the class-incremental learning setting. In this work, we show that the effect of catastrophic forgetting on the model prediction varies with the change in orientation of the same image, which is a novel finding. Based on this, we propose a no…
▽ More
Deep learning models suffer from catastrophic forgetting of the classes in the older phases as they get trained on the classes introduced in the new phase in the class-incremental learning setting. In this work, we show that the effect of catastrophic forgetting on the model prediction varies with the change in orientation of the same image, which is a novel finding. Based on this, we propose a novel data-ensemble approach that combines the predictions for the different orientations of the image to help the model retain further information regarding the previously seen classes and thereby reduce the effect of forgetting on the model predictions. However, we cannot directly use the data-ensemble approach if the model is trained using traditional techniques. Therefore, we also propose a novel dual-incremental learning framework that involves jointly training the network with two incremental learning objectives, i.e., the class-incremental learning objective and our proposed data-incremental learning objective. In the dual-incremental learning framework, each image belongs to two classes, i.e., the image class (for class-incremental learning) and the orientation class (for data-incremental learning). In class-incremental learning, each new phase introduces a new set of classes, and the model cannot access the complete training data from the older phases. In our proposed data-incremental learning, the orientation classes remain the same across all the phases, and the data introduced by the new phase in class-incremental learning acts as new training data for these orientation classes. We empirically demonstrate that the dual-incremental learning framework is vital to the data-ensemble approach. We apply our proposed approach to state-of-the-art class-incremental learning methods and empirically show that our framework significantly improves the performance of these methods.
△ Less
Submitted 23 December, 2021;
originally announced December 2021.
-
DeepFakes: Detecting Forged and Synthetic Media Content Using Machine Learning
Authors:
Sm Zobaed,
Md Fazle Rabby,
Md Istiaq Hossain,
Ekram Hossain,
Sazib Hasan,
Asif Karim,
Khan Md. Hasib
Abstract:
The rapid advancement in deep learning makes the differentiation of authentic and manipulated facial images and video clips unprecedentedly harder. The underlying technology of manipulating facial appearances through deep generative approaches, enunciated as DeepFake that have emerged recently by promoting a vast number of malicious face manipulation applications. Subsequently, the need of other s…
▽ More
The rapid advancement in deep learning makes the differentiation of authentic and manipulated facial images and video clips unprecedentedly harder. The underlying technology of manipulating facial appearances through deep generative approaches, enunciated as DeepFake that have emerged recently by promoting a vast number of malicious face manipulation applications. Subsequently, the need of other sort of techniques that can assess the integrity of digital visual content is indisputable to reduce the impact of the creations of DeepFake. A large body of research that are performed on DeepFake creation and detection create a scope of pushing each other beyond the current status. This study presents challenges, research trends, and directions related to DeepFake creation and detection techniques by reviewing the notable research in the DeepFake domain to facilitate the development of more robust approaches that could deal with the more advance DeepFake in the future.
△ Less
Submitted 7 September, 2021;
originally announced September 2021.
-
Knowledge Consolidation based Class Incremental Online Learning with Limited Data
Authors:
Mohammed Asad Karim,
Vinay Kumar Verma,
Pravendra Singh,
Vinay Namboodiri,
Piyush Rai
Abstract:
We propose a novel approach for class incremental online learning in a limited data setting. This problem setting is challenging because of the following constraints: (1) Classes are given incrementally, which necessitates a class incremental learning approach; (2) Data for each class is given in an online fashion, i.e., each training example is seen only once during training; (3) Each class has v…
▽ More
We propose a novel approach for class incremental online learning in a limited data setting. This problem setting is challenging because of the following constraints: (1) Classes are given incrementally, which necessitates a class incremental learning approach; (2) Data for each class is given in an online fashion, i.e., each training example is seen only once during training; (3) Each class has very few training examples; and (4) We do not use or assume access to any replay/memory to store data from previous classes. Therefore, in this setting, we have to handle twofold problems of catastrophic forgetting and overfitting. In our approach, we learn robust representations that are generalizable across tasks without suffering from the problems of catastrophic forgetting and overfitting to accommodate future classes with limited samples. Our proposed method leverages the meta-learning framework with knowledge consolidation. The meta-learning framework helps the model for rapid learning when samples appear in an online fashion. Simultaneously, knowledge consolidation helps to learn a robust representation against forgetting under online updates to facilitate future learning. Our approach significantly outperforms other methods on several benchmarks.
△ Less
Submitted 12 June, 2021;
originally announced June 2021.
-
A Clustering Framework for Lexical Normalization of Roman Urdu
Authors:
Abdul Rafae Khan,
Asim Karim,
Hassan Sajjad,
Faisal Kamiran,
Jia Xu
Abstract:
Roman Urdu is an informal form of the Urdu language written in Roman script, which is widely used in South Asia for online textual content. It lacks standard spelling and hence poses several normalization challenges during automatic language processing. In this article, we present a feature-based clustering framework for the lexical normalization of Roman Urdu corpora, which includes a phonetic al…
▽ More
Roman Urdu is an informal form of the Urdu language written in Roman script, which is widely used in South Asia for online textual content. It lacks standard spelling and hence poses several normalization challenges during automatic language processing. In this article, we present a feature-based clustering framework for the lexical normalization of Roman Urdu corpora, which includes a phonetic algorithm UrduPhone, a string matching component, a feature-based similarity function, and a clustering algorithm Lex-Var. UrduPhone encodes Roman Urdu strings to their pronunciation-based representations. The string matching component handles character-level variations that occur when writing Urdu using Roman script.
△ Less
Submitted 31 March, 2020;
originally announced April 2020.
-
Adapting Deep Learning for Sentiment Classification of Code-Switched Informal Short Text
Authors:
Muhammad Haroon Shakeel,
Asim Karim
Abstract:
Nowadays, an abundance of short text is being generated that uses nonstandard writing styles influenced by regional languages. Such informal and code-switched content are under-resourced in terms of labeled datasets and language models even for popular tasks like sentiment classification. In this work, we (1) present a labeled dataset called MultiSenti for sentiment classification of code-switched…
▽ More
Nowadays, an abundance of short text is being generated that uses nonstandard writing styles influenced by regional languages. Such informal and code-switched content are under-resourced in terms of labeled datasets and language models even for popular tasks like sentiment classification. In this work, we (1) present a labeled dataset called MultiSenti for sentiment classification of code-switched informal short text, (2) explore the feasibility of adapting resources from a resource-rich language for an informal one, and (3) propose a deep learning-based model for sentiment classification of code-switched informal short text. We aim to achieve this without any lexical normalization, language translation, or code-switching indication. The performance of the proposed models is compared with three existing multilingual sentiment classification models. The results show that the proposed model performs better in general and adapting character-based embeddings yield equivalent performance while being computationally more efficient than training word-based domain-specific embeddings.
△ Less
Submitted 4 January, 2020;
originally announced January 2020.
-
A Multi-cascaded Model with Data Augmentation for Enhanced Paraphrase Detection in Short Texts
Authors:
Muhammad Haroon Shakeel,
Asim Karim,
Imdadullah Khan
Abstract:
Paraphrase detection is an important task in text analytics with numerous applications such as plagiarism detection, duplicate question identification, and enhanced customer support helpdesks. Deep models have been proposed for representing and classifying paraphrases. These models, however, require large quantities of human-labeled data, which is expensive to obtain. In this work, we present a da…
▽ More
Paraphrase detection is an important task in text analytics with numerous applications such as plagiarism detection, duplicate question identification, and enhanced customer support helpdesks. Deep models have been proposed for representing and classifying paraphrases. These models, however, require large quantities of human-labeled data, which is expensive to obtain. In this work, we present a data augmentation strategy and a multi-cascaded model for improved paraphrase detection in short texts. Our data augmentation strategy considers the notions of paraphrases and non-paraphrases as binary relations over the set of texts. Subsequently, it uses graph theoretic concepts to efficiently generate additional paraphrase and non-paraphrase pairs in a sound manner. Our multi-cascaded model employs three supervised feature learners (cascades) based on CNN and LSTM networks with and without soft-attention. The learned features, together with hand-crafted linguistic features, are then forwarded to a discriminator network for final classification. Our model is both wide and deep and provides greater robustness across clean and noisy short texts. We evaluate our approach on three benchmark datasets and show that it produces a comparable or state-of-the-art performance on all three.
△ Less
Submitted 27 December, 2019;
originally announced December 2019.
-
A Multi-cascaded Deep Model for Bilingual SMS Classification
Authors:
Muhammad Haroon Shakeel,
Asim Karim,
Imdadullah Khan
Abstract:
Most studies on text classification are focused on the English language. However, short texts such as SMS are influenced by regional languages. This makes the automatic text classification task challenging due to the multilingual, informal, and noisy nature of language in the text. In this work, we propose a novel multi-cascaded deep learning model called McM for bilingual SMS classification. McM…
▽ More
Most studies on text classification are focused on the English language. However, short texts such as SMS are influenced by regional languages. This makes the automatic text classification task challenging due to the multilingual, informal, and noisy nature of language in the text. In this work, we propose a novel multi-cascaded deep learning model called McM for bilingual SMS classification. McM exploits $n$-gram level information as well as long-term dependencies of text for learning. Our approach aims to learn a model without any code-switching indication, lexical normalization, language translation, or language transliteration. The model relies entirely upon the text as no external knowledge base is utilized for learning. For this purpose, a 12 class bilingual text dataset is developed from SMS feedbacks of citizens on public services containing mixed Roman Urdu and English languages. Our model achieves high accuracy for classification on this dataset and outperforms the previous model for multilingual text classification, highlighting language independence of McM.
△ Less
Submitted 29 November, 2019;
originally announced November 2019.
-
Low Cost 3D Printing for Rapid Prototyping and its Application
Authors:
Taha Hasan Masood Siddique,
Iqra Sami,
Malik Zohaib Nisar,
Mashal Naeem,
Abid Karim,
Muhammad Usman
Abstract:
In the recent years of industrial revolution, 3D printing has shown to grow as an expanding field of new applications. The low cost solutions and short time to market makes it a favorable candidate to be utilized in the dynamic fields of engineering. Additive printing has the vast range of applications in many fields. This study presents the wide range of applications of the 3D printers along with…
▽ More
In the recent years of industrial revolution, 3D printing has shown to grow as an expanding field of new applications. The low cost solutions and short time to market makes it a favorable candidate to be utilized in the dynamic fields of engineering. Additive printing has the vast range of applications in many fields. This study presents the wide range of applications of the 3D printers along with the comparison of the additive printing with the traditional manufacturing methods have been shown. A tutorial is presented explaining the steps involved in the prototype printing using Rhinoceros 3D and Simplify 3D software including the detailed specifications of the end products that were printed using the Delta 3D printer.
△ Less
Submitted 25 November, 2019;
originally announced November 2019.
-
Toxicity Prediction by Multimodal Deep Learning
Authors:
Abdul Karim,
Jaspreet Singh,
Avinash Mishra,
Abdollah Dehzangi,
M. A. Hakim Newton,
Abdul Sattar
Abstract:
Prediction of toxicity levels of chemical compounds is an important issue in Quantitative Structure-Activity Relationship (QSAR) modeling. Although toxicity prediction has achieved significant progress in recent times through deep learning, prediction accuracy levels obtained by even very recent methods are not yet very high. We propose a multimodal deep learning method using multiple heterogeneou…
▽ More
Prediction of toxicity levels of chemical compounds is an important issue in Quantitative Structure-Activity Relationship (QSAR) modeling. Although toxicity prediction has achieved significant progress in recent times through deep learning, prediction accuracy levels obtained by even very recent methods are not yet very high. We propose a multimodal deep learning method using multiple heterogeneous neural network types and data representations. We represent chemical compounds by strings, images, and numerical features. We train fully connected, convolutional, and recurrent neural networks and their ensembles. Each data representation or neural network type has its own strengths and weaknesses. Our motivation is to obtain a collective performance that could go beyond individual performance of each data representation or each neural network type. On a standard toxicity benchmark, our proposed method obtains significantly better accuracy levels than that by the state-of-the-art toxicity prediction methods.
△ Less
Submitted 18 July, 2019;
originally announced July 2019.
-
Efficient Toxicity Prediction via Simple Features Using Shallow Neural Networks and Decision Trees
Authors:
Abdul Karim,
Avinash Mishra,
M A Hakim Newton,
Abdul Sattar
Abstract:
Toxicity prediction of chemical compounds is a grand challenge. Lately, it achieved significant progress in accuracy but using a huge set of features, implementing a complex blackbox technique such as a deep neural network, and exploiting enormous computational resources. In this paper, we strongly argue for the models and methods that are simple in machine learning characteristics, efficient in c…
▽ More
Toxicity prediction of chemical compounds is a grand challenge. Lately, it achieved significant progress in accuracy but using a huge set of features, implementing a complex blackbox technique such as a deep neural network, and exploiting enormous computational resources. In this paper, we strongly argue for the models and methods that are simple in machine learning characteristics, efficient in computing resource usage, and powerful to achieve very high accuracy levels. To demonstrate this, we develop a single task-based chemical toxicity prediction framework using only 2D features that are less compute intensive. We effectively use a decision tree to obtain an optimum number of features from a collection of thousands of them. We use a shallow neural network and jointly optimize it with decision tree taking both network parameters and input features into account. Our model needs only a minute on a single CPU for its training while existing methods using deep neural networks need about 10 min on NVidia Tesla K40 GPU. However, we obtain similar or better performance on several toxicity benchmark tasks. We also develop a cumulative feature ranking method which enables us to identify features that can help chemists perform prescreening of toxic compounds effectively.
△ Less
Submitted 26 January, 2019;
originally announced January 2019.
-
A Study on 3D Surface Graph Representations
Authors:
Long Nguyen,
Abdullah Karim
Abstract:
Surface graphs have been used in many application domains to represent three-dimensional (3D) data. Another approach to representing 3D data is making projections onto two-dimensional (2D) graphs. This approach will result in multiple displays, which is time-consuming in switching between different screens for a different perspective. In this work, we study the performance of 3D version of popular…
▽ More
Surface graphs have been used in many application domains to represent three-dimensional (3D) data. Another approach to representing 3D data is making projections onto two-dimensional (2D) graphs. This approach will result in multiple displays, which is time-consuming in switching between different screens for a different perspective. In this work, we study the performance of 3D version of popular 2D visualization techniques for time series: horizon graph, small multiple, and simple line graph. We explore discrimination tasks with respect to each visualization technique that requires simultaneous representations. We demonstrate our study by visualizing saturated thickness of the Ogallala aquifer - the Southern High Plains Aquifer of Texas in multiple years. For the evaluation, we design comparison and discrimination tasks and automatically record result performed by a group of students at a university. Our results show that 3D small multiples perform well with stable accuracy over numbers of occurrences. On the other hand, shared-space visualization within a single 3D coordinate system is more efficient with small number of simultaneous graphs. 3D horizon graph loses its competence in the 3D coordinate system with the lowest accuracy comparing to other techniques. Our demonstration of 3D spatial-temporal is also presented on the Southern High Plains Aquifer of Texas from 2010 to 2016.
△ Less
Submitted 18 November, 2018;
originally announced November 2018.
-
STOAViz: Visualizing Saturated Thickness of Ogallala Aquifer
Authors:
Tommy Dang,
Long Nguyen,
Abdullah Karim,
Venkatesh Uddameri
Abstract:
In this paper, we introduce STOAViz, a visual analytics tool for analyzing the saturated thickness of the Ogallala aquifer. The saturated thicknesses are monitored by sensors integrated on wells distributed on a vast geographic area. Our analytics application also captures the trends and patterns (such as average/standard deviation over time, sudden increase/decrease of saturated thicknesses) of w…
▽ More
In this paper, we introduce STOAViz, a visual analytics tool for analyzing the saturated thickness of the Ogallala aquifer. The saturated thicknesses are monitored by sensors integrated on wells distributed on a vast geographic area. Our analytics application also captures the trends and patterns (such as average/standard deviation over time, sudden increase/decrease of saturated thicknesses) of water on an individual well and a group of wells based on their geographic locations. To highlight the usefulness and effectiveness of STOAViz, we demonstrate it on the Southern High Plains Aquifer of Texas. The work was developed using feedback from experts at the water resource center at a university. Moreover, our technique can be applied on any geographic areas where wells and their measurements are available.
△ Less
Submitted 18 November, 2018;
originally announced November 2018.
-
Machine Learning Interpretability: A Science rather than a tool
Authors:
Abdul Karim,
Avinash Mishra,
MA Hakim Newton,
Abdul Sattar
Abstract:
The term "interpretability" is oftenly used by machine learning researchers each with their own intuitive understanding of it. There is no universal well agreed upon definition of interpretability in machine learning. As any type of science discipline is mainly driven by the set of formulated questions rather than by different tools in that discipline, e.g. astrophysics is the discipline that lear…
▽ More
The term "interpretability" is oftenly used by machine learning researchers each with their own intuitive understanding of it. There is no universal well agreed upon definition of interpretability in machine learning. As any type of science discipline is mainly driven by the set of formulated questions rather than by different tools in that discipline, e.g. astrophysics is the discipline that learns the composition of stars, not as the discipline that use the spectroscopes. Similarly, we propose that machine learning interpretability should be a discipline that answers specific questions related to interpretability. These questions can be of statistical, causal and counterfactual nature. Therefore, there is a need to look into the interpretability problem of machine learning in the context of questions that need to be addressed rather than different tools. We discuss about a hypothetical interpretability framework driven by a question based scientific approach rather than some specific machine learning model. Using a question based notion of interpretability, we can step towards understanding the science of machine learning rather than its engineering. This notion will also help us understanding any specific problem more in depth rather than relying solely on machine learning methods.
△ Less
Submitted 25 July, 2018; v1 submitted 17 July, 2018;
originally announced July 2018.
-
Improving Text Normalization by Optimizing Nearest Neighbor Matching
Authors:
Salman Ahmad Ansari,
Usman Zafar,
Asim Karim
Abstract:
Text normalization is an essential task in the processing and analysis of social media that is dominated with informal writing. It aims to map informal words to their intended standard forms. Previously proposed text normalization approaches typically require manual selection of parameters for improved performance. In this paper, we present an automatic optimizationbased nearest neighbor matching…
▽ More
Text normalization is an essential task in the processing and analysis of social media that is dominated with informal writing. It aims to map informal words to their intended standard forms. Previously proposed text normalization approaches typically require manual selection of parameters for improved performance. In this paper, we present an automatic optimizationbased nearest neighbor matching approach for text normalization. This approach is motivated by the observation that text normalization is essentially a matching problem and nearest neighbor matching with an adaptive similarity function is the most direct procedure for it. Our similarity function incorporates weighted contributions of contextual, string, and phonetic similarity, and the nearest neighbor matching involves a minimum similarity threshold. These four parameters are tuned efficiently using grid search. We evaluate the performance of our approach on two benchmark datasets. The results demonstrate that parameter tuning on small sized labeled datasets produce state-of-the-art text normalization performances. Thus, this approach allows practically easy construction of evolving domain-specific normalization lexicons
△ Less
Submitted 27 December, 2017;
originally announced December 2017.
-
Causal Inference for Social Discrimination Reasoning
Authors:
Bilal Qureshi,
Faisal Kamiran,
Asim Karim,
Salvatore Ruggieri,
Dino Pedreschi
Abstract:
The discovery of discriminatory bias in human or automated decision making is a task of increasing importance and difficulty, exacerbated by the pervasive use of machine learning and data mining. Currently, discrimination discovery largely relies upon correlation analysis of decisions records, disregarding the impact of confounding biases. We present a method for causal discrimination discovery ba…
▽ More
The discovery of discriminatory bias in human or automated decision making is a task of increasing importance and difficulty, exacerbated by the pervasive use of machine learning and data mining. Currently, discrimination discovery largely relies upon correlation analysis of decisions records, disregarding the impact of confounding biases. We present a method for causal discrimination discovery based on propensity score analysis, a statistical tool for filtering out the effect of confounding variables. We introduce causal measures of discrimination which quantify the effect of group membership on the decisions, and highlight causal discrimination/favoritism patterns by learning regression trees over the novel measures. We validate our approach on two real world datasets. Our proposed framework for causal discrimination has the potential to enhance the transparency of machine learning with tools for detecting discriminatory bias both in the training data and in the learning algorithms.
△ Less
Submitted 4 November, 2019; v1 submitted 12 August, 2016;
originally announced August 2016.
-
Adaptive Beaconing Approaches for Vehicular ad hoc Networks: A Survey
Authors:
Syed Adeel Ali Shah,
Ejaz Ahmed,
Feng Xia,
Ahmad Karim,
Muhammad Shiraz,
Rafidah MD Noor
Abstract:
Vehicular communication requires vehicles to self-organize through the exchange of periodic beacons. Recent analysis on beaconing indicates that the standards for beaconing restrict the desired performance of vehicular applications. This situation can be attributed to the quality of the available transmission medium, persistent change in the traffic situation and the inability of standards to cope…
▽ More
Vehicular communication requires vehicles to self-organize through the exchange of periodic beacons. Recent analysis on beaconing indicates that the standards for beaconing restrict the desired performance of vehicular applications. This situation can be attributed to the quality of the available transmission medium, persistent change in the traffic situation and the inability of standards to cope with application requirements. To this end, this paper is motivated by the classifications and capability evaluations of existing adaptive beaconing approaches. To begin with, we explore the anatomy and the performance requirements of beaconing. Then, the beaconing design is analyzed to introduce a design-based beaconing taxonomy. A survey of the state-of-the-art is conducted with an emphasis on the salient features of the beaconing approaches. We also evaluate the capabilities of beaconing approaches using several key parameters. A comparison among beaconing approaches is presented, which is based on the architectural and implementation characteristics. The paper concludes by discussing open challenges in the field.
△ Less
Submitted 24 May, 2016;
originally announced May 2016.
-
X-TREPAN: a multi class regression and adapted extraction of comprehensible decision tree in artificial neural networks
Authors:
Awudu Karim,
Shangbo Zhou
Abstract:
In this work, the TREPAN algorithm is enhanced and extended for extracting decision trees from neural networks. We empirically evaluated the performance of the algorithm on a set of databases from real world events. This benchmark enhancement was achieved by adapting Single-test TREPAN and C4.5 decision tree induction algorithms to analyze the datasets. The models are then compared with X-TREPAN f…
▽ More
In this work, the TREPAN algorithm is enhanced and extended for extracting decision trees from neural networks. We empirically evaluated the performance of the algorithm on a set of databases from real world events. This benchmark enhancement was achieved by adapting Single-test TREPAN and C4.5 decision tree induction algorithms to analyze the datasets. The models are then compared with X-TREPAN for comprehensibility and classification accuracy. Furthermore, we validate the experimentations by applying statistical methods. Finally, the modified algorithm is extended to work with multi-class regression problems and the ability to comprehend generalized feed forward networks is achieved.
△ Less
Submitted 30 August, 2015;
originally announced August 2015.
-
Real-Time System of Hand Detection And Gesture Recognition In Cyber Presence Interactive System For E-Learning
Authors:
Bousaaid Mourad,
Ayaou Tarik,
Afdel Karim,
Estraillier Pascal
Abstract:
The development of technologies of multimedia, linked to that of Internet and democratization of high outflow, has made henceforth E-learning possible for learners being in virtual classes and geographically distributed. The quality and quantity of asynchronous and synchronous communications are the key elements for E-learning success. It is important to have a propitious supervision to reduce the…
▽ More
The development of technologies of multimedia, linked to that of Internet and democratization of high outflow, has made henceforth E-learning possible for learners being in virtual classes and geographically distributed. The quality and quantity of asynchronous and synchronous communications are the key elements for E-learning success. It is important to have a propitious supervision to reduce the feeling of isolation in E-learning. This feeling of isolation is among the main causes of loss and high rates of stalling in E-learning. The researches to be conducted in this domain aim to bring solutions of convergence coming from real time image for the capture and recognition of hand gestures. These gestures will be analyzed by the system and transformed as indicator of participation. This latter is displayed in the table of performance of the tutor as a curve according to the time. In case of isolation of learner, the indicator of participation will become red and the tutor will be informed of learners with difficulties to participate during learning session.
△ Less
Submitted 8 December, 2014;
originally announced February 2015.
-
System Interactive Cyber Presence for E learning to Break Down Learner Isolation
Authors:
Bousaaid Mourad,
Ayaou Tarik,
Afdel Karim,
Estraillier Pascal
Abstract:
The development of technologies of multimedia, linked to that of Internet and democratization of high speed, has made henceforth E-learning possible for learners being in virtual classes and geographically distributed. One benefit to taking course online is that the online course structure is typically more student focused than teacher centered and encouraging more active participation by students…
▽ More
The development of technologies of multimedia, linked to that of Internet and democratization of high speed, has made henceforth E-learning possible for learners being in virtual classes and geographically distributed. One benefit to taking course online is that the online course structure is typically more student focused than teacher centered and encouraging more active participation by students in collaborative learning activities. The quality and quantity of asynchronous and synchronous communications are the key elements for E-learning success. A potential problem that has received little exploration is student's feeling of isolation. It is important to have a propitious supervision to breaking down learner feeling isolation in E learning environment. This feeling of isolation is among the main causes of loss and high rates of dropout in E-learning. It impacts on their levels of participation, satisfaction and learning. To overcome this feeling of isolation, we aim, by this research, to provide the trainer and each learner with an environment allowing them to behave as if being face to face; in other words, to approach the pedagogy of classroom teaching. Our contribution to reduce the feeling of isolation is to ensure the presence of the teacher in the educational tools. These tools aim to establish a real dialogue with the learner, forcing him to take an active part in their learning. Among the tools we offer, video conference Openmeeting integrated in Moodle providing the possibility of using the notion of class and whiteboard, the indicator of motivation quantification tool based hand gesture that we developed and finally social networks web 2. 0 like Facebook, youtube, twitter to promote collaboration, sharing and communication of the learner with his peers.
△ Less
Submitted 23 February, 2015;
originally announced February 2015.
-
Blind and robust images watermarking based on wavelet and edge insertion
Authors:
Henri Bruno Razafindradina,
Attoumani Mohamed Karim
Abstract:
This paper gives a new scheme of watermarking technique related to insert the mark by adding edge in HH sub-band of the host image after wavelet decomposition. Contrary to most of the watermarking algorithms in wavelet domain, our method is blind and results show that it is robust against the JPEG and GIF compression, histogram and spectrum spreading, noise adding and small rotation. Its robustnes…
▽ More
This paper gives a new scheme of watermarking technique related to insert the mark by adding edge in HH sub-band of the host image after wavelet decomposition. Contrary to most of the watermarking algorithms in wavelet domain, our method is blind and results show that it is robust against the JPEG and GIF compression, histogram and spectrum spreading, noise adding and small rotation. Its robustness against compression is better than others watermarking algorithms reported in the literature. The algorithm is flexible because its capacity or robustness can be improved by modifying some parameters.
△ Less
Submitted 9 October, 2013;
originally announced October 2013.
-
Speaker Identification using MFCC-Domain Support Vector Machine
Authors:
S. M. Kamruzzaman,
A. N. M. Rezaul Karim,
Md. Saiful Islam,
Md. Emdadul Haque
Abstract:
Speech recognition and speaker identification are important for authentication and verification in security purpose, but they are difficult to achieve. Speaker identification methods can be divided into text-independent and text-dependent. This paper presents a technique of text-dependent speaker identification using MFCC-domain support vector machine (SVM). In this work, melfrequency cepstrum coe…
▽ More
Speech recognition and speaker identification are important for authentication and verification in security purpose, but they are difficult to achieve. Speaker identification methods can be divided into text-independent and text-dependent. This paper presents a technique of text-dependent speaker identification using MFCC-domain support vector machine (SVM). In this work, melfrequency cepstrum coefficients (MFCCs) and their statistical distribution properties are used as features, which will be inputs to the neural network. This work firstly used sequential minimum optimization (SMO) learning technique for SVM that improve performance over traditional techniques Chunking, Osuna. The cepstrum coefficients representing the speaker characteristics of a speech segment are computed by nonlinear filter bank analysis and discrete cosine transform. The speaker identification ability and convergence speed of the SVMs are investigated for different combinations of features. Extensive experimental results on several samples show the effectiveness of the proposed approach.
△ Less
Submitted 25 September, 2010;
originally announced September 2010.