subscribe to arXiv mailings

doi 10.5220/0012424500003660

Region-Transformer: Self-Attention Region Based Class-Agnostic Point Cloud Segmentation

Authors: Dipesh Gyawali, Jian Zhang, BB Karki

Abstract: Point cloud segmentation, which helps us understand the environment of specific structures and objects, can be performed in class-specific and class-agnostic ways. We propose a novel region-based transformer model called Region-Transformer for performing class-agnostic point cloud segmentation. The model utilizes a region-growth approach and self-attention mechanism to iteratively expand or contra… ▽ More Point cloud segmentation, which helps us understand the environment of specific structures and objects, can be performed in class-specific and class-agnostic ways. We propose a novel region-based transformer model called Region-Transformer for performing class-agnostic point cloud segmentation. The model utilizes a region-growth approach and self-attention mechanism to iteratively expand or contract a region by adding or removing points. It is trained on simulated point clouds with instance labels only, avoiding semantic labels. Attention-based networks have succeeded in many previous methods of performing point cloud segmentation. However, a region-growth approach with attention-based networks has yet to be used to explore its performance gain. To our knowledge, we are the first to use a self-attention mechanism in a region-growth approach. With the introduction of self-attention to region-growth that can utilize local contextual information of neighborhood points, our experiments demonstrate that the Region-Transformer model outperforms previous class-agnostic and class-specific methods on indoor datasets regarding clustering metrics. The model generalizes well to large-scale scenes. Key advantages include capturing long-range dependencies through self-attention, avoiding the need for semantic labels during training, and applicability to a variable number of objects. The Region-Transformer model represents a promising approach for flexible point cloud segmentation with applications in robotics, digital twinning, and autonomous vehicles. △ Less

Submitted 3 March, 2024; originally announced March 2024.

Comments: 8 pages, 5 figures, 3 tables

Journal ref: 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4 VISAPP: VISAPP, 341-348, 2024 , Rome, Italy

arXiv:2309.02521 [pdf, other]

Comparative Analysis of CPU and GPU Profiling for Deep Learning Models

Authors: Dipesh Gyawali

Abstract: Deep Learning(DL) and Machine Learning(ML) applications are rapidly increasing in recent days. Massive amounts of data are being generated over the internet which can derive meaningful results by the use of ML and DL algorithms. Hardware resources and open-source libraries have made it easy to implement these algorithms. Tensorflow and Pytorch are one of the leading frameworks for implementing ML… ▽ More Deep Learning(DL) and Machine Learning(ML) applications are rapidly increasing in recent days. Massive amounts of data are being generated over the internet which can derive meaningful results by the use of ML and DL algorithms. Hardware resources and open-source libraries have made it easy to implement these algorithms. Tensorflow and Pytorch are one of the leading frameworks for implementing ML projects. By using those frameworks, we can trace the operations executed on both GPU and CPU to analyze the resource allocations and consumption. This paper presents the time and memory allocation of CPU and GPU while training deep neural networks using Pytorch. This paper analysis shows that GPU has a lower running time as compared to CPU for deep neural networks. For a simpler network, there are not many significant improvements in GPU over the CPU. △ Less

Submitted 8 December, 2023; v1 submitted 5 September, 2023; originally announced September 2023.

Comments: 6 pages, 11 figures

arXiv:2309.00819 [pdf, other]

Mixed Reality: The Interface of the Future

Authors: Dipesh Gyawali

Abstract: The world is slowly moving towards everything being simulated digitally and virtually. Mixed Reality (MR) is the amalgam of the real world with virtual stimuli. It has great prospects in the future in terms of various applications additionally with some challenges. This paper focuses on how Mixed Reality could be used in the future along with the challenges that could arise. Several application ar… ▽ More The world is slowly moving towards everything being simulated digitally and virtually. Mixed Reality (MR) is the amalgam of the real world with virtual stimuli. It has great prospects in the future in terms of various applications additionally with some challenges. This paper focuses on how Mixed Reality could be used in the future along with the challenges that could arise. Several application areas along with the potential benefits are studied in this research. Three research questions are proposed, analyzed, and concluded through the experiments. While the availability of MR devices could introduce a lot of potential, specific challenges need to be scrutinized by the developers and manufacturers. Overall, MR technology has a chance to enhance personalized, supportive, and interactive experiences for human lives. △ Less

Submitted 2 September, 2023; originally announced September 2023.

Comments: 6 pages, 8 figures

arXiv:2211.05339 [pdf, ps, other]

Writing summary for the state-of-the-art methods for big data clustering in distributed environment

Authors: Dipesh Gyawali

Abstract: Big Data processing systems handle huge unstructured and structured data to store, process, and analyze through cluster analysis which helps in identifying unseen patterns to find the relationships between them. Clustering analysis over the shared machines in big data technologies helps in deriving the relations and making decisions using data in context. It can handle every form of raw, tabular d… ▽ More Big Data processing systems handle huge unstructured and structured data to store, process, and analyze through cluster analysis which helps in identifying unseen patterns to find the relationships between them. Clustering analysis over the shared machines in big data technologies helps in deriving the relations and making decisions using data in context. It can handle every form of raw, tabular data along with structured, semi-structured, and unstructured data. The data doesn't have to possess linearity property. It can reflect associative and correlative patterns and groupings. The main contribution and findings of this paper are to gather and summarize the recent big data clustering techniques, and their strengths, and weaknesses in any distributed environment. △ Less

Submitted 9 November, 2022; originally announced November 2022.

Comments: 4 Pages

MSC Class: 2.2

arXiv:2104.08585 [pdf, other]

doi 10.1109/ICCCNT49239.2020.9225443

Age Range Estimation using MTCNN and VGG-Face Model

Authors: Dipesh Gyawali, Prashanga Pokharel, Ashutosh Chauhan, Subodh Chandra Shakya

Abstract: The Convolutional Neural Network has amazed us with its usage on several applications. Age range estimation using CNN is emerging due to its application in myriad of areas which makes it a state-of-the-art area for research and improve the estimation accuracy. A deep CNN model is used for identification of people's age range in our proposed work. At first, we extracted only face images from image… ▽ More The Convolutional Neural Network has amazed us with its usage on several applications. Age range estimation using CNN is emerging due to its application in myriad of areas which makes it a state-of-the-art area for research and improve the estimation accuracy. A deep CNN model is used for identification of people's age range in our proposed work. At first, we extracted only face images from image dataset using MTCNN to remove unnecessary features other than face from the image. Secondly, we used random crop technique for data augmentation to improve the model performance. We have used the concept of transfer learning in our research. A pretrained face recognition model i.e VGG-Face is used to build our model for identification of age range whose performance is evaluated on Adience Benchmark for confirming the efficacy of our work. The performance in test set outperformed existing state-of-the-art by substantial margins. △ Less

Submitted 17 April, 2021; originally announced April 2021.

Comments: 6 pages, 10 figures

Journal ref: 11th IEEE International Conference on Computing, Communication and Networking Technologies (ICCCNT), 2020

arXiv:2004.02168 [pdf, other]

Comparative Analysis of Multiple Deep CNN Models for Waste Classification

Authors: Dipesh Gyawali, Alok Regmi, Aatish Shakya, Ashish Gautam, Surendra Shrestha

Abstract: Waste is a wealth in a wrong place. Our research focuses on analyzing possibilities for automatic waste sorting and collecting in such a way that helps it for further recycling process. Various approaches are being practiced managing waste but not efficient and require human intervention. The automatic waste segregation would fit in to fill the gap. The project tested well known Deep Learning Netw… ▽ More Waste is a wealth in a wrong place. Our research focuses on analyzing possibilities for automatic waste sorting and collecting in such a way that helps it for further recycling process. Various approaches are being practiced managing waste but not efficient and require human intervention. The automatic waste segregation would fit in to fill the gap. The project tested well known Deep Learning Network architectures for waste classification with dataset combined from own endeavors and Trash Net. The convolutional neural network is used for image classification. The hardware built in the form of dustbin is used to segregate those wastes into different compartments. Without the human exercise in segregating those waste products, the study would save the precious time and would introduce the automation in the area of waste management. Municipal solid waste is a huge, renewable source of energy. The situation is win-win for both government, society and industrialists. Because of fine-tuning of the ResNet18 Network, the best validation accuracy was found to be 87.8%. △ Less

Submitted 14 August, 2020; v1 submitted 5 April, 2020; originally announced April 2020.

Comments: 6 pages, 13 figures

Journal ref: 5th International Conference on Advanced Engineering and ICT-Convergence 2020

Showing 1–6 of 6 results for author: Gyawali, D