subscribe to arXiv mailings

latentSplat: Autoencoding Variational Gaussians for Fast Generalizable 3D Reconstruction

Authors: Christopher Wewer, Kevin Raj, Eddy Ilg, Bernt Schiele, Jan Eric Lenssen

Abstract: We present latentSplat, a method to predict semantic Gaussians in a 3D latent space that can be splatted and decoded by a light-weight generative 2D architecture. Existing methods for generalizable 3D reconstruction either do not enable fast inference of high resolution novel views due to slow volume rendering, or are limited to interpolation of close input views, even in simpler settings with a s… ▽ More We present latentSplat, a method to predict semantic Gaussians in a 3D latent space that can be splatted and decoded by a light-weight generative 2D architecture. Existing methods for generalizable 3D reconstruction either do not enable fast inference of high resolution novel views due to slow volume rendering, or are limited to interpolation of close input views, even in simpler settings with a single central object, where 360-degree generalization is possible. In this work, we combine a regression-based approach with a generative model, moving towards both of these capabilities within the same method, trained purely on readily available real video data. The core of our method are variational 3D Gaussians, a representation that efficiently encodes varying uncertainty within a latent space consisting of 3D feature Gaussians. From these Gaussians, specific instances can be sampled and rendered via efficient Gaussian splatting and a fast, generative decoder network. We show that latentSplat outperforms previous works in reconstruction quality and generalization, while being fast and scalable to high-resolution data. △ Less

Submitted 24 March, 2024; originally announced March 2024.

Comments: Project website: https://geometric-rl.mpi-inf.mpg.de/latentsplat/

arXiv:2312.17748 [pdf, other]

K-PERM: Personalized Response Generation Using Dynamic Knowledge Retrieval and Persona-Adaptive Queries

Authors: Kanak Raj, Kaushik Roy, Vamshi Bonagiri, Priyanshul Govil, Krishnaprasad Thirunarayanan, Manas Gaur

Abstract: Personalizing conversational agents can enhance the quality of conversations and increase user engagement. However, they often lack external knowledge to appropriately tend to a user's persona. This is particularly crucial for practical applications like mental health support, nutrition planning, culturally sensitive conversations, or reducing toxic behavior in conversational agents. To enhance th… ▽ More Personalizing conversational agents can enhance the quality of conversations and increase user engagement. However, they often lack external knowledge to appropriately tend to a user's persona. This is particularly crucial for practical applications like mental health support, nutrition planning, culturally sensitive conversations, or reducing toxic behavior in conversational agents. To enhance the relevance and comprehensiveness of personalized responses, we propose using a two-step approach that involves (1) selectively integrating user personas and (2) contextualizing the response with supplementing information from a background knowledge source. We develop K-PERM (Knowledge-guided PErsonalization with Reward Modulation), a dynamic conversational agent that combines these elements. K-PERM achieves state-of-the-art performance on the popular FoCus dataset, containing real-world personalized conversations concerning global landmarks. We show that using responses from K-PERM can improve performance in state-of-the-art LLMs (GPT 3.5) by 10.5%, highlighting the impact of K-PERM for personalizing chatbots. △ Less

Submitted 6 February, 2024; v1 submitted 29 December, 2023; originally announced December 2023.

Comments: Accepted at AAAI 2024 Spring Symposium Series

arXiv:2306.01805 [pdf, other]

Cook-Gen: Robust Generative Modeling of Cooking Actions from Recipes

Authors: Revathy Venkataramanan, Kaushik Roy, Kanak Raj, Renjith Prasad, Yuxin Zi, Vignesh Narayanan, Amit Sheth

Abstract: As people become more aware of their food choices, food computation models have become increasingly popular in assisting people in maintaining healthy eating habits. For example, food recommendation systems analyze recipe instructions to assess nutritional contents and provide recipe recommendations. The recent and remarkable successes of generative AI methods, such as auto-regressive large langua… ▽ More As people become more aware of their food choices, food computation models have become increasingly popular in assisting people in maintaining healthy eating habits. For example, food recommendation systems analyze recipe instructions to assess nutritional contents and provide recipe recommendations. The recent and remarkable successes of generative AI methods, such as auto-regressive large language models, can lead to robust methods for a more comprehensive understanding of recipes for healthy food recommendations beyond surface-level nutrition content assessments. In this study, we explore the use of generative AI methods to extend current food computation models, primarily involving the analysis of nutrition and ingredients, to also incorporate cooking actions (e.g., add salt, fry the meat, boil the vegetables, etc.). Cooking actions are notoriously hard to model using statistical learning methods due to irregular data patterns - significantly varying natural language descriptions for the same action (e.g., marinate the meat vs. marinate the meat and leave overnight) and infrequently occurring patterns (e.g., add salt occurs far more frequently than marinating the meat). The prototypical approach to handling irregular data patterns is to increase the volume of data that the model ingests by orders of magnitude. Unfortunately, in the cooking domain, these problems are further compounded with larger data volumes presenting a unique challenge that is not easily handled by simply scaling up. In this work, we propose novel aggregation-based generative AI methods, Cook-Gen, that reliably generate cooking actions from recipes, despite difficulties with irregular data patterns, while also outperforming Large Language Models and other strong baselines. △ Less

Submitted 1 June, 2023; originally announced June 2023.

arXiv:2301.00948 [pdf, other]

Understanding EEG signals for subject-wise Definition of Armoni Activities

Authors: Kislay Raj, Aditya Singh, Abhishek Mandal, Teerath Kumar, Arunabha M. Roy

Abstract: In a growing world of technology, psychological disorders became a challenge to be solved. The methods used for cognitive stimulation are very conventional and based on one-way communication, which only relies on the material or method used for training of an individual. It doesn't use any kind of feedback from the individual to analyze the progress of the training process. We have proposed a clos… ▽ More In a growing world of technology, psychological disorders became a challenge to be solved. The methods used for cognitive stimulation are very conventional and based on one-way communication, which only relies on the material or method used for training of an individual. It doesn't use any kind of feedback from the individual to analyze the progress of the training process. We have proposed a closed-loop methodology to improve the cognitive state of a person with ID (Intellectual disability). We have used a platform named 'Armoni', for providing training to the intellectually disabled individuals. The learning is performed in a closed-loop by using feedback in the form of change in affective state. For feedback to the Armoni, an EEG (Electroencephalograph) headband is used. All the changes in EEG are observed and classified against the change in the mean and standard deviation value of all frequency bands of signal. This comparison is being helpful in defining every activity with respect to change in brain signals. In this paper, we have discussed the process of treatment of EEG signal and its definition against the different activities of Armoni. We have tested it on 6 different systems with different age groups and cognitive levels. △ Less

Submitted 26 April, 2023; v1 submitted 3 January, 2023; originally announced January 2023.

Comments: Submitted to SN Computer Science journal

arXiv:2209.06977 [pdf]

SQL and NoSQL Databases Software architectures performance analysis and assessments -- A Systematic Literature review

Authors: Wisal Khan, Teerath Kumar, Zhang Cheng, Kislay Raj, Arunabha M Roy, Bin Luo

Abstract: Context: The efficient processing of Big Data is a challenging task for SQL and NoSQL Databases, where competent software architecture plays a vital role. The SQL Databases are designed for structuring data and supporting vertical scalability. In contrast, horizontal scalability is backed by NoSQL Databases and can process sizeable unstructured Data efficiently. One can choose the right paradigm a… ▽ More Context: The efficient processing of Big Data is a challenging task for SQL and NoSQL Databases, where competent software architecture plays a vital role. The SQL Databases are designed for structuring data and supporting vertical scalability. In contrast, horizontal scalability is backed by NoSQL Databases and can process sizeable unstructured Data efficiently. One can choose the right paradigm according to the organisation's needs; however, making the correct choice can often be challenging. The SQL and NoSQL Databases follow different architectures. Also, the mixed model is followed by each category of NoSQL Databases. Hence, data movement becomes difficult for cloud consumers across multiple cloud service providers (CSPs). In addition, each cloud platform IaaS, PaaS, SaaS, and DBaaS also monitors various paradigms. Objective: This systematic literature review (SLR) aims to study the related articles associated with SQL and NoSQL Database software architectures and tackle data portability and Interoperability among various cloud platforms. State of the art presented many performance comparison studies of SQL and NoSQL Databases by observing scaling, performance, availability, consistency and sharding characteristics. According to the research studies, NoSQL Database designed structures can be the right choice for big data analytics, while SQL Databases are suitable for OLTP Databases. The researcher proposes numerous approaches associated with data movement in the cloud. Platform-based APIs are developed, which makes users' data movement difficult. Therefore, data portability and Interoperability issues are noticed during data movement across multiple CSPs. To minimize developer efforts and Interoperability, Unified APIs are demanded to make data movement relatively more accessible among various cloud platforms. △ Less

Submitted 14 September, 2022; originally announced September 2022.

Comments: 57 pages systematic literature review, already submitted to Big Data Research; More importantly, we can not add method, result and conclusion section in the abstract here due to characters limitations. Please check pdf file

arXiv:2105.01707 [pdf, other]

ABET Accreditation: A Way Forward for PDC Education

Authors: Sherif G. Aly, Haidar Harmanani, Rajendra K. Raj, Sanaa Sharafeddine

Abstract: With parallel and distributed computing (PDC) now wide-spread, modern computing programs must incorporate PDC within the curriculum. ACM and IEEE Computer Society's Computer Science curricular guidelines have recommended exposure to PDC concepts since 2013. More recently, a variety of initiatives have made PDC curricular content, lectures, and labs freely available for undergraduate computer scien… ▽ More With parallel and distributed computing (PDC) now wide-spread, modern computing programs must incorporate PDC within the curriculum. ACM and IEEE Computer Society's Computer Science curricular guidelines have recommended exposure to PDC concepts since 2013. More recently, a variety of initiatives have made PDC curricular content, lectures, and labs freely available for undergraduate computer science programs. Despite these efforts, progress in ensuring computer science students graduate with sufficient PDC exposure has been uneven. This paper discusses the impact of ABET's revised criteria that have required exposure to PDC to achieve accreditation for computer science programs since 2018. The authors reviewed 20 top ABET-accredited computer science programs and analyzed how they covered the required PDC components in their curricula. Using their own institutions as case studies, the authors examine in detail how three different ABET-accredited computer science programs covered PDC using different approaches, yet meeting the PDC requirements of these ABET criteria. The paper also shows how ACM/IEEE Computer Society curricular guidelines for computer engineering and software engineering programs, along with ABET accreditation criteria, can cover PDC. △ Less

Submitted 4 May, 2021; originally announced May 2021.

Journal ref: EduPar-21: 11th NSF/TCPP Workshop on Parallel and Distributed Computing Education, May 2021

Showing 1–6 of 6 results for author: Raj, K