-
REVEAL-IT: REinforcement learning with Visibility of Evolving Agent poLicy for InTerpretability
Authors:
Shuang Ao,
Simon Khan,
Haris Aziz,
Flora D. Salim
Abstract:
Understanding the agent's learning process, particularly the factors that contribute to its success or failure post-training, is crucial for comprehending the rationale behind the agent's decision-making process. Prior methods clarify the learning process by creating a structural causal model (SCM) or visually representing the distribution of value functions. Nevertheless, these approaches have co…
▽ More
Understanding the agent's learning process, particularly the factors that contribute to its success or failure post-training, is crucial for comprehending the rationale behind the agent's decision-making process. Prior methods clarify the learning process by creating a structural causal model (SCM) or visually representing the distribution of value functions. Nevertheless, these approaches have constraints as they exclusively function in 2D-environments or with uncomplicated transition dynamics. Understanding the agent's learning process in complicated environments or tasks is more challenging. In this paper, we propose REVEAL-IT, a novel framework for explaining the learning process of an agent in complex environments. Initially, we visualize the policy structure and the agent's learning process for various training tasks. By visualizing these findings, we can understand how much a particular training task or stage affects the agent's performance in test. Then, a GNN-based explainer learns to highlight the most important section of the policy, providing a more clear and robust explanation of the agent's learning process. The experiments demonstrate that explanations derived from this framework can effectively help in the optimization of the training tasks, resulting in improved learning efficiency and final performance.
△ Less
Submitted 16 July, 2024; v1 submitted 20 June, 2024;
originally announced June 2024.
-
ViLCo-Bench: VIdeo Language COntinual learning Benchmark
Authors:
Tianqi Tang,
Shohreh Deldari,
Hao Xue,
Celso De Melo,
Flora D. Salim
Abstract:
Video language continual learning involves continuously adapting to information from video and text inputs, enhancing a model's ability to handle new tasks while retaining prior knowledge. This field is a relatively under-explored area, and establishing appropriate datasets is crucial for facilitating communication and research in this field. In this study, we present the first dedicated benchmark…
▽ More
Video language continual learning involves continuously adapting to information from video and text inputs, enhancing a model's ability to handle new tasks while retaining prior knowledge. This field is a relatively under-explored area, and establishing appropriate datasets is crucial for facilitating communication and research in this field. In this study, we present the first dedicated benchmark, ViLCo-Bench, designed to evaluate continual learning models across a range of video-text tasks. The dataset comprises ten-minute-long videos and corresponding language queries collected from publicly available datasets. Additionally, we introduce a novel memory-efficient framework that incorporates self-supervised learning and mimics long-term and short-term memory effects. This framework addresses challenges including memory complexity from long video clips, natural language complexity from open queries, and text-video misalignment. We posit that ViLCo-Bench, with greater complexity compared to existing continual learning benchmarks, would serve as a critical tool for exploring the video-language domain, extending beyond conventional class-incremental tasks, and addressing complex and limited annotation issues. The curated data, evaluations, and our novel method are available at https://github.com/cruiseresearchgroup/ViLCo .
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
BTS: Building Timeseries Dataset: Empowering Large-Scale Building Analytics
Authors:
Arian Prabowo,
Xiachong Lin,
Imran Razzak,
Hao Xue,
Emily W. Yap,
Matthew Amos,
Flora D. Salim
Abstract:
Buildings play a crucial role in human well-being, influencing occupant comfort, health, and safety. Additionally, they contribute significantly to global energy consumption, accounting for one-third of total energy usage, and carbon emissions. Optimizing building performance presents a vital opportunity to combat climate change and promote human flourishing. However, research in building analytic…
▽ More
Buildings play a crucial role in human well-being, influencing occupant comfort, health, and safety. Additionally, they contribute significantly to global energy consumption, accounting for one-third of total energy usage, and carbon emissions. Optimizing building performance presents a vital opportunity to combat climate change and promote human flourishing. However, research in building analytics has been hampered by the lack of accessible, available, and comprehensive real-world datasets on multiple building operations. In this paper, we introduce the Building TimeSeries (BTS) dataset. Our dataset covers three buildings over a three-year period, comprising more than ten thousand timeseries data points with hundreds of unique ontologies. Moreover, the metadata is standardized using the Brick schema. To demonstrate the utility of this dataset, we performed benchmarks on two tasks: timeseries ontology classification and zero-shot forecasting. These tasks represent an essential initial step in addressing challenges related to interoperability in building analytics. Access to the dataset and the code used for benchmarking are available here: https://github.com/cruiseresearchgroup/DIEF_BTS .
△ Less
Submitted 18 June, 2024; v1 submitted 13 June, 2024;
originally announced June 2024.
-
STEMO: Early Spatio-temporal Forecasting with Multi-Objective Reinforcement Learning
Authors:
Wei Shao,
Yufan Kang,
Ziyan Peng,
Xiao Xiao,
Lei Wang,
Yuhui Yang,
Flora D Salim
Abstract:
Accuracy and timeliness are indeed often conflicting goals in prediction tasks. Premature predictions may yield a higher rate of false alarms, whereas delaying predictions to gather more information can render them too late to be useful. In applications such as wildfires, crimes, and traffic jams, timely forecasting are vital for safeguarding human life and property. Consequently, finding a balanc…
▽ More
Accuracy and timeliness are indeed often conflicting goals in prediction tasks. Premature predictions may yield a higher rate of false alarms, whereas delaying predictions to gather more information can render them too late to be useful. In applications such as wildfires, crimes, and traffic jams, timely forecasting are vital for safeguarding human life and property. Consequently, finding a balance between accuracy and timeliness is crucial. In this paper, we propose an early spatio-temporal forecasting model based on Multi-Objective reinforcement learning that can either implement an optimal policy given a preference or infer the preference based on a small number of samples. The model addresses two primary challenges: 1) enhancing the accuracy of early forecasting and 2) providing the optimal policy for determining the most suitable prediction time for each area. Our method demonstrates superior performance on three large-scale real-world datasets, surpassing existing methods in early spatio-temporal forecasting tasks.
△ Less
Submitted 18 June, 2024; v1 submitted 6 June, 2024;
originally announced June 2024.
-
ST-DPGAN: A Privacy-preserving Framework for Spatiotemporal Data Generation
Authors:
Wei Shao,
Rongyi Zhu,
Cai Yang,
Chandra Thapa,
Muhammad Ejaz Ahmed,
Seyit Camtepe,
Rui Zhang,
DuYong Kim,
Hamid Menouar,
Flora D. Salim
Abstract:
Spatiotemporal data is prevalent in a wide range of edge devices, such as those used in personal communication and financial transactions. Recent advancements have sparked a growing interest in integrating spatiotemporal analysis with large-scale language models. However, spatiotemporal data often contains sensitive information, making it unsuitable for open third-party access. To address this cha…
▽ More
Spatiotemporal data is prevalent in a wide range of edge devices, such as those used in personal communication and financial transactions. Recent advancements have sparked a growing interest in integrating spatiotemporal analysis with large-scale language models. However, spatiotemporal data often contains sensitive information, making it unsuitable for open third-party access. To address this challenge, we propose a Graph-GAN-based model for generating privacy-protected spatiotemporal data. Our approach incorporates spatial and temporal attention blocks in the discriminator and a spatiotemporal deconvolution structure in the generator. These enhancements enable efficient training under Gaussian noise to achieve differential privacy. Extensive experiments conducted on three real-world spatiotemporal datasets validate the efficacy of our model. Our method provides a privacy guarantee while maintaining the data utility. The prediction model trained on our generated data maintains a competitive performance compared to the model trained on the original data.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
CAPRI-FAIR: Integration of Multi-sided Fairness in Contextual POI Recommendation Framework
Authors:
Francis Zac dela Cruz,
Flora D. Salim,
Yonchanok Khaokaew,
Jeffrey Chan
Abstract:
Point-of-interest (POI) recommendation, a form of context-aware recommendation, takes into account spatio-temporal constraints and contexts like distance, peak business hours, and previous user check-ins. Given the ability of these kinds of systems to influence not just the consumer's travel experience, but also the POI's business, it is important to consider fairness from multiple perspectives. U…
▽ More
Point-of-interest (POI) recommendation, a form of context-aware recommendation, takes into account spatio-temporal constraints and contexts like distance, peak business hours, and previous user check-ins. Given the ability of these kinds of systems to influence not just the consumer's travel experience, but also the POI's business, it is important to consider fairness from multiple perspectives. Unfortunately, these systems tend to provide less accurate recommendations to inactive users, and less exposure to unpopular POIs. The goal of this paper is to develop a post-filter methodology that incorporates provider and consumer fairness factors into pre-existing recommendation models, to satisfy fairness metrics like item exposure, and performance metrics like precision and distance, making the system more sustainable to both consumers and providers. Experiments have shown that using a linear scoring model for provider fairness in re-scoring recommended items yields the best tradeoff between performance and long-tail exposure, in some cases without a significant decrease in precision. When attempting to address consumer fairness by recommending more popular POIs to inactive users, the result was an increase in precision for only some recommendation models and datasets. Finally, when considering the tradeoff between both parameters, the combinations that reached the Pareto front of consumer and provider fairness, unfortunately, achieved the lowest precision values. We find that the nature of this tradeoff depends heavily on the model and the dataset.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Promoting Two-sided Fairness in Dynamic Vehicle Routing Problem
Authors:
Yufan Kang,
Rongsheng Zhang,
Wei Shao,
Flora D. Salim,
Jeffrey Chan
Abstract:
Dynamic Vehicle Routing Problem (DVRP), is an extension of the classic Vehicle Routing Problem (VRP), which is a fundamental problem in logistics and transportation. Typically, DVRPs involve two stakeholders: service providers that deliver services to customers and customers who raise requests from different locations. Many real-world applications can be formulated as DVRP such as ridesharing and…
▽ More
Dynamic Vehicle Routing Problem (DVRP), is an extension of the classic Vehicle Routing Problem (VRP), which is a fundamental problem in logistics and transportation. Typically, DVRPs involve two stakeholders: service providers that deliver services to customers and customers who raise requests from different locations. Many real-world applications can be formulated as DVRP such as ridesharing and non-compliance capture. Apart from original objectives like optimising total utility or efficiency, DVRP should also consider fairness for all parties. Unfairness can induce service providers and customers to give up on the systems, leading to negative financial and social impacts. However, most existing DVRP-related applications focus on improving fairness from a single side, and there have been few works considering two-sided fairness and utility optimisation concurrently. To this end, we propose a novel framework, a Two-sided Fairness-aware Genetic Algorithm (named 2FairGA), which expands the genetic algorithm from the original objective solely focusing on utility to multi-objectives that incorporate two-sided fairness. Subsequently, the impact of injecting two fairness definitions into the utility-focused model and the correlation between any pair of the three objectives are explored. Extensive experiments demonstrate the superiority of our proposed framework compared to the state-of-the-art.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
A Gap in Time: The Challenge of Processing Heterogeneous IoT Point Data in Buildings
Authors:
Xiachong Lin,
Arian Prabowo,
Imran Razzak,
Hao Xue,
Matthew Amos,
Sam Behrens,
Stephen White,
Flora D. Salim
Abstract:
The growing need for sustainable energy solutions has driven the integration of digitalized buildings into the power grid, utilizing Internet-of-Things technology to optimize building performance and energy efficiency. However, incorporating IoT point data within deep-learning frameworks for energy management presents a complex challenge, predominantly due to the inherent data heterogeneity. This…
▽ More
The growing need for sustainable energy solutions has driven the integration of digitalized buildings into the power grid, utilizing Internet-of-Things technology to optimize building performance and energy efficiency. However, incorporating IoT point data within deep-learning frameworks for energy management presents a complex challenge, predominantly due to the inherent data heterogeneity. This paper comprehensively analyzes the multifaceted heterogeneity present in real-world building IoT data streams. We meticulously dissect the heterogeneity across multiple dimensions, encompassing ontology, etiology, temporal irregularity, spatial diversity, and their combined effects on the IoT point data distribution. In addition, experiments using state-of-the-art forecasting models are conducted to evaluate their impacts on the performance of deep-learning models for forecasting tasks. By charting the diversity along these dimensions, we illustrate the challenges and delineate pathways for future research to leverage this heterogeneity as a resource rather than a roadblock. This exploration sets the stage for advancing the predictive abilities of deep-learning algorithms and catalyzing the evolution of intelligent energy-efficient buildings.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Towards Detecting and Mitigating Cognitive Bias in Spoken Conversational Search
Authors:
Kaixin Ji,
Sachin Pathiyan Cherumanal,
Johanne R. Trippas,
Danula Hettiachchi,
Flora D. Salim,
Falk Scholer,
Damiano Spina
Abstract:
Instruments such as eye-tracking devices have contributed to understanding how users interact with screen-based search engines. However, user-system interactions in audio-only channels -- as is the case for Spoken Conversational Search (SCS) -- are harder to characterize, given the lack of instruments to effectively and precisely capture interactions. Furthermore, in this era of information overlo…
▽ More
Instruments such as eye-tracking devices have contributed to understanding how users interact with screen-based search engines. However, user-system interactions in audio-only channels -- as is the case for Spoken Conversational Search (SCS) -- are harder to characterize, given the lack of instruments to effectively and precisely capture interactions. Furthermore, in this era of information overload, cognitive bias can significantly impact how we seek and consume information -- especially in the context of controversial topics or multiple viewpoints. This paper draws upon insights from multiple disciplines (including information seeking, psychology, cognitive science, and wearable sensors) to provoke novel conversations in the community. To this end, we discuss future opportunities and propose a framework including multimodal instruments and methods for experimental designs and settings. We demonstrate preliminary results as an example. We also outline the challenges and offer suggestions for adopting this multimodal approach, including ethical considerations, to assist future researchers and practitioners in exploring cognitive biases in SCS.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
Characterizing Information Seeking Processes with Multiple Physiological Signals
Authors:
Kaixin Ji,
Danula Hettiachchi,
Flora D. Salim,
Falk Scholer,
Damiano Spina
Abstract:
Information access systems are getting complex, and our understanding of user behavior during information seeking processes is mainly drawn from qualitative methods, such as observational studies or surveys. Leveraging the advances in sensing technologies, our study aims to characterize user behaviors with physiological signals, particularly in relation to cognitive load, affective arousal, and va…
▽ More
Information access systems are getting complex, and our understanding of user behavior during information seeking processes is mainly drawn from qualitative methods, such as observational studies or surveys. Leveraging the advances in sensing technologies, our study aims to characterize user behaviors with physiological signals, particularly in relation to cognitive load, affective arousal, and valence. We conduct a controlled lab study with 26 participants, and collect data including Electrodermal Activities, Photoplethysmogram, Electroencephalogram, and Pupillary Responses. This study examines informational search with four stages: the realization of Information Need (IN), Query Formulation (QF), Query Submission (QS), and Relevance Judgment (RJ). We also include different interaction modalities to represent modern systems, e.g., QS by text-typing or verbalizing, and RJ with text or audio information. We analyze the physiological signals across these stages and report outcomes of pairwise non-parametric repeated-measure statistical tests. The results show that participants experience significantly higher cognitive loads at IN with a subtle increase in alertness, while QF requires higher attention. QS involves demanding cognitive loads than QF. Affective responses are more pronounced at RJ than QS or IN, suggesting greater interest and engagement as knowledge gaps are resolved. To the best of our knowledge, this is the first study that explores user behaviors in a search process employing a more nuanced quantitative analysis of physiological signals. Our findings offer valuable insights into user behavior and emotional responses in information seeking processes. We believe our proposed methodology can inform the characterization of more complex processes, such as conversational information seeking.
△ Less
Submitted 7 May, 2024; v1 submitted 1 May, 2024;
originally announced May 2024.
-
Large Language Models for Next Point-of-Interest Recommendation
Authors:
Peibo Li,
Maarten de Rijke,
Hao Xue,
Shuang Ao,
Yang Song,
Flora D. Salim
Abstract:
The next Point of Interest (POI) recommendation task is to predict users' immediate next POI visit given their historical data. Location-Based Social Network (LBSN) data, which is often used for the next POI recommendation task, comes with challenges. One frequently disregarded challenge is how to effectively use the abundant contextual information present in LBSN data. Previous methods are limite…
▽ More
The next Point of Interest (POI) recommendation task is to predict users' immediate next POI visit given their historical data. Location-Based Social Network (LBSN) data, which is often used for the next POI recommendation task, comes with challenges. One frequently disregarded challenge is how to effectively use the abundant contextual information present in LBSN data. Previous methods are limited by their numerical nature and fail to address this challenge. In this paper, we propose a framework that uses pretrained Large Language Models (LLMs) to tackle this challenge. Our framework allows us to preserve heterogeneous LBSN data in its original format, hence avoiding the loss of contextual information. Furthermore, our framework is capable of comprehending the inherent meaning of contextual information due to the inclusion of commonsense knowledge. In experiments, we test our framework on three real-world LBSN datasets. Our results show that the proposed framework outperforms the state-of-the-art models in all three datasets. Our analysis demonstrates the effectiveness of the proposed framework in using contextual information as well as alleviating the commonly encountered cold-start and short trajectory problems.
△ Less
Submitted 19 April, 2024;
originally announced April 2024.
-
Illuminating the Unseen: Investigating the Context-induced Harms in Behavioral Sensing
Authors:
Han Zhang,
Vedant Das Swain,
Leijie Wang,
Nan Gao,
Yilun Sheng,
Xuhai Xu,
Flora D. Salim,
Koustuv Saha,
Anind K. Dey,
Jennifer Mankoff
Abstract:
Behavioral sensing technologies are rapidly evolving across a range of well-being applications. Despite its potential, concerns about the responsible use of such technology are escalating. In response, recent research within the sensing technology has started to address these issues. While promising, they primarily focus on broad demographic categories and overlook more nuanced, context-specific i…
▽ More
Behavioral sensing technologies are rapidly evolving across a range of well-being applications. Despite its potential, concerns about the responsible use of such technology are escalating. In response, recent research within the sensing technology has started to address these issues. While promising, they primarily focus on broad demographic categories and overlook more nuanced, context-specific identities. These approaches lack grounding within domain-specific harms that arise from deploying sensing technology in diverse social, environmental, and technological settings. Additionally, existing frameworks for evaluating harms are designed for a generic ML life cycle, and fail to adapt to the dynamic and longitudinal considerations for behavioral sensing technology. To address these gaps, we introduce a framework specifically designed for evaluating behavioral sensing technologies. This framework emphasizes a comprehensive understanding of context, particularly the situated identities of users and the deployment settings of the sensing technology. It also highlights the necessity for iterative harm mitigation and continuous maintenance to adapt to the evolving nature of technology and its use. We demonstrate the feasibility and generalizability of our framework through post-hoc evaluations on two real-world behavioral sensing studies conducted in different international contexts, involving varied population demographics and machine learning tasks. Our evaluations provide empirical evidence of both situated identity-based harm and more domain-specific harms, and discuss the trade-offs introduced by implementing bias mitigation techniques.
△ Less
Submitted 5 May, 2024; v1 submitted 22 April, 2024;
originally announced April 2024.
-
GustosonicSense: Towards understanding the design of playful gustosonic eating experiences
Authors:
Yan Wang,
Humphrey O. Obie,
Zhuying Li,
Flora D. Salim,
John Grundy,
Florian 'Floyd' Mueller
Abstract:
The pleasure that often comes with eating can be further enhanced with intelligent technology, as the field of human-food interaction suggests. However, knowledge on how to design such pleasure-supporting eating systems is limited. To begin filling this knowledge gap, we designed "GustosonicSense", a novel gustosonic eating system that utilizes wireless earbuds for sensing different eating and dri…
▽ More
The pleasure that often comes with eating can be further enhanced with intelligent technology, as the field of human-food interaction suggests. However, knowledge on how to design such pleasure-supporting eating systems is limited. To begin filling this knowledge gap, we designed "GustosonicSense", a novel gustosonic eating system that utilizes wireless earbuds for sensing different eating and drinking actions with a machine learning algorithm and trigger playful sounds as a way to facilitate pleasurable eating experiences. We present the findings from our design and a study that revealed how we can support the "stimulation", "hedonism", and "reflexivity" for playful human-food interactions. Ultimately, with our work, we aim to support interaction designers in facilitating playful experiences with food.
△ Less
Submitted 16 March, 2024;
originally announced March 2024.
-
Prompt Mining for Language-based Human Mobility Forecasting
Authors:
Hao Xue,
Tianye Tang,
Ali Payani,
Flora D. Salim
Abstract:
With the advancement of large language models, language-based forecasting has recently emerged as an innovative approach for predicting human mobility patterns. The core idea is to use prompts to transform the raw mobility data given as numerical values into natural language sentences so that the language models can be leveraged to generate the description for future observations. However, previou…
▽ More
With the advancement of large language models, language-based forecasting has recently emerged as an innovative approach for predicting human mobility patterns. The core idea is to use prompts to transform the raw mobility data given as numerical values into natural language sentences so that the language models can be leveraged to generate the description for future observations. However, previous studies have only employed fixed and manually designed templates to transform numerical values into sentences. Since the forecasting performance of language models heavily relies on prompts, using fixed templates for prompting may limit the forecasting capability of language models. In this paper, we propose a novel framework for prompt mining in language-based mobility forecasting, aiming to explore diverse prompt design strategies. Specifically, the framework includes a prompt generation stage based on the information entropy of prompts and a prompt refinement stage to integrate mechanisms such as the chain of thought. Experimental results on real-world large-scale data demonstrate the superiority of generated prompts from our prompt mining pipeline. Additionally, the comparison of different prompt variants shows that the proposed prompt refinement process is effective. Our study presents a promising direction for further advancing language-based mobility forecasting.
△ Less
Submitted 6 March, 2024;
originally announced March 2024.
-
SSTKG: Simple Spatio-Temporal Knowledge Graph for Intepretable and Versatile Dynamic Information Embedding
Authors:
Ruiyi Yang,
Flora D. Salim,
Hao Xue
Abstract:
Knowledge graphs (KGs) have been increasingly employed for link prediction and recommendation using real-world datasets. However, the majority of current methods rely on static data, neglecting the dynamic nature and the hidden spatio-temporal attributes of real-world scenarios. This often results in suboptimal predictions and recommendations. Although there are effective spatio-temporal inference…
▽ More
Knowledge graphs (KGs) have been increasingly employed for link prediction and recommendation using real-world datasets. However, the majority of current methods rely on static data, neglecting the dynamic nature and the hidden spatio-temporal attributes of real-world scenarios. This often results in suboptimal predictions and recommendations. Although there are effective spatio-temporal inference methods, they face challenges such as scalability with large datasets and inadequate semantic understanding, which impede their performance. To address these limitations, this paper introduces a novel framework - Simple Spatio-Temporal Knowledge Graph (SSTKG), for constructing and exploring spatio-temporal KGs. To integrate spatial and temporal data into KGs, our framework exploited through a new 3-step embedding method. Output embeddings can be used for future temporal sequence prediction and spatial information recommendation, providing valuable insights for various applications such as retail sales forecasting and traffic volume prediction. Our framework offers a simple but comprehensive way to understand the underlying patterns and trends in dynamic KG, thereby enhancing the accuracy of predictions and the relevance of recommendations. This work paves the way for more effective utilization of spatio-temporal data in KGs, with potential impacts across a wide range of sectors.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Measuring Misogyny in Natural Language Generation: Preliminary Results from a Case Study on two Reddit Communities
Authors:
Aaron J. Snoswell,
Lucinda Nelson,
Hao Xue,
Flora D. Salim,
Nicolas Suzor,
Jean Burgess
Abstract:
Generic `toxicity' classifiers continue to be used for evaluating the potential for harm in natural language generation, despite mounting evidence of their shortcomings. We consider the challenge of measuring misogyny in natural language generation, and argue that generic `toxicity' classifiers are inadequate for this task. We use data from two well-characterised `Incel' communities on Reddit that…
▽ More
Generic `toxicity' classifiers continue to be used for evaluating the potential for harm in natural language generation, despite mounting evidence of their shortcomings. We consider the challenge of measuring misogyny in natural language generation, and argue that generic `toxicity' classifiers are inadequate for this task. We use data from two well-characterised `Incel' communities on Reddit that differ primarily in their degrees of misogyny to construct a pair of training corpora which we use to fine-tune two language models. We show that an open source `toxicity' classifier is unable to distinguish meaningfully between generations from these models. We contrast this with a misogyny-specific lexicon recently proposed by feminist subject-matter experts, demonstrating that, despite the limitations of simple lexicon-based approaches, this shows promise as a benchmark to evaluate language models for misogyny, and that it is sensitive enough to reveal the known differences in these Reddit communities. Our preliminary findings highlight the limitations of a generic approach to evaluating harms, and further emphasise the need for careful benchmark design and selection in natural language evaluation.
△ Less
Submitted 6 December, 2023;
originally announced December 2023.
-
Critiquing Self-report Practices for Human Mental and Wellbeing Computing at Ubicomp
Authors:
Nan Gao,
Soundariya Ananthan,
Chun Yu,
Yuntao Wang,
Flora D. Salim
Abstract:
Computing human mental and wellbeing is crucial to various domains, including health, education, and entertainment. However, the reliance on self-reporting in traditional research to establish ground truth often leads to methodological inconsistencies and susceptibility to response biases, thus hindering the effectiveness of modelling. This paper presents the first systematic methodological review…
▽ More
Computing human mental and wellbeing is crucial to various domains, including health, education, and entertainment. However, the reliance on self-reporting in traditional research to establish ground truth often leads to methodological inconsistencies and susceptibility to response biases, thus hindering the effectiveness of modelling. This paper presents the first systematic methodological review of self-reporting practices in Ubicomp within the context of human mental and wellbeing computing. Drawing from existing survey research, we establish guidelines for self-reporting in human wellbeing studies and identify shortcomings in current practices at Ubicomp community. Furthermore, we explore the reliability of self-report as a means of ground truth and propose directions for improving ground truth measurement in this field. Ultimately, we emphasize the urgent need for methodological advancements to enhance human mental and wellbeing computing.
△ Less
Submitted 26 November, 2023;
originally announced November 2023.
-
Automated Mobile Sensing Strategies Generation for Human Behaviour Understanding
Authors:
Nan Gao,
Zhuolei Yu,
Chun Yu,
Yuntao Wang,
Flora D. Salim,
Yuanchun Shi
Abstract:
Mobile sensing plays a crucial role in generating digital traces to understand human daily lives. However, studying behaviours like mood or sleep quality in smartphone users requires carefully designed mobile sensing strategies such as sensor selection and feature construction. This process is time-consuming, burdensome, and requires expertise in multiple domains. Furthermore, the resulting sensin…
▽ More
Mobile sensing plays a crucial role in generating digital traces to understand human daily lives. However, studying behaviours like mood or sleep quality in smartphone users requires carefully designed mobile sensing strategies such as sensor selection and feature construction. This process is time-consuming, burdensome, and requires expertise in multiple domains. Furthermore, the resulting sensing framework lacks generalizability, making it difficult to apply to different scenarios. To address these challenges, we propose an automated mobile sensing strategy for human behaviour understanding. First, we establish a knowledge base and consolidate rules for effective feature construction, data collection, and model selection. Then, we introduce the multi-granular human behaviour representation and design procedures for leveraging large language models to generate strategies. Our approach is validated through blind comparative studies and usability evaluation. Ultimately, our approach holds the potential to revolutionise the field of mobile sensing and its applications.
△ Less
Submitted 9 November, 2023;
originally announced November 2023.
-
Utilizing Language Models for Energy Load Forecasting
Authors:
Hao Xue,
Flora D. Salim
Abstract:
Energy load forecasting plays a crucial role in optimizing resource allocation and managing energy consumption in buildings and cities. In this paper, we propose a novel approach that leverages language models for energy load forecasting. We employ prompting techniques to convert energy consumption data into descriptive sentences, enabling fine-tuning of language models. By adopting an autoregress…
▽ More
Energy load forecasting plays a crucial role in optimizing resource allocation and managing energy consumption in buildings and cities. In this paper, we propose a novel approach that leverages language models for energy load forecasting. We employ prompting techniques to convert energy consumption data into descriptive sentences, enabling fine-tuning of language models. By adopting an autoregressive generating approach, our proposed method enables predictions of various horizons of future energy load consumption. Through extensive experiments on real-world datasets, we demonstrate the effectiveness and accuracy of our proposed method. Our results indicate that utilizing language models for energy load forecasting holds promise for enhancing energy efficiency and facilitating intelligent decision-making in energy systems.
△ Less
Submitted 26 October, 2023;
originally announced October 2023.
-
ZzzGPT: An Interactive GPT Approach to Enhance Sleep Quality
Authors:
Yonchanok Khaokaew,
Kaixin Ji,
Thuc Hanh Nguyen,
Hiruni Kegalle,
Marwah Alaofi,
Hao Xue,
Flora D. Salim
Abstract:
This paper explores the intersection of technology and sleep pattern comprehension, presenting a cutting-edge two-stage framework that harnesses the power of Large Language Models (LLMs). The primary objective is to deliver precise sleep predictions paired with actionable feedback, addressing the limitations of existing solutions. This innovative approach involves leveraging the GLOBEM dataset alo…
▽ More
This paper explores the intersection of technology and sleep pattern comprehension, presenting a cutting-edge two-stage framework that harnesses the power of Large Language Models (LLMs). The primary objective is to deliver precise sleep predictions paired with actionable feedback, addressing the limitations of existing solutions. This innovative approach involves leveraging the GLOBEM dataset alongside synthetic data generated by LLMs. The results highlight significant improvements, underlining the efficacy of merging advanced machine-learning techniques with a user-centric design ethos. Through this exploration, we bridge the gap between technological sophistication and user-friendly design, ensuring that our framework yields accurate predictions and translates them into actionable insights.
△ Less
Submitted 6 May, 2024; v1 submitted 24 October, 2023;
originally announced October 2023.
-
"Living Within Four Walls": Exploring Emotional and Social Dynamics in Mobile Usage During Home Confinement
Authors:
Nan Gao,
Sam Nolan,
Kaixin Ji,
Shakila Khan Rumi,
Judith Simone Heinisch,
Christoph Anderson,
Klaus David,
Flora D. Salim
Abstract:
Home confinement, a situation experienced by individuals for reasons ranging from medical quarantines, rehabilitation needs, disability accommodations, and remote working, is a common yet impactful aspect of modern life. While essential in various scenarios, confinement within the home environment can profoundly influence psychological well-being and digital device usage. In this study, we delve i…
▽ More
Home confinement, a situation experienced by individuals for reasons ranging from medical quarantines, rehabilitation needs, disability accommodations, and remote working, is a common yet impactful aspect of modern life. While essential in various scenarios, confinement within the home environment can profoundly influence psychological well-being and digital device usage. In this study, we delve into these effects, utilising the COVID-19 lockdown as a special case study to draw insights extending to various homebound situations. We conducted an in-situ study with 32 participants living in states affected by COVID-19 lockdowns for three weeks and analysed their emotions, well-being, social roles, and mobile usage behaviours. We extracted user activity from app usage records in an unsupervised manner, and experimental results revealed that app usage behaviours are effective indicators of emotional well-being in confined environments. Our research has great potential for developing supportive strategies and remote programs, not only for people facing similar medical isolation situations, but also for individuals in long-term home confinement, such as those with chronic illnesses, recovering from surgery, or adapting to permanent remote work arrangements.
△ Less
Submitted 8 June, 2024; v1 submitted 20 October, 2023;
originally announced October 2023.
-
Human Mobility Question Answering (Vision Paper)
Authors:
Hao Xue,
Flora D. Salim
Abstract:
Question answering (QA) systems have attracted much attention from the artificial intelligence community as they can learn to answer questions based on the given knowledge source (e.g., images in visual question answering). However, the research into question answering systems with human mobility data remains unexplored. Mining human mobility data is crucial for various applications such as smart…
▽ More
Question answering (QA) systems have attracted much attention from the artificial intelligence community as they can learn to answer questions based on the given knowledge source (e.g., images in visual question answering). However, the research into question answering systems with human mobility data remains unexplored. Mining human mobility data is crucial for various applications such as smart city planning, pandemic management, and personalised recommendation system. In this paper, we aim to tackle this gap and introduce a novel task, that is, human mobility question answering (MobQA). The aim of the task is to let the intelligent system learn from mobility data and answer related questions. This task presents a new paradigm change in mobility prediction research and further facilitates the research of human mobility recommendation systems. To better support this novel research topic, this vision paper also proposes an initial design of the dataset and a potential deep learning model framework for the introduced MobQA task. We hope that this paper will provide novel insights and open new directions in human mobility research and question answering research.
△ Less
Submitted 13 October, 2023; v1 submitted 2 October, 2023;
originally announced October 2023.
-
MAPLE: Mobile App Prediction Leveraging Large Language Model Embeddings
Authors:
Yonchanok Khaokaew,
Hao Xue,
Flora D. Salim
Abstract:
In recent years, predicting mobile app usage has become increasingly important for areas like app recommendation, user behaviour analysis, and mobile resource management. Existing models, however, struggle with the heterogeneous nature of contextual data and the user cold start problem. This study introduces a novel prediction model, Mobile App Prediction Leveraging Large Language Model Embeddings…
▽ More
In recent years, predicting mobile app usage has become increasingly important for areas like app recommendation, user behaviour analysis, and mobile resource management. Existing models, however, struggle with the heterogeneous nature of contextual data and the user cold start problem. This study introduces a novel prediction model, Mobile App Prediction Leveraging Large Language Model Embeddings (MAPLE), which employs Large Language Models (LLMs) and installed app similarity to overcome these challenges. MAPLE utilises the power of LLMs to process contextual data and discern intricate relationships within it effectively. Additionally, we explore the use of installed app similarity to address the cold start problem, facilitating the modelling of user preferences and habits, even for new users with limited historical data. In essence, our research presents MAPLE as a novel, potent, and practical approach to app usage prediction, making significant strides in resolving issues faced by existing models. MAPLE stands out as a comprehensive and effective solution, setting a new benchmark for more precise and personalised app usage predictions. In tests on two real-world datasets, MAPLE surpasses contemporary models in both standard and cold start scenarios. These outcomes validate MAPLE's capacity for precise app usage predictions and its resilience against the cold start problem. This enhanced performance stems from the model's proficiency in capturing complex temporal patterns and leveraging contextual information. As a result, MAPLE can potentially improve personalised mobile app usage predictions and user experiences markedly.
△ Less
Submitted 30 January, 2024; v1 submitted 15 September, 2023;
originally announced September 2023.
-
Navigating Out-of-Distribution Electricity Load Forecasting during COVID-19: Benchmarking energy load forecasting models without and with continual learning
Authors:
Arian Prabowo,
Kaixuan Chen,
Hao Xue,
Subbu Sethuvenkatraman,
Flora D. Salim
Abstract:
In traditional deep learning algorithms, one of the key assumptions is that the data distribution remains constant during both training and deployment. However, this assumption becomes problematic when faced with Out-of-Distribution periods, such as the COVID-19 lockdowns, where the data distribution significantly deviates from what the model has seen during training. This paper employs a two-fold…
▽ More
In traditional deep learning algorithms, one of the key assumptions is that the data distribution remains constant during both training and deployment. However, this assumption becomes problematic when faced with Out-of-Distribution periods, such as the COVID-19 lockdowns, where the data distribution significantly deviates from what the model has seen during training. This paper employs a two-fold strategy: utilizing continual learning techniques to update models with new data and harnessing human mobility data collected from privacy-preserving pedestrian counters located outside buildings. In contrast to online learning, which suffers from 'catastrophic forgetting' as newly acquired knowledge often erases prior information, continual learning offers a holistic approach by preserving past insights while integrating new data. This research applies FSNet, a powerful continual learning algorithm, to real-world data from 13 building complexes in Melbourne, Australia, a city which had the second longest total lockdown duration globally during the pandemic. Results underscore the crucial role of continual learning in accurate energy forecasting, particularly during Out-of-Distribution periods. Secondary data such as mobility and temperature provided ancillary support to the primary forecasting model. More importantly, while traditional methods struggled to adapt during lockdowns, models featuring at least online learning demonstrated resilience, with lockdown periods posing fewer challenges once armed with adaptive learning techniques. This study contributes valuable methodologies and insights to the ongoing effort to improve energy load forecasting during future Out-of-Distribution periods.
△ Less
Submitted 3 October, 2023; v1 submitted 8 September, 2023;
originally announced September 2023.
-
Counterfactual Explanations via Locally-guided Sequential Algorithmic Recourse
Authors:
Edward A. Small,
Jeffrey N. Clark,
Christopher J. McWilliams,
Kacper Sokol,
Jeffrey Chan,
Flora D. Salim,
Raul Santos-Rodriguez
Abstract:
Counterfactuals operationalised through algorithmic recourse have become a powerful tool to make artificial intelligence systems explainable. Conceptually, given an individual classified as y -- the factual -- we seek actions such that their prediction becomes the desired class y' -- the counterfactual. This process offers algorithmic recourse that is (1) easy to customise and interpret, and (2) d…
▽ More
Counterfactuals operationalised through algorithmic recourse have become a powerful tool to make artificial intelligence systems explainable. Conceptually, given an individual classified as y -- the factual -- we seek actions such that their prediction becomes the desired class y' -- the counterfactual. This process offers algorithmic recourse that is (1) easy to customise and interpret, and (2) directly aligned with the goals of each individual. However, the properties of a "good" counterfactual are still largely debated; it remains an open challenge to effectively locate a counterfactual along with its corresponding recourse. Some strategies use gradient-driven methods, but these offer no guarantees on the feasibility of the recourse and are open to adversarial attacks on carefully created manifolds. This can lead to unfairness and lack of robustness. Other methods are data-driven, which mostly addresses the feasibility problem at the expense of privacy, security and secrecy as they require access to the entire training data set. Here, we introduce LocalFACE, a model-agnostic technique that composes feasible and actionable counterfactual explanations using locally-acquired information at each step of the algorithmic recourse. Our explainer preserves the privacy of users by only leveraging data that it specifically requires to construct actionable algorithmic recourse, and protects the model by offering transparency solely in the regions deemed necessary for the intervention.
△ Less
Submitted 8 September, 2023;
originally announced September 2023.
-
How Crowd Worker Factors Influence Subjective Annotations: A Study of Tagging Misogynistic Hate Speech in Tweets
Authors:
Danula Hettiachchi,
Indigo Holcombe-James,
Stephanie Livingstone,
Anjalee de Silva,
Matthew Lease,
Flora D. Salim,
Mark Sanderson
Abstract:
Crowdsourced annotation is vital to both collecting labelled data to train and test automated content moderation systems and to support human-in-the-loop review of system decisions. However, annotation tasks such as judging hate speech are subjective and thus highly sensitive to biases stemming from annotator beliefs, characteristics and demographics. We conduct two crowdsourcing studies on Mechan…
▽ More
Crowdsourced annotation is vital to both collecting labelled data to train and test automated content moderation systems and to support human-in-the-loop review of system decisions. However, annotation tasks such as judging hate speech are subjective and thus highly sensitive to biases stemming from annotator beliefs, characteristics and demographics. We conduct two crowdsourcing studies on Mechanical Turk to examine annotator bias in labelling sexist and misogynistic hate speech. Results from 109 annotators show that annotator political inclination, moral integrity, personality traits, and sexist attitudes significantly impact annotation accuracy and the tendency to tag content as hate speech. In addition, semi-structured interviews with nine crowd workers provide further insights regarding the influence of subjectivity on annotations. In exploring how workers interpret a task - shaped by complex negotiations between platform structures, task instructions, subjective motivations, and external contextual factors - we see annotations not only impacted by worker factors but also simultaneously shaped by the structures under which they labour.
△ Less
Submitted 3 September, 2023;
originally announced September 2023.
-
i-Align: an interpretable knowledge graph alignment model
Authors:
Bayu Distiawan Trisedya,
Flora D Salim,
Jeffrey Chan,
Damiano Spina,
Falk Scholer,
Mark Sanderson
Abstract:
Knowledge graphs (KGs) are becoming essential resources for many downstream applications. However, their incompleteness may limit their potential. Thus, continuous curation is needed to mitigate this problem. One of the strategies to address this problem is KG alignment, i.e., forming a more complete KG by merging two or more KGs. This paper proposes i-Align, an interpretable KG alignment model. U…
▽ More
Knowledge graphs (KGs) are becoming essential resources for many downstream applications. However, their incompleteness may limit their potential. Thus, continuous curation is needed to mitigate this problem. One of the strategies to address this problem is KG alignment, i.e., forming a more complete KG by merging two or more KGs. This paper proposes i-Align, an interpretable KG alignment model. Unlike the existing KG alignment models, i-Align provides an explanation for each alignment prediction while maintaining high alignment performance. Experts can use the explanation to check the correctness of the alignment prediction. Thus, the high quality of a KG can be maintained during the curation process (e.g., the merging process of two KGs). To this end, a novel Transformer-based Graph Encoder (Trans-GE) is proposed as a key component of i-Align for aggregating information from entities' neighbors (structures). Trans-GE uses Edge-gated Attention that combines the adjacency matrix and the self-attention matrix to learn a gating mechanism to control the information aggregation from the neighboring entities. It also uses historical embeddings, allowing Trans-GE to be trained over mini-batches, or smaller sub-graphs, to address the scalability issue when encoding a large KG. Another component of i-Align is a Transformer encoder for aggregating entities' attributes. This way, i-Align can generate explanations in the form of a set of the most influential attributes/neighbors based on attention weights. Extensive experiments are conducted to show the power of i-Align. The experiments include several aspects, such as the model's effectiveness for aligning KGs, the quality of the generated explanations, and its practicality for aligning large KGs. The results show the effectiveness of i-Align in these aspects.
△ Less
Submitted 25 August, 2023;
originally announced August 2023.
-
Hypergraph Convolutional Networks for Fine-grained ICU Patient Similarity Analysis and Risk Prediction
Authors:
Yuxi Liu,
Zhenhao Zhang,
Shaowen Qin,
Flora D. Salim,
Antonio Jimeno Yepes,
Jun Shen,
Jiang Bian
Abstract:
The Intensive Care Unit (ICU) is one of the most important parts of a hospital, which admits critically ill patients and provides continuous monitoring and treatment. Various patient outcome prediction methods have been attempted to assist healthcare professionals in clinical decision-making. Existing methods focus on measuring the similarity between patients using deep neural networks to capture…
▽ More
The Intensive Care Unit (ICU) is one of the most important parts of a hospital, which admits critically ill patients and provides continuous monitoring and treatment. Various patient outcome prediction methods have been attempted to assist healthcare professionals in clinical decision-making. Existing methods focus on measuring the similarity between patients using deep neural networks to capture the hidden feature structures. However, the higher-order relationships are ignored, such as patient characteristics (e.g., diagnosis codes) and their causal effects on downstream clinical predictions. In this paper, we propose a novel Hypergraph Convolutional Network that allows the representation of non-pairwise relationships among diagnosis codes in a hypergraph to capture the hidden feature structures so that fine-grained patient similarity can be calculated for personalized mortality risk prediction. Evaluation using a publicly available eICU Collaborative Research Database indicates that our method achieves superior performance over the state-of-the-art models on mortality risk prediction. Moreover, the results of several case studies demonstrated the effectiveness and robustness of the model decisions.
△ Less
Submitted 21 October, 2023; v1 submitted 24 August, 2023;
originally announced August 2023.
-
Designing and Evaluating Presentation Strategies for Fact-Checked Content
Authors:
Danula Hettiachchi,
Kaixin Ji,
Jenny Kennedy,
Anthony McCosker,
Flora D. Salim,
Mark Sanderson,
Falk Scholer,
Damiano Spina
Abstract:
With the rapid growth of online misinformation, it is crucial to have reliable fact-checking methods. Recent research on finding check-worthy claims and automated fact-checking have made significant advancements. However, limited guidance exists regarding the presentation of fact-checked content to effectively convey verified information to users. We address this research gap by exploring the crit…
▽ More
With the rapid growth of online misinformation, it is crucial to have reliable fact-checking methods. Recent research on finding check-worthy claims and automated fact-checking have made significant advancements. However, limited guidance exists regarding the presentation of fact-checked content to effectively convey verified information to users. We address this research gap by exploring the critical design elements in fact-checking reports and investigating whether credibility and presentation-based design improvements can enhance users' ability to interpret the report accurately. We co-developed potential content presentation strategies through a workshop involving fact-checking professionals, communication experts, and researchers. The workshop examined the significance and utility of elements such as veracity indicators and explored the feasibility of incorporating interactive components for enhanced information disclosure. Building on the workshop outcomes, we conducted an online experiment involving 76 crowd workers to assess the efficacy of different design strategies. The results indicate that proposed strategies significantly improve users' ability to accurately interpret the verdict of fact-checking articles. Our findings underscore the critical role of effective presentation of fact reports in addressing the spread of misinformation. By adopting appropriate design enhancements, the effectiveness of fact-checking reports can be maximized, enabling users to make informed judgments.
△ Less
Submitted 23 December, 2023; v1 submitted 20 August, 2023;
originally announced August 2023.
-
Contrastive Learning-based Imputation-Prediction Networks for In-hospital Mortality Risk Modeling using EHRs
Authors:
Yuxi Liu,
Zhenhao Zhang,
Shaowen Qin,
Flora D. Salim,
Antonio Jimeno Yepes
Abstract:
Predicting the risk of in-hospital mortality from electronic health records (EHRs) has received considerable attention. Such predictions will provide early warning of a patient's health condition to healthcare professionals so that timely interventions can be taken. This prediction task is challenging since EHR data are intrinsically irregular, with not only many missing values but also varying ti…
▽ More
Predicting the risk of in-hospital mortality from electronic health records (EHRs) has received considerable attention. Such predictions will provide early warning of a patient's health condition to healthcare professionals so that timely interventions can be taken. This prediction task is challenging since EHR data are intrinsically irregular, with not only many missing values but also varying time intervals between medical records. Existing approaches focus on exploiting the variable correlations in patient medical records to impute missing values and establishing time-decay mechanisms to deal with such irregularity. This paper presents a novel contrastive learning-based imputation-prediction network for predicting in-hospital mortality risks using EHR data. Our approach introduces graph analysis-based patient stratification modeling in the imputation process to group similar patients. This allows information of similar patients only to be used, in addition to personal contextual information, for missing value imputation. Moreover, our approach can integrate contrastive learning into the proposed network architecture to enhance patient representation learning and predictive performance on the classification task. Experiments on two real-world EHR datasets show that our approach outperforms the state-of-the-art approaches in both imputation and prediction tasks.
△ Less
Submitted 18 August, 2023;
originally announced August 2023.
-
A System of Monitoring and Analyzing Human Indoor Mobility and Air Quality
Authors:
Kyle K. Qin,
Mohammad S. Rahaman,
Yongli Ren,
Chi-Tsun Cheng,
Ivan Cole,
Flora D. Salim
Abstract:
Human movements in the workspace usually have non-negligible relations with air quality parameters (e.g., CO$_2$, PM2.5, and PM10). We establish a system to monitor indoor human mobility with air quality and assess the interrelationship between these two types of time series data. More specifically, a sensor network was designed in indoor environments to observe air quality parameters continuously…
▽ More
Human movements in the workspace usually have non-negligible relations with air quality parameters (e.g., CO$_2$, PM2.5, and PM10). We establish a system to monitor indoor human mobility with air quality and assess the interrelationship between these two types of time series data. More specifically, a sensor network was designed in indoor environments to observe air quality parameters continuously. Simultaneously, another sensing module detected participants' movements around the study areas. In this module, modern data analysis and machine learning techniques have been applied to reconstruct the trajectories of participants with relevant sensor information. Finally, a further study revealed the correlation between human indoor mobility patterns and indoor air quality parameters. Our experimental results demonstrate that human movements in different environments can significantly impact air quality during busy hours. With the results, we propose recommendations for future studies.
△ Less
Submitted 20 June, 2023;
originally announced June 2023.
-
Continually learning out-of-distribution spatiotemporal data for robust energy forecasting
Authors:
Arian Prabowo,
Kaixuan Chen,
Hao Xue,
Subbu Sethuvenkatraman,
Flora D. Salim
Abstract:
Forecasting building energy usage is essential for promoting sustainability and reducing waste, as it enables building managers to optimize energy consumption and reduce costs. This importance is magnified during anomalous periods, such as the COVID-19 pandemic, which have disrupted occupancy patterns and made accurate forecasting more challenging. Forecasting energy usage during anomalous periods…
▽ More
Forecasting building energy usage is essential for promoting sustainability and reducing waste, as it enables building managers to optimize energy consumption and reduce costs. This importance is magnified during anomalous periods, such as the COVID-19 pandemic, which have disrupted occupancy patterns and made accurate forecasting more challenging. Forecasting energy usage during anomalous periods is difficult due to changes in occupancy patterns and energy usage behavior. One of the primary reasons for this is the shift in distribution of occupancy patterns, with many people working or learning from home. This has created a need for new forecasting methods that can adapt to changing occupancy patterns. Online learning has emerged as a promising solution to this challenge, as it enables building managers to adapt to changes in occupancy patterns and adjust energy usage accordingly. With online learning, models can be updated incrementally with each new data point, allowing them to learn and adapt in real-time. Another solution is to use human mobility data as a proxy for occupancy, leveraging the prevalence of mobile devices to track movement patterns and infer occupancy levels. Human mobility data can be useful in this context as it provides a way to monitor occupancy patterns without relying on traditional sensors or manual data collection methods. We have conducted extensive experiments using data from six buildings to test the efficacy of these approaches. However, deploying these methods in the real world presents several challenges.
△ Less
Submitted 9 September, 2023; v1 submitted 10 June, 2023;
originally announced June 2023.
-
Message Passing Neural Networks for Traffic Forecasting
Authors:
Arian Prabowo,
Hao Xue,
Wei Shao,
Piotr Koniusz,
Flora D. Salim
Abstract:
A road network, in the context of traffic forecasting, is typically modeled as a graph where the nodes are sensors that measure traffic metrics (such as speed) at that location. Traffic forecasting is interesting because it is complex as the future speed of a road is dependent on a number of different factors. Therefore, to properly forecast traffic, we need a model that is capable of capturing al…
▽ More
A road network, in the context of traffic forecasting, is typically modeled as a graph where the nodes are sensors that measure traffic metrics (such as speed) at that location. Traffic forecasting is interesting because it is complex as the future speed of a road is dependent on a number of different factors. Therefore, to properly forecast traffic, we need a model that is capable of capturing all these different factors. A factor that is missing from the existing works is the node interactions factor. Existing works fail to capture the inter-node interactions because none are using the message-passing flavor of GNN, which is the one best suited to capture the node interactions This paper presents a plausible scenario in road traffic where node interactions are important and argued that the most appropriate GNN flavor to capture node interactions is message-passing. Results from real-world data show the superiority of the message-passing flavor for traffic forecasting. An additional experiment using synthetic data shows that the message-passing flavor can capture inter-node interaction better than other flavors.
△ Less
Submitted 9 May, 2023;
originally announced May 2023.
-
Traffic Forecasting on New Roads Using Spatial Contrastive Pre-Training (SCPT)
Authors:
Arian Prabowo,
Hao Xue,
Wei Shao,
Piotr Koniusz,
Flora D. Salim
Abstract:
New roads are being constructed all the time. However, the capabilities of previous deep forecasting models to generalize to new roads not seen in the training data (unseen roads) are rarely explored. In this paper, we introduce a novel setup called a spatio-temporal (ST) split to evaluate the models' capabilities to generalize to unseen roads. In this setup, the models are trained on data from a…
▽ More
New roads are being constructed all the time. However, the capabilities of previous deep forecasting models to generalize to new roads not seen in the training data (unseen roads) are rarely explored. In this paper, we introduce a novel setup called a spatio-temporal (ST) split to evaluate the models' capabilities to generalize to unseen roads. In this setup, the models are trained on data from a sample of roads, but tested on roads not seen in the training data. Moreover, we also present a novel framework called Spatial Contrastive Pre-Training (SCPT) where we introduce a spatial encoder module to extract latent features from unseen roads during inference time. This spatial encoder is pre-trained using contrastive learning. During inference, the spatial encoder only requires two days of traffic data on the new roads and does not require any re-training. We also show that the output from the spatial encoder can be used effectively to infer latent node embeddings on unseen roads during inference time. The SCPT framework also incorporates a new layer, named the spatially gated addition (SGA) layer, to effectively combine the latent features from the output of the spatial encoder to existing backbones. Additionally, since there is limited data on the unseen roads, we argue that it is better to decouple traffic signals to trivial-to-capture periodic signals and difficult-to-capture Markovian signals, and for the spatial encoder to only learn the Markovian signals. Finally, we empirically evaluated SCPT using the ST split setup on four real-world datasets. The results showed that adding SCPT to a backbone consistently improves forecasting performance on unseen roads. More importantly, the improvements are greater when forecasting further into the future. The codes are available on GitHub: https://github.com/cruiseresearchgroup/forecasting-on-new-roads .
△ Less
Submitted 21 September, 2023; v1 submitted 9 May, 2023;
originally announced May 2023.
-
Self-supervised Activity Representation Learning with Incremental Data: An Empirical Study
Authors:
Jason Liu,
Shohreh Deldari,
Hao Xue,
Van Nguyen,
Flora D. Salim
Abstract:
In the context of mobile sensing environments, various sensors on mobile devices continually generate a vast amount of data. Analyzing this ever-increasing data presents several challenges, including limited access to annotated data and a constantly changing environment. Recent advancements in self-supervised learning have been utilized as a pre-training step to enhance the performance of conventi…
▽ More
In the context of mobile sensing environments, various sensors on mobile devices continually generate a vast amount of data. Analyzing this ever-increasing data presents several challenges, including limited access to annotated data and a constantly changing environment. Recent advancements in self-supervised learning have been utilized as a pre-training step to enhance the performance of conventional supervised models to address the absence of labelled datasets. This research examines the impact of using a self-supervised representation learning model for time series classification tasks in which data is incrementally available. We proposed and evaluated a workflow in which a model learns to extract informative features using a corpus of unlabeled time series data and then conducts classification on labelled data using features extracted by the model. We analyzed the effect of varying the size, distribution, and source of the unlabeled data on the final classification performance across four public datasets, including various types of sensors in diverse applications.
△ Less
Submitted 30 April, 2023;
originally announced May 2023.
-
Examining the Impact of Uncontrolled Variables on Physiological Signals in User Studies for Information Processing Activities
Authors:
Kaixin Ji,
Damiano Spina,
Danula Hettiachchi,
Flora Dilys Salim,
Falk Scholer
Abstract:
Physiological signals can potentially be applied as objective measures to understand the behavior and engagement of users interacting with information access systems. However, the signals are highly sensitive, and many controls are required in laboratory user studies. To investigate the extent to which controlled or uncontrolled (i.e., confounding) variables such as task sequence or duration influ…
▽ More
Physiological signals can potentially be applied as objective measures to understand the behavior and engagement of users interacting with information access systems. However, the signals are highly sensitive, and many controls are required in laboratory user studies. To investigate the extent to which controlled or uncontrolled (i.e., confounding) variables such as task sequence or duration influence the observed signals, we conducted a pilot study where each participant completed four types of information-processing activities (READ, LISTEN, SPEAK, and WRITE). Meanwhile, we collected data on blood volume pulse, electrodermal activity, and pupil responses. We then used machine learning approaches as a mechanism to examine the influence of controlled and uncontrolled variables that commonly arise in user studies. Task duration was found to have a substantial effect on the model performance, suggesting it represents individual differences rather than giving insight into the target variables. This work contributes to our understanding of such variables in using physiological signals in information retrieval user studies.
△ Less
Submitted 26 April, 2023;
originally announced April 2023.
-
Equalised Odds is not Equal Individual Odds: Post-processing for Group and Individual Fairness
Authors:
Edward A. Small,
Kacper Sokol,
Daniel Manning,
Flora D. Salim,
Jeffrey Chan
Abstract:
Group fairness is achieved by equalising prediction distributions between protected sub-populations; individual fairness requires treating similar individuals alike. These two objectives, however, are incompatible when a scoring model is calibrated through discontinuous probability functions, where individuals can be randomly assigned an outcome determined by a fixed probability. This procedure ma…
▽ More
Group fairness is achieved by equalising prediction distributions between protected sub-populations; individual fairness requires treating similar individuals alike. These two objectives, however, are incompatible when a scoring model is calibrated through discontinuous probability functions, where individuals can be randomly assigned an outcome determined by a fixed probability. This procedure may provide two similar individuals from the same protected group with classification odds that are disparately different -- a clear violation of individual fairness. Assigning unique odds to each protected sub-population may also prevent members of one sub-population from ever receiving equal chances of a positive outcome to another, which we argue is another type of unfairness called individual odds. We reconcile all this by constructing continuous probability functions between group thresholds that are constrained by their Lipschitz constant. Our solution preserves the model's predictive power, individual fairness and robustness while ensuring group fairness.
△ Less
Submitted 19 April, 2024; v1 submitted 19 April, 2023;
originally announced April 2023.
-
Because Every Sensor Is Unique, so Is Every Pair: Handling Dynamicity in Traffic Forecasting
Authors:
Arian Prabowo,
Wei Shao,
Hao Xue,
Piotr Koniusz,
Flora D. Salim
Abstract:
Traffic forecasting is a critical task to extract values from cyber-physical infrastructures, which is the backbone of smart transportation. However owing to external contexts, the dynamics at each sensor are unique. For example, the afternoon peaks at sensors near schools are more likely to occur earlier than those near residential areas. In this paper, we first analyze real-world traffic data to…
▽ More
Traffic forecasting is a critical task to extract values from cyber-physical infrastructures, which is the backbone of smart transportation. However owing to external contexts, the dynamics at each sensor are unique. For example, the afternoon peaks at sensors near schools are more likely to occur earlier than those near residential areas. In this paper, we first analyze real-world traffic data to show that each sensor has a unique dynamic. Further analysis also shows that each pair of sensors also has a unique dynamic. Then, we explore how node embedding learns the unique dynamics at every sensor location. Next, we propose a novel module called Spatial Graph Transformers (SGT) where we use node embedding to leverage the self-attention mechanism to ensure that the information flow between two sensors is adaptive with respect to the unique dynamic of each pair. Finally, we present Graph Self-attention WaveNet (G-SWaN) to address the complex, non-linear spatiotemporal traffic dynamics. Through empirical experiments on four real-world, open datasets, we show that the proposed method achieves superior performance on both traffic speed and flow forecasting. Code is available at: https://github.com/aprbw/G-SWaN
△ Less
Submitted 28 February, 2023; v1 submitted 20 February, 2023;
originally announced February 2023.
-
Multiple-level Point Embedding for Solving Human Trajectory Imputation with Prediction
Authors:
Kyle K. Qin,
Yongli Ren,
Wei Shao,
Brennan Lake,
Filippo Privitera,
Flora D. Salim
Abstract:
Sparsity is a common issue in many trajectory datasets, including human mobility data. This issue frequently brings more difficulty to relevant learning tasks, such as trajectory imputation and prediction. Nowadays, little existing work simultaneously deals with imputation and prediction on human trajectories. This work plans to explore whether the learning process of imputation and prediction cou…
▽ More
Sparsity is a common issue in many trajectory datasets, including human mobility data. This issue frequently brings more difficulty to relevant learning tasks, such as trajectory imputation and prediction. Nowadays, little existing work simultaneously deals with imputation and prediction on human trajectories. This work plans to explore whether the learning process of imputation and prediction could benefit from each other to achieve better outcomes. And the question will be answered by studying the coexistence patterns between missing points and observed ones in incomplete trajectories. More specifically, the proposed model develops an imputation component based on the self-attention mechanism to capture the coexistence patterns between observations and missing points among encoder-decoder layers. Meanwhile, a recurrent unit is integrated to extract the sequential embeddings from newly imputed sequences for predicting the following location. Furthermore, a new implementation called Imputation Cycle is introduced to enable gradual imputation with prediction enhancement at multiple levels, which helps to accelerate the speed of convergence. The experimental results on three different real-world mobility datasets show that the proposed approach has significant advantages over the competitive baselines across both imputation and prediction tasks in terms of accuracy and stability.
△ Less
Submitted 12 January, 2023; v1 submitted 11 January, 2023;
originally announced January 2023.
-
Detecting Change Intervals with Isolation Distributional Kernel
Authors:
Yang Cao,
Ye Zhu,
Kai Ming Ting,
Flora D. Salim,
Hong Xian Li,
Luxing Yang,
Gang Li
Abstract:
Detecting abrupt changes in data distribution is one of the most significant tasks in streaming data analysis. Although many unsupervised Change-Point Detection (CPD) methods have been proposed recently to identify those changes, they still suffer from missing subtle changes, poor scalability, or/and sensitivity to outliers. To meet these challenges, we are the first to generalise the CPD problem…
▽ More
Detecting abrupt changes in data distribution is one of the most significant tasks in streaming data analysis. Although many unsupervised Change-Point Detection (CPD) methods have been proposed recently to identify those changes, they still suffer from missing subtle changes, poor scalability, or/and sensitivity to outliers. To meet these challenges, we are the first to generalise the CPD problem as a special case of the Change-Interval Detection (CID) problem. Then we propose a CID method, named iCID, based on a recent Isolation Distributional Kernel (IDK). iCID identifies the change interval if there is a high dissimilarity score between two non-homogeneous temporal adjacent intervals. The data-dependent property and finite feature map of IDK enabled iCID to efficiently identify various types of change-points in data streams with the tolerance of outliers. Moreover, the proposed online and offline versions of iCID have the ability to optimise key parameter settings. The effectiveness and efficiency of iCID have been systematically verified on both synthetic and real-world datasets.
△ Less
Submitted 18 January, 2024; v1 submitted 30 December, 2022;
originally announced December 2022.
-
SeqLink: A Robust Neural-ODE Architecture for Modelling Partially Observed Time Series
Authors:
Futoon M. Abushaqra,
Hao Xue,
Yongli Ren,
Flora D. Salim
Abstract:
Ordinary Differential Equations (ODE) based models have become popular as foundation models for solving many time series problems. Combining neural ODEs with traditional RNN models has provided the best representation for irregular time series. However, ODE-based models typically require the trajectory of hidden states to be defined based on either the initial observed value or the most recent obs…
▽ More
Ordinary Differential Equations (ODE) based models have become popular as foundation models for solving many time series problems. Combining neural ODEs with traditional RNN models has provided the best representation for irregular time series. However, ODE-based models typically require the trajectory of hidden states to be defined based on either the initial observed value or the most recent observation, raising questions about their effectiveness when dealing with longer sequences and extended time intervals. In this article, we explore the behaviour of the ODE models in the context of time series data with varying degrees of sparsity. We introduce SeqLink, an innovative neural architecture designed to enhance the robustness of sequence representation. Unlike traditional approaches that solely rely on the hidden state generated from the last observed value, SeqLink leverages ODE latent representations derived from multiple data samples, enabling it to generate robust data representations regardless of sequence length or data sparsity level. The core concept behind our model is the definition of hidden states for the unobserved values based on the relationships between samples (links between sequences). Through extensive experiments on partially observed synthetic and real-world datasets, we demonstrate that SeqLink improves the modelling of intermittent time series, consistently outperforming state-of-the-art approaches.
△ Less
Submitted 8 July, 2024; v1 submitted 7 December, 2022;
originally announced December 2022.
-
Integrated Convolutional and Recurrent Neural Networks for Health Risk Prediction using Patient Journey Data with Many Missing Values
Authors:
Yuxi Liu,
Shaowen Qin,
Antonio Jimeno Yepes,
Wei Shao,
Zhenhao Zhang,
Flora D. Salim
Abstract:
Predicting the health risks of patients using Electronic Health Records (EHR) has attracted considerable attention in recent years, especially with the development of deep learning techniques. Health risk refers to the probability of the occurrence of a specific health outcome for a specific patient. The predicted risks can be used to support decision-making by healthcare professionals. EHRs are s…
▽ More
Predicting the health risks of patients using Electronic Health Records (EHR) has attracted considerable attention in recent years, especially with the development of deep learning techniques. Health risk refers to the probability of the occurrence of a specific health outcome for a specific patient. The predicted risks can be used to support decision-making by healthcare professionals. EHRs are structured patient journey data. Each patient journey contains a chronological set of clinical events, and within each clinical event, there is a set of clinical/medical activities. Due to variations of patient conditions and treatment needs, EHR patient journey data has an inherently high degree of missingness that contains important information affecting relationships among variables, including time. Existing deep learning-based models generate imputed values for missing values when learning the relationships. However, imputed data in EHR patient journey data may distort the clinical meaning of the original EHR patient journey data, resulting in classification bias. This paper proposes a novel end-to-end approach to modeling EHR patient journey data with Integrated Convolutional and Recurrent Neural Networks. Our model can capture both long- and short-term temporal patterns within each patient journey and effectively handle the high degree of missingness in EHR data without any imputation data generation. Extensive experimental results using the proposed model on two real-world datasets demonstrate robust performance as well as superior prediction accuracy compared to existing state-of-the-art imputation-based prediction methods.
△ Less
Submitted 13 November, 2022; v1 submitted 11 November, 2022;
originally announced November 2022.
-
PromptCast: A New Prompt-based Learning Paradigm for Time Series Forecasting
Authors:
Hao Xue,
Flora D. Salim
Abstract:
This paper presents a new perspective on time series forecasting. In existing time series forecasting methods, the models take a sequence of numerical values as input and yield numerical values as output. The existing SOTA models are largely based on the Transformer architecture, modified with multiple encoding mechanisms to incorporate the context and semantics around the historical data. Inspire…
▽ More
This paper presents a new perspective on time series forecasting. In existing time series forecasting methods, the models take a sequence of numerical values as input and yield numerical values as output. The existing SOTA models are largely based on the Transformer architecture, modified with multiple encoding mechanisms to incorporate the context and semantics around the historical data. Inspired by the successes of pre-trained language foundation models, we pose a question about whether these models can also be adapted to solve time-series forecasting. Thus, we propose a new forecasting paradigm: prompt-based time series forecasting (PromptCast). In this novel task, the numerical input and output are transformed into prompts and the forecasting task is framed in a sentence-to-sentence manner, making it possible to directly apply language models for forecasting purposes. To support and facilitate the research of this task, we also present a large-scale dataset (PISA) that includes three real-world forecasting scenarios. We evaluate different SOTA numerical-based forecasting methods and language generation models. The benchmark results with various forecasting settings demonstrate the proposed PromptCast with language generation models is a promising research direction. Additionally, in comparison to conventional numerical-based forecasting, PromptCast shows a much better generalization ability under the zero-shot setting.
△ Less
Submitted 10 December, 2023; v1 submitted 20 September, 2022;
originally announced October 2022.
-
Leveraging Language Foundation Models for Human Mobility Forecasting
Authors:
Hao Xue,
Bhanu Prakash Voutharoja,
Flora D. Salim
Abstract:
In this paper, we propose a novel pipeline that leverages language foundation models for temporal sequential pattern mining, such as for human mobility forecasting tasks. For example, in the task of predicting Place-of-Interest (POI) customer flows, typically the number of visits is extracted from historical logs, and only the numerical data are used to predict visitor flows. In this research, we…
▽ More
In this paper, we propose a novel pipeline that leverages language foundation models for temporal sequential pattern mining, such as for human mobility forecasting tasks. For example, in the task of predicting Place-of-Interest (POI) customer flows, typically the number of visits is extracted from historical logs, and only the numerical data are used to predict visitor flows. In this research, we perform the forecasting task directly on the natural language input that includes all kinds of information such as numerical values and contextual semantic information. Specific prompts are introduced to transform numerical temporal sequences into sentences so that existing language models can be directly applied. We design an AuxMobLCast pipeline for predicting the number of visitors in each POI, integrating an auxiliary POI category classification task with the encoder-decoder architecture. This research provides empirical evidence of the effectiveness of the proposed AuxMobLCast pipeline to discover sequential patterns in mobility forecasting tasks. The results, evaluated on three real-world datasets, demonstrate that pre-trained language foundation models also have good performance in forecasting temporal sequences. This study could provide visionary insights and lead to new research directions for predicting human mobility.
△ Less
Submitted 14 September, 2022; v1 submitted 10 September, 2022;
originally announced September 2022.
-
Imagining Future Digital Assistants at Work: A Study of Task Management Needs
Authors:
Yonchanok Khaokaew,
Indigo Holcombe-James,
Mohammad Saiedur Rahaman,
Jonathan Liono,
Johanne R. Trippas,
Damiano Spina,
Nicholas Belkin,
Peter Bailey,
Paul N. Bennett,
Yongli Ren,
Mark Sanderson,
Falk Scholer,
Ryen W. White,
Flora D. Salim
Abstract:
Digital Assistants (DAs) can support workers in the workplace and beyond. However, target user needs are not fully understood, and the functions that workers would ideally want a DA to support require further study. A richer understanding of worker needs could help inform the design of future DAs. We investigate user needs of future workplace DAs using data from a user study of 40 workers over a f…
▽ More
Digital Assistants (DAs) can support workers in the workplace and beyond. However, target user needs are not fully understood, and the functions that workers would ideally want a DA to support require further study. A richer understanding of worker needs could help inform the design of future DAs. We investigate user needs of future workplace DAs using data from a user study of 40 workers over a four-week period. Our qualitative analysis confirms existing research and generates new insight on the role of DAs in managing people's time, tasks, and information. Placing these insights in relation to quantitative analysis of self-reported task data, we highlight how different occupation roles require DAs to take varied approaches to these domains and the effect of task characteristics on the imagined features. Our findings have implications for the design of future DAs in work settings, and we offer some recommendations for reduction to practice.
△ Less
Submitted 6 August, 2022;
originally announced August 2022.
-
COCOA: Cross Modality Contrastive Learning for Sensor Data
Authors:
Shohreh Deldari,
Hao Xue,
Aaqib Saeed,
Daniel V. Smith,
Flora D. Salim
Abstract:
Self-Supervised Learning (SSL) is a new paradigm for learning discriminative representations without labelled data and has reached comparable or even state-of-the-art results in comparison to supervised counterparts. Contrastive Learning (CL) is one of the most well-known approaches in SSL that attempts to learn general, informative representations of data. CL methods have been mostly developed fo…
▽ More
Self-Supervised Learning (SSL) is a new paradigm for learning discriminative representations without labelled data and has reached comparable or even state-of-the-art results in comparison to supervised counterparts. Contrastive Learning (CL) is one of the most well-known approaches in SSL that attempts to learn general, informative representations of data. CL methods have been mostly developed for applications in computer vision and natural language processing where only a single sensor modality is used. A majority of pervasive computing applications, however, exploit data from a range of different sensor modalities. While existing CL methods are limited to learning from one or two data sources, we propose COCOA (Cross mOdality COntrastive leArning), a self-supervised model that employs a novel objective function to learn quality representations from multisensor data by computing the cross-correlation between different data modalities and minimizing the similarity between irrelevant instances. We evaluate the effectiveness of COCOA against eight recently introduced state-of-the-art self-supervised models, and two supervised baselines across five public datasets. We show that COCOA achieves superior classification performance to all other approaches. Also, COCOA is far more label-efficient than the other baselines including the fully supervised model using only one-tenth of available labelled data.
△ Less
Submitted 3 August, 2022; v1 submitted 31 July, 2022;
originally announced August 2022.
-
Modeling Long-term Dependencies and Short-term Correlations in Patient Journey Data with Temporal Attention Networks for Health Prediction
Authors:
Yuxi Liu,
Zhenhao Zhang,
Antonio Jimeno Yepes,
Flora D. Salim
Abstract:
Building models for health prediction based on Electronic Health Records (EHR) has become an active research area. EHR patient journey data consists of patient time-ordered clinical events/visits from patients. Most existing studies focus on modeling long-term dependencies between visits, without explicitly taking short-term correlations between consecutive visits into account, where irregular tim…
▽ More
Building models for health prediction based on Electronic Health Records (EHR) has become an active research area. EHR patient journey data consists of patient time-ordered clinical events/visits from patients. Most existing studies focus on modeling long-term dependencies between visits, without explicitly taking short-term correlations between consecutive visits into account, where irregular time intervals, incorporated as auxiliary information, are fed into health prediction models to capture latent progressive patterns of patient journeys. We present a novel deep neural network with four modules to take into account the contributions of various variables for health prediction: i) the Stacked Attention module strengthens the deep semantics in clinical events within each patient journey and generates visit embeddings, ii) the Short-Term Temporal Attention module models short-term correlations between consecutive visit embeddings while capturing the impact of time intervals within those visit embeddings, iii) the Long-Term Temporal Attention module models long-term dependencies between visit embeddings while capturing the impact of time intervals within those visit embeddings, iv) and finally, the Coupled Attention module adaptively aggregates the outputs of Short-Term Temporal Attention and Long-Term Temporal Attention modules to make health predictions. Experimental results on MIMIC-III demonstrate superior predictive accuracy of our model compared to existing state-of-the-art methods, as well as the interpretability and robustness of this approach. Furthermore, we found that modeling short-term correlations contributes to local priors generation, leading to improved predictive modeling of patient journeys.
△ Less
Submitted 15 July, 2022; v1 submitted 13 July, 2022;
originally announced July 2022.
-
Beyond Just Vision: A Review on Self-Supervised Representation Learning on Multimodal and Temporal Data
Authors:
Shohreh Deldari,
Hao Xue,
Aaqib Saeed,
Jiayuan He,
Daniel V. Smith,
Flora D. Salim
Abstract:
Recently, Self-Supervised Representation Learning (SSRL) has attracted much attention in the field of computer vision, speech, natural language processing (NLP), and recently, with other types of modalities, including time series from sensors. The popularity of self-supervised learning is driven by the fact that traditional models typically require a huge amount of well-annotated data for training…
▽ More
Recently, Self-Supervised Representation Learning (SSRL) has attracted much attention in the field of computer vision, speech, natural language processing (NLP), and recently, with other types of modalities, including time series from sensors. The popularity of self-supervised learning is driven by the fact that traditional models typically require a huge amount of well-annotated data for training. Acquiring annotated data can be a difficult and costly process. Self-supervised methods have been introduced to improve the efficiency of training data through discriminative pre-training of models using supervisory signals that have been freely obtained from the raw data. Unlike existing reviews of SSRL that have pre-dominately focused upon methods in the fields of CV or NLP for a single modality, we aim to provide the first comprehensive review of multimodal self-supervised learning methods for temporal data. To this end, we 1) provide a comprehensive categorization of existing SSRL methods, 2) introduce a generic pipeline by defining the key components of a SSRL framework, 3) compare existing models in terms of their objective function, network architecture and potential applications, and 4) review existing multimodal techniques in each category and various modalities. Finally, we present existing weaknesses and future opportunities. We believe our work develops a perspective on the requirements of SSRL in domains that utilise multimodal and/or temporal data
△ Less
Submitted 7 June, 2022; v1 submitted 6 June, 2022;
originally announced June 2022.
-
Measuring disentangled generative spatio-temporal representation
Authors:
Sichen Zhao,
Wei Shao,
Jeffrey Chan,
Flora D. Salim
Abstract:
Disentangled representation learning offers useful properties such as dimension reduction and interpretability, which are essential to modern deep learning approaches. Although deep learning techniques have been widely applied to spatio-temporal data mining, there has been little attention to further disentangle the latent features and understanding their contribution to the model performance, par…
▽ More
Disentangled representation learning offers useful properties such as dimension reduction and interpretability, which are essential to modern deep learning approaches. Although deep learning techniques have been widely applied to spatio-temporal data mining, there has been little attention to further disentangle the latent features and understanding their contribution to the model performance, particularly their mutual information and correlation across features. In this study, we adopt two state-of-the-art disentangled representation learning methods and apply them to three large-scale public spatio-temporal datasets. To evaluate their performance, we propose an internal evaluation metric focusing on the degree of correlations among latent variables of the learned representations and the prediction performance of the downstream tasks. Empirical results show that our modified method can learn disentangled representations that achieve the same level of performance as existing state-of-the-art ST deep learning methods in a spatio-temporal sequence forecasting problem. Additionally, we find that our methods can be used to discover real-world spatial-temporal semantics to describe the variables in the learned representation.
△ Less
Submitted 8 April, 2022; v1 submitted 9 February, 2022;
originally announced February 2022.
-
Individual and Group-wise Classroom Seating Experience: Effects on Student Engagement in Different Courses
Authors:
Nan Gao,
Mohammad Saiedur Rahaman,
Wei Shao,
Kaixin Ji,
Flora D. Salim
Abstract:
Seating location in the classroom can affect student engagement, attention and academic performance by providing better visibility, improved movement, and participation in discussions. Existing studies typically explore how traditional seating arrangements (e.g. grouped tables or traditional rows) influence students' perceived engagement, without considering group seating behaviours under more fle…
▽ More
Seating location in the classroom can affect student engagement, attention and academic performance by providing better visibility, improved movement, and participation in discussions. Existing studies typically explore how traditional seating arrangements (e.g. grouped tables or traditional rows) influence students' perceived engagement, without considering group seating behaviours under more flexible seating arrangements. Furthermore, survey-based measures of student engagement are prone to subjectivity and various response bias. Therefore, in this research, we investigate how individual and group-wise classroom seating experiences affect student engagement using wearable physiological sensors. We conducted a field study at a high school and collected survey and wearable data from 23 students in 10 courses over four weeks. We aim to answer the following research questions: 1. How does the seating proximity between students relate to their perceived learning engagement? 2. How do students' group seating behaviours relate to their physiologically-based measures of engagement (i.e. physiological arousal and physiological synchrony)? Experiment results indicate that the individual and group-wise classroom seating experience is associated with perceived student engagement and physiologically-based engagement measured from electrodermal activity. We also find that students who sit close together are more likely to have similar learning engagement and tend to have high physiological synchrony. This research opens up opportunities to explore the implications of flexible seating arrangements and has great potential to maximize student engagement by suggesting intelligent seating choices in the future.
△ Less
Submitted 23 July, 2022; v1 submitted 22 December, 2021;
originally announced December 2021.