-
Latent Conditional Diffusion-based Data Augmentation for Continuous-Time Dynamic Graph Mode
Authors:
Yuxing Tian,
Yiyan Qi,
Aiwen Jiang,
Qi Huang,
Jian Guo
Abstract:
Continuous-Time Dynamic Graph (CTDG) precisely models evolving real-world relationships, drawing heightened interest in dynamic graph learning across academia and industry. However, existing CTDG models encounter challenges stemming from noise and limited historical data. Graph Data Augmentation (GDA) emerges as a critical solution, yet current approaches primarily focus on static graphs and strug…
▽ More
Continuous-Time Dynamic Graph (CTDG) precisely models evolving real-world relationships, drawing heightened interest in dynamic graph learning across academia and industry. However, existing CTDG models encounter challenges stemming from noise and limited historical data. Graph Data Augmentation (GDA) emerges as a critical solution, yet current approaches primarily focus on static graphs and struggle to effectively address the dynamics inherent in CTDGs. Moreover, these methods often demand substantial domain expertise for parameter tuning and lack theoretical guarantees for augmentation efficacy. To address these issues, we propose Conda, a novel latent diffusion-based GDA method tailored for CTDGs. Conda features a sandwich-like architecture, incorporating a Variational Auto-Encoder (VAE) and a conditional diffusion model, aimed at generating enhanced historical neighbor embeddings for target nodes. Unlike conventional diffusion models trained on entire graphs via pre-training, Conda requires historical neighbor sequence embeddings of target nodes for training, thus facilitating more targeted augmentation. We integrate Conda into the CTDG model and adopt an alternating training strategy to optimize performance. Extensive experimentation across six widely used real-world datasets showcases the consistent performance improvement of our approach, particularly in scenarios with limited historical data.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
DaLPSR: Leverage Degradation-Aligned Language Prompt for Real-World Image Super-Resolution
Authors:
Aiwen Jiang,
Zhi Wei,
Long Peng,
Feiqiang Liu,
Wenbo Li,
Mingwen Wang
Abstract:
Image super-resolution pursuits reconstructing high-fidelity high-resolution counterpart for low-resolution image. In recent years, diffusion-based models have garnered significant attention due to their capabilities with rich prior knowledge. The success of diffusion models based on general text prompts has validated the effectiveness of textual control in the field of text2image. However, given…
▽ More
Image super-resolution pursuits reconstructing high-fidelity high-resolution counterpart for low-resolution image. In recent years, diffusion-based models have garnered significant attention due to their capabilities with rich prior knowledge. The success of diffusion models based on general text prompts has validated the effectiveness of textual control in the field of text2image. However, given the severe degradation commonly presented in low-resolution images, coupled with the randomness characteristics of diffusion models, current models struggle to adequately discern semantic and degradation information within severely degraded images. This often leads to obstacles such as semantic loss, visual artifacts, and visual hallucinations, which pose substantial challenges for practical use. To address these challenges, this paper proposes to leverage degradation-aligned language prompt for accurate, fine-grained, and high-fidelity image restoration. Complementary priors including semantic content descriptions and degradation prompts are explored. Specifically, on one hand, image-restoration prompt alignment decoder is proposed to automatically discern the degradation degree of LR images, thereby generating beneficial degradation priors for image restoration. On the other hand, much richly tailored descriptions from pretrained multimodal large language model elicit high-level semantic priors closely aligned with human perception, ensuring fidelity control for image restoration. Comprehensive comparisons with state-of-the-art methods have been done on several popular synthetic and real-world benchmark datasets. The quantitative and qualitative analysis have demonstrated that the proposed method achieves a new state-of-the-art perceptual quality level, especially in real-world cases based on reference-free metrics.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Implant-to-Wearable Communication through the Human Body: Exploring the Effects of Encapsulated Capacitive and Galvanic Transmitters
Authors:
Anyu Jiang,
Cassandra Acebal,
Brook Heyd,
Trustin White,
Gurleen Kainth,
Arunashish Datta,
Shreyas Sen,
Adam Khalifa,
Baibhab Chatterjee
Abstract:
Data transfer using human-body communication (HBC) represents an actively explored alternative solution to address the challenges related to energy-efficiency, tissue absorption, and security of conventional wireless. Although the use of HBC for wearable-to-wearable communication has been well-explored, different configurations for the transmitter (Tx) and receiver (Rx) for implant-to-wearable HBC…
▽ More
Data transfer using human-body communication (HBC) represents an actively explored alternative solution to address the challenges related to energy-efficiency, tissue absorption, and security of conventional wireless. Although the use of HBC for wearable-to-wearable communication has been well-explored, different configurations for the transmitter (Tx) and receiver (Rx) for implant-to-wearable HBC needs further studies. This paper substantiates the hypothesis that a fully implanted galvanic Tx is more efficient than a capacitive Tx for interaction with a wearable Rx. Given the practical limitations of implanting an ideal capacitive device, we choose a galvanic device with one electrode encapsulated to model the capacitive scenario. We analyze the lumped circuit model for in-body to out-of-body communication, and perform Circuit-based as well as Finite Element Method (FEM) simulations to explore how the encapsulation thickness affects the received signal levels. We demonstrate in-vivo experimental results on live Sprague Dawley rats to validate the hypothesis, and show that compared to the galvanic Tx, the channel loss will be $\approx$ 20 dB higher with each additional mm thickness of capacitive encapsulation, eventually going below the noise floor for ideal capacitive Tx.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection
Authors:
Anbai Jiang,
Bing Han,
Zhiqiang Lv,
Yufeng Deng,
Wei-Qiang Zhang,
Xie Chen,
Yanmin Qian,
Jia Liu,
Pingyi Fan
Abstract:
Large pre-trained models have demonstrated dominant performances in multiple areas, where the consistency between pre-training and fine-tuning is the key to success. However, few works reported satisfactory results of pre-trained models for the machine anomalous sound detection (ASD) task. This may be caused by the inconsistency of the pre-trained model and the inductive bias of machine audio, res…
▽ More
Large pre-trained models have demonstrated dominant performances in multiple areas, where the consistency between pre-training and fine-tuning is the key to success. However, few works reported satisfactory results of pre-trained models for the machine anomalous sound detection (ASD) task. This may be caused by the inconsistency of the pre-trained model and the inductive bias of machine audio, resulting in inconsistency in data and architecture. Thus, we propose AnoPatch which utilizes a ViT backbone pre-trained on AudioSet and fine-tunes it on machine audio. It is believed that machine audio is more related to audio datasets than speech datasets, and modeling it from patch level suits the sparsity of machine audio. As a result, AnoPatch showcases state-of-the-art (SOTA) performances on the DCASE 2020 ASD dataset and the DCASE 2023 ASD dataset. We also compare multiple pre-trained models and empirically demonstrate that better consistency yields considerable improvement.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Few-Shot Anomaly Detection via Category-Agnostic Registration Learning
Authors:
Chaoqin Huang,
Haoyan Guan,
Aofan Jiang,
Yanfeng Wang,
Michael Spratling,
Xinchao Wang,
Ya Zhang
Abstract:
Most existing anomaly detection methods require a dedicated model for each category. Such a paradigm, despite its promising results, is computationally expensive and inefficient, thereby failing to meet the requirements for real-world applications. Inspired by how humans detect anomalies, by comparing a query image to known normal ones, this paper proposes a novel few-shot anomaly detection (FSAD)…
▽ More
Most existing anomaly detection methods require a dedicated model for each category. Such a paradigm, despite its promising results, is computationally expensive and inefficient, thereby failing to meet the requirements for real-world applications. Inspired by how humans detect anomalies, by comparing a query image to known normal ones, this paper proposes a novel few-shot anomaly detection (FSAD) framework. Using a training set of normal images from various categories, registration, aiming to align normal images of the same categories, is leveraged as the proxy task for self-supervised category-agnostic representation learning. At test time, an image and its corresponding support set, consisting of a few normal images from the same category, are supplied, and anomalies are identified by comparing the registered features of the test image to its corresponding support image features. Such a setup enables the model to generalize to novel test categories. It is, to our best knowledge, the first FSAD method that requires no model fine-tuning for novel categories: enabling a single model to be applied to all categories. Extensive experiments demonstrate the effectiveness of the proposed method. Particularly, it improves the current state-of-the-art for FSAD by 11.3% and 8.3% on the MVTec and MPDD benchmarks, respectively. The source code is available at https://github.com/Haoyan-Guan/CAReg.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Towards a Personal Health Large Language Model
Authors:
Justin Cosentino,
Anastasiya Belyaeva,
Xin Liu,
Nicholas A. Furlotte,
Zhun Yang,
Chace Lee,
Erik Schenck,
Yojan Patel,
Jian Cui,
Logan Douglas Schneider,
Robby Bryant,
Ryan G. Gomes,
Allen Jiang,
Roy Lee,
Yun Liu,
Javier Perez,
Jameson K. Rogers,
Cathy Speed,
Shyam Tailor,
Megan Walker,
Jeffrey Yu,
Tim Althoff,
Conor Heneghan,
John Hernandez,
Mark Malhotra
, et al. (9 additional authors not shown)
Abstract:
In health, most large language model (LLM) research has focused on clinical tasks. However, mobile and wearable devices, which are rarely integrated into such tasks, provide rich, longitudinal data for personal health monitoring. Here we present Personal Health Large Language Model (PH-LLM), fine-tuned from Gemini for understanding and reasoning over numerical time-series personal health data. We…
▽ More
In health, most large language model (LLM) research has focused on clinical tasks. However, mobile and wearable devices, which are rarely integrated into such tasks, provide rich, longitudinal data for personal health monitoring. Here we present Personal Health Large Language Model (PH-LLM), fine-tuned from Gemini for understanding and reasoning over numerical time-series personal health data. We created and curated three datasets that test 1) production of personalized insights and recommendations from sleep patterns, physical activity, and physiological responses, 2) expert domain knowledge, and 3) prediction of self-reported sleep outcomes. For the first task we designed 857 case studies in collaboration with domain experts to assess real-world scenarios in sleep and fitness. Through comprehensive evaluation of domain-specific rubrics, we observed that Gemini Ultra 1.0 and PH-LLM are not statistically different from expert performance in fitness and, while experts remain superior for sleep, fine-tuning PH-LLM provided significant improvements in using relevant domain knowledge and personalizing information for sleep insights. We evaluated PH-LLM domain knowledge using multiple choice sleep medicine and fitness examinations. PH-LLM achieved 79% on sleep and 88% on fitness, exceeding average scores from a sample of human experts. Finally, we trained PH-LLM to predict self-reported sleep quality outcomes from textual and multimodal encoding representations of wearable data, and demonstrate that multimodal encoding is required to match performance of specialized discriminative models. Although further development and evaluation are necessary in the safety-critical personal health domain, these results demonstrate both the broad knowledge and capabilities of Gemini models and the benefit of contextualizing physiological data for personal health applications as done with PH-LLM.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Repurposing Language Models into Embedding Models: Finding the Compute-Optimal Recipe
Authors:
Alicja Ziarko,
Albert Q. Jiang,
Bartosz Piotrowski,
Wenda Li,
Mateja Jamnik,
Piotr Miłoś
Abstract:
Text embeddings are essential for many tasks, such as document retrieval, clustering, and semantic similarity assessment. In this paper, we study how to contrastively train text embedding models in a compute-optimal fashion, given a suite of pre-trained decoder-only language models. Our innovation is an algorithm that produces optimal configurations of model sizes, data quantities, and fine-tuning…
▽ More
Text embeddings are essential for many tasks, such as document retrieval, clustering, and semantic similarity assessment. In this paper, we study how to contrastively train text embedding models in a compute-optimal fashion, given a suite of pre-trained decoder-only language models. Our innovation is an algorithm that produces optimal configurations of model sizes, data quantities, and fine-tuning methods for text-embedding models at different computational budget levels. The resulting recipe, which we obtain through extensive experiments, can be used by practitioners to make informed design choices for their embedding models. Specifically, our findings suggest that full fine-tuning and low-rank adaptation fine-tuning produce optimal models at lower and higher computational budgets respectively.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Anomaly Detection in Electrocardiograms: Advancing Clinical Diagnosis Through Self-Supervised Learning
Authors:
Aofan Jiang,
Chaoqin Huang,
Qing Cao,
Yuchen Xu,
Zi Zeng,
Kang Chen,
Ya Zhang,
Yanfeng Wang
Abstract:
The electrocardiogram (ECG) is an essential tool for diagnosing heart disease, with computer-aided systems improving diagnostic accuracy and reducing healthcare costs. Despite advancements, existing systems often miss rare cardiac anomalies that could be precursors to serious, life-threatening issues or alterations in the cardiac macro/microstructure. We address this gap by focusing on self-superv…
▽ More
The electrocardiogram (ECG) is an essential tool for diagnosing heart disease, with computer-aided systems improving diagnostic accuracy and reducing healthcare costs. Despite advancements, existing systems often miss rare cardiac anomalies that could be precursors to serious, life-threatening issues or alterations in the cardiac macro/microstructure. We address this gap by focusing on self-supervised anomaly detection (AD), training exclusively on normal ECGs to recognize deviations indicating anomalies. We introduce a novel self-supervised learning framework for ECG AD, utilizing a vast dataset of normal ECGs to autonomously detect and localize cardiac anomalies. It proposes a novel masking and restoration technique alongside a multi-scale cross-attention module, enhancing the model's ability to integrate global and local signal features. The framework emphasizes accurate localization of anomalies within ECG signals, ensuring the method's clinical relevance and reliability. To reduce the impact of individual variability, the approach further incorporates crucial patient-specific information from ECG reports, such as age and gender, thus enabling accurate identification of a broad spectrum of cardiac anomalies, including rare ones. Utilizing an extensive dataset of 478,803 ECG graphic reports from real-world clinical practice, our method has demonstrated exceptional effectiveness in AD across all tested conditions, regardless of their frequency of occurrence, significantly outperforming existing models. It achieved superior performance metrics, including an AUROC of 91.2%, an F1 score of 83.7%, a sensitivity rate of 84.2%, a specificity of 83.0%, and a precision of 75.6% with a fixed recall rate of 90%. It has also demonstrated robust localization capabilities, with an AUROC of 76.5% and a Dice coefficient of 65.3% for anomaly localization.
△ Less
Submitted 7 April, 2024;
originally announced April 2024.
-
Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images
Authors:
Chaoqin Huang,
Aofan Jiang,
Jinghao Feng,
Ya Zhang,
Xinchao Wang,
Yanfeng Wang
Abstract:
Recent advancements in large-scale visual-language pre-trained models have led to significant progress in zero-/few-shot anomaly detection within natural image domains. However, the substantial domain divergence between natural and medical images limits the effectiveness of these methodologies in medical anomaly detection. This paper introduces a novel lightweight multi-level adaptation and compar…
▽ More
Recent advancements in large-scale visual-language pre-trained models have led to significant progress in zero-/few-shot anomaly detection within natural image domains. However, the substantial domain divergence between natural and medical images limits the effectiveness of these methodologies in medical anomaly detection. This paper introduces a novel lightweight multi-level adaptation and comparison framework to repurpose the CLIP model for medical anomaly detection. Our approach integrates multiple residual adapters into the pre-trained visual encoder, enabling a stepwise enhancement of visual features across different levels. This multi-level adaptation is guided by multi-level, pixel-wise visual-language feature alignment loss functions, which recalibrate the model's focus from object semantics in natural imagery to anomaly identification in medical images. The adapted features exhibit improved generalization across various medical data types, even in zero-shot scenarios where the model encounters unseen medical modalities and anatomical regions during training. Our experiments on medical anomaly detection benchmarks demonstrate that our method significantly surpasses current state-of-the-art models, with an average AUC improvement of 6.24% and 7.33% for anomaly classification, 2.03% and 2.37% for anomaly segmentation, under the zero-shot and few-shot settings, respectively. Source code is available at: https://github.com/MediaBrain-SJTU/MVFA-AD
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Dual-Path Coupled Image Deraining Network via Spatial-Frequency Interaction
Authors:
Yuhong He,
Aiwen Jiang,
Lingfang Jiang,
Zhifeng Wang,
Lu Wang
Abstract:
Transformers have recently emerged as a significant force in the field of image deraining. Existing image deraining methods utilize extensive research on self-attention. Though showcasing impressive results, they tend to neglect critical frequency information, as self-attention is generally less adept at capturing high-frequency details. To overcome this shortcoming, we have developed an innovativ…
▽ More
Transformers have recently emerged as a significant force in the field of image deraining. Existing image deraining methods utilize extensive research on self-attention. Though showcasing impressive results, they tend to neglect critical frequency information, as self-attention is generally less adept at capturing high-frequency details. To overcome this shortcoming, we have developed an innovative Dual-Path Coupled Deraining Network (DPCNet) that integrates information from both spatial and frequency domains through Spatial Feature Extraction Block (SFEBlock) and Frequency Feature Extraction Block (FFEBlock). We have further introduced an effective Adaptive Fusion Module (AFM) for the dual-path feature aggregation. Extensive experiments on six public deraining benchmarks and downstream vision tasks have demonstrated that our proposed method not only outperforms the existing state-of-the-art deraining method but also achieves visually pleasuring results with excellent robustness on downstream vision tasks.
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
Cross-lingual Offensive Language Detection: A Systematic Review of Datasets, Transfer Approaches and Challenges
Authors:
Aiqi Jiang,
Arkaitz Zubiaga
Abstract:
The growing prevalence and rapid evolution of offensive language in social media amplify the complexities of detection, particularly highlighting the challenges in identifying such content across diverse languages. This survey presents a systematic and comprehensive exploration of Cross-Lingual Transfer Learning (CLTL) techniques in offensive language detection in social media. Our study stands as…
▽ More
The growing prevalence and rapid evolution of offensive language in social media amplify the complexities of detection, particularly highlighting the challenges in identifying such content across diverse languages. This survey presents a systematic and comprehensive exploration of Cross-Lingual Transfer Learning (CLTL) techniques in offensive language detection in social media. Our study stands as the first holistic overview to focus exclusively on the cross-lingual scenario in this domain. We analyse 67 relevant papers and categorise these studies across various dimensions, including the characteristics of multilingual datasets used, the cross-lingual resources employed, and the specific CLTL strategies implemented. According to "what to transfer", we also summarise three main CLTL transfer approaches: instance, feature, and parameter transfer. Additionally, we shed light on the current challenges and future research opportunities in this field. Furthermore, we have made our survey resources available online, including two comprehensive tables that provide accessible references to the multilingual datasets and CLTL methods used in the reviewed literature.
△ Less
Submitted 17 January, 2024;
originally announced January 2024.
-
Mixtral of Experts
Authors:
Albert Q. Jiang,
Alexandre Sablayrolles,
Antoine Roux,
Arthur Mensch,
Blanche Savary,
Chris Bamford,
Devendra Singh Chaplot,
Diego de las Casas,
Emma Bou Hanna,
Florian Bressand,
Gianna Lengyel,
Guillaume Bour,
Guillaume Lample,
Lélio Renard Lavaud,
Lucile Saulnier,
Marie-Anne Lachaux,
Pierre Stock,
Sandeep Subramanian,
Sophia Yang,
Szymon Antoniak,
Teven Le Scao,
Théophile Gervet,
Thibaut Lavril,
Thomas Wang,
Timothée Lacroix
, et al. (1 additional authors not shown)
Abstract:
We introduce Mixtral 8x7B, a Sparse Mixture of Experts (SMoE) language model. Mixtral has the same architecture as Mistral 7B, with the difference that each layer is composed of 8 feedforward blocks (i.e. experts). For every token, at each layer, a router network selects two experts to process the current state and combine their outputs. Even though each token only sees two experts, the selected e…
▽ More
We introduce Mixtral 8x7B, a Sparse Mixture of Experts (SMoE) language model. Mixtral has the same architecture as Mistral 7B, with the difference that each layer is composed of 8 feedforward blocks (i.e. experts). For every token, at each layer, a router network selects two experts to process the current state and combine their outputs. Even though each token only sees two experts, the selected experts can be different at each timestep. As a result, each token has access to 47B parameters, but only uses 13B active parameters during inference. Mixtral was trained with a context size of 32k tokens and it outperforms or matches Llama 2 70B and GPT-3.5 across all evaluated benchmarks. In particular, Mixtral vastly outperforms Llama 2 70B on mathematics, code generation, and multilingual benchmarks. We also provide a model fine-tuned to follow instructions, Mixtral 8x7B - Instruct, that surpasses GPT-3.5 Turbo, Claude-2.1, Gemini Pro, and Llama 2 70B - chat model on human benchmarks. Both the base and instruct models are released under the Apache 2.0 license.
△ Less
Submitted 8 January, 2024;
originally announced January 2024.
-
Textual Prompt Guided Image Restoration
Authors:
Qiuhai Yan,
Aiwen Jiang,
Kang Chen,
Long Peng,
Qiaosi Yi,
Chunjie Zhang
Abstract:
Image restoration has always been a cutting-edge topic in the academic and industrial fields of computer vision. Since degradation signals are often random and diverse, "all-in-one" models that can do blind image restoration have been concerned in recent years. Early works require training specialized headers and tails to handle each degradation of concern, which are manually cumbersome. Recent wo…
▽ More
Image restoration has always been a cutting-edge topic in the academic and industrial fields of computer vision. Since degradation signals are often random and diverse, "all-in-one" models that can do blind image restoration have been concerned in recent years. Early works require training specialized headers and tails to handle each degradation of concern, which are manually cumbersome. Recent works focus on learning visual prompts from data distribution to identify degradation type. However, the prompts employed in most of models are non-text, lacking sufficient emphasis on the importance of human-in-the-loop. In this paper, an effective textual prompt guided image restoration model has been proposed. In this model, task-specific BERT is fine-tuned to accurately understand user's instructions and generating textual prompt guidance. Depth-wise multi-head transposed attentions and gated convolution modules are designed to bridge the gap between textual prompts and visual features. The proposed model has innovatively introduced semantic prompts into low-level visual domain. It highlights the potential to provide a natural, precise, and controllable way to perform image restoration tasks. Extensive experiments have been done on public denoising, dehazing and deraining datasets. The experiment results demonstrate that, compared with popular state-of-the-art methods, the proposed model can obtain much more superior performance, achieving accurate recognition and removal of degradation without increasing model's complexity. Related source codes and data will be publicly available on github site https://github.com/MoTong-AI-studio/TextPromptIR.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
Multilingual Mathematical Autoformalization
Authors:
Albert Q. Jiang,
Wenda Li,
Mateja Jamnik
Abstract:
Autoformalization is the task of translating natural language materials into machine-verifiable formalisations. Progress in autoformalization research is hindered by the lack of a sizeable dataset consisting of informal-formal pairs expressing the same essence. Existing methods tend to circumvent this challenge by manually curating small corpora or using few-shot learning with large language model…
▽ More
Autoformalization is the task of translating natural language materials into machine-verifiable formalisations. Progress in autoformalization research is hindered by the lack of a sizeable dataset consisting of informal-formal pairs expressing the same essence. Existing methods tend to circumvent this challenge by manually curating small corpora or using few-shot learning with large language models. But these methods suffer from data scarcity and formal language acquisition difficulty. In this work, we create $\texttt{MMA}$, a large, flexible, multilingual, and multi-domain dataset of informal-formal pairs, by using a language model to translate in the reverse direction, that is, from formal mathematical statements into corresponding informal ones. Experiments show that language models fine-tuned on $\texttt{MMA}$ produce $16-18\%$ of statements acceptable with minimal corrections on the $\texttt{miniF2F}$ and $\texttt{ProofNet}$ benchmarks, up from $0\%$ with the base model. We demonstrate that fine-tuning on multilingual formal data results in more capable autoformalization models even when deployed on monolingual tasks.
△ Less
Submitted 9 November, 2023; v1 submitted 7 November, 2023;
originally announced November 2023.
-
Llemma: An Open Language Model For Mathematics
Authors:
Zhangir Azerbayev,
Hailey Schoelkopf,
Keiran Paster,
Marco Dos Santos,
Stephen McAleer,
Albert Q. Jiang,
Jia Deng,
Stella Biderman,
Sean Welleck
Abstract:
We present Llemma, a large language model for mathematics. We continue pretraining Code Llama on the Proof-Pile-2, a mixture of scientific papers, web data containing mathematics, and mathematical code, yielding Llemma. On the MATH benchmark Llemma outperforms all known open base models, as well as the unreleased Minerva model suite on an equi-parameter basis. Moreover, Llemma is capable of tool u…
▽ More
We present Llemma, a large language model for mathematics. We continue pretraining Code Llama on the Proof-Pile-2, a mixture of scientific papers, web data containing mathematics, and mathematical code, yielding Llemma. On the MATH benchmark Llemma outperforms all known open base models, as well as the unreleased Minerva model suite on an equi-parameter basis. Moreover, Llemma is capable of tool use and formal theorem proving without any further finetuning. We openly release all artifacts, including 7 billion and 34 billion parameter models, the Proof-Pile-2, and code to replicate our experiments.
△ Less
Submitted 15 March, 2024; v1 submitted 16 October, 2023;
originally announced October 2023.
-
Mistral 7B
Authors:
Albert Q. Jiang,
Alexandre Sablayrolles,
Arthur Mensch,
Chris Bamford,
Devendra Singh Chaplot,
Diego de las Casas,
Florian Bressand,
Gianna Lengyel,
Guillaume Lample,
Lucile Saulnier,
Lélio Renard Lavaud,
Marie-Anne Lachaux,
Pierre Stock,
Teven Le Scao,
Thibaut Lavril,
Thomas Wang,
Timothée Lacroix,
William El Sayed
Abstract:
We introduce Mistral 7B v0.1, a 7-billion-parameter language model engineered for superior performance and efficiency. Mistral 7B outperforms Llama 2 13B across all evaluated benchmarks, and Llama 1 34B in reasoning, mathematics, and code generation. Our model leverages grouped-query attention (GQA) for faster inference, coupled with sliding window attention (SWA) to effectively handle sequences o…
▽ More
We introduce Mistral 7B v0.1, a 7-billion-parameter language model engineered for superior performance and efficiency. Mistral 7B outperforms Llama 2 13B across all evaluated benchmarks, and Llama 1 34B in reasoning, mathematics, and code generation. Our model leverages grouped-query attention (GQA) for faster inference, coupled with sliding window attention (SWA) to effectively handle sequences of arbitrary length with a reduced inference cost. We also provide a model fine-tuned to follow instructions, Mistral 7B -- Instruct, that surpasses the Llama 2 13B -- Chat model both on human and automated benchmarks. Our models are released under the Apache 2.0 license.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
CODA: Temporal Domain Generalization via Concept Drift Simulator
Authors:
Chia-Yuan Chang,
Yu-Neng Chuang,
Zhimeng Jiang,
Kwei-Herng Lai,
Anxiao Jiang,
Na Zou
Abstract:
In real-world applications, machine learning models often become obsolete due to shifts in the joint distribution arising from underlying temporal trends, a phenomenon known as the "concept drift". Existing works propose model-specific strategies to achieve temporal generalization in the near-future domain. However, the diverse characteristics of real-world datasets necessitate customized predicti…
▽ More
In real-world applications, machine learning models often become obsolete due to shifts in the joint distribution arising from underlying temporal trends, a phenomenon known as the "concept drift". Existing works propose model-specific strategies to achieve temporal generalization in the near-future domain. However, the diverse characteristics of real-world datasets necessitate customized prediction model architectures. To this end, there is an urgent demand for a model-agnostic temporal domain generalization approach that maintains generality across diverse data modalities and architectures. In this work, we aim to address the concept drift problem from a data-centric perspective to bypass considering the interaction between data and model. Developing such a framework presents non-trivial challenges: (i) existing generative models struggle to generate out-of-distribution future data, and (ii) precisely capturing the temporal trends of joint distribution along chronological source domains is computationally infeasible. To tackle the challenges, we propose the COncept Drift simulAtor (CODA) framework incorporating a predicted feature correlation matrix to simulate future data for model training. Specifically, CODA leverages feature correlations to represent data characteristics at specific time points, thereby circumventing the daunting computational costs. Experimental results demonstrate that using CODA-generated data as training input effectively achieves temporal domain generalization across different model architectures.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
Improvements on Scalable Stochastic Bayesian Inference Methods for Multivariate Hawkes Process
Authors:
Alex Ziyu Jiang,
Abel Rodríguez
Abstract:
Multivariate Hawkes Processes (MHPs) are a class of point processes that can account for complex temporal dynamics among event sequences. In this work, we study the accuracy and computational efficiency of three classes of algorithms which, while widely used in the context of Bayesian inference, have rarely been applied in the context of MHPs: stochastic gradient expectation-maximization, stochast…
▽ More
Multivariate Hawkes Processes (MHPs) are a class of point processes that can account for complex temporal dynamics among event sequences. In this work, we study the accuracy and computational efficiency of three classes of algorithms which, while widely used in the context of Bayesian inference, have rarely been applied in the context of MHPs: stochastic gradient expectation-maximization, stochastic gradient variational inference and stochastic gradient Langevin Monte Carlo. An important contribution of this paper is a novel approximation to the likelihood function that allows us to retain the computational advantages associated with conjugate settings while reducing approximation errors associated with the boundary effects. The comparisons are based on various simulated scenarios as well as an application to the study the risk dynamics in the Standard & Poor's 500 intraday index prices among its 11 sectors.
△ Less
Submitted 15 January, 2024; v1 submitted 26 September, 2023;
originally announced September 2023.
-
BART-SIMP: a novel framework for flexible spatial covariate modeling and prediction using Bayesian additive regression trees
Authors:
Alex Ziyu Jiang,
Jon Wakefield
Abstract:
Prediction is a classic challenge in spatial statistics and the inclusion of spatial covariates can greatly improve predictive performance when incorporated into a model with latent spatial effects. It is desirable to develop flexible regression models that allow for nonlinearities and interactions in the covariate structure. Machine learning models have been suggested in the spatial context, allo…
▽ More
Prediction is a classic challenge in spatial statistics and the inclusion of spatial covariates can greatly improve predictive performance when incorporated into a model with latent spatial effects. It is desirable to develop flexible regression models that allow for nonlinearities and interactions in the covariate structure. Machine learning models have been suggested in the spatial context, allowing for spatial dependence in the residuals, but fail to provide reliable uncertainty estimates. In this paper, we investigate a novel combination of a Gaussian process spatial model and a Bayesian Additive Regression Tree (BART) model. The computational burden of the approach is reduced by combining Markov chain Monte Carlo (MCMC) with the Integrated Nested Laplace Approximation (INLA) technique. We study the performance of the method via simulations and use the model to predict anthropometric responses, collected via household cluster samples in Kenya.
△ Less
Submitted 23 September, 2023;
originally announced September 2023.
-
Multi-Scale Memory Comparison for Zero-/Few-Shot Anomaly Detection
Authors:
Chaoqin Huang,
Aofan Jiang,
Ya Zhang,
Yanfeng Wang
Abstract:
Anomaly detection has gained considerable attention due to its broad range of applications, particularly in industrial defect detection. To address the challenges of data collection, researchers have introduced zero-/few-shot anomaly detection techniques that require minimal normal images for each category. However, complex industrial scenarios often involve multiple objects, presenting a signific…
▽ More
Anomaly detection has gained considerable attention due to its broad range of applications, particularly in industrial defect detection. To address the challenges of data collection, researchers have introduced zero-/few-shot anomaly detection techniques that require minimal normal images for each category. However, complex industrial scenarios often involve multiple objects, presenting a significant challenge. In light of this, we propose a straightforward yet powerful multi-scale memory comparison framework for zero-/few-shot anomaly detection. Our approach employs a global memory bank to capture features across the entire image, while an individual memory bank focuses on simplified scenes containing a single object. The efficacy of our method is validated by its remarkable achievement of 4th place in the zero-shot track and 2nd place in the few-shot track of the Visual Anomaly and Novelty Detection (VAND) competition.
△ Less
Submitted 1 January, 2024; v1 submitted 9 August, 2023;
originally announced August 2023.
-
Multi-scale Cross-restoration Framework for Electrocardiogram Anomaly Detection
Authors:
Aofan Jiang,
Chaoqin Huang,
Qing Cao,
Shuang Wu,
Zi Zeng,
Kang Chen,
Ya Zhang,
Yanfeng Wang
Abstract:
Electrocardiogram (ECG) is a widely used diagnostic tool for detecting heart conditions. Rare cardiac diseases may be underdiagnosed using traditional ECG analysis, considering that no training dataset can exhaust all possible cardiac disorders. This paper proposes using anomaly detection to identify any unhealthy status, with normal ECGs solely for training. However, detecting anomalies in ECG ca…
▽ More
Electrocardiogram (ECG) is a widely used diagnostic tool for detecting heart conditions. Rare cardiac diseases may be underdiagnosed using traditional ECG analysis, considering that no training dataset can exhaust all possible cardiac disorders. This paper proposes using anomaly detection to identify any unhealthy status, with normal ECGs solely for training. However, detecting anomalies in ECG can be challenging due to significant inter-individual differences and anomalies present in both global rhythm and local morphology. To address this challenge, this paper introduces a novel multi-scale cross-restoration framework for ECG anomaly detection and localization that considers both local and global ECG characteristics. The proposed framework employs a two-branch autoencoder to facilitate multi-scale feature learning through a masking and restoration process, with one branch focusing on global features from the entire ECG and the other on local features from heartbeat-level details, mimicking the diagnostic process of cardiologists. Anomalies are identified by their high restoration errors. To evaluate the performance on a large number of individuals, this paper introduces a new challenging benchmark with signal point-level ground truths annotated by experienced cardiologists. The proposed method demonstrates state-of-the-art performance on this benchmark and two other well-known ECG datasets. The benchmark dataset and source code are available at: \url{https://github.com/MediaBrain-SJTU/ECGAD}
△ Less
Submitted 3 August, 2023;
originally announced August 2023.
-
Research Protocol for the Google Health Digital Well-being Study
Authors:
Daniel McDuff,
Andrew Barakat,
Ari Winbush,
Allen Jiang,
Felicia Cordeiro,
Ryann Crowley,
Lauren E. Kahn,
John Hernandez,
Nicholas B. Allen
Abstract:
The impact of digital device use on health and well-being is a pressing question to which individuals, families, schools, policy makers, legislators, and digital designers are all demanding answers. However, the scientific literature on this topic to date is marred by small and/or unrepresentative samples, poor measurement of core constructs (e.g., device use, smartphone addiction), and a limited…
▽ More
The impact of digital device use on health and well-being is a pressing question to which individuals, families, schools, policy makers, legislators, and digital designers are all demanding answers. However, the scientific literature on this topic to date is marred by small and/or unrepresentative samples, poor measurement of core constructs (e.g., device use, smartphone addiction), and a limited ability to address the psychological and behavioral mechanisms that may underlie the relationships between device use and well-being. A number of recent authoritative reviews have made urgent calls for future research projects to address these limitations. The critical role of research is to identify which patterns of use are associated with benefits versus risks, and who is more vulnerable to harmful versus beneficial outcomes, so that we can pursue evidence-based product design, education, and regulation aimed at maximizing benefits and minimizing risks of smartphones and other digital devices. We describe a protocol for a Digital Well-Being (DWB) study to help answer these questions.
△ Less
Submitted 11 July, 2023;
originally announced July 2023.
-
Evaluating Language Models for Mathematics through Interactions
Authors:
Katherine M. Collins,
Albert Q. Jiang,
Simon Frieder,
Lionel Wong,
Miri Zilka,
Umang Bhatt,
Thomas Lukasiewicz,
Yuhuai Wu,
Joshua B. Tenenbaum,
William Hart,
Timothy Gowers,
Wenda Li,
Adrian Weller,
Mateja Jamnik
Abstract:
There is much excitement about the opportunity to harness the power of large language models (LLMs) when building problem-solving assistants. However, the standard methodology of evaluating LLMs relies on static pairs of inputs and outputs, and is insufficient for making an informed decision about which LLMs and under which assistive settings can they be sensibly used. Static assessment fails to a…
▽ More
There is much excitement about the opportunity to harness the power of large language models (LLMs) when building problem-solving assistants. However, the standard methodology of evaluating LLMs relies on static pairs of inputs and outputs, and is insufficient for making an informed decision about which LLMs and under which assistive settings can they be sensibly used. Static assessment fails to account for the essential interactive element in LLM deployment, and therefore limits how we understand language model capabilities. We introduce CheckMate, an adaptable prototype platform for humans to interact with and evaluate LLMs. We conduct a study with CheckMate to evaluate three language models (InstructGPT, ChatGPT, and GPT-4) as assistants in proving undergraduate-level mathematics, with a mixed cohort of participants from undergraduate students to professors of mathematics. We release the resulting interaction and rating dataset, MathConverse. By analysing MathConverse, we derive a taxonomy of human behaviours and uncover that despite a generally positive correlation, there are notable instances of divergence between correctness and perceived helpfulness in LLM generations, amongst other findings. Further, we garner a more granular understanding of GPT-4 mathematical problem-solving through a series of case studies, contributed by expert mathematicians. We conclude with actionable takeaways for ML practitioners and mathematicians: models that communicate uncertainty respond well to user corrections, and are more interpretable and concise may constitute better assistants. Interactive evaluation is a promising way to navigate the capability of these models; humans should be aware of language models' algebraic fallibility and discern where they are appropriate to use.
△ Less
Submitted 5 November, 2023; v1 submitted 2 June, 2023;
originally announced June 2023.
-
Minkowski Functionals of Large-Scale Structure as a Probe of Modified Gravity
Authors:
Aoxiang Jiang,
Wei Liu,
Baojiu Li,
Cristian Barrera-Hinojosa,
Yufei Zhang,
Wenjuan Fang
Abstract:
In this study, we explore the potential of utilizing the four Minkowski functionals, which can fully describe the morphological properties of the large-scale structures, as a robust tool for investigating the modified gravity, particularly on non-linear and quasi-linear scales. With the assistance of the N-body simulation, we employ the Minkowski functionals to probe the Hu-Sawicki f(R) gravity mo…
▽ More
In this study, we explore the potential of utilizing the four Minkowski functionals, which can fully describe the morphological properties of the large-scale structures, as a robust tool for investigating the modified gravity, particularly on non-linear and quasi-linear scales. With the assistance of the N-body simulation, we employ the Minkowski functionals to probe the Hu-Sawicki f(R) gravity model. The focus is on understanding the morphorlogical properties extracted by the Minkowski functionals and their sensitivity to modified gravity. Our analysis involves a comprehensive examination of the cosmic variance arising from finite simulation volumes. By systematically varying smoothing scales and redshifts, we quantify the information encoded in the Minkowski functionals measured from the dark-matter density field. The goal is to assess the capacity of the Minkowksi functionals to constrain the model and explore potential improvements through their combination. Additionally, we investigate the impact of using biased tracers such as dark matter halos and the halo occupation distribution galaxies on the modified gravity signatures within the Minkowksi functionals of the LSS. Furthermore, we evaluate the influence of the redshift space distortion on the observed results. In summary, our study suggests that the Minkowski functionals of the large-scale structures hold promise as a stringent tool for constraining modified gravity and offer valuable insights into the morphological features of the cosmic web.
△ Less
Submitted 19 March, 2024; v1 submitted 8 May, 2023;
originally announced May 2023.
-
Dynamical hotness, star formation quenching and growth of supermassive black holes
Authors:
Hui Hong,
Huiyuan Wang,
H. J. Mo,
Ziwen Zhang,
Guangwen Chen,
Wentao Luo,
Tinggui Wang,
Pengfei Li,
Renjie Li,
Yao yao,
Aoxiang Jiang
Abstract:
A stellar system is dynamically hot when its kinetic energy is dominated by random motion represented by the velocity dispersion $σ_{\rm hot} (M_*)$. We use MaNGA data to obtain inner and outer dispersion of a galaxy, $σ_{\rm in}$ and $σ_{\rm out}$, to characterize its dynamical status and study its connection with star formation quenching and the growth of supermassive black hole (SMBH). We divid…
▽ More
A stellar system is dynamically hot when its kinetic energy is dominated by random motion represented by the velocity dispersion $σ_{\rm hot} (M_*)$. We use MaNGA data to obtain inner and outer dispersion of a galaxy, $σ_{\rm in}$ and $σ_{\rm out}$, to characterize its dynamical status and study its connection with star formation quenching and the growth of supermassive black hole (SMBH). We divide galaxies into fully quenched (FQGs), partially quenched (PQGs) and fully star-forming (FSGs) populations, and identify quenched central cores (QCCs) in PQGs. The galaxy distribution in $σ_{\rm in}/σ_{\rm hot}$-$σ_{\rm out}/σ_{\rm hot}$ diagram is L-shaped, consisting of a horizontal sequence ($σ_{\rm out}/σ_{\rm hot}\sim0$) and a vertical sequence ($σ_{\rm in}/σ_{\rm hot}\sim1$). FQGs and QCCs are located at the top of vertical sequence, $σ_{\rm out}/σ_{\rm hot}\sim1$, therefore they are dynamically hot over their entire bodies. PQGs reside along vertical sequence, so they have hot center but cold outskirt. FSGs are diverse and can be found in both sequences. Galaxy structural properties, star formation and AGN activities make a transition along horizontal sequence at $\log(σ_{\rm in}/σ_{\rm hot})\sim-0.3$, and along vertical sequence at $\log(σ_{\rm out}/σ_{\rm hot})\sim-0.3$. The fractions of optical AGNs and barred galaxies increase rapidly in the first transition and decline rapidly in the second; radio galaxies are located at the top of vertical sequence. Our results demonstrate that star formation quenching and SMBH growth are effective only in dynamically hot systems. A simple model along this line can reproduce the observed SMBH scaling relations. We discuss how secular processes and strong interactions can make a system dynamically hot, and lead to the SMBH growth and star formation quenching.
△ Less
Submitted 19 July, 2023; v1 submitted 4 May, 2023;
originally announced May 2023.
-
Q-Kostka polynomials and spin Green polynomials
Authors:
Anguo Jiang,
Naihuan Jing,
Ning Liu
Abstract:
We study the $Q$-Kostka polynomials $L_{λμ}(t)$ by the vertex operator realization of the $Q$-Hall-Littlewood functions $G_λ(x;t)$ and derive new formulae for $L_{λμ}(t)$. In particular, we have established stability property for the Q-Kostka polynomials. We also introduce spin Green polynomials $Y^λ_μ(t)$ as both an analogue of the Green polynomials and deformation of the spin irreducible charact…
▽ More
We study the $Q$-Kostka polynomials $L_{λμ}(t)$ by the vertex operator realization of the $Q$-Hall-Littlewood functions $G_λ(x;t)$ and derive new formulae for $L_{λμ}(t)$. In particular, we have established stability property for the Q-Kostka polynomials. We also introduce spin Green polynomials $Y^λ_μ(t)$ as both an analogue of the Green polynomials and deformation of the spin irreducible characters of $\mathfrak S_n$. Iterative formulas of the spin Green polynomials are given and some favorable properties parallel to the Green polynomials are obtained. Tables of $Y^λ_μ(t)$ are included for $n\leq7.$
△ Less
Submitted 14 April, 2023;
originally announced April 2023.
-
Unsupervised Anomaly Detection and Localization of Machine Audio: A GAN-based Approach
Authors:
Anbai Jiang,
Wei-Qiang Zhang,
Yufeng Deng,
Pingyi Fan,
Jia Liu
Abstract:
Automatic detection of machine anomaly remains challenging for machine learning. We believe the capability of generative adversarial network (GAN) suits the need of machine audio anomaly detection, yet rarely has this been investigated by previous work. In this paper, we propose AEGAN-AD, a totally unsupervised approach in which the generator (also an autoencoder) is trained to reconstruct input s…
▽ More
Automatic detection of machine anomaly remains challenging for machine learning. We believe the capability of generative adversarial network (GAN) suits the need of machine audio anomaly detection, yet rarely has this been investigated by previous work. In this paper, we propose AEGAN-AD, a totally unsupervised approach in which the generator (also an autoencoder) is trained to reconstruct input spectrograms. It is pointed out that the denoising nature of reconstruction deprecates its capacity. Thus, the discriminator is redesigned to aid the generator during both training stage and detection stage. The performance of AEGAN-AD on the dataset of DCASE 2022 Challenge TASK 2 demonstrates the state-of-the-art result on five machine types. A novel anomaly localization method is also investigated. Source code available at: www.github.com/jianganbai/AEGAN-AD
△ Less
Submitted 31 March, 2023;
originally announced March 2023.
-
Grothendieck Duality via Diagonally Supported Sheaves
Authors:
Andy Jiang
Abstract:
Following a formula found in the paper of Avramov, Iyengar, Lipman, and Nayak (2010) and ideas of Neeman and Khusyairi, we indicate that Grothendieck duality for finite tor-amplitude maps can be developed from scratch via the formula $f^! := δ^*π_1^{\times}f^*$. Our strategy centers on the subcategory $Γ_Δ(\mathrm{QCoh}(X \times X))$ of quasicoherent sheaves on $X \times X$ supported on the diagon…
▽ More
Following a formula found in the paper of Avramov, Iyengar, Lipman, and Nayak (2010) and ideas of Neeman and Khusyairi, we indicate that Grothendieck duality for finite tor-amplitude maps can be developed from scratch via the formula $f^! := δ^*π_1^{\times}f^*$. Our strategy centers on the subcategory $Γ_Δ(\mathrm{QCoh}(X \times X))$ of quasicoherent sheaves on $X \times X$ supported on the diagonal. By exclusively using this subcategory instead of the full category $\mathrm{QCoh}(X \times X)$ we give systematic categorical proofs of results in Grothendieck duality and reprove many formulas found in Neeman (2018). We also relate some results in Grothendieck duality with properties of the sheaf of (derived) Grothendieck differential operators.
△ Less
Submitted 28 March, 2023;
originally announced March 2023.
-
The Derived Ring of Differential Operators
Authors:
Andy Jiang
Abstract:
By reading a standard formula for the ring of Grothendieck differential operators in a derived way, we construct a derived (sheaf of) ring of Grothendieck differential operators for Noetherian schemes $X$ separated and finite-type over a base $S$, when the map $X \to S$ is finite tor-amplitude. Using this ring of differential operators, we (re-)develop the theory of $D$-modules from scratch and sh…
▽ More
By reading a standard formula for the ring of Grothendieck differential operators in a derived way, we construct a derived (sheaf of) ring of Grothendieck differential operators for Noetherian schemes $X$ separated and finite-type over a base $S$, when the map $X \to S$ is finite tor-amplitude. Using this ring of differential operators, we (re-)develop the theory of $D$-modules from scratch and show an equivalence of categories between $D$-modules using our definition and crystals over the infinitesimal site.
△ Less
Submitted 28 March, 2023;
originally announced March 2023.
-
Learning a Practical SDR-to-HDRTV Up-conversion using New Dataset and Degradation Models
Authors:
Cheng Guo,
Leidong Fan,
Ziyu Xue,
and Xiuhua Jiang
Abstract:
In media industry, the demand of SDR-to-HDRTV up-conversion arises when users possess HDR-WCG (high dynamic range-wide color gamut) TVs while most off-the-shelf footage is still in SDR (standard dynamic range). The research community has started tackling this low-level vision task by learning-based approaches. When applied to real SDR, yet, current methods tend to produce dim and desaturated resul…
▽ More
In media industry, the demand of SDR-to-HDRTV up-conversion arises when users possess HDR-WCG (high dynamic range-wide color gamut) TVs while most off-the-shelf footage is still in SDR (standard dynamic range). The research community has started tackling this low-level vision task by learning-based approaches. When applied to real SDR, yet, current methods tend to produce dim and desaturated result, making nearly no improvement on viewing experience. Different from other network-oriented methods, we attribute such deficiency to training set (HDR-SDR pair). Consequently, we propose new HDRTV dataset (dubbed HDRTV4K) and new HDR-to-SDR degradation models. Then, it's used to train a luminance-segmented network (LSN) consisting of a global mapping trunk, and two Transformer branches on bright and dark luminance range. We also update assessment criteria by tailored metrics and subjective experiment. Finally, ablation studies are conducted to prove the effectiveness. Our work is available at: https://github.com/AndreGuo/HDRTVDM.
△ Less
Submitted 23 March, 2023;
originally announced March 2023.
-
GPT-4 Technical Report
Authors:
OpenAI,
Josh Achiam,
Steven Adler,
Sandhini Agarwal,
Lama Ahmad,
Ilge Akkaya,
Florencia Leoni Aleman,
Diogo Almeida,
Janko Altenschmidt,
Sam Altman,
Shyamal Anadkat,
Red Avila,
Igor Babuschkin,
Suchir Balaji,
Valerie Balcom,
Paul Baltescu,
Haiming Bao,
Mohammad Bavarian,
Jeff Belgum,
Irwan Bello,
Jake Berdine,
Gabriel Bernadett-Shapiro,
Christopher Berner,
Lenny Bogdonoff,
Oleg Boiko
, et al. (256 additional authors not shown)
Abstract:
We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based mo…
▽ More
We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based model pre-trained to predict the next token in a document. The post-training alignment process results in improved performance on measures of factuality and adherence to desired behavior. A core component of this project was developing infrastructure and optimization methods that behave predictably across a wide range of scales. This allowed us to accurately predict some aspects of GPT-4's performance based on models trained with no more than 1/1,000th the compute of GPT-4.
△ Less
Submitted 4 March, 2024; v1 submitted 15 March, 2023;
originally announced March 2023.
-
Magnushammer: A Transformer-Based Approach to Premise Selection
Authors:
Maciej Mikuła,
Szymon Tworkowski,
Szymon Antoniak,
Bartosz Piotrowski,
Albert Qiaochu Jiang,
Jin Peng Zhou,
Christian Szegedy,
Łukasz Kuciński,
Piotr Miłoś,
Yuhuai Wu
Abstract:
This paper presents a novel approach to premise selection, a crucial reasoning task in automated theorem proving. Traditionally, symbolic methods that rely on extensive domain knowledge and engineering effort are applied to this task. In contrast, this work demonstrates that contrastive training with the transformer architecture can achieve higher-quality retrieval of relevant premises, without th…
▽ More
This paper presents a novel approach to premise selection, a crucial reasoning task in automated theorem proving. Traditionally, symbolic methods that rely on extensive domain knowledge and engineering effort are applied to this task. In contrast, this work demonstrates that contrastive training with the transformer architecture can achieve higher-quality retrieval of relevant premises, without the engineering overhead. Our method, Magnushammer, outperforms the most advanced and widely used automation tool in interactive theorem proving called Sledgehammer. On the PISA and miniF2F benchmarks Magnushammer achieves $59.5\%$ (against $38.3\%$) and $34.0\%$ (against $20.9\%$) success rates, respectively. By combining \method with a language-model-based automated theorem prover, we further improve the state-of-the-art proof success rate from $57.0\%$ to $71.0\%$ on the PISA benchmark using $4$x fewer parameters. Moreover, we develop and open source a novel dataset for premise selection, containing textual representations of (proof state, relevant premise) pairs. To the best of our knowledge, this is the largest available premise selection dataset, and the first one for the Isabelle proof assistant.
△ Less
Submitted 18 March, 2024; v1 submitted 8 March, 2023;
originally announced March 2023.
-
Probing massive neutrinos with the Minkowski functionals of the galaxy distribution
Authors:
Wei Liu,
Aoxiang Jiang,
Wenjuan Fang
Abstract:
The characteristic signatures of massive neutrinos on large-scale structure (LSS), if fully captured, can be used to put a stringent constraint on their mass sum, $M_ν$. Previous work utilizing N-body simulations has shown the Minkowski functionals (MFs) of LSS can reveal the imprints of massive neutrinos on LSS, provide important complementary information to two-point statistics and significantly…
▽ More
The characteristic signatures of massive neutrinos on large-scale structure (LSS), if fully captured, can be used to put a stringent constraint on their mass sum, $M_ν$. Previous work utilizing N-body simulations has shown the Minkowski functionals (MFs) of LSS can reveal the imprints of massive neutrinos on LSS, provide important complementary information to two-point statistics and significantly improve constraints on $M_ν$. In this work, we take a step forward and apply the statistics to the biased tracers of LSS, i.e. the galaxies, and in redshift space. We perform a Fisher matrix analysis and quantify the constraining power of the MFs by using the Molino mock galaxy catalogs, which are constructed based on the halo occupation distribution (HOD) framework with parameters for the SDSS $M_r < -21.5$ and -22 galaxy samples. We find the MFs give tighter constraints on all of the cosmological parameters that we consider than the power spectrum. The constraints on $Ω_{\mathrm{m}}, Ω_{\mathrm{b}}, h, n_s, σ_8$, and $M_ν$ from the MFs are better by a factor of 1.9, 2.9, 3.7, 4.2, 2.5, and 5.7, respectively, after marginalizing over the HOD parameters. Specifically, for $M_ν$, we obtain a 1$σ$ constraint of 0.059 eV with the MFs alone for a volume of only $\left(1 h^{-1} \mathrm{Gpc}\right)^3$.
△ Less
Submitted 18 September, 2023; v1 submitted 16 February, 2023;
originally announced February 2023.
-
AnnoBERT: Effectively Representing Multiple Annotators' Label Choices to Improve Hate Speech Detection
Authors:
Wenjie Yin,
Vibhor Agarwal,
Aiqi Jiang,
Arkaitz Zubiaga,
Nishanth Sastry
Abstract:
Supervised approaches generally rely on majority-based labels. However, it is hard to achieve high agreement among annotators in subjective tasks such as hate speech detection. Existing neural network models principally regard labels as categorical variables, while ignoring the semantic information in diverse label texts. In this paper, we propose AnnoBERT, a first-of-its-kind architecture integra…
▽ More
Supervised approaches generally rely on majority-based labels. However, it is hard to achieve high agreement among annotators in subjective tasks such as hate speech detection. Existing neural network models principally regard labels as categorical variables, while ignoring the semantic information in diverse label texts. In this paper, we propose AnnoBERT, a first-of-its-kind architecture integrating annotator characteristics and label text with a transformer-based model to detect hate speech, with unique representations based on each annotator's characteristics via Collaborative Topic Regression (CTR) and integrate label text to enrich textual representations. During training, the model associates annotators with their label choices given a piece of text; during evaluation, when label information is not available, the model predicts the aggregated label given by the participating annotators by utilising the learnt association. The proposed approach displayed an advantage in detecting hate speech, especially in the minority class and edge cases with annotator disagreement. Improvement in the overall performance is the largest when the dataset is more label-imbalanced, suggesting its practical value in identifying real-world hate speech, as the volume of hate speech in-the-wild is extremely small on social media, when compared with normal (non-hate) speech. Through ablation studies, we show the relative contributions of annotator embeddings and label text to the model performance, and tested a range of alternative annotator embeddings and label text combinations.
△ Less
Submitted 10 January, 2023; v1 submitted 20 December, 2022;
originally announced December 2022.
-
Modeling 100% Electrified Transportation in NYC
Authors:
Jingrong Zhang,
Amber Jiang,
Brian Newborn,
Sara Kou,
Robert Mieth
Abstract:
Envisioning a future 100% electrified transportation sector, this paper uses socio-economic, demographic, and geographic data to assess electric energy demand from commuter traffic. We explore the individual mode choices, which allows to create mode-mix scenarios for the entire population, and quantify the electric energy demand for each scenario using technical specifications of battery and elect…
▽ More
Envisioning a future 100% electrified transportation sector, this paper uses socio-economic, demographic, and geographic data to assess electric energy demand from commuter traffic. We explore the individual mode choices, which allows to create mode-mix scenarios for the entire population, and quantify the electric energy demand for each scenario using technical specifications of battery and electric drives technology in combination with different charging scenarios. Using data sets for New York City, our results highlight the need for infrastructure investments, the usefulness of flexible charging policies, and the positive impact of incentivizing micromobility and mass-transit options. Our model and results are publicly available as interactive dashboard.
△ Less
Submitted 17 February, 2023; v1 submitted 16 November, 2022;
originally announced November 2022.
-
SexWEs: Domain-Aware Word Embeddings via Cross-lingual Semantic Specialisation for Chinese Sexism Detection in Social Media
Authors:
Aiqi Jiang,
Arkaitz Zubiaga
Abstract:
The goal of sexism detection is to mitigate negative online content targeting certain gender groups of people. However, the limited availability of labeled sexism-related datasets makes it problematic to identify online sexism for low-resource languages. In this paper, we address the task of automatic sexism detection in social media for one low-resource language -- Chinese. Rather than collecting…
▽ More
The goal of sexism detection is to mitigate negative online content targeting certain gender groups of people. However, the limited availability of labeled sexism-related datasets makes it problematic to identify online sexism for low-resource languages. In this paper, we address the task of automatic sexism detection in social media for one low-resource language -- Chinese. Rather than collecting new sexism data or building cross-lingual transfer learning models, we develop a cross-lingual domain-aware semantic specialisation system in order to make the most of existing data. Semantic specialisation is a technique for retrofitting pre-trained distributional word vectors by integrating external linguistic knowledge (such as lexico-semantic relations) into the specialised feature space. To do this, we leverage semantic resources for sexism from a high-resource language (English) to specialise pre-trained word vectors in the target language (Chinese) to inject domain knowledge. We demonstrate the benefit of our sexist word embeddings (SexWEs) specialised by our framework via intrinsic evaluation of word similarity and extrinsic evaluation of sexism detection. Compared with other specialisation approaches and Chinese baseline word vectors, our SexWEs shows an average score improvement of 0.033 and 0.064 in both intrinsic and extrinsic evaluations, respectively. The ablative results and visualisation of SexWEs also prove the effectiveness of our framework on retrofitting word vectors in low-resource languages.
△ Less
Submitted 30 March, 2023; v1 submitted 15 November, 2022;
originally announced November 2022.
-
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs
Authors:
Albert Q. Jiang,
Sean Welleck,
Jin Peng Zhou,
Wenda Li,
Jiacheng Liu,
Mateja Jamnik,
Timothée Lacroix,
Yuhuai Wu,
Guillaume Lample
Abstract:
The formalization of existing mathematical proofs is a notoriously difficult process. Despite decades of research on automation and proof assistants, writing formal proofs remains arduous and only accessible to a few experts. While previous studies to automate formalization focused on powerful search algorithms, no attempts were made to take advantage of available informal proofs. In this work, we…
▽ More
The formalization of existing mathematical proofs is a notoriously difficult process. Despite decades of research on automation and proof assistants, writing formal proofs remains arduous and only accessible to a few experts. While previous studies to automate formalization focused on powerful search algorithms, no attempts were made to take advantage of available informal proofs. In this work, we introduce Draft, Sketch, and Prove (DSP), a method that maps informal proofs to formal proof sketches, and uses the sketches to guide an automated prover by directing its search to easier sub-problems. We investigate two relevant setups where informal proofs are either written by humans or generated by a language model. Our experiments and ablation studies show that large language models are able to produce well-structured formal sketches that follow the same reasoning steps as the informal proofs. Guiding an automated prover with these sketches enhances its performance from 20.9% to 39.3% on a collection of mathematical competition problems.
△ Less
Submitted 20 February, 2023; v1 submitted 21 October, 2022;
originally announced October 2022.
-
Tensor Hypercontraction Form of the Perturbative Triples Energy in Coupled-Cluster Theory
Authors:
Andy Jiang,
Justin M. Turney,
Henry F. Schaefer III
Abstract:
We present the working equations for a reduced-scaling method of evaluating the perturbative triples (T) energy in coupled-cluster theory, through the tensor hypercontraction (THC) of the triples amplitudes ($t_{ijk}^{abc}$). Through our method we can reduce the scaling of the (T) energy from the traditional O($N^{7}$) to a more modest O($N^{5}$). We also discuss implementation details to aid futu…
▽ More
We present the working equations for a reduced-scaling method of evaluating the perturbative triples (T) energy in coupled-cluster theory, through the tensor hypercontraction (THC) of the triples amplitudes ($t_{ijk}^{abc}$). Through our method we can reduce the scaling of the (T) energy from the traditional O($N^{7}$) to a more modest O($N^{5}$). We also discuss implementation details to aid future research, development, and software realization of this method. Additionally, we show that this method yields sub-millihartree (mEh) differences from CCSD(T) when evaluating absolute energies, and sub-0.1 kcal/mol energy differences when evaluating relative energies. Finally, we demonstrate that this method converges to the true CCSD(T) energy through the systematic increasing of the rank or eigenvalue tolerance of the orthogonal projector, as well as exhibiting sub-linear to linear error growth with respect to system size.
△ Less
Submitted 13 October, 2022;
originally announced October 2022.
-
A Holistic Approach to Undesired Content Detection in the Real World
Authors:
Todor Markov,
Chong Zhang,
Sandhini Agarwal,
Tyna Eloundou,
Teddy Lee,
Steven Adler,
Angela Jiang,
Lilian Weng
Abstract:
We present a holistic approach to building a robust and useful natural language classification system for real-world content moderation. The success of such a system relies on a chain of carefully designed and executed steps, including the design of content taxonomies and labeling instructions, data quality control, an active learning pipeline to capture rare events, and a variety of methods to ma…
▽ More
We present a holistic approach to building a robust and useful natural language classification system for real-world content moderation. The success of such a system relies on a chain of carefully designed and executed steps, including the design of content taxonomies and labeling instructions, data quality control, an active learning pipeline to capture rare events, and a variety of methods to make the model robust and to avoid overfitting. Our moderation system is trained to detect a broad set of categories of undesired content, including sexual content, hateful content, violence, self-harm, and harassment. This approach generalizes to a wide range of different content taxonomies and can be used to create high-quality content classifiers that outperform off-the-shelf models.
△ Less
Submitted 14 February, 2023; v1 submitted 5 August, 2022;
originally announced August 2022.
-
Finding a Lower Bound for k-Unbounded Hamiltonian Cycles
Authors:
Albert R. Jiang
Abstract:
Methods to determine the existence of Hamiltonian Cycles in graphs have been extensively studied. However, little research has been done following cases when no Hamiltonian Cycle exists. Let a vertex be "unbounded" if it is visited more than once in a path. Furthermore, let a k-Unbounded Hamiltonian Cycle be a path with finite length that visits every vertex, has adjacent start and end vertices, a…
▽ More
Methods to determine the existence of Hamiltonian Cycles in graphs have been extensively studied. However, little research has been done following cases when no Hamiltonian Cycle exists. Let a vertex be "unbounded" if it is visited more than once in a path. Furthermore, let a k-Unbounded Hamiltonian Cycle be a path with finite length that visits every vertex, has adjacent start and end vertices, and contains k unbounded vertices. We consider a novel variant of the Hamiltonian Cycle Problem in which the objective is to find an m-Unbounded Hamiltonian Cycle where m is the minimum value of k such that a k-Unbounded Hamiltonian Cycle exists. We first consider the task on well-known non-Hamiltonian graphs. We then provide an exponential-time brute-force algorithm for the determination of an m-Unbounded Hamiltonian Cycle and discuss approaches to solve the variant through transformations to the Hamiltonian Cycle Problem and the Asymmetric Traveling Salesman Problem. Finally, we present a polynomial-time heuristic for the determination of an m-Unbounded Hamiltonian Cycle that is also shown to be an effective heuristic for the original Hamiltonian Cycle Problem.
△ Less
Submitted 8 August, 2022; v1 submitted 3 August, 2022;
originally announced August 2022.
-
Registration based Few-Shot Anomaly Detection
Authors:
Chaoqin Huang,
Haoyan Guan,
Aofan Jiang,
Ya Zhang,
Michael Spratling,
Yan-Feng Wang
Abstract:
This paper considers few-shot anomaly detection (FSAD), a practical yet under-studied setting for anomaly detection (AD), where only a limited number of normal images are provided for each category at training. So far, existing FSAD studies follow the one-model-per-category learning paradigm used for standard AD, and the inter-category commonality has not been explored. Inspired by how humans dete…
▽ More
This paper considers few-shot anomaly detection (FSAD), a practical yet under-studied setting for anomaly detection (AD), where only a limited number of normal images are provided for each category at training. So far, existing FSAD studies follow the one-model-per-category learning paradigm used for standard AD, and the inter-category commonality has not been explored. Inspired by how humans detect anomalies, i.e., comparing an image in question to normal images, we here leverage registration, an image alignment task that is inherently generalizable across categories, as the proxy task, to train a category-agnostic anomaly detection model. During testing, the anomalies are identified by comparing the registered features of the test image and its corresponding support (normal) images. As far as we know, this is the first FSAD method that trains a single generalizable model and requires no re-training or parameter fine-tuning for new categories. Experimental results have shown that the proposed method outperforms the state-of-the-art FSAD methods by 3%-8% in AUC on the MVTec and MPDD benchmarks.
△ Less
Submitted 15 July, 2022;
originally announced July 2022.
-
FedSSO: A Federated Server-Side Second-Order Optimization Algorithm
Authors:
Xin Ma,
Renyi Bao,
Jinpeng Jiang,
Yang Liu,
Arthur Jiang,
Jun Yan,
Xin Liu,
Zhisong Pan
Abstract:
In this work, we propose FedSSO, a server-side second-order optimization method for federated learning (FL). In contrast to previous works in this direction, we employ a server-side approximation for the Quasi-Newton method without requiring any training data from the clients. In this way, we not only shift the computation burden from clients to server, but also eliminate the additional communicat…
▽ More
In this work, we propose FedSSO, a server-side second-order optimization method for federated learning (FL). In contrast to previous works in this direction, we employ a server-side approximation for the Quasi-Newton method without requiring any training data from the clients. In this way, we not only shift the computation burden from clients to server, but also eliminate the additional communication for second-order updates between clients and server entirely. We provide theoretical guarantee for convergence of our novel method, and empirically demonstrate our fast convergence and communication savings in both convex and non-convex settings.
△ Less
Submitted 22 August, 2022; v1 submitted 20 June, 2022;
originally announced June 2022.
-
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
Authors:
Aarohi Srivastava,
Abhinav Rastogi,
Abhishek Rao,
Abu Awal Md Shoeb,
Abubakar Abid,
Adam Fisch,
Adam R. Brown,
Adam Santoro,
Aditya Gupta,
Adrià Garriga-Alonso,
Agnieszka Kluska,
Aitor Lewkowycz,
Akshat Agarwal,
Alethea Power,
Alex Ray,
Alex Warstadt,
Alexander W. Kocurek,
Ali Safaya,
Ali Tazarv,
Alice Xiang,
Alicia Parrish,
Allen Nie,
Aman Hussain,
Amanda Askell,
Amanda Dsouza
, et al. (426 additional authors not shown)
Abstract:
Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur…
▽ More
Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-future capabilities and limitations of language models. To address this challenge, we introduce the Beyond the Imitation Game benchmark (BIG-bench). BIG-bench currently consists of 204 tasks, contributed by 450 authors across 132 institutions. Task topics are diverse, drawing problems from linguistics, childhood development, math, common-sense reasoning, biology, physics, social bias, software development, and beyond. BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models. We evaluate the behavior of OpenAI's GPT models, Google-internal dense transformer architectures, and Switch-style sparse transformers on BIG-bench, across model sizes spanning millions to hundreds of billions of parameters. In addition, a team of human expert raters performed all tasks in order to provide a strong baseline. Findings include: model performance and calibration both improve with scale, but are poor in absolute terms (and when compared with rater performance); performance is remarkably similar across model classes, though with benefits from sparsity; tasks that improve gradually and predictably commonly involve a large knowledge or memorization component, whereas tasks that exhibit "breakthrough" behavior at a critical scale often involve multiple steps or components, or brittle metrics; social bias typically increases with scale in settings with ambiguous context, but this can be improved with prompting.
△ Less
Submitted 12 June, 2023; v1 submitted 9 June, 2022;
originally announced June 2022.
-
A Trade-off-centered Framework of Content Moderation
Authors:
Jialun Aaron Jiang,
Peipei Nie,
Jed R. Brubaker,
Casey Fiesler
Abstract:
Content moderation research typically prioritizes representing and addressing challenges for one group of stakeholders or communities in one type of context. While taking a focused approach is reasonable or even favorable for empirical case studies, it does not address how content moderation works in multiple contexts. Through a systematic literature review of 86 content moderation papers that doc…
▽ More
Content moderation research typically prioritizes representing and addressing challenges for one group of stakeholders or communities in one type of context. While taking a focused approach is reasonable or even favorable for empirical case studies, it does not address how content moderation works in multiple contexts. Through a systematic literature review of 86 content moderation papers that document empirical studies, we seek to uncover patterns and tensions within past content moderation research. We find that content moderation can be characterized as a series of trade-offs around moderation actions, styles, philosophies, and values. We discuss how facilitating cooperation and preventing abuse, two key elements in Grimmelmann's definition of moderation, are inherently dialectical in practice. We close by showing how researchers, designers, and moderators can use our framework of trade-offs in their own work, and arguing that trade-offs should be of central importance in investigating and designing content moderation.
△ Less
Submitted 7 June, 2022;
originally announced June 2022.
-
Autoformalization with Large Language Models
Authors:
Yuhuai Wu,
Albert Q. Jiang,
Wenda Li,
Markus N. Rabe,
Charles Staats,
Mateja Jamnik,
Christian Szegedy
Abstract:
Autoformalization is the process of automatically translating from natural language mathematics to formal specifications and proofs. A successful autoformalization system could advance the fields of formal verification, program synthesis, and artificial intelligence. While the long-term goal of autoformalization seemed elusive for a long time, we show large language models provide new prospects to…
▽ More
Autoformalization is the process of automatically translating from natural language mathematics to formal specifications and proofs. A successful autoformalization system could advance the fields of formal verification, program synthesis, and artificial intelligence. While the long-term goal of autoformalization seemed elusive for a long time, we show large language models provide new prospects towards this goal. We make the surprising observation that LLMs can correctly translate a significant portion ($25.3\%$) of mathematical competition problems perfectly to formal specifications in Isabelle/HOL. We demonstrate the usefulness of this process by improving a previously introduced neural theorem prover via training on these autoformalized theorems. Our methodology results in a new state-of-the-art result on the MiniF2F theorem proving benchmark, improving the proof rate from $29.6\%$ to $35.2\%$.
△ Less
Submitted 25 May, 2022;
originally announced May 2022.
-
Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers
Authors:
Albert Q. Jiang,
Wenda Li,
Szymon Tworkowski,
Konrad Czechowski,
Tomasz Odrzygóźdź,
Piotr Miłoś,
Yuhuai Wu,
Mateja Jamnik
Abstract:
In theorem proving, the task of selecting useful premises from a large library to unlock the proof of a given conjecture is crucially important. This presents a challenge for all theorem provers, especially the ones based on language models, due to their relative inability to reason over huge volumes of premises in text form. This paper introduces Thor, a framework integrating language models and…
▽ More
In theorem proving, the task of selecting useful premises from a large library to unlock the proof of a given conjecture is crucially important. This presents a challenge for all theorem provers, especially the ones based on language models, due to their relative inability to reason over huge volumes of premises in text form. This paper introduces Thor, a framework integrating language models and automated theorem provers to overcome this difficulty. In Thor, a class of methods called hammers that leverage the power of automated theorem provers are used for premise selection, while all other tasks are designated to language models. Thor increases a language model's success rate on the PISA dataset from $39\%$ to $57\%$, while solving $8.2\%$ of problems neither language models nor automated theorem provers are able to solve on their own. Furthermore, with a significantly smaller computational budget, Thor can achieve a success rate on the MiniF2F dataset that is on par with the best existing methods. Thor can be instantiated for the majority of popular interactive theorem provers via a straightforward protocol we provide.
△ Less
Submitted 22 May, 2022;
originally announced May 2022.
-
The Effects of Dynamic Learning and the Forgetting Process on an Optimizing Modelling for Full-Service Repair Pricing Contracts for Medical Devices
Authors:
Aiping Jiang,
Lin Li,
Xuemin Xu,
David Y. C. Huang
Abstract:
In order to improve the profitability and customer service management of original equipment manufacturers (OEMs) in a market where full-service (FS) and on-call service (OS) co-exist, this article extends the optimizing modelling for pricing FS repair contracts with the effects of dynamic learning and forgetting. Along with considering autonomous learning in maintenance practice, this study also a…
▽ More
In order to improve the profitability and customer service management of original equipment manufacturers (OEMs) in a market where full-service (FS) and on-call service (OS) co-exist, this article extends the optimizing modelling for pricing FS repair contracts with the effects of dynamic learning and forgetting. Along with considering autonomous learning in maintenance practice, this study also analyses how induced learning and forgetting process in a workplace put impact on the pricing optimizing model of FS contracts in the portfolio of FS and OS. A numerical analysis based on real data from a medical industry proves that the enhanced FS pricing model discussed here has two main advantages: (1) It could prominently improve repair efficiency, and (2) It help OEMs gain better profits compared to the original FS model and the sole OS maintenance. Sensitivity analysis shows that if internal failure rate increases, the optimized FS price rises gradually until reaching the maximum value, and profitability to the OEM increases overall; if frequency of induced learning goes up, the optimal FS price rises after a short-term downward trend, with a stable profitability to the OEM.
△ Less
Submitted 12 April, 2022;
originally announced April 2022.
-
Probing massive neutrinos with the Minkowski functionals of large-scale structure
Authors:
Wei Liu,
Aoxiang Jiang,
Wenjuan Fang
Abstract:
Massive neutrinos suppress the growth of structure under their free-streaming scales. The effect is most prominent on small scales where the widely-used two-point statistics can no longer capture the full information. In this work, we study the signatures massive neutrinos leave on large-scale structure (LSS) as revealed by its morphological properties, which are fully described by $4$ Minkowski f…
▽ More
Massive neutrinos suppress the growth of structure under their free-streaming scales. The effect is most prominent on small scales where the widely-used two-point statistics can no longer capture the full information. In this work, we study the signatures massive neutrinos leave on large-scale structure (LSS) as revealed by its morphological properties, which are fully described by $4$ Minkowski functionals (MFs), and quantify the constraints on the summed neutrino mass $M_ν$ from the MFs, by using publicly available N-body simulations. We find the MFs provide important complementary information, and give tighter constraints on $M_ν$ than the power spectrum. Specifically, depending on whether massive neutrinos are included in the density field (the `m' field) or not (the `cb' field), we find the constraint on $M_ν$ from the MFs with a smoothing scale of $R_G=5 h^{-1}$Mpc is $48$ or $4$ times better than that from the power spectrum. When the MFs are combined with the power spectrum, they can improve the constraint on $M_ν$ from the latter by a factor of 63 for the `m' field and 5 for the `cb' field. Notably, when the `m' field is used, the constraint on $M_ν$ from the MFs can reach $0.0177$eV with a volume of $1(h^{-1}\rm Gpc)^3$, while the combination of the MFs and power spectrum can tighten this constraint to be $0.0133$eV, a $4.5σ$ significance on detecting the minimum sum of the neutrino masses. For the `m' field, we also find the $σ_8$ and $M_ν$ degeneracy is broken with the MFs, leading to stronger constraints on all 6 cosmological parameters considered in this work than the power spectrum.
△ Less
Submitted 15 June, 2022; v1 submitted 6 April, 2022;
originally announced April 2022.
-
STOPS: Short-Term-based Volatility-controlled Policy Search and its Global Convergence
Authors:
Liangliang Xu,
Daoming Lyu,
Yangchen Pan,
Aiwen Jiang,
Bo Liu
Abstract:
It remains challenging to deploy existing risk-averse approaches to real-world applications. The reasons are multi-fold, including the lack of global optimality guarantee and the necessity of learning from long-term consecutive trajectories. Long-term consecutive trajectories are prone to involving visiting hazardous states, which is a major concern in the risk-averse setting. This paper proposes…
▽ More
It remains challenging to deploy existing risk-averse approaches to real-world applications. The reasons are multi-fold, including the lack of global optimality guarantee and the necessity of learning from long-term consecutive trajectories. Long-term consecutive trajectories are prone to involving visiting hazardous states, which is a major concern in the risk-averse setting. This paper proposes Short-Term VOlatility-controlled Policy Search (STOPS), a novel algorithm that solves risk-averse problems by learning from short-term trajectories instead of long-term trajectories. Short-term trajectories are more flexible to generate, and can avoid the danger of hazardous state visitations. By using an actor-critic scheme with an overparameterized two-layer neural network, our algorithm finds a globally optimal policy at a sublinear rate with proximal policy optimization and natural policy gradient, with effectiveness comparable to the state-of-the-art convergence rate of risk-neutral policy-search methods. The algorithm is evaluated on challenging Mujoco robot simulation tasks under the mean-variance evaluation metric. Both theoretical analysis and experimental results demonstrate a state-of-the-art level of STOPS' performance among existing risk-averse policy search methods.
△ Less
Submitted 22 July, 2022; v1 submitted 24 January, 2022;
originally announced January 2022.
-
A Framework of Severity for Harmful Content Online
Authors:
Morgan Klaus Scheuerman,
Jialun Aaron Jiang,
Casey Fiesler,
Jed R. Brubaker
Abstract:
The proliferation of harmful content on online social media platforms has necessitated empirical understandings of experiences of harm online and the development of practices for harm mitigation. Both understandings of harm and approaches to mitigating that harm, often through content moderation, have implicitly embedded frameworks of prioritization - what forms of harm should be researched, how p…
▽ More
The proliferation of harmful content on online social media platforms has necessitated empirical understandings of experiences of harm online and the development of practices for harm mitigation. Both understandings of harm and approaches to mitigating that harm, often through content moderation, have implicitly embedded frameworks of prioritization - what forms of harm should be researched, how policy on harmful content should be implemented, and how harmful content should be moderated. To aid efforts of better understanding the variety of online harms, how they relate to one another, and how to prioritize harms relevant to research, policy, and practice, we present a theoretical framework of severity for harmful online content. By employing a grounded theory approach, we developed a framework of severity based on interviews and card-sorting activities conducted with 52 participants over the course of ten months. Through our analysis, we identified four Types of Harm (physical, emotional, relational, and financial) and eight Dimensions along which the severity of harm can be understood (perspectives, intent, agency, experience, scale, urgency, vulnerability, sphere). We describe how our framework can be applied to both research and policy settings towards deeper understandings of specific forms of harm (e.g., harassment) and prioritization frameworks when implementing policies encompassing many forms of harm.
△ Less
Submitted 17 September, 2021; v1 submitted 9 August, 2021;
originally announced August 2021.