Skip to main content

Showing 1–50 of 94 results for author: Jiang, A

  1. arXiv:2407.08500  [pdf, other

    cs.LG cs.AI

    Latent Conditional Diffusion-based Data Augmentation for Continuous-Time Dynamic Graph Mode

    Authors: Yuxing Tian, Yiyan Qi, Aiwen Jiang, Qi Huang, Jian Guo

    Abstract: Continuous-Time Dynamic Graph (CTDG) precisely models evolving real-world relationships, drawing heightened interest in dynamic graph learning across academia and industry. However, existing CTDG models encounter challenges stemming from noise and limited historical data. Graph Data Augmentation (GDA) emerges as a critical solution, yet current approaches primarily focus on static graphs and strug… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Accepted by KDD 2024

  2. arXiv:2406.16477  [pdf, other

    cs.CV cs.CL

    DaLPSR: Leverage Degradation-Aligned Language Prompt for Real-World Image Super-Resolution

    Authors: Aiwen Jiang, Zhi Wei, Long Peng, Feiqiang Liu, Wenbo Li, Mingwen Wang

    Abstract: Image super-resolution pursuits reconstructing high-fidelity high-resolution counterpart for low-resolution image. In recent years, diffusion-based models have garnered significant attention due to their capabilities with rich prior knowledge. The success of diffusion models based on general text prompts has validated the effectiveness of textual control in the field of text2image. However, given… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  3. arXiv:2406.13141  [pdf, other

    q-bio.TO

    Implant-to-Wearable Communication through the Human Body: Exploring the Effects of Encapsulated Capacitive and Galvanic Transmitters

    Authors: Anyu Jiang, Cassandra Acebal, Brook Heyd, Trustin White, Gurleen Kainth, Arunashish Datta, Shreyas Sen, Adam Khalifa, Baibhab Chatterjee

    Abstract: Data transfer using human-body communication (HBC) represents an actively explored alternative solution to address the challenges related to energy-efficiency, tissue absorption, and security of conventional wireless. Although the use of HBC for wearable-to-wearable communication has been well-explored, different configurations for the transmitter (Tx) and receiver (Rx) for implant-to-wearable HBC… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  4. arXiv:2406.11364  [pdf, other

    cs.SD eess.AS

    AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection

    Authors: Anbai Jiang, Bing Han, Zhiqiang Lv, Yufeng Deng, Wei-Qiang Zhang, Xie Chen, Yanmin Qian, Jia Liu, Pingyi Fan

    Abstract: Large pre-trained models have demonstrated dominant performances in multiple areas, where the consistency between pre-training and fine-tuning is the key to success. However, few works reported satisfactory results of pre-trained models for the machine anomalous sound detection (ASD) task. This may be caused by the inconsistency of the pre-trained model and the inductive bias of machine audio, res… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH 2024

  5. arXiv:2406.08810  [pdf, other

    cs.CV

    Few-Shot Anomaly Detection via Category-Agnostic Registration Learning

    Authors: Chaoqin Huang, Haoyan Guan, Aofan Jiang, Yanfeng Wang, Michael Spratling, Xinchao Wang, Ya Zhang

    Abstract: Most existing anomaly detection methods require a dedicated model for each category. Such a paradigm, despite its promising results, is computationally expensive and inefficient, thereby failing to meet the requirements for real-world applications. Inspired by how humans detect anomalies, by comparing a query image to known normal ones, this paper proposes a novel few-shot anomaly detection (FSAD)… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  6. arXiv:2406.06474  [pdf, other

    cs.AI cs.CL

    Towards a Personal Health Large Language Model

    Authors: Justin Cosentino, Anastasiya Belyaeva, Xin Liu, Nicholas A. Furlotte, Zhun Yang, Chace Lee, Erik Schenck, Yojan Patel, Jian Cui, Logan Douglas Schneider, Robby Bryant, Ryan G. Gomes, Allen Jiang, Roy Lee, Yun Liu, Javier Perez, Jameson K. Rogers, Cathy Speed, Shyam Tailor, Megan Walker, Jeffrey Yu, Tim Althoff, Conor Heneghan, John Hernandez, Mark Malhotra , et al. (9 additional authors not shown)

    Abstract: In health, most large language model (LLM) research has focused on clinical tasks. However, mobile and wearable devices, which are rarely integrated into such tasks, provide rich, longitudinal data for personal health monitoring. Here we present Personal Health Large Language Model (PH-LLM), fine-tuned from Gemini for understanding and reasoning over numerical time-series personal health data. We… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 72 pages

  7. arXiv:2406.04165  [pdf, other

    cs.LG

    Repurposing Language Models into Embedding Models: Finding the Compute-Optimal Recipe

    Authors: Alicja Ziarko, Albert Q. Jiang, Bartosz Piotrowski, Wenda Li, Mateja Jamnik, Piotr Miłoś

    Abstract: Text embeddings are essential for many tasks, such as document retrieval, clustering, and semantic similarity assessment. In this paper, we study how to contrastively train text embedding models in a compute-optimal fashion, given a suite of pre-trained decoder-only language models. Our innovation is an algorithm that produces optimal configurations of model sizes, data quantities, and fine-tuning… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  8. arXiv:2404.04935  [pdf, other

    cs.CV

    Anomaly Detection in Electrocardiograms: Advancing Clinical Diagnosis Through Self-Supervised Learning

    Authors: Aofan Jiang, Chaoqin Huang, Qing Cao, Yuchen Xu, Zi Zeng, Kang Chen, Ya Zhang, Yanfeng Wang

    Abstract: The electrocardiogram (ECG) is an essential tool for diagnosing heart disease, with computer-aided systems improving diagnostic accuracy and reducing healthcare costs. Despite advancements, existing systems often miss rare cardiac anomalies that could be precursors to serious, life-threatening issues or alterations in the cardiac macro/microstructure. We address this gap by focusing on self-superv… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  9. arXiv:2403.12570  [pdf, other

    cs.CV

    Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images

    Authors: Chaoqin Huang, Aofan Jiang, Jinghao Feng, Ya Zhang, Xinchao Wang, Yanfeng Wang

    Abstract: Recent advancements in large-scale visual-language pre-trained models have led to significant progress in zero-/few-shot anomaly detection within natural image domains. However, the substantial domain divergence between natural and medical images limits the effectiveness of these methodologies in medical anomaly detection. This paper introduces a novel lightweight multi-level adaptation and compar… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: CVPR 2024

  10. arXiv:2402.04855  [pdf, other

    cs.CV

    Dual-Path Coupled Image Deraining Network via Spatial-Frequency Interaction

    Authors: Yuhong He, Aiwen Jiang, Lingfang Jiang, Zhifeng Wang, Lu Wang

    Abstract: Transformers have recently emerged as a significant force in the field of image deraining. Existing image deraining methods utilize extensive research on self-attention. Though showcasing impressive results, they tend to neglect critical frequency information, as self-attention is generally less adept at capturing high-frequency details. To overcome this shortcoming, we have developed an innovativ… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

  11. arXiv:2401.09244  [pdf, other

    cs.CL

    Cross-lingual Offensive Language Detection: A Systematic Review of Datasets, Transfer Approaches and Challenges

    Authors: Aiqi Jiang, Arkaitz Zubiaga

    Abstract: The growing prevalence and rapid evolution of offensive language in social media amplify the complexities of detection, particularly highlighting the challenges in identifying such content across diverse languages. This survey presents a systematic and comprehensive exploration of Cross-Lingual Transfer Learning (CLTL) techniques in offensive language detection in social media. Our study stands as… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

    Comments: 35 pages, 7 figures

  12. arXiv:2401.04088  [pdf, other

    cs.LG cs.CL

    Mixtral of Experts

    Authors: Albert Q. Jiang, Alexandre Sablayrolles, Antoine Roux, Arthur Mensch, Blanche Savary, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Emma Bou Hanna, Florian Bressand, Gianna Lengyel, Guillaume Bour, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Sandeep Subramanian, Sophia Yang, Szymon Antoniak, Teven Le Scao, Théophile Gervet, Thibaut Lavril, Thomas Wang, Timothée Lacroix , et al. (1 additional authors not shown)

    Abstract: We introduce Mixtral 8x7B, a Sparse Mixture of Experts (SMoE) language model. Mixtral has the same architecture as Mistral 7B, with the difference that each layer is composed of 8 feedforward blocks (i.e. experts). For every token, at each layer, a router network selects two experts to process the current state and combine their outputs. Even though each token only sees two experts, the selected e… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

    Comments: See more details at https://mistral.ai/news/mixtral-of-experts/

  13. arXiv:2312.06162  [pdf, other

    cs.CV

    Textual Prompt Guided Image Restoration

    Authors: Qiuhai Yan, Aiwen Jiang, Kang Chen, Long Peng, Qiaosi Yi, Chunjie Zhang

    Abstract: Image restoration has always been a cutting-edge topic in the academic and industrial fields of computer vision. Since degradation signals are often random and diverse, "all-in-one" models that can do blind image restoration have been concerned in recent years. Early works require training specialized headers and tails to handle each degradation of concern, which are manually cumbersome. Recent wo… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: 12 pages, 10figures

  14. arXiv:2311.03755  [pdf, other

    cs.CL cs.LG

    Multilingual Mathematical Autoformalization

    Authors: Albert Q. Jiang, Wenda Li, Mateja Jamnik

    Abstract: Autoformalization is the task of translating natural language materials into machine-verifiable formalisations. Progress in autoformalization research is hindered by the lack of a sizeable dataset consisting of informal-formal pairs expressing the same essence. Existing methods tend to circumvent this challenge by manually curating small corpora or using few-shot learning with large language model… ▽ More

    Submitted 9 November, 2023; v1 submitted 7 November, 2023; originally announced November 2023.

  15. arXiv:2310.10631  [pdf, other

    cs.CL cs.AI cs.LO

    Llemma: An Open Language Model For Mathematics

    Authors: Zhangir Azerbayev, Hailey Schoelkopf, Keiran Paster, Marco Dos Santos, Stephen McAleer, Albert Q. Jiang, Jia Deng, Stella Biderman, Sean Welleck

    Abstract: We present Llemma, a large language model for mathematics. We continue pretraining Code Llama on the Proof-Pile-2, a mixture of scientific papers, web data containing mathematics, and mathematical code, yielding Llemma. On the MATH benchmark Llemma outperforms all known open base models, as well as the unreleased Minerva model suite on an equi-parameter basis. Moreover, Llemma is capable of tool u… ▽ More

    Submitted 15 March, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: Updated references; corrected description of COPRA search budget

  16. arXiv:2310.06825  [pdf, other

    cs.CL cs.AI cs.LG

    Mistral 7B

    Authors: Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, Lélio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed

    Abstract: We introduce Mistral 7B v0.1, a 7-billion-parameter language model engineered for superior performance and efficiency. Mistral 7B outperforms Llama 2 13B across all evaluated benchmarks, and Llama 1 34B in reasoning, mathematics, and code generation. Our model leverages grouped-query attention (GQA) for faster inference, coupled with sliding window attention (SWA) to effectively handle sequences o… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: Models and code are available at https://mistral.ai/news/announcing-mistral-7b/

  17. arXiv:2310.01508  [pdf, other

    cs.LG stat.ML

    CODA: Temporal Domain Generalization via Concept Drift Simulator

    Authors: Chia-Yuan Chang, Yu-Neng Chuang, Zhimeng Jiang, Kwei-Herng Lai, Anxiao Jiang, Na Zou

    Abstract: In real-world applications, machine learning models often become obsolete due to shifts in the joint distribution arising from underlying temporal trends, a phenomenon known as the "concept drift". Existing works propose model-specific strategies to achieve temporal generalization in the near-future domain. However, the diverse characteristics of real-world datasets necessitate customized predicti… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

  18. arXiv:2309.14658  [pdf, other

    stat.CO stat.ME

    Improvements on Scalable Stochastic Bayesian Inference Methods for Multivariate Hawkes Process

    Authors: Alex Ziyu Jiang, Abel Rodríguez

    Abstract: Multivariate Hawkes Processes (MHPs) are a class of point processes that can account for complex temporal dynamics among event sequences. In this work, we study the accuracy and computational efficiency of three classes of algorithms which, while widely used in the context of Bayesian inference, have rarely been applied in the context of MHPs: stochastic gradient expectation-maximization, stochast… ▽ More

    Submitted 15 January, 2024; v1 submitted 26 September, 2023; originally announced September 2023.

  19. arXiv:2309.13270  [pdf, other

    stat.ME stat.ML

    BART-SIMP: a novel framework for flexible spatial covariate modeling and prediction using Bayesian additive regression trees

    Authors: Alex Ziyu Jiang, Jon Wakefield

    Abstract: Prediction is a classic challenge in spatial statistics and the inclusion of spatial covariates can greatly improve predictive performance when incorporated into a model with latent spatial effects. It is desirable to develop flexible regression models that allow for nonlinearities and interactions in the covariate structure. Machine learning models have been suggested in the spatial context, allo… ▽ More

    Submitted 23 September, 2023; originally announced September 2023.

  20. arXiv:2308.04789  [pdf, other

    cs.CV

    Multi-Scale Memory Comparison for Zero-/Few-Shot Anomaly Detection

    Authors: Chaoqin Huang, Aofan Jiang, Ya Zhang, Yanfeng Wang

    Abstract: Anomaly detection has gained considerable attention due to its broad range of applications, particularly in industrial defect detection. To address the challenges of data collection, researchers have introduced zero-/few-shot anomaly detection techniques that require minimal normal images for each category. However, complex industrial scenarios often involve multiple objects, presenting a signific… ▽ More

    Submitted 1 January, 2024; v1 submitted 9 August, 2023; originally announced August 2023.

    Comments: VAND Runner-up Winner in CVPR 2023

  21. arXiv:2308.01639  [pdf, other

    cs.CV

    Multi-scale Cross-restoration Framework for Electrocardiogram Anomaly Detection

    Authors: Aofan Jiang, Chaoqin Huang, Qing Cao, Shuang Wu, Zi Zeng, Kang Chen, Ya Zhang, Yanfeng Wang

    Abstract: Electrocardiogram (ECG) is a widely used diagnostic tool for detecting heart conditions. Rare cardiac diseases may be underdiagnosed using traditional ECG analysis, considering that no training dataset can exhaust all possible cardiac disorders. This paper proposes using anomaly detection to identify any unhealthy status, with normal ECGs solely for training. However, detecting anomalies in ECG ca… ▽ More

    Submitted 3 August, 2023; originally announced August 2023.

    Comments: MICCAI 2023 Early Accept

  22. arXiv:2307.05795  [pdf

    cs.HC

    Research Protocol for the Google Health Digital Well-being Study

    Authors: Daniel McDuff, Andrew Barakat, Ari Winbush, Allen Jiang, Felicia Cordeiro, Ryann Crowley, Lauren E. Kahn, John Hernandez, Nicholas B. Allen

    Abstract: The impact of digital device use on health and well-being is a pressing question to which individuals, families, schools, policy makers, legislators, and digital designers are all demanding answers. However, the scientific literature on this topic to date is marred by small and/or unrepresentative samples, poor measurement of core constructs (e.g., device use, smartphone addiction), and a limited… ▽ More

    Submitted 11 July, 2023; originally announced July 2023.

  23. arXiv:2306.01694  [pdf, other

    cs.LG cs.HC

    Evaluating Language Models for Mathematics through Interactions

    Authors: Katherine M. Collins, Albert Q. Jiang, Simon Frieder, Lionel Wong, Miri Zilka, Umang Bhatt, Thomas Lukasiewicz, Yuhuai Wu, Joshua B. Tenenbaum, William Hart, Timothy Gowers, Wenda Li, Adrian Weller, Mateja Jamnik

    Abstract: There is much excitement about the opportunity to harness the power of large language models (LLMs) when building problem-solving assistants. However, the standard methodology of evaluating LLMs relies on static pairs of inputs and outputs, and is insufficient for making an informed decision about which LLMs and under which assistive settings can they be sensibly used. Static assessment fails to a… ▽ More

    Submitted 5 November, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

  24. arXiv:2305.04520  [pdf, other

    astro-ph.CO

    Minkowski Functionals of Large-Scale Structure as a Probe of Modified Gravity

    Authors: Aoxiang Jiang, Wei Liu, Baojiu Li, Cristian Barrera-Hinojosa, Yufei Zhang, Wenjuan Fang

    Abstract: In this study, we explore the potential of utilizing the four Minkowski functionals, which can fully describe the morphological properties of the large-scale structures, as a robust tool for investigating the modified gravity, particularly on non-linear and quasi-linear scales. With the assistance of the N-body simulation, we employ the Minkowski functionals to probe the Hu-Sawicki f(R) gravity mo… ▽ More

    Submitted 19 March, 2024; v1 submitted 8 May, 2023; originally announced May 2023.

    Comments: 21 pages, 12 figures, accepted by PRD

  25. arXiv:2305.02910  [pdf, other

    astro-ph.GA

    Dynamical hotness, star formation quenching and growth of supermassive black holes

    Authors: Hui Hong, Huiyuan Wang, H. J. Mo, Ziwen Zhang, Guangwen Chen, Wentao Luo, Tinggui Wang, Pengfei Li, Renjie Li, Yao yao, Aoxiang Jiang

    Abstract: A stellar system is dynamically hot when its kinetic energy is dominated by random motion represented by the velocity dispersion $σ_{\rm hot} (M_*)$. We use MaNGA data to obtain inner and outer dispersion of a galaxy, $σ_{\rm in}$ and $σ_{\rm out}$, to characterize its dynamical status and study its connection with star formation quenching and the growth of supermassive black hole (SMBH). We divid… ▽ More

    Submitted 19 July, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

    Comments: 24 pages, 19 figures, 1 table, accepted by ApJ

  26. Q-Kostka polynomials and spin Green polynomials

    Authors: Anguo Jiang, Naihuan Jing, Ning Liu

    Abstract: We study the $Q$-Kostka polynomials $L_{λμ}(t)$ by the vertex operator realization of the $Q$-Hall-Littlewood functions $G_λ(x;t)$ and derive new formulae for $L_{λμ}(t)$. In particular, we have established stability property for the Q-Kostka polynomials. We also introduce spin Green polynomials $Y^λ_μ(t)$ as both an analogue of the Green polynomials and deformation of the spin irreducible charact… ▽ More

    Submitted 14 April, 2023; originally announced April 2023.

    Comments: 5 tables

    MSC Class: Primary: 05E05; Secondary: 17B69; 05E10

    Journal ref: Monatsh. Math. 201 (2023), no. 1, 109-125

  27. arXiv:2303.17949  [pdf, other

    cs.SD cs.LG eess.AS

    Unsupervised Anomaly Detection and Localization of Machine Audio: A GAN-based Approach

    Authors: Anbai Jiang, Wei-Qiang Zhang, Yufeng Deng, Pingyi Fan, Jia Liu

    Abstract: Automatic detection of machine anomaly remains challenging for machine learning. We believe the capability of generative adversarial network (GAN) suits the need of machine audio anomaly detection, yet rarely has this been investigated by previous work. In this paper, we propose AEGAN-AD, a totally unsupervised approach in which the generator (also an autoencoder) is trained to reconstruct input s… ▽ More

    Submitted 31 March, 2023; originally announced March 2023.

    Comments: Accepted by ICASSP 2023

  28. arXiv:2303.16086  [pdf, other

    math.AG

    Grothendieck Duality via Diagonally Supported Sheaves

    Authors: Andy Jiang

    Abstract: Following a formula found in the paper of Avramov, Iyengar, Lipman, and Nayak (2010) and ideas of Neeman and Khusyairi, we indicate that Grothendieck duality for finite tor-amplitude maps can be developed from scratch via the formula $f^! := δ^*π_1^{\times}f^*$. Our strategy centers on the subcategory $Γ_Δ(\mathrm{QCoh}(X \times X))$ of quasicoherent sheaves on $X \times X$ supported on the diagon… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

    Comments: 27 pages

  29. arXiv:2303.16083  [pdf, ps, other

    math.AG

    The Derived Ring of Differential Operators

    Authors: Andy Jiang

    Abstract: By reading a standard formula for the ring of Grothendieck differential operators in a derived way, we construct a derived (sheaf of) ring of Grothendieck differential operators for Noetherian schemes $X$ separated and finite-type over a base $S$, when the map $X \to S$ is finite tor-amplitude. Using this ring of differential operators, we (re-)develop the theory of $D$-modules from scratch and sh… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

    Comments: 46 pages

  30. Learning a Practical SDR-to-HDRTV Up-conversion using New Dataset and Degradation Models

    Authors: Cheng Guo, Leidong Fan, Ziyu Xue, and Xiuhua Jiang

    Abstract: In media industry, the demand of SDR-to-HDRTV up-conversion arises when users possess HDR-WCG (high dynamic range-wide color gamut) TVs while most off-the-shelf footage is still in SDR (standard dynamic range). The research community has started tackling this low-level vision task by learning-based approaches. When applied to real SDR, yet, current methods tend to produce dim and desaturated resul… ▽ More

    Submitted 23 March, 2023; originally announced March 2023.

    Comments: Accepted by CVPR2023

  31. arXiv:2303.08774  [pdf, other

    cs.CL cs.AI

    GPT-4 Technical Report

    Authors: OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Berner, Lenny Bogdonoff, Oleg Boiko , et al. (256 additional authors not shown)

    Abstract: We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based mo… ▽ More

    Submitted 4 March, 2024; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: 100 pages; updated authors list; fixed author names and added citation

  32. arXiv:2303.04488  [pdf, other

    cs.LG cs.AI cs.LO

    Magnushammer: A Transformer-Based Approach to Premise Selection

    Authors: Maciej Mikuła, Szymon Tworkowski, Szymon Antoniak, Bartosz Piotrowski, Albert Qiaochu Jiang, Jin Peng Zhou, Christian Szegedy, Łukasz Kuciński, Piotr Miłoś, Yuhuai Wu

    Abstract: This paper presents a novel approach to premise selection, a crucial reasoning task in automated theorem proving. Traditionally, symbolic methods that rely on extensive domain knowledge and engineering effort are applied to this task. In contrast, this work demonstrates that contrastive training with the transformer architecture can achieve higher-quality retrieval of relevant premises, without th… ▽ More

    Submitted 18 March, 2024; v1 submitted 8 March, 2023; originally announced March 2023.

    Comments: ICLR 2024

  33. Probing massive neutrinos with the Minkowski functionals of the galaxy distribution

    Authors: Wei Liu, Aoxiang Jiang, Wenjuan Fang

    Abstract: The characteristic signatures of massive neutrinos on large-scale structure (LSS), if fully captured, can be used to put a stringent constraint on their mass sum, $M_ν$. Previous work utilizing N-body simulations has shown the Minkowski functionals (MFs) of LSS can reveal the imprints of massive neutrinos on LSS, provide important complementary information to two-point statistics and significantly… ▽ More

    Submitted 18 September, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

    Comments: 38 pages, 10 figures, 5 tables. Accepted for publication in JCAP. This is the second in our series of work, the first is arXiv:2204.02945

  34. arXiv:2212.10405  [pdf, other

    cs.CL cs.SI

    AnnoBERT: Effectively Representing Multiple Annotators' Label Choices to Improve Hate Speech Detection

    Authors: Wenjie Yin, Vibhor Agarwal, Aiqi Jiang, Arkaitz Zubiaga, Nishanth Sastry

    Abstract: Supervised approaches generally rely on majority-based labels. However, it is hard to achieve high agreement among annotators in subjective tasks such as hate speech detection. Existing neural network models principally regard labels as categorical variables, while ignoring the semantic information in diverse label texts. In this paper, we propose AnnoBERT, a first-of-its-kind architecture integra… ▽ More

    Submitted 10 January, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: accepted at ICWSM 2023

    Journal ref: 17th International AAAI Conference on Web and Social Media (ICWSM 2023). Please cite accordingly

  35. arXiv:2211.11581  [pdf, other

    cs.CY eess.SY

    Modeling 100% Electrified Transportation in NYC

    Authors: Jingrong Zhang, Amber Jiang, Brian Newborn, Sara Kou, Robert Mieth

    Abstract: Envisioning a future 100% electrified transportation sector, this paper uses socio-economic, demographic, and geographic data to assess electric energy demand from commuter traffic. We explore the individual mode choices, which allows to create mode-mix scenarios for the entire population, and quantify the electric energy demand for each scenario using technical specifications of battery and elect… ▽ More

    Submitted 17 February, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

    Comments: Accepted for publication at the 2023 IEEE PES General Meeting

  36. arXiv:2211.08447  [pdf, other

    cs.CL cs.SI

    SexWEs: Domain-Aware Word Embeddings via Cross-lingual Semantic Specialisation for Chinese Sexism Detection in Social Media

    Authors: Aiqi Jiang, Arkaitz Zubiaga

    Abstract: The goal of sexism detection is to mitigate negative online content targeting certain gender groups of people. However, the limited availability of labeled sexism-related datasets makes it problematic to identify online sexism for low-resource languages. In this paper, we address the task of automatic sexism detection in social media for one low-resource language -- Chinese. Rather than collecting… ▽ More

    Submitted 30 March, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

    Comments: accepted at ICWSM 2023

  37. arXiv:2210.12283  [pdf, other

    cs.AI cs.LG

    Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs

    Authors: Albert Q. Jiang, Sean Welleck, Jin Peng Zhou, Wenda Li, Jiacheng Liu, Mateja Jamnik, Timothée Lacroix, Yuhuai Wu, Guillaume Lample

    Abstract: The formalization of existing mathematical proofs is a notoriously difficult process. Despite decades of research on automation and proof assistants, writing formal proofs remains arduous and only accessible to a few experts. While previous studies to automate formalization focused on powerful search algorithms, no attempts were made to take advantage of available informal proofs. In this work, we… ▽ More

    Submitted 20 February, 2023; v1 submitted 21 October, 2022; originally announced October 2022.

  38. Tensor Hypercontraction Form of the Perturbative Triples Energy in Coupled-Cluster Theory

    Authors: Andy Jiang, Justin M. Turney, Henry F. Schaefer III

    Abstract: We present the working equations for a reduced-scaling method of evaluating the perturbative triples (T) energy in coupled-cluster theory, through the tensor hypercontraction (THC) of the triples amplitudes ($t_{ijk}^{abc}$). Through our method we can reduce the scaling of the (T) energy from the traditional O($N^{7}$) to a more modest O($N^{5}$). We also discuss implementation details to aid futu… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

    Journal ref: Journal of Chemical Theory and Computation 2023

  39. arXiv:2208.03274  [pdf, other

    cs.CL cs.LG

    A Holistic Approach to Undesired Content Detection in the Real World

    Authors: Todor Markov, Chong Zhang, Sandhini Agarwal, Tyna Eloundou, Teddy Lee, Steven Adler, Angela Jiang, Lilian Weng

    Abstract: We present a holistic approach to building a robust and useful natural language classification system for real-world content moderation. The success of such a system relies on a chain of carefully designed and executed steps, including the design of content taxonomies and labeling instructions, data quality control, an active learning pipeline to capture rare events, and a variety of methods to ma… ▽ More

    Submitted 14 February, 2023; v1 submitted 5 August, 2022; originally announced August 2022.

    Comments: Oral presentation at AAAI-23

  40. arXiv:2208.02153  [pdf, ps, other

    cs.DM

    Finding a Lower Bound for k-Unbounded Hamiltonian Cycles

    Authors: Albert R. Jiang

    Abstract: Methods to determine the existence of Hamiltonian Cycles in graphs have been extensively studied. However, little research has been done following cases when no Hamiltonian Cycle exists. Let a vertex be "unbounded" if it is visited more than once in a path. Furthermore, let a k-Unbounded Hamiltonian Cycle be a path with finite length that visits every vertex, has adjacent start and end vertices, a… ▽ More

    Submitted 8 August, 2022; v1 submitted 3 August, 2022; originally announced August 2022.

    Comments: 26 pages, 14 figures

  41. arXiv:2207.07361  [pdf, other

    cs.CV

    Registration based Few-Shot Anomaly Detection

    Authors: Chaoqin Huang, Haoyan Guan, Aofan Jiang, Ya Zhang, Michael Spratling, Yan-Feng Wang

    Abstract: This paper considers few-shot anomaly detection (FSAD), a practical yet under-studied setting for anomaly detection (AD), where only a limited number of normal images are provided for each category at training. So far, existing FSAD studies follow the one-model-per-category learning paradigm used for standard AD, and the inter-category commonality has not been explored. Inspired by how humans dete… ▽ More

    Submitted 15 July, 2022; originally announced July 2022.

    Comments: ECCV 2022 Oral; Code is available at https://github.com/MediaBrain-SJTU/RegAD

  42. arXiv:2206.09576  [pdf, other

    cs.LG cs.AI math.OC

    FedSSO: A Federated Server-Side Second-Order Optimization Algorithm

    Authors: Xin Ma, Renyi Bao, Jinpeng Jiang, Yang Liu, Arthur Jiang, Jun Yan, Xin Liu, Zhisong Pan

    Abstract: In this work, we propose FedSSO, a server-side second-order optimization method for federated learning (FL). In contrast to previous works in this direction, we employ a server-side approximation for the Quasi-Newton method without requiring any training data from the clients. In this way, we not only shift the computation burden from clients to server, but also eliminate the additional communicat… ▽ More

    Submitted 22 August, 2022; v1 submitted 20 June, 2022; originally announced June 2022.

  43. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  44. arXiv:2206.03450  [pdf, other

    cs.HC cs.CY

    A Trade-off-centered Framework of Content Moderation

    Authors: Jialun Aaron Jiang, Peipei Nie, Jed R. Brubaker, Casey Fiesler

    Abstract: Content moderation research typically prioritizes representing and addressing challenges for one group of stakeholders or communities in one type of context. While taking a focused approach is reasonable or even favorable for empirical case studies, it does not address how content moderation works in multiple contexts. Through a systematic literature review of 86 content moderation papers that doc… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

    Comments: To appear in ACM TOCHI

    ACM Class: J.4; K.4.2

  45. arXiv:2205.12615  [pdf, ps, other

    cs.LG cs.AI cs.LO cs.SE

    Autoformalization with Large Language Models

    Authors: Yuhuai Wu, Albert Q. Jiang, Wenda Li, Markus N. Rabe, Charles Staats, Mateja Jamnik, Christian Szegedy

    Abstract: Autoformalization is the process of automatically translating from natural language mathematics to formal specifications and proofs. A successful autoformalization system could advance the fields of formal verification, program synthesis, and artificial intelligence. While the long-term goal of autoformalization seemed elusive for a long time, we show large language models provide new prospects to… ▽ More

    Submitted 25 May, 2022; originally announced May 2022.

    Comments: 44 pages

  46. arXiv:2205.10893  [pdf, other

    cs.AI

    Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers

    Authors: Albert Q. Jiang, Wenda Li, Szymon Tworkowski, Konrad Czechowski, Tomasz Odrzygóźdź, Piotr Miłoś, Yuhuai Wu, Mateja Jamnik

    Abstract: In theorem proving, the task of selecting useful premises from a large library to unlock the proof of a given conjecture is crucially important. This presents a challenge for all theorem provers, especially the ones based on language models, due to their relative inability to reason over huge volumes of premises in text form. This paper introduces Thor, a framework integrating language models and… ▽ More

    Submitted 22 May, 2022; originally announced May 2022.

  47. arXiv:2204.05552  [pdf

    stat.AP

    The Effects of Dynamic Learning and the Forgetting Process on an Optimizing Modelling for Full-Service Repair Pricing Contracts for Medical Devices

    Authors: Aiping Jiang, Lin Li, Xuemin Xu, David Y. C. Huang

    Abstract: In order to improve the profitability and customer service management of original equipment manufacturers (OEMs) in a market where full-service (FS) and on-call service (OS) co-exist, this article extends the optimizing modelling for pricing FS repair contracts with the effects of dynamic learning and forgetting. Along with considering autonomous learning in maintenance practice, this study also a… ▽ More

    Submitted 12 April, 2022; originally announced April 2022.

  48. Probing massive neutrinos with the Minkowski functionals of large-scale structure

    Authors: Wei Liu, Aoxiang Jiang, Wenjuan Fang

    Abstract: Massive neutrinos suppress the growth of structure under their free-streaming scales. The effect is most prominent on small scales where the widely-used two-point statistics can no longer capture the full information. In this work, we study the signatures massive neutrinos leave on large-scale structure (LSS) as revealed by its morphological properties, which are fully described by $4$ Minkowski f… ▽ More

    Submitted 15 June, 2022; v1 submitted 6 April, 2022; originally announced April 2022.

    Comments: Accepted for publication in JCAP. Changes from the first version: add figure 10, and minor text revisions. Matches accepted version. 33 pages, 10 figures, 2 tables

  49. arXiv:2201.09857  [pdf, other

    cs.LG

    STOPS: Short-Term-based Volatility-controlled Policy Search and its Global Convergence

    Authors: Liangliang Xu, Daoming Lyu, Yangchen Pan, Aiwen Jiang, Bo Liu

    Abstract: It remains challenging to deploy existing risk-averse approaches to real-world applications. The reasons are multi-fold, including the lack of global optimality guarantee and the necessity of learning from long-term consecutive trajectories. Long-term consecutive trajectories are prone to involving visiting hazardous states, which is a major concern in the risk-averse setting. This paper proposes… ▽ More

    Submitted 22 July, 2022; v1 submitted 24 January, 2022; originally announced January 2022.

  50. A Framework of Severity for Harmful Content Online

    Authors: Morgan Klaus Scheuerman, Jialun Aaron Jiang, Casey Fiesler, Jed R. Brubaker

    Abstract: The proliferation of harmful content on online social media platforms has necessitated empirical understandings of experiences of harm online and the development of practices for harm mitigation. Both understandings of harm and approaches to mitigating that harm, often through content moderation, have implicitly embedded frameworks of prioritization - what forms of harm should be researched, how p… ▽ More

    Submitted 17 September, 2021; v1 submitted 9 August, 2021; originally announced August 2021.

    Comments: CSCW 2021; 33 pages

    Journal ref: Proc. ACM Hum.-Comput. Interact.5, CSCW2, Article 368 (October 2021), 33 pages