Skip to main content

Showing 1–29 of 29 results for author: Jiang, E

  1. arXiv:2406.18532  [pdf, other

    cs.CL cs.AI cs.LG

    Symbolic Learning Enables Self-Evolving Agents

    Authors: Wangchunshu Zhou, Yixin Ou, Shengwei Ding, Long Li, Jialong Wu, Tiannan Wang, Jiamin Chen, Shuai Wang, Xiaohua Xu, Ningyu Zhang, Huajun Chen, Yuchen Eleanor Jiang

    Abstract: The AI community has been exploring a pathway to artificial general intelligence (AGI) by developing "language agents", which are complex large language models (LLMs) pipelines involving both prompting techniques and tool usage methods. While language agents have demonstrated impressive capabilities for many real-world tasks, a fundamental limitation of current language agents research is that the… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Code available at https://github.com/aiwaves-cn/agents

  2. arXiv:2404.13496  [pdf, other

    math.NA cs.AI

    ODE-DPS: ODE-based Diffusion Posterior Sampling for Inverse Problems in Partial Differential Equation

    Authors: Enze Jiang, Jishen Peng, Zheng Ma, Xiong-Bin Yan

    Abstract: In recent years we have witnessed a growth in mathematics for deep learning, which has been used to solve inverse problems of partial differential equations (PDEs). However, most deep learning-based inversion methods either require paired data or necessitate retraining neural networks for modifications in the conditions of the inverse problem, significantly reducing the efficiency of inversion and… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

  3. arXiv:2402.06827  [pdf, other

    cs.LG cs.CR

    RAMP: Boosting Adversarial Robustness Against Multiple $l_p$ Perturbations

    Authors: Enyi Jiang, Gagandeep Singh

    Abstract: There is considerable work on improving robustness against adversarial attacks bounded by a single $l_p$ norm using adversarial training (AT). However, the multiple-norm robustness (union accuracy) of AT models is still low. We observe that simultaneously obtaining good union and clean accuracy is hard since there are tradeoffs between robustness against multiple $l_p$ perturbations, and accuracy/… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  4. arXiv:2401.17268  [pdf, other

    cs.CL cs.AI cs.LG

    Weaver: Foundation Models for Creative Writing

    Authors: Tiannan Wang, Jiamin Chen, Qingrui Jia, Shuai Wang, Ruoyu Fang, Huilin Wang, Zhaowei Gao, Chunzhao Xie, Chuou Xu, Jihong Dai, Yibin Liu, Jialong Wu, Shengwei Ding, Long Li, Zhiwei Huang, Xinle Deng, Teng Yu, Gangan Ma, Han Xiao, Zixin Chen, Danjun Xiang, Yunxia Wang, Yuanyuan Zhu, Yi Xiao, Jing Wang , et al. (21 additional authors not shown)

    Abstract: This work introduces Weaver, our first family of large language models (LLMs) dedicated to content creation. Weaver is pre-trained on a carefully selected corpus that focuses on improving the writing capabilities of large language models. We then fine-tune Weaver for creative and professional writing purposes and align it to the preference of professional writers using a suit of novel methods for… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  5. arXiv:2401.05268  [pdf, other

    cs.CL cs.AI cs.HC cs.LG cs.MA

    AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning

    Authors: Shuofei Qiao, Ningyu Zhang, Runnan Fang, Yujie Luo, Wangchunshu Zhou, Yuchen Eleanor Jiang, Chengfei Lv, Huajun Chen

    Abstract: Language agents have achieved considerable performance on various complex question-answering tasks by planning with external tools. Despite the incessant exploration in this field, existing language agent systems still struggle with costly, non-reproducible data reliance and face the challenge of compelling a single model for multiple functions. To this end, we introduce AutoAct, an automatic agen… ▽ More

    Submitted 26 May, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

    Comments: ACL 2024

  6. arXiv:2312.03216  [pdf, other

    cs.LG cs.AI

    SDSRA: A Skill-Driven Skill-Recombination Algorithm for Efficient Policy Learning

    Authors: Eric H. Jiang, Andrew Lizarraga

    Abstract: In this paper, we introduce a novel algorithm - the Skill-Driven Skill Recombination Algorithm (SDSRA) - an innovative framework that significantly enhances the efficiency of achieving maximum entropy in reinforcement learning tasks. We find that SDSRA achieves faster convergence compared to the traditional Soft Actor-Critic (SAC) algorithm and produces improved policies. By integrating skill-base… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  7. arXiv:2309.07870  [pdf, other

    cs.CL

    Agents: An Open-source Framework for Autonomous Language Agents

    Authors: Wangchunshu Zhou, Yuchen Eleanor Jiang, Long Li, Jialong Wu, Tiannan Wang, Shi Qiu, Jintian Zhang, Jing Chen, Ruipu Wu, Shuai Wang, Shiding Zhu, Jiyu Chen, Wentao Zhang, Xiangru Tang, Ningyu Zhang, Huajun Chen, Peng Cui, Mrinmaya Sachan

    Abstract: Recent advances on large language models (LLMs) enable researchers and developers to build autonomous language agents that can automatically solve various tasks and interact with environments, humans, and other agents using natural language interfaces. We consider language agents as a promising direction towards artificial general intelligence and release Agents, an open-source library with the go… ▽ More

    Submitted 11 December, 2023; v1 submitted 14 September, 2023; originally announced September 2023.

    Comments: Code available at https://github.com/aiwaves-cn/agents

  8. arXiv:2308.05282  [pdf, other

    cs.CR

    Decentralized Finance (DeFi): A Survey

    Authors: Erya Jiang, Bo Qin, Qin Wang, Zhipeng Wang, Qianhong Wu, Jian Weng, Xinyu Li, Chenyang Wang, Yuhang Ding, Yanran Zhang

    Abstract: Decentralized Finance (DeFi) is a new paradigm in the creation, distribution, and utilization of financial services via the integration of blockchain technology. Our research conducts a comprehensive introduction and meticulous classification of various DeFi applications. Beyond that, we thoroughly analyze these risks from both technical and economic perspectives, spanning multiple layers. We poin… ▽ More

    Submitted 30 November, 2023; v1 submitted 9 August, 2023; originally announced August 2023.

  9. arXiv:2305.15067  [pdf, other

    cs.CL

    Not All Metrics Are Guilty: Improving NLG Evaluation by Diversifying References

    Authors: Tianyi Tang, Hongyuan Lu, Yuchen Eleanor Jiang, Haoyang Huang, Dongdong Zhang, Wayne Xin Zhao, Tom Kocmi, Furu Wei

    Abstract: Most research about natural language generation (NLG) relies on evaluation benchmarks with limited references for a sample, which may result in poor correlations with human judgements. The underlying reason is that one semantic meaning can actually be expressed in different forms, and the evaluation with a single or few references may not accurately reflect the quality of the model's hypotheses. T… ▽ More

    Submitted 24 May, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Accepted by NAACL 2024

  10. arXiv:2305.13304  [pdf, other

    cs.CL cs.LG

    RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text

    Authors: Wangchunshu Zhou, Yuchen Eleanor Jiang, Peng Cui, Tiannan Wang, Zhenxin Xiao, Yifan Hou, Ryan Cotterell, Mrinmaya Sachan

    Abstract: The fixed-size context of Transformer makes GPT models incapable of generating arbitrarily long text. In this paper, we introduce RecurrentGPT, a language-based simulacrum of the recurrence mechanism in RNNs. RecurrentGPT is built upon a large language model (LLM) such as ChatGPT and uses natural language to simulate the Long Short-Term Memory mechanism in an LSTM. At each timestep, RecurrentGPT g… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: Under review

  11. arXiv:2305.11170  [pdf, other

    cs.CL cs.AI cs.LG

    Efficient Prompting via Dynamic In-Context Learning

    Authors: Wangchunshu Zhou, Yuchen Eleanor Jiang, Ryan Cotterell, Mrinmaya Sachan

    Abstract: The primary way of building AI applications is shifting from training specialist models to prompting generalist models. A common practice for prompting generalist models, often referred to as in-context learning, is to append a few examples (demonstrations) to the prompt to help the model better understand the task. While effective, in-context learning can be inefficient because it makes the input… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

  12. arXiv:2305.11142  [pdf, other

    cs.CL

    Discourse Centric Evaluation of Machine Translation with a Densely Annotated Parallel Corpus

    Authors: Yuchen Eleanor Jiang, Tianyu Liu, Shuming Ma, Dongdong Zhang, Mrinmaya Sachan, Ryan Cotterell

    Abstract: Several recent papers claim human parity at sentence-level Machine Translation (MT), especially in high-resource languages. Thus, in response, the MT community has, in part, shifted its focus to document-level translation. Translating documents requires a deeper understanding of the structure and meaning of text, which is often captured by various kinds of discourse phenomena such as consistency,… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

    Comments: 9 pages. arXiv admin note: substantial text overlap with arXiv:2210.14667

    Journal ref: ACL 2023

  13. arXiv:2304.14293  [pdf, other

    cs.CL cs.AI cs.LG

    Controlled Text Generation with Natural Language Instructions

    Authors: Wangchunshu Zhou, Yuchen Eleanor Jiang, Ethan Wilcox, Ryan Cotterell, Mrinmaya Sachan

    Abstract: Large language models generate fluent texts and can follow natural language instructions to solve a wide range of tasks without task-specific training. Nevertheless, it is notoriously difficult to control their generation to satisfy the various constraints required by different applications. In this work, we present InstructCTG, a controlled text generation framework that incorporates different co… ▽ More

    Submitted 8 June, 2023; v1 submitted 27 April, 2023; originally announced April 2023.

    Comments: ICML 2023

  14. arXiv:2302.05049  [pdf, other

    cs.LG

    Principled Federated Domain Adaptation: Gradient Projection and Auto-Weighting

    Authors: Enyi Jiang, Yibo Jacky Zhang, Sanmi Koyejo

    Abstract: Federated Domain Adaptation (FDA) describes the federated learning (FL) setting where source clients and a server work collaboratively to improve the performance of a target client where limited data is available. The domain shift between the source and target domains, coupled with limited data of the target client, makes FDA a challenging problem, e.g., common techniques such as federated averagi… ▽ More

    Submitted 24 March, 2024; v1 submitted 9 February, 2023; originally announced February 2023.

    Comments: ICLR 2024

  15. arXiv:2301.09008  [pdf, other

    cs.CL

    Poor Man's Quality Estimation: Predicting Reference-Based MT Metrics Without the Reference

    Authors: Vilém Zouhar, Shehzaad Dhuliawala, Wangchunshu Zhou, Nico Daheim, Tom Kocmi, Yuchen Eleanor Jiang, Mrinmaya Sachan

    Abstract: Machine translation quality estimation (QE) predicts human judgements of a translation hypothesis without seeing the reference. State-of-the-art QE systems based on pretrained language models have been achieving remarkable correlations with human judgements yet they are computationally heavy and require human annotations, which are slow and expensive to create. To address these limitations, we def… ▽ More

    Submitted 25 April, 2023; v1 submitted 21 January, 2023; originally announced January 2023.

    Comments: Accepted at EACL23 (main)

  16. BDTS: Blockchain-based Data Trading System

    Authors: Erya Jiang, Bo Qin, Qin Wang, Qianhong Wu, Sanxi Li, Wenchang Shi, Yingxin Bi, Wenyi Tang

    Abstract: Trading data through blockchain platforms is hard to achieve \textit{fair exchange}. Reasons come from two folds: Firstly, guaranteeing fairness between sellers and consumers is a challenging task as the deception of any participating parties is risk-free. This leads to the second issue where judging the behavior of data executors (such as cloud service providers) among distrustful parties is impr… ▽ More

    Submitted 31 October, 2023; v1 submitted 17 November, 2022; originally announced November 2022.

    Comments: ICICS 2023 (Best Paper Award)

    Journal ref: International Conference on Information and Communications Security, pp. 645-664. Singapore: Springer Nature Singapore, 2023

  17. arXiv:2210.14678  [pdf, other

    cs.CL

    Investigating the Role of Centering Theory in the Context of Neural Coreference Resolution Systems

    Authors: Yuchen Eleanor Jiang, Ryan Cotterell, Mrinmaya Sachan

    Abstract: Centering theory (CT; Grosz et al., 1995) provides a linguistic analysis of the structure of discourse. According to the theory, local coherence of discourse arises from the manner and extent to which successive utterances make reference to the same entities. In this paper, we investigate the connection between centering theory and modern coreference resolution systems. We provide an operationaliz… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

    Comments: 11 pages

  18. arXiv:2210.14667  [pdf, other

    cs.CL

    A Bilingual Parallel Corpus with Discourse Annotations

    Authors: Yuchen Eleanor Jiang, Tianyu Liu, Shuming Ma, Dongdong Zhang, Mrinmaya Sachan, Ryan Cotterell

    Abstract: Machine translation (MT) has almost achieved human parity at sentence-level translation. In response, the MT community has, in part, shifted its focus to document-level translation. However, the development of document-level MT systems is hampered by the lack of parallel document corpora. This paper describes BWB, a large parallel corpus first introduced in Jiang et al. (2022), along with an annot… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

    Comments: 4 pages

  19. arXiv:2209.13647  [pdf

    eess.SP cs.LG

    Deep learning based sferics recognition for AMT data processing in the dead band

    Authors: Enhua Jiang, Rujun Chen, Xinming Wu, Jianxin Liu, Debin Zhu, Weiqiang Liu

    Abstract: In the audio magnetotellurics (AMT) sounding data processing, the absence of sferic signals in some time ranges typically results in a lack of energy in the AMT dead band, which may cause unreliable resistivity estimate. We propose a deep convolutional neural network (CNN) to automatically recognize sferic signals from redundantly recorded data in a long time range and use them to compensate for t… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

  20. A Structured Span Selector

    Authors: Tianyu Liu, Yuchen Eleanor Jiang, Ryan Cotterell, Mrinmaya Sachan

    Abstract: Many natural language processing tasks, e.g., coreference resolution and semantic role labeling, require selecting text spans and making decisions about them. A typical approach to such tasks is to score all possible spans and greedily select spans for task-specific downstream processing. This approach, however, does not incorporate any inductive bias about what sort of spans ought to be selected,… ▽ More

    Submitted 23 August, 2023; v1 submitted 8 May, 2022; originally announced May 2022.

    Comments: NAACL 2022 camera-ready

  21. arXiv:2203.06566  [pdf, other

    cs.HC

    PromptChainer: Chaining Large Language Model Prompts through Visual Programming

    Authors: Tongshuang Wu, Ellen Jiang, Aaron Donsbach, Jeff Gray, Alejandra Molina, Michael Terry, Carrie J Cai

    Abstract: While LLMs can effectively help prototype single ML functionalities, many real-world applications involve complex tasks that cannot be easily handled via a single run of an LLM. Recent work has found that chaining multiple LLM runs together (with the output of one step being the input to the next) can help users accomplish these more complex tasks, and in a way that is perceived to be more transpa… ▽ More

    Submitted 12 March, 2022; originally announced March 2022.

    Comments: CHI LBW 2022

  22. arXiv:2108.07732  [pdf, other

    cs.PL cs.LG

    Program Synthesis with Large Language Models

    Authors: Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le, Charles Sutton

    Abstract: This paper explores the limits of the current generation of large language models for program synthesis in general purpose programming languages. We evaluate a collection of such models (with between 244M and 137B parameters) on two new benchmarks, MBPP and MathQA-Python, in both the few-shot and fine-tuning regimes. Our benchmarks are designed to measure the ability of these models to synthesize… ▽ More

    Submitted 15 August, 2021; originally announced August 2021.

    Comments: Jacob and Augustus contributed equally

  23. arXiv:2108.04974  [pdf, other

    cs.CR cs.LG

    SoK: How Robust is Image Classification Deep Neural Network Watermarking? (Extended Version)

    Authors: Nils Lukas, Edward Jiang, Xinda Li, Florian Kerschbaum

    Abstract: Deep Neural Network (DNN) watermarking is a method for provenance verification of DNN models. Watermarking should be robust against watermark removal attacks that derive a surrogate model that evades provenance verification. Many watermarking schemes that claim robustness have been proposed, but their robustness is only validated in isolation against a relatively small set of attacks. There is no… ▽ More

    Submitted 10 August, 2021; originally announced August 2021.

  24. arXiv:2103.11878  [pdf, other

    cs.CL cs.AI

    BlonDe: An Automatic Evaluation Metric for Document-level Machine Translation

    Authors: Yuchen Eleanor Jiang, Tianyu Liu, Shuming Ma, Dongdong Zhang, Jian Yang, Haoyang Huang, Rico Sennrich, Ryan Cotterell, Mrinmaya Sachan, Ming Zhou

    Abstract: Standard automatic metrics, e.g. BLEU, are not reliable for document-level MT evaluation. They can neither distinguish document-level improvements in translation quality from sentence-level ones, nor identify the discourse phenomena that cause context-agnostic translations. This paper introduces a novel automatic metric BlonDe to widen the scope of automatic MT evaluation from sentence to document… ▽ More

    Submitted 5 July, 2022; v1 submitted 22 March, 2021; originally announced March 2021.

    Comments: 9 pages, accepted to NAACL 2022

  25. arXiv:2101.00110  [pdf, other

    cs.IT

    Task Offloading and Resource Allocation with Multiple CAPs and Selfish Users

    Authors: Eric Jiang, Meng-Hsi Chen, Ben Liang, Min Dong

    Abstract: In this work, we consider a multi-user mobile edge computing system with multiple computing access points (CAPs). Each mobile user has multiple dependent tasks that must be processed in a round-by-round schedule. In every round, a user may process their individual task locally, or choose to offload their task to one of the $M$ CAPs or the remote cloud server, in order to possibly reduce their proc… ▽ More

    Submitted 2 March, 2021; v1 submitted 31 December, 2020; originally announced January 2021.

    Comments: 13 pages, 11 figures

  26. arXiv:2009.08026  [pdf, other

    cs.GR cs.CV cs.LG

    ShapeAssembly: Learning to Generate Programs for 3D Shape Structure Synthesis

    Authors: R. Kenny Jones, Theresa Barton, Xianghao Xu, Kai Wang, Ellen Jiang, Paul Guerrero, Niloy J. Mitra, Daniel Ritchie

    Abstract: Manually authoring 3D shapes is difficult and time consuming; generative models of 3D shapes offer compelling alternatives. Procedural representations are one such possibility: they offer high-quality and editable results but are difficult to author and often produce outputs with limited diversity. On the other extreme are deep generative models: given enough data, they can learn to generate any c… ▽ More

    Submitted 16 September, 2020; originally announced September 2020.

    Comments: Accepted to Siggraph Asia 2020; project page: https://rkjones4.github.io/shapeAssembly.html

  27. arXiv:2008.05122  [pdf, other

    cs.CL

    The Language Interpretability Tool: Extensible, Interactive Visualizations and Analysis for NLP Models

    Authors: Ian Tenney, James Wexler, Jasmijn Bastings, Tolga Bolukbasi, Andy Coenen, Sebastian Gehrmann, Ellen Jiang, Mahima Pushkarna, Carey Radebaugh, Emily Reif, Ann Yuan

    Abstract: We present the Language Interpretability Tool (LIT), an open-source platform for visualization and understanding of NLP models. We focus on core questions about model behavior: Why did my model make this prediction? When does it perform poorly? What happens under a controlled change in the input? LIT integrates local explanations, aggregate analysis, and counterfactual generation into a streamline… ▽ More

    Submitted 12 August, 2020; originally announced August 2020.

  28. arXiv:1812.06237  [pdf

    astro-ph.IM astro-ph.HE cs.HC

    Walking Through an Exploded Star: Rendering Supernova Remnant Cassiopeia A into Virtual Reality

    Authors: Kimberly K. Arcand, Elaine Jiang, Sara Price, Megan Watzke, Tom Sgouros, Peter Edmonds

    Abstract: NASA and other astrophysical data of the Cassiopeia A supernova remnant have been rendered into a three-dimensional virtual reality (VR) and augmented reality (AR) program, the first of its kind. This data-driven experience of a supernova remnant allows viewers to walk inside the leftovers from the explosion of a massive star, select the parts of the supernova remnant to engage with, and access de… ▽ More

    Submitted 15 December, 2018; originally announced December 2018.

    Comments: 20 pages, 6 figures

    Journal ref: Communicating Astronomy with the Public Journal Volume 1, Issue 24 (2018): 17

  29. arXiv:1811.04968  [pdf, other

    quant-ph cs.ET cs.LG physics.comp-ph

    PennyLane: Automatic differentiation of hybrid quantum-classical computations

    Authors: Ville Bergholm, Josh Izaac, Maria Schuld, Christian Gogolin, Shahnawaz Ahmed, Vishnu Ajith, M. Sohaib Alam, Guillermo Alonso-Linaje, B. AkashNarayanan, Ali Asadi, Juan Miguel Arrazola, Utkarsh Azad, Sam Banning, Carsten Blank, Thomas R Bromley, Benjamin A. Cordier, Jack Ceroni, Alain Delgado, Olivia Di Matteo, Amintor Dusko, Tanya Garg, Diego Guala, Anthony Hayes, Ryan Hill, Aroosa Ijaz , et al. (43 additional authors not shown)

    Abstract: PennyLane is a Python 3 software framework for differentiable programming of quantum computers. The library provides a unified architecture for near-term quantum computing devices, supporting both qubit and continuous-variable paradigms. PennyLane's core feature is the ability to compute gradients of variational quantum circuits in a way that is compatible with classical techniques such as backpro… ▽ More

    Submitted 29 July, 2022; v1 submitted 12 November, 2018; originally announced November 2018.

    Comments: Code available at https://github.com/XanaduAI/pennylane/ . Significant contributions to the code (new features, new plugins, etc.) will be recognized by the opportunity to be a co-author on this paper