Skip to main content

Showing 1–15 of 15 results for author: Liao, I

  1. arXiv:2406.07814  [pdf, other

    cs.AI cs.CL cs.HC

    Collective Constitutional AI: Aligning a Language Model with Public Input

    Authors: Saffron Huang, Divya Siddarth, Liane Lovitt, Thomas I. Liao, Esin Durmus, Alex Tamkin, Deep Ganguli

    Abstract: There is growing consensus that language model (LM) developers should not be the sole deciders of LM behavior, creating a need for methods that enable the broader public to collectively shape the behavior of LM systems that affect them. To address this need, we present Collective Constitutional AI (CCAI): a multi-stage process for sourcing and integrating public input into LMs-from identifying a t… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    ACM Class: I.2.7; K.4.2

    Journal ref: Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency. 1395-1417

  2. arXiv:2405.14860  [pdf, other

    cs.LG

    Not All Language Model Features Are Linear

    Authors: Joshua Engels, Isaac Liao, Eric J. Michaud, Wes Gurnee, Max Tegmark

    Abstract: Recent work has proposed the linear representation hypothesis: that language models perform computation by manipulating one-dimensional representations of concepts ("features") in activation space. In contrast, we explore whether some language model representations may be inherently multi-dimensional. We begin by developing a rigorous definition of irreducible multi-dimensional features based on w… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Code and data at https://github.com/JoshEngels/MultiDimensionalFeatures

  3. arXiv:2404.08237  [pdf, other

    cs.CV cs.AI

    IFViT: Interpretable Fixed-Length Representation for Fingerprint Matching via Vision Transformer

    Authors: Yuhang Qiu, Honghui Chen, Xingbo Dong, Zheng Lin, Iman Yi Liao, Massimo Tistarelli, Zhe Jin

    Abstract: Determining dense feature points on fingerprints used in constructing deep fixed-length representations for accurate matching, particularly at the pixel level, is of significant interest. To explore the interpretability of fingerprint matching, we propose a multi-stage interpretable fingerprint matching network, namely Interpretable Fixed-length Representation for Fingerprint Matching via Vision T… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: ready to submit to IEEE Transactions on Information Forensics and Security (TIFS)

  4. arXiv:2402.05110  [pdf, other

    cs.LG

    Opening the AI black box: program synthesis via mechanistic interpretability

    Authors: Eric J. Michaud, Isaac Liao, Vedang Lad, Ziming Liu, Anish Mudide, Chloe Loughridge, Zifan Carl Guo, Tara Rezaei Kheirkhah, Mateja Vukelić, Max Tegmark

    Abstract: We present MIPS, a novel method for program synthesis based on automated mechanistic interpretability of neural networks trained to perform the desired task, auto-distilling the learned algorithm into Python code. We test MIPS on a benchmark of 62 algorithmic tasks that can be learned by an RNN and find it highly complementary to GPT-4: MIPS solves 32 of them, including 13 that are not solved by G… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: 24 pages

  5. arXiv:2312.03051  [pdf, other

    cs.LG cs.AI cs.NE

    Generating Interpretable Networks using Hypernetworks

    Authors: Isaac Liao, Ziming Liu, Max Tegmark

    Abstract: An essential goal in mechanistic interpretability to decode a network, i.e., to convert a neural network's raw weights to an interpretable algorithm. Given the difficulty of the decoding problem, progress has been made to understand the easier encoding problem, i.e., to convert an interpretable algorithm into network weights. Previous works focus on encoding existing algorithms into networks, whic… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: 15 pages, 7 figures

    MSC Class: 68T07 ACM Class: I.2.6

  6. arXiv:2310.15129  [pdf, other

    cs.CL cs.LG

    Location-Aware Visual Question Generation with Lightweight Models

    Authors: Nicholas Collin Suwono, Justin Chih-Yao Chen, Tun Min Hung, Ting-Hao Kenneth Huang, I-Bin Liao, Yung-Hui Li, Lun-Wei Ku, Shao-Hua Sun

    Abstract: This work introduces a novel task, location-aware visual question generation (LocaVQG), which aims to generate engaging questions from data relevant to a particular geographical location. Specifically, we represent such location-aware information with surrounding images and a GPS coordinate. To tackle this task, we present a dataset generation pipeline that leverages GPT-4 to produce diverse and s… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023

  7. arXiv:2310.13798  [pdf, other

    cs.CL cs.AI

    Specific versus General Principles for Constitutional AI

    Authors: Sandipan Kundu, Yuntao Bai, Saurav Kadavath, Amanda Askell, Andrew Callahan, Anna Chen, Anna Goldie, Avital Balwit, Azalia Mirhoseini, Brayden McLean, Catherine Olsson, Cassie Evraets, Eli Tran-Johnson, Esin Durmus, Ethan Perez, Jackson Kernion, Jamie Kerr, Kamal Ndousse, Karina Nguyen, Nelson Elhage, Newton Cheng, Nicholas Schiefer, Nova DasSarma, Oliver Rausch, Robin Larson , et al. (11 additional authors not shown)

    Abstract: Human feedback can prevent overtly harmful utterances in conversational models, but may not automatically mitigate subtle problematic behaviors such as a stated desire for self-preservation or power. Constitutional AI offers an alternative, replacing human feedback with feedback from AI models conditioned only on a list of written principles. We find this approach effectively prevents the expressi… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

  8. arXiv:2306.16388  [pdf, other

    cs.CL cs.AI

    Towards Measuring the Representation of Subjective Global Opinions in Language Models

    Authors: Esin Durmus, Karina Nguyen, Thomas I. Liao, Nicholas Schiefer, Amanda Askell, Anton Bakhtin, Carol Chen, Zac Hatfield-Dodds, Danny Hernandez, Nicholas Joseph, Liane Lovitt, Sam McCandlish, Orowa Sikder, Alex Tamkin, Janel Thamkul, Jared Kaplan, Jack Clark, Deep Ganguli

    Abstract: Large language models (LLMs) may not equitably represent diverse global perspectives on societal issues. In this paper, we develop a quantitative framework to evaluate whose opinions model-generated responses are more similar to. We first build a dataset, GlobalOpinionQA, comprised of questions and answers from cross-national surveys designed to capture diverse opinions on global issues across dif… ▽ More

    Submitted 11 April, 2024; v1 submitted 28 June, 2023; originally announced June 2023.

  9. arXiv:2303.15772  [pdf, other

    cs.LG cs.AI cs.CY

    Ecosystem Graphs: The Social Footprint of Foundation Models

    Authors: Rishi Bommasani, Dilara Soylu, Thomas I. Liao, Kathleen A. Creel, Percy Liang

    Abstract: Foundation models (e.g. ChatGPT, StableDiffusion) pervasively influence society, warranting immediate social attention. While the models themselves garner much attention, to accurately characterize their impact, we must consider the broader sociotechnical ecosystem. We propose Ecosystem Graphs as a documentation framework to transparently centralize knowledge of this ecosystem. Ecosystem Graphs is… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

    Comments: Authored by the Center for Research on Foundation Models (CRFM) at the Stanford Institute for Human-Centered Artificial Intelligence (HAI). Ecosystem Graphs available at https://crfm.stanford.edu/ecosystem-graphs/

  10. A Simple 2-Approximation for Maximum-Leaf Spanning Tree

    Authors: I-Cheng Liao, Hsueh-I Lu

    Abstract: For an $m$-edge connected simple graph $G$, finding a spanning tree of $G$ with the maximum number of leaves is MAXSNP-complete. The problem remains NP-complete even if $G$ is planar and the maximal degree of $G$ is at most four. Lu and Ravi gave the first known polynomial-time approximation algorithms with approximation factors $5$ and $3$. Later, they obtained a $3$-approximation algorithm that… ▽ More

    Submitted 1 April, 2024; v1 submitted 6 March, 2023; originally announced March 2023.

    Comments: 10 pages, 4 figures, fixing typos of Equation (3)

    MSC Class: 05C38; 05C10; 05C85; 68P05

    Journal ref: International Journal of Foundations of Computer Science, 2023

  11. arXiv:2302.07459  [pdf, other

    cs.CL

    The Capacity for Moral Self-Correction in Large Language Models

    Authors: Deep Ganguli, Amanda Askell, Nicholas Schiefer, Thomas I. Liao, Kamilė Lukošiūtė, Anna Chen, Anna Goldie, Azalia Mirhoseini, Catherine Olsson, Danny Hernandez, Dawn Drain, Dustin Li, Eli Tran-Johnson, Ethan Perez, Jackson Kernion, Jamie Kerr, Jared Mueller, Joshua Landau, Kamal Ndousse, Karina Nguyen, Liane Lovitt, Michael Sellitto, Nelson Elhage, Noemi Mercado, Nova DasSarma , et al. (24 additional authors not shown)

    Abstract: We test the hypothesis that language models trained with reinforcement learning from human feedback (RLHF) have the capability to "morally self-correct" -- to avoid producing harmful outputs -- if instructed to do so. We find strong evidence in support of this hypothesis across three different experiments, each of which reveal different facets of moral self-correction. We find that the capability… ▽ More

    Submitted 18 February, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

  12. arXiv:2210.06171  [pdf, other

    cs.LG

    Learning to Optimize Quasi-Newton Methods

    Authors: Isaac Liao, Rumen R. Dangovski, Jakob N. Foerster, Marin Soljačić

    Abstract: Fast gradient-based optimization algorithms have become increasingly essential for the computationally efficient training of machine learning models. One technique is to multiply the gradient by a preconditioner matrix to produce a step, but it is unclear what the best preconditioner matrix is. This paper introduces a novel machine learning optimizer called LODO, which tries to online meta-learn t… ▽ More

    Submitted 11 September, 2023; v1 submitted 10 October, 2022; originally announced October 2022.

    ACM Class: I.2.6

  13. arXiv:2104.12040  [pdf, ps, other

    cs.LG

    Balancing Accuracy and Latency in Multipath Neural Networks

    Authors: Mohammed Amer, Tomás Maul, Iman Yi Liao

    Abstract: The growing capacity of neural networks has strongly contributed to their success at complex machine learning tasks and the computational demand of such large models has, in turn, stimulated a significant improvement in the hardware necessary to accelerate their computations. However, models with high latency aren't suitable for limited-resource environments such as hand-held and IoT devices. Henc… ▽ More

    Submitted 24 April, 2021; originally announced April 2021.

  14. arXiv:2012.15025  [pdf

    eess.IV cs.CV

    A Review of Machine Learning Techniques for Applied Eye Fundus and Tongue Digital Image Processing with Diabetes Management System

    Authors: Wei Xiang Lim, Zhiyuan Chen, Amr Ahmed, Tissa Chandesa, Iman Liao

    Abstract: Diabetes is a global epidemic and it is increasing at an alarming rate. The International Diabetes Federation (IDF) projected that the total number of people with diabetes globally may increase by 48%, from 425 million (year 2017) to 629 million (year 2045). Moreover, diabetes had caused millions of deaths and the number is increasing drastically. Therefore, this paper addresses the background of… ▽ More

    Submitted 29 December, 2020; originally announced December 2020.

    Comments: This paper is published in The International Conference on Digital Image and Signal Processing (DISP 2019)At: Oxford, United Kingdom

  15. arXiv:1907.03698  [pdf, other

    cs.LG cs.CV cs.MM stat.ML

    TrackNet: A Deep Learning Network for Tracking High-speed and Tiny Objects in Sports Applications

    Authors: Yu-Chuan Huang, I-No Liao, Ching-Hsuan Chen, Tsì-Uí İk, Wen-Chih Peng

    Abstract: Ball trajectory data are one of the most fundamental and useful information in the evaluation of players' performance and analysis of game strategies. Although vision-based object tracking techniques have been developed to analyze sport competition videos, it is still challenging to recognize and position a high-speed and tiny ball accurately. In this paper, we develop a deep learning network, cal… ▽ More

    Submitted 8 July, 2019; originally announced July 2019.