Skip to main content

Showing 1–16 of 16 results for author: Qu, T

  1. Null Compliance: NYC Local Law 144 and the Challenges of Algorithm Accountability

    Authors: Lucas Wright, Roxana Mike Muenster, Briana Vecchione, Tianyao Qu, Pika, Cai, COMM/INFO 2450 Student Investigators, Jacob Metcalf, J. Nathan Matias

    Abstract: In July 2023, New York City became the first jurisdiction globally to mandate bias audits for commercial algorithmic systems, specifically for automated employment decisions systems (AEDTs) used in hiring and promotion. Local Law 144 (LL 144) requires AEDTs to be independently audited annually for race and gender bias, and the audit report must be publicly posted. Additionally, employers are oblig… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  2. arXiv:2403.09377  [pdf, other

    cs.CV

    Introducing Routing Functions to Vision-Language Parameter-Efficient Fine-Tuning with Low-Rank Bottlenecks

    Authors: Tingyu Qu, Tinne Tuytelaars, Marie-Francine Moens

    Abstract: Mainstream parameter-efficient fine-tuning (PEFT) methods, such as LoRA or Adapter, project a model's hidden states to a lower dimension, allowing pre-trained models to adapt to new data through this low-rank bottleneck. However, PEFT tasks involving multiple modalities, like vision-language (VL) tasks, require not only adaptation to new data but also learning the relationship between different mo… ▽ More

    Submitted 12 July, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

    Comments: Accepted at ECCV 2024

  3. arXiv:2312.17240  [pdf, other

    cs.CV

    LISA++: An Improved Baseline for Reasoning Segmentation with Large Language Model

    Authors: Senqiao Yang, Tianyuan Qu, Xin Lai, Zhuotao Tian, Bohao Peng, Shu Liu, Jiaya Jia

    Abstract: While LISA effectively bridges the gap between segmentation and large language models to enable reasoning segmentation, it poses certain limitations: unable to distinguish different instances of the target region, and constrained by the pre-defined textual response formats. In this work, we introduce LISA++, an update to the existing LISA model, focusing on improving core functionalities while kee… ▽ More

    Submitted 22 January, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

    Comments: Typo fixed

  4. arXiv:2312.17051  [pdf, other

    cs.CV

    FILP-3D: Enhancing 3D Few-shot Class-incremental Learning with Pre-trained Vision-Language Models

    Authors: Wan Xu, Tianyu Huang, Tianyu Qu, Guanglei Yang, Yiwen Guo, Wangmeng Zuo

    Abstract: Few-shot class-incremental learning (FSCIL) aims to mitigate the catastrophic forgetting issue when a model is incrementally trained on limited data. While the Contrastive Vision-Language Pre-Training (CLIP) model has been effective in addressing 2D few/zero-shot learning tasks, its direct application to 3D FSCIL faces limitations. These limitations arise from feature space misalignment and signif… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

  5. arXiv:2311.10764  [pdf, other

    cs.IR cs.AI

    Deep Group Interest Modeling of Full Lifelong User Behaviors for CTR Prediction

    Authors: Qi Liu, Xuyang Hou, Haoran Jin, jin Chen, Zhe Wang, Defu Lian, Tan Qu, Jia Cheng, Jun Lei

    Abstract: Extracting users' interests from their lifelong behavior sequence is crucial for predicting Click-Through Rate (CTR). Most current methods employ a two-stage process for efficiency: they first select historical behaviors related to the candidate item and then deduce the user's interest from this narrowed-down behavior sub-sequence. This two-stage paradigm, though effective, leads to information lo… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  6. arXiv:2310.00029   

    cs.AI cs.GT cs.LG cs.RO

    Adversarial Driving Behavior Generation Incorporating Human Risk Cognition for Autonomous Vehicle Evaluation

    Authors: Zhen Liu, Hang Gao, Hao Ma, Shuo Cai, Yunfeng Hu, Ting Qu, Hong Chen, Xun Gong

    Abstract: Autonomous vehicle (AV) evaluation has been the subject of increased interest in recent years both in industry and in academia. This paper focuses on the development of a novel framework for generating adversarial driving behavior of background vehicle interfering against the AV to expose effective and rational risky events. Specifically, the adversarial behavior is learned by a reinforcement lear… ▽ More

    Submitted 14 October, 2023; v1 submitted 29 September, 2023; originally announced October 2023.

    Comments: We find there is expression error in III.A. A correction edition will be offered

  7. arXiv:2308.08325  [pdf, other

    cs.CV

    Visually-Aware Context Modeling for News Image Captioning

    Authors: Tingyu Qu, Tinne Tuytelaars, Marie-Francine Moens

    Abstract: News Image Captioning aims to create captions from news articles and images, emphasizing the connection between textual context and visual elements. Recognizing the significance of human faces in news images and the face-name co-occurrence pattern in existing datasets, we propose a face-naming module for learning better name embeddings. Apart from names, which can be directly linked to an image ar… ▽ More

    Submitted 21 March, 2024; v1 submitted 16 August, 2023; originally announced August 2023.

    Comments: Accepted at NAACL 2024 Main Conference

  8. arXiv:2308.06037  [pdf, other

    cs.IR cs.AI

    Deep Context Interest Network for Click-Through Rate Prediction

    Authors: Xuyang Hou, Zhe Wang, Qi Liu, Tan Qu, Jia Cheng, Jun Lei

    Abstract: Click-Through Rate (CTR) prediction, estimating the probability of a user clicking on an item, is essential in industrial applications, such as online advertising. Many works focus on user behavior modeling to improve CTR prediction performance. However, most of those methods only model users' positive interests from users' click items while ignoring the context information, which is the display i… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

    Comments: accepted by CIKM 2023

  9. arXiv:2306.17162  [pdf, other

    cs.RO

    Can Machines Garden? Systematically Comparing the AlphaGarden vs. Professional Horticulturalists

    Authors: Simeon Adebola, Rishi Parikh, Mark Presten, Satvik Sharma, Shrey Aeron, Ananth Rao, Sandeep Mukherjee, Tomson Qu, Christina Wistrom, Eugen Solowjow, Ken Goldberg

    Abstract: The AlphaGarden is an automated testbed for indoor polyculture farming which combines a first-order plant simulator, a gantry robot, a seed planting algorithm, plant phenotyping and tracking algorithms, irrigation sensors and algorithms, and custom pruning tools and algorithms. In this paper, we systematically compare the performance of the AlphaGarden to professional horticulturalists on the staf… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

    Comments: International Conference on Robotics and Automation(ICRA) 2023 Oral

  10. arXiv:2305.15583  [pdf, other

    cs.CV

    Alleviating Exposure Bias in Diffusion Models through Sampling with Shifted Time Steps

    Authors: Mingxiao Li, Tingyu Qu, Ruicong Yao, Wei Sun, Marie-Francine Moens

    Abstract: Diffusion Probabilistic Models (DPM) have shown remarkable efficacy in the synthesis of high-quality images. However, their inference process characteristically requires numerous, potentially hundreds, of iterative steps, which could exaggerate the problem of exposure bias due to the training and inference discrepancy. Previous work has attempted to mitigate this issue by perturbing inputs during… ▽ More

    Submitted 16 June, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Accepted at International Conference on Learning Representations (ICLR2024); typo correction

  11. arXiv:2210.16849  [pdf, other

    cs.SD eess.AS

    TT-Net: Dual-path transformer based sound field translation in the spherical harmonic domain

    Authors: Yiwen Wang, Zijian Lan, Xihong Wu, Tianshu Qu

    Abstract: In the current method for the sound field translation tasks based on spherical harmonic (SH) analysis, the solution based on the additive theorem usually faces the problem of singular values caused by large matrix condition numbers. The influence of different distances and frequencies of the spherical radial function on the stability of the translation matrix will affect the accuracy of the SH coe… ▽ More

    Submitted 30 October, 2022; originally announced October 2022.

    Comments: Submitted to ICASSP 2023

  12. arXiv:2210.08957  [pdf, other

    cs.CV

    Weakly Supervised Face Naming with Symmetry-Enhanced Contrastive Loss

    Authors: Tingyu Qu, Tinne Tuytelaars, Marie-Francine Moens

    Abstract: We revisit the weakly supervised cross-modal face-name alignment task; that is, given an image and a caption, we label the faces in the image with the names occurring in the caption. Whereas past approaches have learned the latent alignment between names and faces by uncertainty reasoning over a set of images and their respective captions, in this paper, we rely on appropriate loss functions to le… ▽ More

    Submitted 17 October, 2022; originally announced October 2022.

    Comments: Accepted at IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023

  13. arXiv:2207.10478  [pdf, other

    cs.SD eess.AS

    Room geometry blind inference based on the localization of real sound source and first order reflections

    Authors: Shan Gao, Xihong Wu, Tianshu Qu

    Abstract: The conventional room geometry blind inference techniques with acoustic signals are conducted based on the prior knowledge of the environment, such as the room impulse response (RIR) or the sound source position, which will limit its application under unknown scenarios. To solve this problem, we have proposed a room geometry reconstruction method in this paper by using the geometric relation betwe… ▽ More

    Submitted 22 July, 2022; v1 submitted 21 July, 2022; originally announced July 2022.

  14. arXiv:2110.04850  [pdf, other

    eess.AS cs.SD eess.SP

    Direct source and early reflections localization using deep deconvolution network under reverberant environment

    Authors: Shan Gao, Xihong Wu, Tianshu Qu

    Abstract: This paper proposes a deconvolution-based network (DCNN) model for DOA estimation of direct source and early reflections under reverberant scenarios. Considering that the first-order reflections of the sound source also contain spatial directivity like the direct source, we treat both of them as the sources in the learning process. We use the covariance matrix of high order Ambisonics (HOA) signal… ▽ More

    Submitted 22 October, 2021; v1 submitted 10 October, 2021; originally announced October 2021.

  15. arXiv:2109.06094  [pdf, other

    cs.CV cs.LG eess.IV

    Single-stream CNN with Learnable Architecture for Multi-source Remote Sensing Data

    Authors: Yi Yang, Daoye Zhu, Tengteng Qu, Qiangyu Wang, Fuhu Ren, Chengqi Cheng

    Abstract: In this paper, we propose an efficient and generalizable framework based on deep convolutional neural network (CNN) for multi-source remote sensing data joint classification. While recent methods are mostly based on multi-stream architectures, we use group convolution to construct equivalent network architectures efficiently within a single-stream network. We further adopt and improve dynamic grou… ▽ More

    Submitted 6 February, 2022; v1 submitted 13 September, 2021; originally announced September 2021.

  16. arXiv:1910.09484  [pdf, other

    eess.AS cs.SD eess.SP

    Modeling of Individual HRTFs based on Spatial Principal Component Analysis

    Authors: Mengfan Zhang, Zhongshu Ge, Tiejun Liu, Xihong Wu, Tianshu Qu

    Abstract: Head-related transfer function (HRTF) plays an important role in the construction of 3D auditory display. This paper presents an individual HRTF modeling method using deep neural networks based on spatial principal component analysis. The HRTFs are represented by a small set of spatial principal components combined with frequency and individual-dependent weights. By estimating the spatial principa… ▽ More

    Submitted 5 February, 2020; v1 submitted 21 October, 2019; originally announced October 2019.

    Comments: 12 pages with 18 figures. This paper was published in IEEE/ACM Transactions on Audio, Speech and Language Processing. Copyright 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media

    Journal ref: IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 28, No. 1, December 2020