Skip to main content

Showing 1–6 of 6 results for author: Sudalairaj, S

  1. arXiv:2403.01081  [pdf, other

    cs.CL cs.LG

    LAB: Large-Scale Alignment for ChatBots

    Authors: Shivchander Sudalairaj, Abhishek Bhandwaldar, Aldo Pareja, Kai Xu, David D. Cox, Akash Srivastava

    Abstract: This work introduces LAB (Large-scale Alignment for chatBots), a novel methodology designed to overcome the scalability challenges in the instruction-tuning phase of large language model (LLM) training. Leveraging a taxonomy-guided synthetic data generation process and a multi-phase tuning framework, LAB significantly reduces reliance on expensive human annotations and proprietary models like GPT-… ▽ More

    Submitted 29 April, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

    Comments: Corresponding Author: Akash Srivastava. Equal Contribution: Shivchander Sudalairaj, Abhishek Bhandwaldar, Aldo Pareja, Akash Srivastava, Code: https://github.com/instructlab

  2. arXiv:2305.15538  [pdf, other

    cs.LG cs.CR cs.DB cs.IT

    Post-processing Private Synthetic Data for Improving Utility on Selected Measures

    Authors: Hao Wang, Shivchander Sudalairaj, John Henning, Kristjan Greenewald, Akash Srivastava

    Abstract: Existing private synthetic data generation algorithms are agnostic to downstream tasks. However, end users may have specific requirements that the synthetic data must satisfy. Failure to meet these requirements could significantly reduce the utility of the data for downstream use. We introduce a post-processing technique that improves the utility of the synthetic data with respect to measures sele… ▽ More

    Submitted 18 October, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

  3. arXiv:2304.00601  [pdf, other

    cs.CV cs.LG

    Constructive Assimilation: Boosting Contrastive Learning Performance through View Generation Strategies

    Authors: Ligong Han, Seungwook Han, Shivchander Sudalairaj, Charlotte Loh, Rumen Dangovski, Fei Deng, Pulkit Agrawal, Dimitris Metaxas, Leonid Karlinsky, Tsui-Wei Weng, Akash Srivastava

    Abstract: Transformations based on domain expertise (expert transformations), such as random-resized-crop and color-jitter, have proven critical to the success of contrastive learning techniques such as SimCLR. Recently, several attempts have been made to replace such domain-specific, human-designed transformations with generated views that are learned. However for imagery data, so far none of these view-ge… ▽ More

    Submitted 8 April, 2023; v1 submitted 2 April, 2023; originally announced April 2023.

    Comments: Accepted at Generative Models for Computer Vision Workshop 2023

  4. arXiv:2303.02484  [pdf, other

    cs.LG cs.AI cs.CV

    Multi-Symmetry Ensembles: Improving Diversity and Generalization via Opposing Symmetries

    Authors: Charlotte Loh, Seungwook Han, Shivchander Sudalairaj, Rumen Dangovski, Kai Xu, Florian Wenzel, Marin Soljacic, Akash Srivastava

    Abstract: Deep ensembles (DE) have been successful in improving model performance by learning diverse members via the stochasticity of random initialization. While recent works have attempted to promote further diversity in DE via hyperparameters or regularizing loss functions, these methods primarily still rely on a stochastic approach to explore the hypothesis space. In this work, we present Multi-Symmetr… ▽ More

    Submitted 19 June, 2023; v1 submitted 4 March, 2023; originally announced March 2023.

    Comments: Camera Ready Revision. ICML 2023

  5. arXiv:2210.15943  [pdf, other

    cs.CV

    Grafting Vision Transformers

    Authors: Jongwoo Park, Kumara Kahatapitiya, Donghyun Kim, Shivchander Sudalairaj, Quanfu Fan, Michael S. Ryoo

    Abstract: Vision Transformers (ViTs) have recently become the state-of-the-art across many computer vision tasks. In contrast to convolutional networks (CNNs), ViTs enable global information sharing even within shallow layers of a network, i.e., among high-resolution features. However, this perk was later overlooked with the success of pyramid architectures such as Swin Transformer, which show better perfor… ▽ More

    Submitted 3 April, 2023; v1 submitted 28 October, 2022; originally announced October 2022.

  6. arXiv:2210.04783  [pdf, other

    cs.LG cs.CV physics.app-ph

    On the Importance of Calibration in Semi-supervised Learning

    Authors: Charlotte Loh, Rumen Dangovski, Shivchander Sudalairaj, Seungwook Han, Ligong Han, Leonid Karlinsky, Marin Soljacic, Akash Srivastava

    Abstract: State-of-the-art (SOTA) semi-supervised learning (SSL) methods have been highly successful in leveraging a mix of labeled and unlabeled data by combining techniques of consistency regularization and pseudo-labeling. During pseudo-labeling, the model's predictions on unlabeled data are used for training and thus, model calibration is important in mitigating confirmation bias. Yet, many SOTA methods… ▽ More

    Submitted 10 October, 2022; originally announced October 2022.

    Comments: 24 pages