Skip to main content

Showing 1–21 of 21 results for author: Ghosh, G

  1. arXiv:2405.01582  [pdf, other

    cs.CL cs.AI cs.LG

    Text Quality-Based Pruning for Efficient Training of Language Models

    Authors: Vasu Sharma, Karthik Padthe, Newsha Ardalani, Kushal Tirumala, Russell Howes, Hu Xu, Po-Yao Huang, Shang-Wen Li, Armen Aghajanyan, Gargi Ghosh, Luke Zettlemoyer

    Abstract: In recent times training Language Models (LMs) have relied on computationally heavy training over massive datasets which makes this training process extremely laborious. In this paper we propose a novel method for numerically evaluating text quality in large unlabelled NLP datasets in a model agnostic manner to assign the text instances a "quality score". By proposing the text quality metric, th… ▽ More

    Submitted 10 May, 2024; v1 submitted 26 April, 2024; originally announced May 2024.

  2. arXiv:2402.09757  [pdf, ps, other

    cs.IT math.CO

    Construction of CCC and ZCCS Through Additive Characters Over Galois Field

    Authors: Gobinda Ghosh, Sudhan Majhi, Subhabrata Paul

    Abstract: The rapid progression in wireless communication technologies, especially in multicarrier code-division multiple access (MC-CDMA), there is a need of advanced code construction methods. Traditional approaches, mainly based on generalized Boolean functions, have limitations in code length versatility. This paper introduces a novel approach to constructing complete complementary codes (CCC) and Z-com… ▽ More

    Submitted 18 March, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

  3. arXiv:2309.16671  [pdf, other

    cs.CV cs.CL

    Demystifying CLIP Data

    Authors: Hu Xu, Saining Xie, Xiaoqing Ellen Tan, Po-Yao Huang, Russell Howes, Vasu Sharma, Shang-Wen Li, Gargi Ghosh, Luke Zettlemoyer, Christoph Feichtenhofer

    Abstract: Contrastive Language-Image Pre-training (CLIP) is an approach that has advanced research and applications in computer vision, fueling modern recognition systems and generative models. We believe that the main ingredient to the success of CLIP is its data and not the model architecture or pre-training objective. However, CLIP only provides very limited information about its data and how it has been… ▽ More

    Submitted 7 April, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

    Comments: 17 pages. arXiv admin note: text overlap with arXiv:2103.00020 by other authors

  4. arXiv:2309.02591  [pdf, other

    cs.LG cs.CL cs.CV

    Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning

    Authors: Lili Yu, Bowen Shi, Ramakanth Pasunuru, Benjamin Muller, Olga Golovneva, Tianlu Wang, Arun Babu, Binh Tang, Brian Karrer, Shelly Sheynin, Candace Ross, Adam Polyak, Russell Howes, Vasu Sharma, Puxin Xu, Hovhannes Tamoyan, Oron Ashual, Uriel Singer, Shang-Wen Li, Susan Zhang, Richard James, Gargi Ghosh, Yaniv Taigman, Maryam Fazel-Zarandi, Asli Celikyilmaz , et al. (2 additional authors not shown)

    Abstract: We present CM3Leon (pronounced "Chameleon"), a retrieval-augmented, token-based, decoder-only multi-modal language model capable of generating and infilling both text and images. CM3Leon uses the CM3 multi-modal architecture but additionally shows the extreme benefits of scaling up and tuning on more diverse instruction-style data. It is the first multi-modal model trained with a recipe adapted fr… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

  5. arXiv:2305.11206  [pdf, other

    cs.CL cs.AI cs.LG

    LIMA: Less Is More for Alignment

    Authors: Chunting Zhou, Pengfei Liu, Puxin Xu, Srini Iyer, Jiao Sun, Yuning Mao, Xuezhe Ma, Avia Efrat, Ping Yu, Lili Yu, Susan Zhang, Gargi Ghosh, Mike Lewis, Luke Zettlemoyer, Omer Levy

    Abstract: Large language models are trained in two stages: (1) unsupervised pretraining from raw text, to learn general-purpose representations, and (2) large scale instruction tuning and reinforcement learning, to better align to end tasks and user preferences. We measure the relative importance of these two stages by training LIMA, a 65B parameter LLaMa language model fine-tuned with the standard supervis… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

  6. arXiv:2301.03294  [pdf, ps, other

    cs.IT

    Construction of Optimal Binary Z-Complementary Code Sets with New Lengths

    Authors: Gobinda Ghosh, Sudhan Majhi, Shubabrata Paul

    Abstract: Z-complementary code sets (ZCCSs) are used in multicarrier code-division multiple access (MC-CDMA) systems, for interference-free communication over multiuser and quasi-asynchronous environments. In this letter, we propose three new constructions of optimal binary $\left(R2^{k+1},2^{k+1}, Rγ,γ\right)$-ZCCS, $\left(R2^{k+1},2^{k+1}, R2^{m_{2}},2^{m_{2}}\right)$-ZCCS and… ▽ More

    Submitted 22 February, 2023; v1 submitted 9 January, 2023; originally announced January 2023.

  7. arXiv:2301.02400  [pdf, ps, other

    cs.IT

    A Direct Construction of Optimal 2D-ZCACS with Flexible Array Size and Large Set Size

    Authors: Gobinda Ghosh, Sudhan Majhi, Shubhabrata Paul

    Abstract: In this paper, we propose a direct construction of optimal two-dimensional Z-complementary array code sets (2D-ZCACS) using multivariable functions (MVFs). In contrast to earlier works, the proposed construction allows for a flexible array size and a large set size. Additionally, the proposed design can be transformed into a one-dimensional Z-complementary code set (1D-ZCCS). Many of the 1D-ZCCS d… ▽ More

    Submitted 6 January, 2023; originally announced January 2023.

  8. arXiv:2301.02241  [pdf, other

    cs.CV cs.CL

    CiT: Curation in Training for Effective Vision-Language Data

    Authors: Hu Xu, Saining Xie, Po-Yao Huang, Licheng Yu, Russell Howes, Gargi Ghosh, Luke Zettlemoyer, Christoph Feichtenhofer

    Abstract: Large vision-language models are generally applicable to many downstream tasks, but come at an exorbitant training cost that only large institutions can afford. This paper trades generality for efficiency and presents Curation in Training (CiT), a simple and efficient vision-text learning algorithm that couples a data objective into training. CiT automatically yields quality data to speed-up contr… ▽ More

    Submitted 5 January, 2023; originally announced January 2023.

    Comments: Technical Report

  9. arXiv:2212.08286  [pdf, other

    cs.CL

    ALERT: Adapting Language Models to Reasoning Tasks

    Authors: Ping Yu, Tianlu Wang, Olga Golovneva, Badr AlKhamissi, Siddharth Verma, Zhijing Jin, Gargi Ghosh, Mona Diab, Asli Celikyilmaz

    Abstract: Current large language models can perform reasonably well on complex tasks that require step-by-step reasoning with few-shot learning. Are these models applying reasoning skills they have learnt during pre-training and reason outside of their training context, or are they simply memorizing their training corpus at finer granularity and have learnt to better understand their context? To tease apart… ▽ More

    Submitted 7 July, 2023; v1 submitted 16 December, 2022; originally announced December 2022.

  10. arXiv:2212.08071  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    MAViL: Masked Audio-Video Learners

    Authors: Po-Yao Huang, Vasu Sharma, Hu Xu, Chaitanya Ryali, Haoqi Fan, Yanghao Li, Shang-Wen Li, Gargi Ghosh, Jitendra Malik, Christoph Feichtenhofer

    Abstract: We present Masked Audio-Video Learners (MAViL) to train audio-visual representations. Our approach learns with three complementary forms of self-supervision: (1) reconstruction of masked audio and video input data, (2) intra- and inter-modal contrastive learning with masking, and (3) self-training by reconstructing joint audio-video contextualized features learned from the first two objectives. Pr… ▽ More

    Submitted 17 July, 2023; v1 submitted 15 December, 2022; originally announced December 2022.

    Comments: Technical report

  11. arXiv:2207.13395  [pdf, ps, other

    cs.IT

    A Direct Construction of 2D-CCC with Arbitrary Array Size and Flexible Set Size Using Multivariable Function

    Authors: Gobinda Ghosh, Sudhan Majhi

    Abstract: Recently, two-dimensional (2D) array codes have been found to have applications in wireless communication.In this paper, we propose direct construction of 2D complete complementary codes (2D-CCCs) with arbitrary array size and flexible set size using multivariable functions (MVF). The Peak-to-mean envelope power ratio (PMEPR) properties of row and column sequences of the constructed 2D-CCC arrays… ▽ More

    Submitted 28 February, 2024; v1 submitted 27 July, 2022; originally announced July 2022.

  12. arXiv:2201.07520  [pdf, other

    cs.CL

    CM3: A Causal Masked Multimodal Model of the Internet

    Authors: Armen Aghajanyan, Bernie Huang, Candace Ross, Vladimir Karpukhin, Hu Xu, Naman Goyal, Dmytro Okhonko, Mandar Joshi, Gargi Ghosh, Mike Lewis, Luke Zettlemoyer

    Abstract: We introduce CM3, a family of causally masked generative models trained over a large corpus of structured multi-modal documents that can contain both text and image tokens. Our new causally masked approach generates tokens left to right while also masking out a small number of long token spans that are generated at the end of the string, instead of their original positions. The casual masking obje… ▽ More

    Submitted 19 January, 2022; originally announced January 2022.

  13. arXiv:2109.14084  [pdf, other

    cs.CV cs.CL

    VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding

    Authors: Hu Xu, Gargi Ghosh, Po-Yao Huang, Dmytro Okhonko, Armen Aghajanyan, Florian Metze, Luke Zettlemoyer, Christoph Feichtenhofer

    Abstract: We present VideoCLIP, a contrastive approach to pre-train a unified model for zero-shot video and text understanding, without using any labels on downstream tasks. VideoCLIP trains a transformer for video and text by contrasting temporally overlapping positive video-text pairs with hard negatives from nearest neighbor retrieval. Our experiments on a diverse series of downstream tasks, including se… ▽ More

    Submitted 1 October, 2021; v1 submitted 28 September, 2021; originally announced September 2021.

    Comments: EMNLP 2021

  14. Direct Construction of Optimal Z-Complementary Code Sets for all Possible Even Length by Using Pseudo-Boolean Functions

    Authors: Gobinda Ghosh, Sudhan Majhi, Palash Sarkar, Ashish Kumar Upadhyay

    Abstract: Z-complementary code set (ZCCS) are well known to be used in multicarrier code-division multiple access (MCCDMA) system to provide a interference free environment. Based on the existing literature, the direct construction of optimal ZCCSs are limited to its length. In this paper, we are interested in constructing optimal ZCCSs of all possible even lengths using Pseudo-Boolean functions. The maximu… ▽ More

    Submitted 5 August, 2021; originally announced August 2021.

  15. arXiv:2107.06955  [pdf, ps, other

    cs.CL cs.LG

    HTLM: Hyper-Text Pre-Training and Prompting of Language Models

    Authors: Armen Aghajanyan, Dmytro Okhonko, Mike Lewis, Mandar Joshi, Hu Xu, Gargi Ghosh, Luke Zettlemoyer

    Abstract: We introduce HTLM, a hyper-text language model trained on a large-scale web crawl. Modeling hyper-text has a number of advantages: (1) it is easily gathered at scale, (2) it provides rich document-level and end-task-adjacent supervision (e.g. class and id attributes often encode document category information), and (3) it allows for new structured prompting that follows the established semantics of… ▽ More

    Submitted 14 July, 2021; originally announced July 2021.

  16. arXiv:2105.09996  [pdf, other

    cs.CV cs.CL

    VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding

    Authors: Hu Xu, Gargi Ghosh, Po-Yao Huang, Prahal Arora, Masoumeh Aminzadeh, Christoph Feichtenhofer, Florian Metze, Luke Zettlemoyer

    Abstract: We present a simplified, task-agnostic multi-modal pre-training approach that can accept either video or text input, or both for a variety of end tasks. Existing pre-training are task-specific by adopting either a single cross-modal encoder that requires both modalities, limiting their use for retrieval-style end tasks or more complex multitask learning with two unimodal encoders, limiting early c… ▽ More

    Submitted 30 September, 2021; v1 submitted 20 May, 2021; originally announced May 2021.

    Comments: 9 pages, ACL Findings 2021

  17. arXiv:2101.00117  [pdf, other

    cs.CL

    Multi-task Retrieval for Knowledge-Intensive Tasks

    Authors: Jean Maillard, Vladimir Karpukhin, Fabio Petroni, Wen-tau Yih, Barlas Oğuz, Veselin Stoyanov, Gargi Ghosh

    Abstract: Retrieving relevant contexts from a large corpus is a crucial step for tasks such as open-domain question answering and fact checking. Although neural retrieval outperforms traditional methods like tf-idf and BM25, its performance degrades considerably when applied to out-of-domain data. Driven by the question of whether a neural retrieval model can be universal and perform robustly on a wide va… ▽ More

    Submitted 31 December, 2020; originally announced January 2021.

  18. arXiv:2008.09015  [pdf, other

    cs.CE math.NA

    A stabilized finite element method for delamination analysis of composites using cohesive elements

    Authors: Gourab Ghosh, Ravindra Duddu, Chandrasekhar Annavarapu

    Abstract: We demonstrate the ability of a stabilized finite element method, inspired by the weighted Nitsche approach, to alleviate spurious traction oscillations at interlaminar interfaces in multi-ply multi-directional composite laminates. In contrast with the standard (penalty-like) method, the stabilized method allows the use of arbitrarily large values of cohesive stiffness and obviates the need for en… ▽ More

    Submitted 20 August, 2020; originally announced August 2020.

    Comments: 32 pages, 20 figures, submitted to computational mechanics journal

  19. arXiv:2006.15020  [pdf, other

    cs.CL cs.LG stat.ML

    Pre-training via Paraphrasing

    Authors: Mike Lewis, Marjan Ghazvininejad, Gargi Ghosh, Armen Aghajanyan, Sida Wang, Luke Zettlemoyer

    Abstract: We introduce MARGE, a pre-trained sequence-to-sequence model learned with an unsupervised multi-lingual multi-document paraphrasing objective. MARGE provides an alternative to the dominant masked language modeling paradigm, where we self-supervise the reconstruction of target text by retrieving a set of related texts (in many languages) and conditioning on them to maximize the likelihood of genera… ▽ More

    Submitted 26 June, 2020; originally announced June 2020.

  20. arXiv:1804.04410  [pdf, other

    cs.IR

    Optimizing Query Evaluations using Reinforcement Learning for Web Search

    Authors: Corby Rosset, Damien Jose, Gargi Ghosh, Bhaskar Mitra, Saurabh Tiwary

    Abstract: In web search, typically a candidate generation step selects a small set of documents---from collections containing as many as billions of web pages---that are subsequently ranked and pruned before being presented to the user. In Bing, the candidate generation involves scanning the index using statically designed match plans that prescribe sequences of different match criteria and stopping conditi… ▽ More

    Submitted 18 August, 2018; v1 submitted 12 April, 2018; originally announced April 2018.

    Comments: ACM SIGIR 2018 short paper (pre-print)

  21. arXiv:1709.04033  [pdf, other

    cs.SI physics.soc-ph

    Local Community Detection in Dynamic Networks

    Authors: Daniel J. DiTursi, Gaurav Ghosh, Petko Bogdanov

    Abstract: Given a time-evolving network, how can we detect communities over periods of high internal and low external interactions? To address this question we generalize traditional local community detection in graphs to the setting of dynamic networks. Adopting existing static-network approaches in an "aggregated" graph of all temporal interactions is not appropriate for the problem as dynamic communities… ▽ More

    Submitted 12 September, 2017; originally announced September 2017.

    Comments: extended version of paper in ICDM 2017