Skip to main content

Showing 1–15 of 15 results for author: Saravanan, K

  1. arXiv:2406.17159  [pdf, other

    eess.AS cs.MM cs.SD

    Exploring compressibility of transformer based text-to-music (TTM) models

    Authors: Vasileios Moschopoulos, Thanasis Kotsiopoulos, Pablo Peso Parada, Konstantinos Nikiforidis, Alexandros Stergiadis, Gerasimos Papakostas, Md Asif Jalal, Jisi Zhang, Anastasios Drosou, Karthikeyan Saravanan

    Abstract: State-of-the art Text-To-Music (TTM) generative AI models are large and require desktop or server class compute, making them infeasible for deployment on mobile phones. This paper presents an analysis of trade-offs between model compression and generation performance of TTM models. We study compression through knowledge distillation and specific modifications that enable applicability over the var… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Proceedings of INTERSPEECH 2024

  2. arXiv:2405.06368  [pdf, other

    cs.LG cs.CR cs.DC

    DP-DyLoRA: Fine-Tuning Transformer-Based Models On-Device under Differentially Private Federated Learning using Dynamic Low-Rank Adaptation

    Authors: Jie Xu, Karthikeyan Saravanan, Rogier van Dalen, Haaris Mehmood, David Tuckey, Mete Ozay

    Abstract: Federated learning (FL) allows clients in an Internet of Things (IoT) system to collaboratively train a global model without sharing their local data with a server. However, clients' contributions to the server can still leak sensitive information. Differential privacy (DP) addresses such leakage by providing formal privacy guarantees, with mechanisms that add randomness to the clients' contributi… ▽ More

    Submitted 28 May, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

    Comments: 16 pages, 10 figures, 5 tables

  3. arXiv:2401.13146  [pdf, other

    eess.AS cs.CL cs.SD

    Locality enhanced dynamic biasing and sampling strategies for contextual ASR

    Authors: Md Asif Jalal, Pablo Peso Parada, George Pavlidis, Vasileios Moschopoulos, Karthikeyan Saravanan, Chrysovalantis-Giorgos Kontoulis, Jisi Zhang, Anastasios Drosou, Gil Ho Lee, Jungin Lee, Seokyeong Jung

    Abstract: Automatic Speech Recognition (ASR) still face challenges when recognizing time-variant rare-phrases. Contextual biasing (CB) modules bias ASR model towards such contextually-relevant phrases. During training, a list of biasing phrases are selected from a large pool of phrases following a sampling strategy. In this work we firstly analyse different sampling strategies to provide insights into the t… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: Accepted for IEEE ASRU 2023

  4. arXiv:2401.12085  [pdf, other

    eess.AS cs.SD

    Consistency Based Unsupervised Self-training For ASR Personalisation

    Authors: Jisi Zhang, Vandana Rajan, Haaris Mehmood, David Tuckey, Pablo Peso Parada, Md Asif Jalal, Karthikeyan Saravanan, Gil Ho Lee, Jungin Lee, Seokyeong Jung

    Abstract: On-device Automatic Speech Recognition (ASR) models trained on speech data of a large population might underperform for individuals unseen during training. This is due to a domain shift between user data and the original training data, differed by user's speaking characteristics and environmental acoustic conditions. ASR personalisation is a solution that aims to exploit user data to improve model… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: Accepted for IEEE ASRU 2023

  5. BEAVIS: Balloon Enabled Aerial Vehicle for IoT and Sensing

    Authors: Suryansh Sharma, Ashutosh Simha, R. Venkatesha Prasad, Shubham Deshmukh, Kavin B. Saravanan, Ravi Ramesh, Luca Mottola

    Abstract: UAVs are becoming versatile and valuable platforms for various applications. However, the main limitation is their flying time. We present BEAVIS, a novel aerial robotic platform striking an unparalleled trade-off between the manoeuvrability of drones and the long lasting capacity of blimps. BEAVIS scores highly in applications where drones enjoy unconstrained mobility yet suffer from limited life… ▽ More

    Submitted 2 August, 2023; originally announced August 2023.

    Comments: To be published in the 29th Annual International Conference on Mobile Computing and Networking (ACM MobiCom 23), October 2-6, 2023, Madrid, Spain. ACM, New York, NY, USA, 15 pages

  6. arXiv:2307.13343  [pdf, other

    eess.AS cs.CR cs.SD

    On-Device Speaker Anonymization of Acoustic Embeddings for ASR based onFlexible Location Gradient Reversal Layer

    Authors: Md Asif Jalal, Pablo Peso Parada, Jisi Zhang, Karthikeyan Saravanan, Mete Ozay, Myoungji Han, Jung In Lee, Seokyeong Jung

    Abstract: Smart devices serviced by large-scale AI models necessitates user data transfer to the cloud for inference. For speech applications, this means transferring private user information, e.g., speaker identity. Our paper proposes a privacy-enhancing framework that targets speaker identity anonymization while preserving speech recognition accuracy for our downstream task~-~Automatic Speech Recognition… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

    Comments: Proceedings of INTERSPEECH 2023

  7. arXiv:2207.04949  [pdf, ps, other

    eess.AS cs.SD

    pMCT: Patched Multi-Condition Training for Robust Speech Recognition

    Authors: Pablo Peso Parada, Agnieszka Dobrowolska, Karthikeyan Saravanan, Mete Ozay

    Abstract: We propose a novel Patched Multi-Condition Training (pMCT) method for robust Automatic Speech Recognition (ASR). pMCT employs Multi-condition Audio Modification and Patching (MAMP) via mixing {\it patches} of the same utterance extracted from clean and distorted speech. Training using patch-modified signals improves robustness of models in noisy reverberant scenarios. Our proposed pMCT is evaluate… ▽ More

    Submitted 11 July, 2022; originally announced July 2022.

    Comments: Accepted at Interspeech 2022

  8. arXiv:2206.02797  [pdf, ps, other

    eess.AS cs.AI cs.CL cs.CV cs.DC cs.LG

    FedNST: Federated Noisy Student Training for Automatic Speech Recognition

    Authors: Haaris Mehmood, Agnieszka Dobrowolska, Karthikeyan Saravanan, Mete Ozay

    Abstract: Federated Learning (FL) enables training state-of-the-art Automatic Speech Recognition (ASR) models on user devices (clients) in distributed systems, hence preventing transmission of raw user data to a central server. A key challenge facing practical adoption of FL for ASR is obtaining ground-truth labels on the clients. Existing approaches rely on clients to manually transcribe their speech, whic… ▽ More

    Submitted 12 July, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: Accepted at Interspeech 2022

    ACM Class: I.2.11

  9. Semantic Annotation and Search for Educational Resources Supporting Distance Learning

    Authors: C. Nithya, K. Saravanan

    Abstract: Multimedia educational resources play an important role in education, particularly for distance learning environments. With the rapid growth of the multimedia web, large numbers of education articles video resources are increasingly being created by several different organizations. It is crucial to explore, share, reuse, and link these educational resources for better e-learning experiences. Most… ▽ More

    Submitted 1 March, 2014; originally announced March 2014.

    Comments: Linked Data, Semantic search, Cloud Applications, Web services, Semantic annotation, Ontology

    Journal ref: IJETT V8(6),277-285 February 2014. ISSN:2231-5381

  10. arXiv:1402.2509  [pdf

    cs.DC cs.IR

    Achieve Better Ranking Accuracy Using CloudRank Framework for Cloud Services

    Authors: M. Subha, K. Saravanan

    Abstract: Building high quality cloud applications becomes an urgently required research problem. Nonfunctional performance of cloud services is usually described by quality-of-service (QoS). In cloud applications, cloud services are invoked remotely by internet connections. The QoS Ranking of cloud services for a user cannot be transferred directly to another user, since the locations of the cloud applicat… ▽ More

    Submitted 11 February, 2014; originally announced February 2014.

    Comments: 6 pages, 10 figures, Published with International Journal of Engineering Trends and Technology (IJETT)

    Journal ref: International Journal of Engineering Trends and Technology (IJETT) 6(6):307-312, December 2013

  11. arXiv:1402.2491  [pdf

    cs.DC

    Optimizing the Cost for Resource Subscription Policy in IaaS Cloud

    Authors: M. Uthaya Banu, K. Saravanan

    Abstract: Cloud computing allow the users to efficiently and dynamically provision computing resource to meet their IT needs. Cloud Provider offers two subscription plan to the customer namely reservation and on-demand. The reservation plan is typically cheaper than on-demand plan. If the actual computing demand is known in advance reserving the resource would be straightforward. The challenge is how to mak… ▽ More

    Submitted 11 February, 2014; originally announced February 2014.

    Comments: 6 pages,8 figures,"Published with International Journal of Engineering Trends and Technology (IJETT)". M.Uthaya Banu, K.Saravanan. Article:Optimizing the Cost for Resource Subscription Policy in IaaS Cloud

    Journal ref: International Journal of Engineering Trends and Technology (IJETT) 6(6):296-301, December 2013

  12. arXiv:1210.2977  [pdf

    cs.CR

    An Effective Fusion Technique of Cloud Computing and Networking Series

    Authors: K. Saravanan, S. Akshaya, R. Pavithra, K. Pushpavalli

    Abstract: Cloud computing is making it possible to separate the process of building an infrastructure for service provisioning from the business of providing end user services. Today, such infrastructures are normally provided in large data centres and the applications are executed remotely from the users. One reason for this is that cloud computing requires a reasonably stable infrastructure and networking… ▽ More

    Submitted 10 October, 2012; originally announced October 2012.

  13. arXiv:1210.2971  [pdf

    cs.CR

    A new application of Multi modal Biometrics in home and office security system

    Authors: K. Saravanan, C. Saranya, M. Saranya

    Abstract: Biometric door lock security systems are used at those places where you have important information and stuffs. In that kind of places multibiometric electronic door lock security systems that are based on finger print and iris recognization.Multibiometric door lock security systems are used to prevent the door related burglaries such as break ins occurred in different forms so this is the best met… ▽ More

    Submitted 10 October, 2012; originally announced October 2012.

  14. arXiv:1203.4649  [pdf

    cs.CR

    A Novel Bluetooth Man-In-The-Middle Attack Based On SSP using OOB Association model

    Authors: K. Saravanan, L. Vijayanand, R. K. Negesh

    Abstract: As an interconnection technology, Bluetooth has to address all traditional security problems, well known from the distributed networks. Moreover, as Bluetooth networks are formed by the radio links, there are also additional security aspects whose impact is yet not well understood. In this paper, we propose a novel Man-In-The-Middle (MITM) attack against Bluetooth enabled mobile phone that support… ▽ More

    Submitted 21 March, 2012; originally announced March 2012.

    Report number: EMICS12

  15. arXiv:1202.2024  [pdf

    cs.NI

    Packet Score based network security and Traffic Optimization

    Authors: k. Saravanan, S. Karthik

    Abstract: One of the critical threat to internet security is Distributed Denial of Service (DDoS). This paper by the introduction of automated online attack classification and attack packet discarding helps to resolve the network security issue by certain level. The incoming packets are assigned scores based on the priority associated with the attributes and on comparison with probability distribution of ar… ▽ More

    Submitted 9 February, 2012; originally announced February 2012.