subscribe to arXiv mailings

Semantic In-Domain Product Identification for Search Queries

Authors: Sanat Sharma, Jayant Kumar, Twisha Naik, Zhaoyu Lu, Arvind Srikantan, Tracy Holloway King

Abstract: Accurate explicit and implicit product identification in search queries is critical for enhancing user experiences, especially at a company like Adobe which has over 50 products and covers queries across hundreds of tools. In this work, we present a novel approach to training a product classifier from user behavioral data. Our semantic model led to >25% relative improvement in CTR (click through r… ▽ More Accurate explicit and implicit product identification in search queries is critical for enhancing user experiences, especially at a company like Adobe which has over 50 products and covers queries across hundreds of tools. In this work, we present a novel approach to training a product classifier from user behavioral data. Our semantic model led to >25% relative improvement in CTR (click through rate) across the deployed surfaces; a >50% decrease in null rate; a 2x increase in the app cards surfaced, which helps drive product visibility. △ Less

Submitted 29 May, 2024; v1 submitted 13 April, 2024; originally announced April 2024.

arXiv:2012.06238 [pdf, other]

Query Understanding for Natural Language Enterprise Search

Authors: Francisco Borges, Georgios Balikas, Marc Brette, Guillaume Kempf, Arvind Srikantan, Matthieu Landos, Darya Brazouskaya, Qianqian Shi

Abstract: Natural Language Search (NLS) extends the capabilities of search engines that perform keyword search allowing users to issue queries in a more "natural" language. The engine tries to understand the meaning of the queries and to map the query words to the symbols it supports like Persons, Organizations, Time Expressions etc.. It, then, retrieves the information that satisfies the user's need in dif… ▽ More Natural Language Search (NLS) extends the capabilities of search engines that perform keyword search allowing users to issue queries in a more "natural" language. The engine tries to understand the meaning of the queries and to map the query words to the symbols it supports like Persons, Organizations, Time Expressions etc.. It, then, retrieves the information that satisfies the user's need in different forms like an answer, a record or a list of records. We present an NLS system we implemented as part of the Search service of a major CRM platform. The system is currently in production serving thousands of customers. Our user studies showed that creating dynamic reports with NLS saved more than 50% of our user's time compared to achieving the same result with navigational search. We describe the architecture of the system, the particularities of the CRM domain as well as how they have influenced our design decisions. Among several submodules of the system we detail the role of a Deep Learning Named Entity Recognizer. The paper concludes with discussion over the lessons learned while developing this product. △ Less

Submitted 11 December, 2020; originally announced December 2020.

Comments: accepted at DeepNLP @ SIGIR 2020

arXiv:2010.09927 [pdf, other]

ColloQL: Robust Cross-Domain Text-to-SQL Over Search Queries

Authors: Karthik Radhakrishnan, Arvind Srikantan, Xi Victoria Lin

Abstract: Translating natural language utterances to executable queries is a helpful technique in making the vast amount of data stored in relational databases accessible to a wider range of non-tech-savvy end users. Prior work in this area has largely focused on textual input that is linguistically correct and semantically unambiguous. However, real-world user queries are often succinct, colloquial, and no… ▽ More Translating natural language utterances to executable queries is a helpful technique in making the vast amount of data stored in relational databases accessible to a wider range of non-tech-savvy end users. Prior work in this area has largely focused on textual input that is linguistically correct and semantically unambiguous. However, real-world user queries are often succinct, colloquial, and noisy, resembling the input of a search engine. In this work, we introduce data augmentation techniques and a sampling-based content-aware BERT model (ColloQL) to achieve robust text-to-SQL modeling over natural language search (NLS) questions. Due to the lack of evaluation data, we curate a new dataset of NLS questions and demonstrate the efficacy of our approach. ColloQL's superior performance extends to well-formed text, achieving 84.9% (logical) and 90.7% (execution) accuracy on the WikiSQL dataset, making it, to the best of our knowledge, the highest performing model that does not use execution guided decoding. △ Less

Submitted 19 October, 2020; originally announced October 2020.

Comments: IntEx-SemPar Workshop at EMNLP 2020, 12 pages, 3 figures

Showing 1–3 of 3 results for author: Srikantan, A