Skip to main content

Showing 1–1 of 1 results for author: Dardouri, T

  1. arXiv:2407.01558  [pdf, other

    cs.HC cs.AI

    Graphical user interface agents optimization for visual instruction grounding using multi-modal artificial intelligence systems

    Authors: Tassnim Dardouri, Laura Minkova, Jessica López Espejel, Walid Dahhane, El Hassane Ettifouri

    Abstract: Most instance perception and image understanding solutions focus mainly on natural images. However, applications for synthetic images, and more specifically, images of Graphical User Interfaces (GUI) remain limited. This hinders the development of autonomous computer-vision-powered Artificial Intelligence (AI) agents. In this work, we present Search Instruction Coordinates or SIC, a multi-modal so… ▽ More

    Submitted 5 May, 2024; originally announced July 2024.

    Comments: Preprint submitted to Engineering Applications of Artificial Intelligence journal