Skip to main content

Showing 1–13 of 13 results for author: Pavel, A

  1. Making Short-Form Videos Accessible with Hierarchical Video Summaries

    Authors: Tess Van Daele, Akhil Iyer, Yuning Zhang, Jalyn C. Derry, Mina Huh, Amy Pavel

    Abstract: Short videos on platforms such as TikTok, Instagram Reels, and YouTube Shorts (i.e. short-form videos) have become a primary source of information and entertainment. Many short-form videos are inaccessible to blind and low vision (BLV) viewers due to their rapid visual changes, on-screen text, and music or meme-audio overlays. In our formative study, 7 BLV viewers who regularly watched short-form… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

    Comments: To appear at CHI 2024

  2. arXiv:2310.07057  [pdf, other

    cs.HC

    Exploring Community-Driven Descriptions for Making Livestreams Accessible

    Authors: Daniel Killough, Amy Pavel

    Abstract: People watch livestreams to connect with others and learn about their hobbies. Livestreams feature multiple visual streams including the main video, webcams, on-screen overlays, and chat, all of which are inaccessible to livestream viewers with visual impairments. While prior work explores creating audio descriptions for recorded videos, live videos present new challenges: authoring descriptions i… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: 13 pages; to appear in The 25th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS 2023)

  3. arXiv:2307.07589  [pdf, other

    cs.HC

    GenAssist: Making Image Generation Accessible

    Authors: Mina Huh, Yi-Hao Peng, Amy Pavel

    Abstract: Blind and low vision (BLV) creators use images to communicate with sighted audiences. However, creating or retrieving images is challenging for BLV creators as it is difficult to use authoring tools or assess image search results. Thus, creators limit the types of images they create or recruit sighted collaborators. While text-to-image generation models let creators generate high-fidelity images b… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

    Comments: For accessibility tagged pdf, please refer to the ancillary file

  4. arXiv:2303.05325  [pdf, other

    cs.CV

    BaDLAD: A Large Multi-Domain Bengali Document Layout Analysis Dataset

    Authors: Md. Istiak Hossain Shihab, Md. Rakibul Hasan, Mahfuzur Rahman Emon, Syed Mobassir Hossen, Md. Nazmuddoha Ansary, Intesur Ahmed, Fazle Rabbi Rakib, Shahriar Elahi Dhruvo, Souhardya Saha Dip, Akib Hasan Pavel, Marsia Haque Meghla, Md. Rezwanul Haque, Sayma Sultana Chowdhury, Farig Sadeque, Tahsin Reasat, Ahmed Imtiaz Humayun, Asif Shahriyar Sushmit

    Abstract: While strides have been made in deep learning based Bengali Optical Character Recognition (OCR) in the past decade, the absence of large Document Layout Analysis (DLA) datasets has hindered the application of OCR in document transcription, e.g., transcribing historical documents and newspapers. Moreover, rule-based DLA systems that are currently being employed in practice are not robust to domain… ▽ More

    Submitted 5 May, 2023; v1 submitted 9 March, 2023; originally announced March 2023.

  5. AVscript: Accessible Video Editing with Audio-Visual Scripts

    Authors: Mina Huh, Saelyne Yang, Yi-Hao Peng, Xiang 'Anthony' Chen, Young-Ho Kim, Amy Pavel

    Abstract: Sighted and blind and low vision (BLV) creators alike use videos to communicate with broad audiences. Yet, video editing remains inaccessible to BLV creators. Our formative study revealed that current video editing tools make it difficult to access the visual content, assess the visual quality, and efficiently navigate the timeline. We present AVscript, an accessible text-based video editor. AVscr… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

    Comments: CHI 2023

  6. CrossA11y: Identifying Video Accessibility Issues via Cross-modal Grounding

    Authors: Xingyu "Bruce" Liu, Ruolin Wang, Dingzeyu Li, Xiang 'Anthony' Chen, Amy Pavel

    Abstract: Authors make their videos visually accessible by adding audio descriptions (AD), and auditorily accessible by adding closed captions (CC). However, creating AD and CC is challenging and tedious, especially for non-professional describers and captioners, due to the difficulty of identifying accessibility problems in videos. A video author will have to watch the video through and manually check for… ▽ More

    Submitted 23 August, 2022; originally announced August 2022.

  7. arXiv:2103.14491  [pdf

    cs.HC

    Say It All: Feedback for Improving Non-Visual Presentation Accessibility

    Authors: Yi-Hao Peng, JiWoong Jang, Jeffrey P. Bigham, Amy Pavel

    Abstract: Presenters commonly use slides as visual aids for informative talks. When presenters fail to verbally describe the content on their slides, blind and visually impaired audience members lose access to necessary content, making the presentation difficult to follow. Our analysis of 90 presentation videos revealed that 72% of 610 visual elements (e.g., images, text) were insufficiently described. To h… ▽ More

    Submitted 26 March, 2021; originally announced March 2021.

  8. Making Mobile Augmented Reality Applications Accessible

    Authors: Jaylin Herskovitz, Jason Wu, Samuel White, Amy Pavel, Gabriel Reyes, Anhong Guo, Jeffrey P. Bigham

    Abstract: Augmented Reality (AR) technology creates new immersive experiences in entertainment, games, education, retail, and social media. AR content is often primarily visual and it is challenging to enable access to it non-visually due to the mix of virtual and real-world content. In this paper, we identify common constituent tasks in AR by analyzing existing mobile AR applications for iOS, and character… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

    Comments: 14 pages. 6 figures. Published in The 22nd International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '20)

  9. arXiv:2010.03667  [pdf

    cs.HC

    Rescribe: Authoring and Automatically Editing Audio Descriptions

    Authors: Amy Pavel, Gabriel Reyes, Jeffrey P. Bigham

    Abstract: Audio descriptions make videos accessible to those who cannot see them by describing visual content in audio. Producing audio descriptions is challenging due to the synchronous nature of the audio description that must fit into gaps of other video content. An experienced audio description author will produce content that fits narration necessary to understand, enjoy, or experience the video conten… ▽ More

    Submitted 7 October, 2020; originally announced October 2020.

  10. arXiv:2008.09075  [pdf

    cs.CL cs.AI

    Controlling Dialogue Generation with Semantic Exemplars

    Authors: Prakhar Gupta, Jeffrey P. Bigham, Yulia Tsvetkov, Amy Pavel

    Abstract: Dialogue systems pretrained with large language models generate locally coherent responses, but lack the fine-grained control over responses necessary to achieve specific goals. A promising method to control response generation is exemplar-based generation, in which models edit exemplar responses that are retrieved from training data, or hand-written to strategically address discourse-level goals,… ▽ More

    Submitted 25 March, 2021; v1 submitted 20 August, 2020; originally announced August 2020.

    Comments: Accepted at NAACL 2021

  11. arXiv:2007.07151  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Extracting Structured Data from Physician-Patient Conversations By Predicting Noteworthy Utterances

    Authors: Kundan Krishna, Amy Pavel, Benjamin Schloss, Jeffrey P. Bigham, Zachary C. Lipton

    Abstract: Despite diverse efforts to mine various modalities of medical data, the conversations between physicians and patients at the time of care remain an untapped source of insights. In this paper, we leverage this data to extract structured information that might assist physicians with post-visit documentation in electronic health records, potentially lightening the clerical burden. In this exploratory… ▽ More

    Submitted 14 July, 2020; originally announced July 2020.

  12. arXiv:1907.10568  [pdf, other

    cs.CL

    Investigating Evaluation of Open-Domain Dialogue Systems With Human Generated Multiple References

    Authors: Prakhar Gupta, Shikib Mehri, Tiancheng Zhao, Amy Pavel, Maxine Eskenazi, Jeffrey P. Bigham

    Abstract: The aim of this paper is to mitigate the shortcomings of automatic evaluation of open-domain dialog systems through multi-reference evaluation. Existing metrics have been shown to correlate poorly with human judgement, particularly in open-domain dialog. One alternative is to collect human annotations for evaluation, which can be expensive and time consuming. To demonstrate the effectiveness of mu… ▽ More

    Submitted 8 September, 2019; v1 submitted 24 July, 2019; originally announced July 2019.

    Comments: SIGDIAL 2019

  13. arXiv:1612.04335  [pdf, other

    cs.CV

    How do people explore virtual environments?

    Authors: Vincent Sitzmann, Ana Serrano, Amy Pavel, Maneesh Agrawala, Diego Gutierrez, Belen Masia, Gordon Wetzstein

    Abstract: Understanding how people explore immersive virtual environments is crucial for many applications, such as designing virtual reality (VR) content, developing new compression algorithms, or learning computational models of saliency or visual attention. Whereas a body of recent work has focused on modeling saliency in desktop viewing conditions, VR is very different from these conditions in that view… ▽ More

    Submitted 19 September, 2017; v1 submitted 13 December, 2016; originally announced December 2016.

    Comments: First two authors contributed equally