-
Changing Answer Order Can Decrease MMLU Accuracy
Authors:
Vipul Gupta,
David Pantoja,
Candace Ross,
Adina Williams,
Megan Ung
Abstract:
As large language models (LLMs) have grown in prevalence, particular benchmarks have become essential for the evaluation of these models and for understanding model capabilities. Most commonly, we use test accuracy averaged across multiple subtasks in order to rank models on leaderboards, to determine which model is best for our purposes. In this paper, we investigate the robustness of the accurac…
▽ More
As large language models (LLMs) have grown in prevalence, particular benchmarks have become essential for the evaluation of these models and for understanding model capabilities. Most commonly, we use test accuracy averaged across multiple subtasks in order to rank models on leaderboards, to determine which model is best for our purposes. In this paper, we investigate the robustness of the accuracy measurement on a widely used multiple choice question answering dataset, MMLU. When shuffling the answer label contents, we find that all explored models decrease in accuracy on MMLU, but not every model is equally sensitive. These findings suggest a possible adjustment to the standard practice of leaderboard testing, where we additionally consider the percentage of examples each model answers correctly by random chance.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Improving Geo-diversity of Generated Images with Contextualized Vendi Score Guidance
Authors:
Reyhane Askari Hemmat,
Melissa Hall,
Alicia Sun,
Candace Ross,
Michal Drozdzal,
Adriana Romero-Soriano
Abstract:
With the growing popularity of text-to-image generative models, there has been increasing focus on understanding their risks and biases. Recent work has found that state-of-the-art models struggle to depict everyday objects with the true diversity of the real world and have notable gaps between geographic regions. In this work, we aim to increase the diversity of generated images of common objects…
▽ More
With the growing popularity of text-to-image generative models, there has been increasing focus on understanding their risks and biases. Recent work has found that state-of-the-art models struggle to depict everyday objects with the true diversity of the real world and have notable gaps between geographic regions. In this work, we aim to increase the diversity of generated images of common objects such that per-region variations are representative of the real world. We introduce an inference time intervention, contextualized Vendi Score Guidance (c-VSG), that guides the backwards steps of latent diffusion models to increase the diversity of a sample as compared to a "memory bank" of previously generated images while constraining the amount of variation within that of an exemplar set of real-world contextualizing images. We evaluate c-VSG with two geographically representative datasets and find that it substantially increases the diversity of generated images, both for the worst performing regions and on average, while simultaneously maintaining or improving image quality and consistency. Additionally, qualitative analyses reveal that diversity of generated images is significantly improved, including along the lines of reductive region portrayals present in the original model. We hope that this work is a step towards text-to-image generative models that reflect the true geographic diversity of the world.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
An Introduction to Vision-Language Modeling
Authors:
Florian Bordes,
Richard Yuanzhe Pang,
Anurag Ajay,
Alexander C. Li,
Adrien Bardes,
Suzanne Petryk,
Oscar Mañas,
Zhiqiu Lin,
Anas Mahmoud,
Bargav Jayaraman,
Mark Ibrahim,
Melissa Hall,
Yunyang Xiong,
Jonathan Lebensold,
Candace Ross,
Srihari Jayakumar,
Chuan Guo,
Diane Bouchacourt,
Haider Al-Tahan,
Karthik Padthe,
Vasu Sharma,
Hu Xu,
Xiaoqing Ellen Tan,
Megan Richards,
Samuel Lavoie
, et al. (16 additional authors not shown)
Abstract:
Following the recent popularity of Large Language Models (LLMs), several attempts have been made to extend them to the visual domain. From having a visual assistant that could guide us through unfamiliar environments to generative models that produce images using only a high-level text description, the vision-language model (VLM) applications will significantly impact our relationship with technol…
▽ More
Following the recent popularity of Large Language Models (LLMs), several attempts have been made to extend them to the visual domain. From having a visual assistant that could guide us through unfamiliar environments to generative models that produce images using only a high-level text description, the vision-language model (VLM) applications will significantly impact our relationship with technology. However, there are many challenges that need to be addressed to improve the reliability of those models. While language is discrete, vision evolves in a much higher dimensional space in which concepts cannot always be easily discretized. To better understand the mechanics behind mapping vision to language, we present this introduction to VLMs which we hope will help anyone who would like to enter the field. First, we introduce what VLMs are, how they work, and how to train them. Then, we present and discuss approaches to evaluate VLMs. Although this work primarily focuses on mapping images to language, we also discuss extending VLMs to videos.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
GeneAgent: Self-verification Language Agent for Gene Set Knowledge Discovery using Domain Databases
Authors:
Zhizheng Wang,
Qiao Jin,
Chih-Hsuan Wei,
Shubo Tian,
Po-Ting Lai,
Qingqing Zhu,
Chi-Ping Day,
Christina Ross,
Zhiyong Lu
Abstract:
Gene set knowledge discovery is essential for advancing human functional genomics. Recent studies have shown promising performance by harnessing the power of Large Language Models (LLMs) on this task. Nonetheless, their results are subject to several limitations common in LLMs such as hallucinations. In response, we present GeneAgent, a first-of-its-kind language agent featuring self-verification…
▽ More
Gene set knowledge discovery is essential for advancing human functional genomics. Recent studies have shown promising performance by harnessing the power of Large Language Models (LLMs) on this task. Nonetheless, their results are subject to several limitations common in LLMs such as hallucinations. In response, we present GeneAgent, a first-of-its-kind language agent featuring self-verification capability. It autonomously interacts with various biological databases and leverages relevant domain knowledge to improve accuracy and reduce hallucination occurrences. Benchmarking on 1,106 gene sets from different sources, GeneAgent consistently outperforms standard GPT-4 by a significant margin. Moreover, a detailed manual review confirms the effectiveness of the self-verification module in minimizing hallucinations and generating more reliable analytical narratives. To demonstrate its practical utility, we apply GeneAgent to seven novel gene sets derived from mouse B2905 melanoma cell lines, with expert evaluations showing that GeneAgent offers novel insights into gene functions and subsequently expedites knowledge discovery.
△ Less
Submitted 25 May, 2024;
originally announced May 2024.
-
Towards Geographic Inclusion in the Evaluation of Text-to-Image Models
Authors:
Melissa Hall,
Samuel J. Bell,
Candace Ross,
Adina Williams,
Michal Drozdzal,
Adriana Romero Soriano
Abstract:
Rapid progress in text-to-image generative models coupled with their deployment for visual content creation has magnified the importance of thoroughly evaluating their performance and identifying potential biases. In pursuit of models that generate images that are realistic, diverse, visually appealing, and consistent with the given prompt, researchers and practitioners often turn to automated met…
▽ More
Rapid progress in text-to-image generative models coupled with their deployment for visual content creation has magnified the importance of thoroughly evaluating their performance and identifying potential biases. In pursuit of models that generate images that are realistic, diverse, visually appealing, and consistent with the given prompt, researchers and practitioners often turn to automated metrics to facilitate scalable and cost-effective performance profiling. However, commonly-used metrics often fail to account for the full diversity of human preference; often even in-depth human evaluations face challenges with subjectivity, especially as interpretations of evaluation criteria vary across regions and cultures. In this work, we conduct a large, cross-cultural study to study how much annotators in Africa, Europe, and Southeast Asia vary in their perception of geographic representation, visual appeal, and consistency in real and generated images from state-of-the art public APIs. We collect over 65,000 image annotations and 20 survey responses. We contrast human annotations with common automated metrics, finding that human preferences vary notably across geographic location and that current metrics do not fully account for this diversity. For example, annotators in different locations often disagree on whether exaggerated, stereotypical depictions of a region are considered geographically representative. In addition, the utility of automatic evaluations is dependent on assumptions about their set-up, such as the alignment of feature extractors with human perception of object similarity or the definition of "appeal" captured in reference datasets used to ground evaluations. We recommend steps for improved automatic and human evaluations.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
[Call for Papers] The 2nd BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus
Authors:
Leshem Choshen,
Ryan Cotterell,
Michael Y. Hu,
Tal Linzen,
Aaron Mueller,
Candace Ross,
Alex Warstadt,
Ethan Wilcox,
Adina Williams,
Chengxu Zhuang
Abstract:
After last year's successful BabyLM Challenge, the competition will be hosted again in 2024/2025. The overarching goals of the challenge remain the same; however, some of the competition rules will be different. The big changes for this year's competition are as follows: First, we replace the loose track with a paper track, which allows (for example) non-model-based submissions, novel cognitively-…
▽ More
After last year's successful BabyLM Challenge, the competition will be hosted again in 2024/2025. The overarching goals of the challenge remain the same; however, some of the competition rules will be different. The big changes for this year's competition are as follows: First, we replace the loose track with a paper track, which allows (for example) non-model-based submissions, novel cognitively-inspired benchmarks, or analysis techniques. Second, we are relaxing the rules around pretraining data, and will now allow participants to construct their own datasets provided they stay within the 100M-word or 10M-word budget. Third, we introduce a multimodal vision-and-language track, and will release a corpus of 50% text-only and 50% image-text multimodal data as a starting point for LM model training. The purpose of this CfP is to provide rules for this year's challenge, explain these rule changes and their rationale in greater detail, give a timeline of this year's competition, and provide answers to frequently asked questions from last year's challenge.
△ Less
Submitted 9 April, 2024;
originally announced April 2024.
-
Heterogeneous Peridynamic Neural Operators: Discover Biotissue Constitutive Law and Microstructure From Digital Image Correlation Measurements
Authors:
Siavash Jafarzadeh,
Stewart Silling,
Lu Zhang,
Colton Ross,
Chung-Hao Lee,
S. M. Rakibur Rahman,
Shuodao Wang,
Yue Yu
Abstract:
Human tissues are highly organized structures with specific collagen fiber arrangements varying from point to point. The effects of such heterogeneity play an important role for tissue function, and hence it is of critical to discover and understand the distribution of such fiber orientations from experimental measurements, such as the digital image correlation data. To this end, we introduce the…
▽ More
Human tissues are highly organized structures with specific collagen fiber arrangements varying from point to point. The effects of such heterogeneity play an important role for tissue function, and hence it is of critical to discover and understand the distribution of such fiber orientations from experimental measurements, such as the digital image correlation data. To this end, we introduce the heterogeneous peridynamic neural operator (HeteroPNO) approach, for data-driven constitutive modeling of heterogeneous anisotropic materials. The goal is to learn both a nonlocal constitutive law together with the material microstructure, in the form of a heterogeneous fiber orientation field, from loading field-displacement field measurements. To this end, we propose a two-phase learning approach. Firstly, we learn a homogeneous constitutive law in the form of a neural network-based kernel function and a nonlocal bond force, to capture complex homogeneous material responses from data. Then, in the second phase we reinitialize the learnt bond force and the kernel function, and training them together with a fiber orientation field for each material point. Owing to the state-based peridynamic skeleton, our HeteroPNO-learned material models are objective and have the balance of linear and angular momentum guaranteed. Moreover, the effects from heterogeneity and nonlinear constitutive relationship are captured by the kernel function and the bond force respectively, enabling physical interpretability. As a result, our HeteroPNO architecture can learn a constitutive model for a biological tissue with anisotropic heterogeneous response undergoing large deformation regime. Moreover, the framework is capable to provide displacement and stress field predictions for new and unseen loading instances.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Improving Text-to-Image Consistency via Automatic Prompt Optimization
Authors:
Oscar Mañas,
Pietro Astolfi,
Melissa Hall,
Candace Ross,
Jack Urbanek,
Adina Williams,
Aishwarya Agrawal,
Adriana Romero-Soriano,
Michal Drozdzal
Abstract:
Impressive advances in text-to-image (T2I) generative models have yielded a plethora of high performing models which are able to generate aesthetically appealing, photorealistic images. Despite the progress, these models still struggle to produce images that are consistent with the input prompt, oftentimes failing to capture object quantities, relations and attributes properly. Existing solutions…
▽ More
Impressive advances in text-to-image (T2I) generative models have yielded a plethora of high performing models which are able to generate aesthetically appealing, photorealistic images. Despite the progress, these models still struggle to produce images that are consistent with the input prompt, oftentimes failing to capture object quantities, relations and attributes properly. Existing solutions to improve prompt-image consistency suffer from the following challenges: (1) they oftentimes require model fine-tuning, (2) they only focus on nearby prompt samples, and (3) they are affected by unfavorable trade-offs among image quality, representation diversity, and prompt-image consistency. In this paper, we address these challenges and introduce a T2I optimization-by-prompting framework, OPT2I, which leverages a large language model (LLM) to improve prompt-image consistency in T2I models. Our framework starts from a user prompt and iteratively generates revised prompts with the goal of maximizing a consistency score. Our extensive validation on two datasets, MSCOCO and PartiPrompts, shows that OPT2I can boost the initial consistency score by up to 24.9% in terms of DSG score while preserving the FID and increasing the recall between generated and real data. Our work paves the way toward building more reliable and robust T2I systems by harnessing the power of LLMs.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
Leveraging Diffusion Perturbations for Measuring Fairness in Computer Vision
Authors:
Nicholas Lui,
Bryan Chia,
William Berrios,
Candace Ross,
Douwe Kiela
Abstract:
Computer vision models have been known to encode harmful biases, leading to the potentially unfair treatment of historically marginalized groups, such as people of color. However, there remains a lack of datasets balanced along demographic traits that can be used to evaluate the downstream fairness of these models. In this work, we demonstrate that diffusion models can be leveraged to create such…
▽ More
Computer vision models have been known to encode harmful biases, leading to the potentially unfair treatment of historically marginalized groups, such as people of color. However, there remains a lack of datasets balanced along demographic traits that can be used to evaluate the downstream fairness of these models. In this work, we demonstrate that diffusion models can be leveraged to create such a dataset. We first use a diffusion model to generate a large set of images depicting various occupations. Subsequently, each image is edited using inpainting to generate multiple variants, where each variant refers to a different perceived race. Using this dataset, we benchmark several vision-language models on a multi-class occupation classification task. We find that images generated with non-Caucasian labels have a significantly higher occupation misclassification rate than images generated with Caucasian labels, and that several misclassifications are suggestive of racial biases. We measure a model's downstream fairness by computing the standard deviation in the probability of predicting the true occupation label across the different perceived identity groups. Using this fairness metric, we find significant disparities between the evaluated vision-and-language models. We hope that our work demonstrates the potential value of diffusion methods for fairness evaluations.
△ Less
Submitted 11 February, 2024; v1 submitted 25 November, 2023;
originally announced November 2023.
-
Domain-wall skyrmion chain and domain-wall bimerons in chiral magnets
Authors:
Yuki Amari,
Calum Ross,
Muneto Nitta
Abstract:
We construct domain-wall skyrmion chains and domain-wall bimerons in chiral magnets with an out-of-plane easy-axis anisotropy and without a Zeeman term coupling to a magnetic field. Domain-wall skyrmions are skyrmions trapped inside a domain wall, they are present in the ferromagnetic (FM) phase of a chiral magnet with an out-of-plane easy-axis anisotropy. In this paper, we explore the stability o…
▽ More
We construct domain-wall skyrmion chains and domain-wall bimerons in chiral magnets with an out-of-plane easy-axis anisotropy and without a Zeeman term coupling to a magnetic field. Domain-wall skyrmions are skyrmions trapped inside a domain wall, they are present in the ferromagnetic (FM) phase of a chiral magnet with an out-of-plane easy-axis anisotropy. In this paper, we explore the stability of domain-wall skyrmions in the FM phase and in a chiral soliton lattice (CSL) or spiral phase, which is a periodic array of domain walls and anti-domain walls arranged in an alternating manner. In the FM phase, the worldline of a domain-wall skyrmion is bent to form a cusp at the position of the skyrmion. We describe such a cusp using both an analytic method and numerical solutions, and find a good agreement between them for small DM interactions. We show that the cusp grows toward the phase boundary with the CSL, and eventually diverges at the boundary. Second, if we put one skyrmion trapped inside a domain wall in a CSL, it decays into a pair of merons by a reconnection of the domain wall and its adjacent anti-domain wall. Third, if we put skyrmions and anti-skyrmions alternately in domain walls and anti-domain walls, respectively such a chain is stable.
△ Less
Submitted 9 November, 2023;
originally announced November 2023.
-
Role of isospin composition in low energy nuclear fusion
Authors:
Richard Gumbel,
Christian Ross,
A. S. Umar
Abstract:
We employ a microscopic approach that examines the impact of isospin dynamics on the process of low energy nuclear fusion along an isotope chain and dependence on deformation. Our method utilizes the density constrained time-dependent Hartree-Fock theory (DC-TDHF), where isoscalar and isovector characteristics of the energy density functional (EDF) are examined in turn. This approach is applied to…
▽ More
We employ a microscopic approach that examines the impact of isospin dynamics on the process of low energy nuclear fusion along an isotope chain and dependence on deformation. Our method utilizes the density constrained time-dependent Hartree-Fock theory (DC-TDHF), where isoscalar and isovector characteristics of the energy density functional (EDF) are examined in turn. This approach is applied to a series of fusion interactions of $^{176}$Yb with increasingly neutron rich isotopes of Calcium. By evaluating the contributions from the isoscalar and isovector components of the EDF, we look to quantify the influence of isospin composition on the conditions under which fusion is most likely to take place. Our findings reveal that, in non-symmetric systems, the isovector dynamics play a significant role. It's typical effect is a reduction in the potential barrier, which turns into enhancement for neutron-rich systems.
△ Less
Submitted 7 October, 2023;
originally announced October 2023.
-
Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Authors:
Lili Yu,
Bowen Shi,
Ramakanth Pasunuru,
Benjamin Muller,
Olga Golovneva,
Tianlu Wang,
Arun Babu,
Binh Tang,
Brian Karrer,
Shelly Sheynin,
Candace Ross,
Adam Polyak,
Russell Howes,
Vasu Sharma,
Puxin Xu,
Hovhannes Tamoyan,
Oron Ashual,
Uriel Singer,
Shang-Wen Li,
Susan Zhang,
Richard James,
Gargi Ghosh,
Yaniv Taigman,
Maryam Fazel-Zarandi,
Asli Celikyilmaz
, et al. (2 additional authors not shown)
Abstract:
We present CM3Leon (pronounced "Chameleon"), a retrieval-augmented, token-based, decoder-only multi-modal language model capable of generating and infilling both text and images. CM3Leon uses the CM3 multi-modal architecture but additionally shows the extreme benefits of scaling up and tuning on more diverse instruction-style data. It is the first multi-modal model trained with a recipe adapted fr…
▽ More
We present CM3Leon (pronounced "Chameleon"), a retrieval-augmented, token-based, decoder-only multi-modal language model capable of generating and infilling both text and images. CM3Leon uses the CM3 multi-modal architecture but additionally shows the extreme benefits of scaling up and tuning on more diverse instruction-style data. It is the first multi-modal model trained with a recipe adapted from text-only language models, including a large-scale retrieval-augmented pre-training stage and a second multi-task supervised fine-tuning (SFT) stage. It is also a general-purpose model that can do both text-to-image and image-to-text generation, allowing us to introduce self-contained contrastive decoding methods that produce high-quality outputs. Extensive experiments demonstrate that this recipe is highly effective for multi-modal models. CM3Leon achieves state-of-the-art performance in text-to-image generation with 5x less training compute than comparable methods (zero-shot MS-COCO FID of 4.88). After SFT, CM3Leon can also demonstrate unprecedented levels of controllability in tasks ranging from language-guided image editing to image-controlled generation and segmentation.
△ Less
Submitted 5 September, 2023;
originally announced September 2023.
-
FACET: Fairness in Computer Vision Evaluation Benchmark
Authors:
Laura Gustafson,
Chloe Rolland,
Nikhila Ravi,
Quentin Duval,
Aaron Adcock,
Cheng-Yang Fu,
Melissa Hall,
Candace Ross
Abstract:
Computer vision models have known performance disparities across attributes such as gender and skin tone. This means during tasks such as classification and detection, model performance differs for certain classes based on the demographics of the people in the image. These disparities have been shown to exist, but until now there has not been a unified approach to measure these differences for com…
▽ More
Computer vision models have known performance disparities across attributes such as gender and skin tone. This means during tasks such as classification and detection, model performance differs for certain classes based on the demographics of the people in the image. These disparities have been shown to exist, but until now there has not been a unified approach to measure these differences for common use-cases of computer vision models. We present a new benchmark named FACET (FAirness in Computer Vision EvaluaTion), a large, publicly available evaluation set of 32k images for some of the most common vision tasks - image classification, object detection and segmentation. For every image in FACET, we hired expert reviewers to manually annotate person-related attributes such as perceived skin tone and hair type, manually draw bounding boxes and label fine-grained person-related classes such as disk jockey or guitarist. In addition, we use FACET to benchmark state-of-the-art vision models and present a deeper understanding of potential performance disparities and challenges across sensitive demographic attributes. With the exhaustive annotations collected, we probe models using single demographics attributes as well as multiple attributes using an intersectional approach (e.g. hair color and perceived skin tone). Our results show that classification, detection, segmentation, and visual grounding models exhibit performance disparities across demographic attributes and intersections of attributes. These harms suggest that not all people represented in datasets receive fair and equitable treatment in these vision tasks. We hope current and future results using our benchmark will contribute to fairer, more robust vision models. FACET is available publicly at https://facet.metademolab.com/
△ Less
Submitted 31 August, 2023;
originally announced September 2023.
-
Temperature Evolution of Magnon Propagation Length in Tm$_3$Fe$_5$O$_{12}$ Thin Films: Roles of Magnetic Anisotropy and Gilbert Damping
Authors:
Amit Chanda,
Christian Holzmann,
Noah Schulz,
Aladin Ullrich,
Manfred Albrecht,
Miela J. Gross,
Caroline A. Ross,
Dario. A. Arena,
Manh-Huong Phan,
Hariharan Srikanth
Abstract:
The magnon propagation length ($\langleξ\rangle$) of a ferro/ferrimagnet (FM) is one of the key factors that controls the generation and propagation of thermally-driven spin current in FM/heavy metal (HM) bilayer based spincaloritronic devices. Theory predicts that for the FM layer, $\langleξ\rangle$ is inversely proportional to the Gilbert damping ($α$) and the square root of the effective magnet…
▽ More
The magnon propagation length ($\langleξ\rangle$) of a ferro/ferrimagnet (FM) is one of the key factors that controls the generation and propagation of thermally-driven spin current in FM/heavy metal (HM) bilayer based spincaloritronic devices. Theory predicts that for the FM layer, $\langleξ\rangle$ is inversely proportional to the Gilbert damping ($α$) and the square root of the effective magnetic anisotropy constant ($K_{\rm eff}$). However, direct experimental evidence of this relationship is lacking. To experimentally confirm this prediction, we employ a combination of longitudinal spin Seebeck effect (LSSE), transverse susceptibility, and ferromagnetic resonance experiments to investigate the temperature evolution of $\langleξ\rangle$ and establish its correlation with the effective magnetic anisotropy field, $H_K^{\rm eff}$ ($\propto K_{\rm eff}$) and $α$ in Tm$_3$Fe$_5$O$_{12}$ (TmIG)/Pt bilayers. We observe concurrent drops in the LSSE voltage and $\langleξ\rangle$ below 200$^\circ$K in TmIG/Pt bilayers regardless of TmIG film thickness and substrate choice and attribute it to the noticeable increases in $H_K^{\rm eff}$ and $α$ that occur within the same temperature range. From the TmIG thickness dependence of the LSSE voltage, we determined the temperature dependence of $\langleξ\rangle$ and highlighted its correlation with the temperature-dependent $H_K^{\rm eff}$ and $α$ in TmIG/Pt bilayers, which will be beneficial for the development of rare-earth iron garnet-based efficient spincaloritronic nanodevices.
△ Less
Submitted 13 February, 2024; v1 submitted 14 August, 2023;
originally announced August 2023.
-
DIG In: Evaluating Disparities in Image Generations with Indicators for Geographic Diversity
Authors:
Melissa Hall,
Candace Ross,
Adina Williams,
Nicolas Carion,
Michal Drozdzal,
Adriana Romero Soriano
Abstract:
The unprecedented photorealistic results achieved by recent text-to-image generative systems and their increasing use as plug-and-play content creation solutions make it crucial to understand their potential biases. In this work, we introduce three indicators to evaluate the realism, diversity and prompt-generation consistency of text-to-image generative systems when prompted to generate objects f…
▽ More
The unprecedented photorealistic results achieved by recent text-to-image generative systems and their increasing use as plug-and-play content creation solutions make it crucial to understand their potential biases. In this work, we introduce three indicators to evaluate the realism, diversity and prompt-generation consistency of text-to-image generative systems when prompted to generate objects from across the world. Our indicators complement qualitative analysis of the broader impact of such systems by enabling automatic and efficient benchmarking of geographic disparities, an important step towards building responsible visual content creation systems. We use our proposed indicators to analyze potential geographic biases in state-of-the-art visual content creation systems and find that: (1) models have less realism and diversity of generations when prompting for Africa and West Asia than Europe, (2) prompting with geographic information comes at a cost to prompt-consistency and diversity of generated images, and (3) models exhibit more region-level disparities for some objects than others. Perhaps most interestingly, our indicators suggest that progress in image generation quality has come at the cost of real-world geographic representation. Our comprehensive evaluation constitutes a crucial step towards ensuring a positive experience of visual content creation for everyone.
△ Less
Submitted 18 March, 2024; v1 submitted 11 August, 2023;
originally announced August 2023.
-
Multipliers and equivalence of functions, spaces, and operators
Authors:
Cristina Camara,
Carlos Carteiro. William T. Ross
Abstract:
This paper offers a unified approach to determining when two generalized Toeplitz operators on L^2 are equivalent. This will be done through multipliers between closed subspaces of L^2. Our discussion will include Toeplitz operators (and their duals) on the Hardy space, Hankel operators, asymmetric truncated Toeplitz operators, and dual asymmetric truncated Toeplitz operators. Along the way, there…
▽ More
This paper offers a unified approach to determining when two generalized Toeplitz operators on L^2 are equivalent. This will be done through multipliers between closed subspaces of L^2. Our discussion will include Toeplitz operators (and their duals) on the Hardy space, Hankel operators, asymmetric truncated Toeplitz operators, and dual asymmetric truncated Toeplitz operators. Along the way, there will be a discussion of equivalence of functions and kernels of generalized Toeplitz operators and a generalization of the Brown--Halmos theorem for this class of operators.
△ Less
Submitted 11 July, 2023;
originally announced July 2023.
-
GPT-4 Technical Report
Authors:
OpenAI,
Josh Achiam,
Steven Adler,
Sandhini Agarwal,
Lama Ahmad,
Ilge Akkaya,
Florencia Leoni Aleman,
Diogo Almeida,
Janko Altenschmidt,
Sam Altman,
Shyamal Anadkat,
Red Avila,
Igor Babuschkin,
Suchir Balaji,
Valerie Balcom,
Paul Baltescu,
Haiming Bao,
Mohammad Bavarian,
Jeff Belgum,
Irwan Bello,
Jake Berdine,
Gabriel Bernadett-Shapiro,
Christopher Berner,
Lenny Bogdonoff,
Oleg Boiko
, et al. (256 additional authors not shown)
Abstract:
We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based mo…
▽ More
We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based model pre-trained to predict the next token in a document. The post-training alignment process results in improved performance on measures of factuality and adherence to desired behavior. A core component of this project was developing infrastructure and optimization methods that behave predictably across a wide range of scales. This allowed us to accurately predict some aspects of GPT-4's performance based on models trained with no more than 1/1,000th the compute of GPT-4.
△ Less
Submitted 4 March, 2024; v1 submitted 15 March, 2023;
originally announced March 2023.
-
The Asymptotic Structure of the Centred Hyperbolic 2-Monopole Moduli Space
Authors:
Guido Franchetti,
Calum Ross
Abstract:
We construct an asymptotic metric on the moduli space of two centred hyperbolic monopoles by working in the point particle approximation, that is treating well-separated monopoles as point particles with an electric, magnetic and scalar charge and re-interpreting the dynamics of the 2-particle system as geodesic motion with respect to some metric. The corresponding analysis in the Euclidean case f…
▽ More
We construct an asymptotic metric on the moduli space of two centred hyperbolic monopoles by working in the point particle approximation, that is treating well-separated monopoles as point particles with an electric, magnetic and scalar charge and re-interpreting the dynamics of the 2-particle system as geodesic motion with respect to some metric. The corresponding analysis in the Euclidean case famously yields the negative mass Taub-NUT metric, which asymptotically approximates the $L^2$ metric on the moduli space of two Euclidean monopoles, the Atiyah-Hitchin metric. An important difference with the Euclidean case is that, due to the absence of Galilean symmetry, in the hyperbolic case it is not possible to factor out the centre of mass motion. Nevertheless we show that we can consistently restrict to a 3-dimensional configuration space by considering antipodal configurations. In complete parallel with the Euclidean case, the metric that we obtain is then the hyperbolic analogue of negative mass Taub-NUT. We also show how the metric obtained is related to the asymptotic form of a hyperbolic analogue of the Atiyah-Hitchin metric constructed by Hitchin.
△ Less
Submitted 4 July, 2023; v1 submitted 27 February, 2023;
originally announced February 2023.
-
First-principles based Monte Carlo modeling of oxygen deficient Fe-substituted SrTiO$_3$ experimental magnetization
Authors:
Juan M. Florez,
Miguel A. Solis Miquio,
Emilio A. Cortés Estay,
Eric Suárez Morell,
Caroline A. Ross
Abstract:
Ferroics based on transition-metal (TM) substituted SrTiO$_{3}$ have called much attention as magnetism and/or ferroelectricity can be tuned by using cations substitution and defects, strain and/or oxygen deficiency. C. A. Ross et al. [Phys. Rev. Applied 7, 024006 (2017)] demonstrated the SrTi$_{1-x}$Fe$_{x}$O$_{3-δ}$ (STF) magnetization behavior for different deposition oxygen-pressures, substrat…
▽ More
Ferroics based on transition-metal (TM) substituted SrTiO$_{3}$ have called much attention as magnetism and/or ferroelectricity can be tuned by using cations substitution and defects, strain and/or oxygen deficiency. C. A. Ross et al. [Phys. Rev. Applied 7, 024006 (2017)] demonstrated the SrTi$_{1-x}$Fe$_{x}$O$_{3-δ}$ (STF) magnetization behavior for different deposition oxygen-pressures, substrates and magnetic fields. The relation between oxygen deficiency and ferroic orders is yet to be well understood, for which the full potential of oxygen-stoichiometry engineered materials remain an open question. Here, we use hybrid-DFT to calculate different oxygen vacancy ($v_{o}$) states in STF with a variety of TM distributions. The resulting cations' magnetic states and alignments associated to the $v_{o}$ ground-states for $x=\{0.125,0.25\}$ are used within a Monte Carlo scope for collinear magnetism to simulate the spontaneous magnetization. Our model captures several experimental STF features i.e., display a maximum of the magnetization at intermediate number of vacancies, a monotonous quenching from $\sim{0.35}μ{_{B}}$ for small $δ$, and a slower decreasing of such saturation for larger number of vacancies. Moreover, our approach gives a further insight into the relations between defects stabilization and magnetization, vacancy density and the oxygen pressure required to maximize such ferroic order, and sets guidelines for future Machine Learning based computational synthesis of multiferroic oxides.
△ Less
Submitted 23 February, 2023;
originally announced February 2023.
-
Towards Reliable Assessments of Demographic Disparities in Multi-Label Image Classifiers
Authors:
Melissa Hall,
Bobbie Chern,
Laura Gustafson,
Denisse Ventura,
Harshad Kulkarni,
Candace Ross,
Nicolas Usunier
Abstract:
Disaggregated performance metrics across demographic groups are a hallmark of fairness assessments in computer vision. These metrics successfully incentivized performance improvements on person-centric tasks such as face analysis and are used to understand risks of modern models. However, there is a lack of discussion on the vulnerabilities of these measurements for more complex computer vision ta…
▽ More
Disaggregated performance metrics across demographic groups are a hallmark of fairness assessments in computer vision. These metrics successfully incentivized performance improvements on person-centric tasks such as face analysis and are used to understand risks of modern models. However, there is a lack of discussion on the vulnerabilities of these measurements for more complex computer vision tasks. In this paper, we consider multi-label image classification and, specifically, object categorization tasks. First, we highlight design choices and trade-offs for measurement that involve more nuance than discussed in prior computer vision literature. These challenges are related to the necessary scale of data, definition of groups for images, choice of metric, and dataset imbalances. Next, through two case studies using modern vision models, we demonstrate that naive implementations of these assessments are brittle. We identify several design choices that look merely like implementation details but significantly impact the conclusions of assessments, both in terms of magnitude and direction (on which group the classifiers work best) of disparities. Based on ablation studies, we propose some recommendations to increase the reliability of these assessments. Finally, through a qualitative analysis we find that concepts with large disparities tend to have varying definitions and representations between groups, with inconsistencies across datasets and annotators. While this result suggests avenues for mitigation through more consistent data collection, it also highlights that ambiguous label definitions remain a challenge when performing model assessments. Vision models are expanding and becoming more ubiquitous; it is even more important that our disparity assessments accurately reflect the true performance of models.
△ Less
Submitted 16 February, 2023;
originally announced February 2023.
-
Vision-Language Models Performing Zero-Shot Tasks Exhibit Gender-based Disparities
Authors:
Melissa Hall,
Laura Gustafson,
Aaron Adcock,
Ishan Misra,
Candace Ross
Abstract:
We explore the extent to which zero-shot vision-language models exhibit gender bias for different vision tasks. Vision models traditionally required task-specific labels for representing concepts, as well as finetuning; zero-shot models like CLIP instead perform tasks with an open-vocabulary, meaning they do not need a fixed set of labels, by using text embeddings to represent concepts. With these…
▽ More
We explore the extent to which zero-shot vision-language models exhibit gender bias for different vision tasks. Vision models traditionally required task-specific labels for representing concepts, as well as finetuning; zero-shot models like CLIP instead perform tasks with an open-vocabulary, meaning they do not need a fixed set of labels, by using text embeddings to represent concepts. With these capabilities in mind, we ask: Do vision-language models exhibit gender bias when performing zero-shot image classification, object detection and semantic segmentation? We evaluate different vision-language models with multiple datasets across a set of concepts and find (i) all models evaluated show distinct performance differences based on the perceived gender of the person co-occurring with a given concept in the image and that aggregating analyses over all concepts can mask these concerns; (ii) model calibration (i.e. the relationship between accuracy and confidence) also differs distinctly by perceived gender, even when evaluating on similar representations of concepts; and (iii) these observed disparities align with existing gender biases in word embeddings from language models. These findings suggest that, while language greatly expands the capability of vision tasks, it can also contribute to social biases in zero-shot vision settings. Furthermore, biases can further propagate when foundational models like CLIP are used by other models to enable zero-shot capabilities.
△ Less
Submitted 26 January, 2023;
originally announced January 2023.
-
Coherent magnon-induced domain wall motion in a magnetic insulator channel
Authors:
Yabin Fan,
Miela J. Gross,
Takian Fakhrul,
Joseph Finley,
Justin T. Hou,
Luqiao Liu,
Caroline A. Ross
Abstract:
Advancing the development of spin-wave devices requires high-quality low-damping magnetic materials where magnon spin currents can propagate efficiently and interact effectively with local magnetic textures. We show that magnetic domain walls (DW) can modulate spin-wave transport in perpendicularly magnetized channels of Bi-doped yttrium-iron-garnet (BiYIG). Conversely, we demonstrate that the mag…
▽ More
Advancing the development of spin-wave devices requires high-quality low-damping magnetic materials where magnon spin currents can propagate efficiently and interact effectively with local magnetic textures. We show that magnetic domain walls (DW) can modulate spin-wave transport in perpendicularly magnetized channels of Bi-doped yttrium-iron-garnet (BiYIG). Conversely, we demonstrate that the magnon spin current can drive DW motion in the BiYIG channel device by means of magnon spin-transfer torque. The DW can be reliably moved over 15 um distances at zero applied magnetic field by a magnon spin current excited by an RF pulse as short as 1 ns. The required energy for driving DW motion is orders of magnitude smaller than those reported for metallic systems. These results facilitate low-switching-energy magnonic devices and circuits where magnetic domains can be efficiently reconfigured by magnon spin currents flowing within magnetic channels.
△ Less
Submitted 2 December, 2022;
originally announced December 2022.
-
The POLARBEAR-2 and Simons Array Focal Plane Fabrication Status
Authors:
B. Westbrook,
P. A. R. Ade,
M. Aguilar,
Y. Akiba,
K. Arnold,
C. Baccigalupi,
D. Barron,
D. Beck,
S. Beckman,
A. N. Bender,
F. Bianchini,
D. Boettger,
J. Borrill,
S. Chapman,
Y. Chinone,
G. Coppi,
K. Crowley,
A. Cukierman,
T. de,
R. Dünner,
M. Dobbs,
T. Elleflot,
J. Errard,
G. Fabbian,
S. M. Feeney
, et al. (68 additional authors not shown)
Abstract:
We present on the status of POLARBEAR-2 A (PB2-A) focal plane fabrication. The PB2-A is the first of three telescopes in the Simon Array (SA), which is an array of three cosmic microwave background (CMB) polarization sensitive telescopes located at the POLARBEAR (PB) site in Northern Chile. As the successor to the PB experiment, each telescope and receiver combination is named as PB2-A, PB2-B, and…
▽ More
We present on the status of POLARBEAR-2 A (PB2-A) focal plane fabrication. The PB2-A is the first of three telescopes in the Simon Array (SA), which is an array of three cosmic microwave background (CMB) polarization sensitive telescopes located at the POLARBEAR (PB) site in Northern Chile. As the successor to the PB experiment, each telescope and receiver combination is named as PB2-A, PB2-B, and PB2-C. PB2-A and -B will have nearly identical receivers operating at 90 and 150 GHz while PB2-C will house a receiver operating at 220 and 270 GHz. Each receiver contains a focal plane consisting of seven close-hex packed lenslet coupled sinuous antenna transition edge sensor bolometer arrays. Each array contains 271 di-chroic optical pixels each of which have four TES bolometers for a total of 7588 detectors per receiver. We have produced a set of two types of candidate arrays for PB2-A. The first we call Version 11 (V11) and uses a silicon oxide (SiOx) for the transmission lines and cross-over process for orthogonal polarizations. The second we call Version 13 (V13) and uses silicon nitride (SiNx) for the transmission lines and cross-under process for orthogonal polarizations. We have produced enough of each type of array to fully populate the focal plane of the PB2-A receiver. The average wirebond yield for V11 and V13 arrays is 93.2% and 95.6% respectively. The V11 arrays had a superconducting transition temperature (Tc) of 452 +/- 15 mK, a normal resistance (Rn) of 1.25 +/- 0.20 Ohms, and saturations powers of 5.2 +/- 1.0 pW and 13 +/- 1.2 pW for the 90 and 150 GHz bands respectively. The V13 arrays had a superconducting transition temperature (Tc) of 456 +/-6 mK, a normal resistance (Rn) of 1.1 +/- 0.2 Ohms, and saturations powers of 10.8 +/- 1.8 pW and 22.9 +/- 2.6 pW for the 90 and 150 GHz bands respectively.
△ Less
Submitted 8 October, 2022;
originally announced October 2022.
-
Effect of intense x-ray free-electron laser transient gratings on the magnetic domain structure of Tm:YIG
Authors:
Victor Ukleev,
Max Burian,
Sebastian Gliga,
C. A. F. Vaz,
Benedikt Rösner,
Danny Fainozzi,
Gediminas Seniutinas,
Adam Kubec,
Roman Mankowsky,
Henrik T. Lemke,
Ethan R. Rosenberg,
Caroline A. Ross,
Elisabeth Müller,
Christian David,
Cristian Svetina,
Urs Staub
Abstract:
Magnetic patterns can be controlled globally using fields or spin polarized currents. In contrast, the local control of the magnetization on the nanometer length scale remains challenging. Here, we demonstrate how magnetic domain patterns in a Tm-doped yttrium iron garnet (Tm:YIG) thin film with perpendicular magnetic anisotropy can be permanently and locally imprinted by high intensity photon pul…
▽ More
Magnetic patterns can be controlled globally using fields or spin polarized currents. In contrast, the local control of the magnetization on the nanometer length scale remains challenging. Here, we demonstrate how magnetic domain patterns in a Tm-doped yttrium iron garnet (Tm:YIG) thin film with perpendicular magnetic anisotropy can be permanently and locally imprinted by high intensity photon pulses of a hard x-ray transient grating (XTG). Micromagnetic simulations provide a qualitative understanding of the observed changes in the orientation of magnetic domains in Tm:YIG and XTG-induced changes. The presented results offer a route for the local manipulation of the magnetic state using hard XTG.
△ Less
Submitted 3 March, 2023; v1 submitted 8 August, 2022;
originally announced August 2022.
-
AG2U -- Autonomous Grading Under Uncertainties
Authors:
Yakov Miron,
Yuval Goldfracht,
Chana Ross,
Dotan Di Castro,
Itzik Klein
Abstract:
Surface grading, the process of leveling an uneven area containing pre-dumped sand piles, is an important task in the construction site pipeline. This labour-intensive process is often carried out by a dozer, a key machinery tool at any construction site. Current attempts to automate surface grading assume perfect localization. However, in real-world scenarios, this assumption fails, as agents are…
▽ More
Surface grading, the process of leveling an uneven area containing pre-dumped sand piles, is an important task in the construction site pipeline. This labour-intensive process is often carried out by a dozer, a key machinery tool at any construction site. Current attempts to automate surface grading assume perfect localization. However, in real-world scenarios, this assumption fails, as agents are presented with imperfect perception, which leads to degraded performance. In this work, we address the problem of autonomous grading under uncertainties. First, we implement a simulation and a scaled real-world prototype environment to enable rapid policy exploration and evaluation in this setting. Second, we formalize the problem as a partially observable markov decision process and train an agent capable of handling such uncertainties. We show, through rigorous experiments, that an agent trained under perfect localization will suffer degraded performance when presented with localization uncertainties. However, an agent trained using our method will develop a more robust policy for addressing such errors and, consequently, exhibit a better grading performance.
△ Less
Submitted 4 August, 2022;
originally announced August 2022.
-
Building ideal paraxial optical skyrmions using rational maps
Authors:
C. Cisowski,
S. Franke-Arnold,
C. Ross
Abstract:
We introduce a simple mathematical expression based on rational maps to construct ideal paraxial optical skyrmions fields including Neel-type and Bloch-type skyrmions, anti-skyrmions, bimerons and multi-skyrmions, including skyrmion lattices. We review the rules that fully polarized paraxial light fields must obey to be considered as optical skyrmions. This work provides guidelines for the experim…
▽ More
We introduce a simple mathematical expression based on rational maps to construct ideal paraxial optical skyrmions fields including Neel-type and Bloch-type skyrmions, anti-skyrmions, bimerons and multi-skyrmions, including skyrmion lattices. We review the rules that fully polarized paraxial light fields must obey to be considered as optical skyrmions. This work provides guidelines for the experimental generation of general skyrmion fields, beyond conventional single skyrmion beams. This lays the foundation for the exploration of nucleation and annihilation mechanisms in multiskyrmions fields
△ Less
Submitted 26 July, 2022;
originally announced July 2022.
-
Calorons and constituent monopoles
Authors:
Lorenzo Foscolo,
Calum Ross
Abstract:
We study anti-self-dual Yang-Mills instantons on $\mathbb{R}^{3}\times S^{1}$, also known as calorons, and their behaviour under collapse of the circle factor. In this limit, we make explicit the decomposition of calorons in terms of constituent pieces which are essentially charge $1$ monopoles. We give a gluing construction of calorons in terms of the constituents and use it to compute the dimens…
▽ More
We study anti-self-dual Yang-Mills instantons on $\mathbb{R}^{3}\times S^{1}$, also known as calorons, and their behaviour under collapse of the circle factor. In this limit, we make explicit the decomposition of calorons in terms of constituent pieces which are essentially charge $1$ monopoles. We give a gluing construction of calorons in terms of the constituents and use it to compute the dimension of the moduli space. The construction works uniformly for structure group an arbitrary compact semi-simple Lie group.
△ Less
Submitted 16 August, 2023; v1 submitted 18 July, 2022;
originally announced July 2022.
-
Towards Autonomous Grading In The Real World
Authors:
Yakov Miron,
Chana Ross,
Yuval Goldfracht,
Chen Tessler,
Dotan Di Castro
Abstract:
In this work, we aim to tackle the problem of autonomous grading, where a dozer is required to flatten an uneven area. In addition, we explore methods for bridging the gap between a simulated environment and real scenarios. We design both a realistic physical simulation and a scaled real prototype environment mimicking the real dozer dynamics and sensory information. We establish heuristics and le…
▽ More
In this work, we aim to tackle the problem of autonomous grading, where a dozer is required to flatten an uneven area. In addition, we explore methods for bridging the gap between a simulated environment and real scenarios. We design both a realistic physical simulation and a scaled real prototype environment mimicking the real dozer dynamics and sensory information. We establish heuristics and learning strategies in order to solve the problem. Through extensive experimentation, we show that although heuristics are capable of tackling the problem in a clean and noise-free simulated environment, they fail catastrophically when facing real world scenarios. As the heuristics are capable of successfully solving the task in the simulated environment, we show they can be leveraged to guide a learning agent which can generalize and solve the task both in simulation and in a scaled prototype environment.
△ Less
Submitted 25 July, 2022; v1 submitted 13 June, 2022;
originally announced June 2022.
-
Perturbation Augmentation for Fairer NLP
Authors:
Rebecca Qian,
Candace Ross,
Jude Fernandes,
Eric Smith,
Douwe Kiela,
Adina Williams
Abstract:
Unwanted and often harmful social biases are becoming ever more salient in NLP research, affecting both models and datasets. In this work, we ask whether training on demographically perturbed data leads to fairer language models. We collect a large dataset of human annotated text perturbations and train a neural perturbation model, which we show outperforms heuristic alternatives. We find that (i)…
▽ More
Unwanted and often harmful social biases are becoming ever more salient in NLP research, affecting both models and datasets. In this work, we ask whether training on demographically perturbed data leads to fairer language models. We collect a large dataset of human annotated text perturbations and train a neural perturbation model, which we show outperforms heuristic alternatives. We find that (i) language models (LMs) pre-trained on demographically perturbed corpora are typically more fair, and (ii) LMs finetuned on perturbed GLUE datasets exhibit less demographic bias on downstream tasks, and (iii) fairness improvements do not come at the expense of performance on downstream tasks. Lastly, we discuss outstanding questions about how best to evaluate the (un)fairness of large language models. We hope that this exploration of neural demographic perturbation will help drive more improvement towards fairer NLP.
△ Less
Submitted 12 October, 2022; v1 submitted 25 May, 2022;
originally announced May 2022.
-
Domain Wall Skyrmions in Chiral Magnets
Authors:
Calum Ross,
Muneto Nitta
Abstract:
Domain wall skyrmions are skyrmions trapped inside a domain wall. We investigate domain wall skyrmions in chiral magnets using a fully analytic approach. Treating the Dzyaloshinskii-Moriya (DM) interaction perturbatively, we construct the low-energy effective theory of a magnetic domain wall in an $O(3)$ sigma model with the DM interaction and an easy-axis potential term, yielding a sine-Gordon mo…
▽ More
Domain wall skyrmions are skyrmions trapped inside a domain wall. We investigate domain wall skyrmions in chiral magnets using a fully analytic approach. Treating the Dzyaloshinskii-Moriya (DM) interaction perturbatively, we construct the low-energy effective theory of a magnetic domain wall in an $O(3)$ sigma model with the DM interaction and an easy-axis potential term, yielding a sine-Gordon model. We then construct domain wall skyrmions as sine-Gordon solitons along the domain wall. We also construct domain wall skyrmions on top of a pair of a domain wall and an anti-domain wall. One of characteristic feature of domain wall skyrmions is that both skyrmions and anti-skyrmions are equally stable inside the domain wall, unlike the bulk in which only one of them is stable.
△ Less
Submitted 23 May, 2022;
originally announced May 2022.
-
Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality
Authors:
Tristan Thrush,
Ryan Jiang,
Max Bartolo,
Amanpreet Singh,
Adina Williams,
Douwe Kiela,
Candace Ross
Abstract:
We present a novel task and dataset for evaluating the ability of vision and language models to conduct visio-linguistic compositional reasoning, which we call Winoground. Given two images and two captions, the goal is to match them correctly - but crucially, both captions contain a completely identical set of words, only in a different order. The dataset was carefully hand-curated by expert annot…
▽ More
We present a novel task and dataset for evaluating the ability of vision and language models to conduct visio-linguistic compositional reasoning, which we call Winoground. Given two images and two captions, the goal is to match them correctly - but crucially, both captions contain a completely identical set of words, only in a different order. The dataset was carefully hand-curated by expert annotators and is labeled with a rich set of fine-grained tags to assist in analyzing model performance. We probe a diverse range of state-of-the-art vision and language models and find that, surprisingly, none of them do much better than chance. Evidently, these models are not as skilled at visio-linguistic compositional reasoning as we might have hoped. We perform an extensive analysis to obtain insights into how future work might try to mitigate these models' shortcomings. We aim for Winoground to serve as a useful evaluation set for advancing the state of the art and driving further progress in the field. The dataset is available at https://huggingface.co/datasets/facebook/winoground.
△ Less
Submitted 22 April, 2022; v1 submitted 6 April, 2022;
originally announced April 2022.
-
A Physics-Guided Neural Operator Learning Approach to Model Biological Tissues from Digital Image Correlation Measurements
Authors:
Huaiqian You,
Quinn Zhang,
Colton J. Ross,
Chung-Hao Lee,
Ming-Chen Hsu,
Yue Yu
Abstract:
We present a data-driven workflow to biological tissue modeling, which aims to predict the displacement field based on digital image correlation (DIC) measurements under unseen loading scenarios, without postulating a specific constitutive model form nor possessing knowledges on the material microstructure. To this end, a material database is constructed from the DIC displacement tracking measurem…
▽ More
We present a data-driven workflow to biological tissue modeling, which aims to predict the displacement field based on digital image correlation (DIC) measurements under unseen loading scenarios, without postulating a specific constitutive model form nor possessing knowledges on the material microstructure. To this end, a material database is constructed from the DIC displacement tracking measurements of multiple biaxial stretching protocols on a porcine tricuspid valve anterior leaflet, with which we build a neural operator learning model. The material response is modeled as a solution operator from the loading to the resultant displacement field, with the material microstructure properties learned implicitly from the data and naturally embedded in the network parameters. Using various combinations of loading protocols, we compare the predictivity of this framework with finite element analysis based on the phenomenological Fung-type model. From in-distribution tests, the predictivity of our approach presents good generalizability to different loading conditions and outperforms the conventional constitutive modeling at approximately one order of magnitude. When tested on out-of-distribution loading ratios, the neural operator learning approach becomes less effective. To improve the generalizability of our framework, we propose a physics-guided neural operator learning model via imposing partial physics knowledge. This method is shown to improve the model's extrapolative performance in the small-deformation regime. Our results demonstrate that with sufficient data coverage and/or guidance from partial physics constraints, the data-driven approach can be a more effective method for modeling biological materials than the traditional constitutive modeling.
△ Less
Submitted 1 April, 2022;
originally announced April 2022.
-
AI Gone Astray: Technical Supplement
Authors:
Janice Yang,
Ludvig Karstens,
Casey Ross,
Adam Yala
Abstract:
This study is a technical supplement to "AI gone astray: How subtle shifts in patient data send popular algorithms reeling, undermining patient safety." from STAT News, which investigates the effect of time drift on clinically deployed machine learning models. We use MIMIC-IV, a publicly available dataset, to train models that replicate commercial approaches by Dascena and Epic to predict the onse…
▽ More
This study is a technical supplement to "AI gone astray: How subtle shifts in patient data send popular algorithms reeling, undermining patient safety." from STAT News, which investigates the effect of time drift on clinically deployed machine learning models. We use MIMIC-IV, a publicly available dataset, to train models that replicate commercial approaches by Dascena and Epic to predict the onset of sepsis, a deadly and yet treatable condition. We observe some of these models degrade overtime; most notably an RNN built on Epic features degrades from a 0.729 AUC to a 0.525 AUC over a decade, leading us to investigate technical and clinical drift as root causes of this performance drop.
△ Less
Submitted 28 February, 2022;
originally announced March 2022.
-
Learning Deep Implicit Fourier Neural Operators (IFNOs) with Applications to Heterogeneous Material Modeling
Authors:
Huaiqian You,
Quinn Zhang,
Colton J. Ross,
Chung-Hao Lee,
Yue Yu
Abstract:
Constitutive modeling based on continuum mechanics theory has been a classical approach for modeling the mechanical responses of materials. However, when constitutive laws are unknown or when defects and/or high degrees of heterogeneity are present, these classical models may become inaccurate. In this work, we propose to use data-driven modeling, which directly utilizes high-fidelity simulation a…
▽ More
Constitutive modeling based on continuum mechanics theory has been a classical approach for modeling the mechanical responses of materials. However, when constitutive laws are unknown or when defects and/or high degrees of heterogeneity are present, these classical models may become inaccurate. In this work, we propose to use data-driven modeling, which directly utilizes high-fidelity simulation and/or experimental measurements to predict a material's response without using conventional constitutive models. Specifically, the material response is modeled by learning the implicit mappings between loading conditions and the resultant displacement and/or damage fields, with the neural network serving as a surrogate for a solution operator. To model the complex responses due to material heterogeneity and defects, we develop a novel deep neural operator architecture, which we coin as the Implicit Fourier Neural Operator (IFNO). In the IFNO, the increment between layers is modeled as an integral operator to capture the long-range dependencies in the feature space. As the network gets deeper, the limit of IFNO becomes a fixed point equation that yields an implicit neural operator and naturally mimics the displacement/damage fields solving procedure in material modeling problems. We demonstrate the performance of our proposed method for a number of examples, including hyperelastic, anisotropic and brittle materials. As an application, we further employ the proposed approach to learn the material models directly from digital image correlation (DIC) tracking measurements, and show that the learned solution operators substantially outperform the conventional constitutive models in predicting displacement fields.
△ Less
Submitted 15 March, 2022;
originally announced March 2022.
-
Higgs Factory Considerations
Authors:
J. A. Bagger,
B. C. Barish,
S. Belomestnykh,
P. C. Bhat,
J. E. Brau,
M. Demarteau,
D. Denisov,
S. C. Eno,
C. G. R. Geddes,
P. D. Grannis,
A. Hutton,
A. J. Lankford,
M. U. Liepe,
D. B. MacFarlane,
T. Markiewicz,
H. E. Montgomery,
J. R. Patterson,
M. Perelstein,
M. E. Peskin,
M. C. Ross,
J. Strube,
A. P. White,
G. W. Wilson
Abstract:
We discuss considerations that can be used to formulate recommendations for initiating a lepton collider project that would provide precision studies of the Higgs boson and related electroweak phenomena.
We discuss considerations that can be used to formulate recommendations for initiating a lepton collider project that would provide precision studies of the Higgs boson and related electroweak phenomena.
△ Less
Submitted 17 March, 2022; v1 submitted 11 March, 2022;
originally announced March 2022.
-
Improved upper limit on degree-scale CMB B-mode polarization power from the 670 square-degree POLARBEAR survey
Authors:
The POLARBEAR Collaboration,
S. Adachi,
T. Adkins,
M. A. O. Aguilar Faúndez,
K. S. Arnold,
C. Baccigalupi,
D. Barron,
S. Chapman,
K. Cheung,
Y. Chinone,
K. T. Crowley,
T. Elleflot,
J. Errard,
G. Fabbian,
C. Feng,
T. Fujino,
N. Galitzki,
N. W. Halverson,
M. Hasegawa,
M. Hazumi,
H. Hirose,
L. Howe,
J. Ito,
O. Jeong,
D. Kaneko
, et al. (29 additional authors not shown)
Abstract:
We report an improved measurement of the degree-scale cosmic microwave background $B$-mode angular-power spectrum over 670 square-degree sky area at 150 GHz with POLARBEAR. In the original analysis of the data, errors in the angle measurement of the continuously rotating half-wave plate, a polarization modulator, caused significant data loss. By introducing an angle-correction algorithm, the data…
▽ More
We report an improved measurement of the degree-scale cosmic microwave background $B$-mode angular-power spectrum over 670 square-degree sky area at 150 GHz with POLARBEAR. In the original analysis of the data, errors in the angle measurement of the continuously rotating half-wave plate, a polarization modulator, caused significant data loss. By introducing an angle-correction algorithm, the data volume is increased by a factor of 1.8. We report a new analysis using the larger data set. We find the measured $B$-mode spectrum is consistent with the $Λ$CDM model with Galactic dust foregrounds. We estimate the contamination of the foreground by cross-correlating our data and Planck 143, 217, and 353 GHz measurements, where its spectrum is modeled as a power law in angular scale and a modified blackbody in frequency. We place an upper limit on the tensor-to-scalar ratio $r$ < 0.33 at 95% confidence level after marginalizing over the foreground parameters.
△ Less
Submitted 15 June, 2022; v1 submitted 4 March, 2022;
originally announced March 2022.
-
CM3: A Causal Masked Multimodal Model of the Internet
Authors:
Armen Aghajanyan,
Bernie Huang,
Candace Ross,
Vladimir Karpukhin,
Hu Xu,
Naman Goyal,
Dmytro Okhonko,
Mandar Joshi,
Gargi Ghosh,
Mike Lewis,
Luke Zettlemoyer
Abstract:
We introduce CM3, a family of causally masked generative models trained over a large corpus of structured multi-modal documents that can contain both text and image tokens. Our new causally masked approach generates tokens left to right while also masking out a small number of long token spans that are generated at the end of the string, instead of their original positions. The casual masking obje…
▽ More
We introduce CM3, a family of causally masked generative models trained over a large corpus of structured multi-modal documents that can contain both text and image tokens. Our new causally masked approach generates tokens left to right while also masking out a small number of long token spans that are generated at the end of the string, instead of their original positions. The casual masking object provides a type of hybrid of the more common causal and masked language models, by enabling full generative modeling while also providing bidirectional context when generating the masked spans. We train causally masked language-image models on large-scale web and Wikipedia articles, where each document contains all of the text, hypertext markup, hyperlinks, and image tokens (from a VQVAE-GAN), provided in the order they appear in the original HTML source (before masking). The resulting CM3 models can generate rich structured, multi-modal outputs while conditioning on arbitrary masked document contexts, and thereby implicitly learn a wide range of text, image, and cross modal tasks. They can be prompted to recover, in a zero-shot fashion, the functionality of models such as DALL-E, GENRE, and HTLM. We set the new state-of-the-art in zero-shot summarization, entity linking, and entity disambiguation while maintaining competitive performance in the fine-tuning setting. We can generate images unconditionally, conditioned on text (like DALL-E) and do captioning all in a zero-shot setting with a single model.
△ Less
Submitted 19 January, 2022;
originally announced January 2022.
-
Latent Network Models to Account for Noisy, Multiply-Reported Social Network Data
Authors:
Caterina De Bacco,
Martina Contisciani,
Jonathan Cardoso-Silva,
Hadiseh Safdari,
Diego Baptista,
Gabriela L. Borges,
Tracy Sweet,
Jean-Gabriel Young,
Jeremy Koster,
Cody T. Ross,
Richard McElreath,
Daniel Redhead,
Eleanor A. Power
Abstract:
Social network data are often constructed by incorporating reports from multiple individuals. However, it is not obvious how to reconcile discordant responses from individuals. There may be particular risks with multiply-reported data if people's responses reflect normative expectations -- such as an expectation of balanced, reciprocal relationships. Here, we propose a probabilistic model that inc…
▽ More
Social network data are often constructed by incorporating reports from multiple individuals. However, it is not obvious how to reconcile discordant responses from individuals. There may be particular risks with multiply-reported data if people's responses reflect normative expectations -- such as an expectation of balanced, reciprocal relationships. Here, we propose a probabilistic model that incorporates ties reported by multiple individuals to estimate the unobserved network structure. In addition to estimating a parameter for each reporter that is related to their tendency of over- or under-reporting relationships, the model explicitly incorporates a term for ``mutuality,'' the tendency to report ties in both directions involving the same alter. Our model's algorithmic implementation is based on variational inference, which makes it efficient and scalable to large systems. We apply our model to data from 75 Indian villages collected with a name-generator design, and a Nicaraguan community collected with a roster-based design. We observe strong evidence of ``mutuality'' in both datasets, and find that this value varies by relationship type. Consequently, our model estimates networks with reciprocity values that are substantially different than those resulting from standard deterministic aggregation approaches, demonstrating the need to consider such issues when gathering, constructing, and analysing survey-based network data.
△ Less
Submitted 12 December, 2022; v1 submitted 21 December, 2021;
originally announced December 2021.
-
AGPNet -- Autonomous Grading Policy Network
Authors:
Chana Ross,
Yakov Miron,
Yuval Goldfracht,
Dotan Di Castro
Abstract:
In this work, we establish heuristics and learning strategies for the autonomous control of a dozer grading an uneven area studded with sand piles. We formalize the problem as a Markov Decision Process, design a simulation which demonstrates agent-environment interactions and finally compare our simulator to a real dozer prototype. We use methods from reinforcement learning, behavior cloning and c…
▽ More
In this work, we establish heuristics and learning strategies for the autonomous control of a dozer grading an uneven area studded with sand piles. We formalize the problem as a Markov Decision Process, design a simulation which demonstrates agent-environment interactions and finally compare our simulator to a real dozer prototype. We use methods from reinforcement learning, behavior cloning and contrastive learning to train a hybrid policy. Our trained agent, AGPNet, reaches human-level performance and outperforms current state-of-the-art machine learning methods for the autonomous grading task. In addition, our agent is capable of generalizing from random scenarios to unseen real world problems.
△ Less
Submitted 20 December, 2021;
originally announced December 2021.
-
Cartan Connections and Integrable Vortex Equations
Authors:
Calum Ross
Abstract:
We demonstrate that integrable abelian vortex equations on constant curvature Riemann surfaces can be reinterpreted as flat non-abelian Cartan connections. By lifting to three dimensional group manifolds we find higher dimensional analogues of vortices. These vortex configurations are also encoded in a Cartan connection. We give examples of different types of vortex that can be interpreted this wa…
▽ More
We demonstrate that integrable abelian vortex equations on constant curvature Riemann surfaces can be reinterpreted as flat non-abelian Cartan connections. By lifting to three dimensional group manifolds we find higher dimensional analogues of vortices. These vortex configurations are also encoded in a Cartan connection. We give examples of different types of vortex that can be interpreted this way, and compare and contrast this Cartan representation of a vortex with the symmetric instanton representation.
△ Less
Submitted 15 December, 2021;
originally announced December 2021.
-
Origins of transverse voltages generated by applied thermal gradients and applied electric fields in ferrimagnetic-insulator/heavy-metal bilayers
Authors:
Arnab Bose,
Rakshit Jain,
Jackson J. Bauer,
Robert A. Buhrman,
Caroline A. Ross,
Daniel C. Ralph
Abstract:
We compare thermal-gradient-driven transverse voltages in ferrimagnetic-insulator/heavy-metal bilayers (Tm3Fe5O12/W and Tm3Fe5O12/Pt) to corresponding electrically-driven transverse resistances at and above room temperature. We find for Tm3Fe5O12/W that the thermal and electrical effects can be explained by a common spin-current detection mechanism, the physics underlying spin Hall magnetoresistan…
▽ More
We compare thermal-gradient-driven transverse voltages in ferrimagnetic-insulator/heavy-metal bilayers (Tm3Fe5O12/W and Tm3Fe5O12/Pt) to corresponding electrically-driven transverse resistances at and above room temperature. We find for Tm3Fe5O12/W that the thermal and electrical effects can be explained by a common spin-current detection mechanism, the physics underlying spin Hall magnetoresistance (SMR). However, for Tm3Fe5O12/Pt the ratio of the electrically-driven transverse voltages (planar Hall signal/anomalous Hall signal) is much larger than the ratio of corresponding thermal-gradient signals, a result which is very different from expectations for a SMR-based mechanism alone. We ascribe this difference to a proximity-induced magnetic layer at the Tm3Fe5O12/Pt interface.
△ Less
Submitted 22 February, 2022; v1 submitted 13 December, 2021;
originally announced December 2021.
-
Antisite defects stabilized by antiphase boundaries in YFeO$_3$ thin films
Authors:
Abinash Kumar,
Konstantin Klyukin,
Shuai Ning,
Cigdem Ozsoy-Keskinbora,
Mikhail Ovsyanko,
Felix van Uden,
Ruud Krijnen,
Bilge Yildiz,
Caroline A. Ross,
James M. LeBeau
Abstract:
YFeO$_3$ thin films are a recent addition to the family of multiferroic orthoferrites where Y\textsubscript{Fe} antisite defects and strain have been shown to introduce polar displacements while retaining magnetic properties. Complete control of the multiferroic properties, however, necessitates knowledge of the defects present and their potential role in modifying behavior. Here, we report the st…
▽ More
YFeO$_3$ thin films are a recent addition to the family of multiferroic orthoferrites where Y\textsubscript{Fe} antisite defects and strain have been shown to introduce polar displacements while retaining magnetic properties. Complete control of the multiferroic properties, however, necessitates knowledge of the defects present and their potential role in modifying behavior. Here, we report the structure and chemistry of antiphase boundaries in multiferroic YFeO$_3$ thin films using aberration corrected scanning transmission electron microscopy combined with atomic resolution energy dispersive X-ray spectroscopy. We find that Fe\textsubscript{Y} antisites, which are not stable in the film bulk, periodically arrange along antiphase boundaries due to changes in the local environment. Using density functional theory, we show that the antiphase boundaries are polar and bi-stable, where the presence of Fe\textsubscript{Y} antisites significantly decreases the switching barrier. These results highlight how planar defects, such as antiphase boundaries, can stabilize point defects that would otherwise not be expected to form within the structure.
△ Less
Submitted 19 July, 2021;
originally announced July 2021.
-
First-Generation Inference Accelerator Deployment at Facebook
Authors:
Michael Anderson,
Benny Chen,
Stephen Chen,
Summer Deng,
Jordan Fix,
Michael Gschwind,
Aravind Kalaiah,
Changkyu Kim,
Jaewon Lee,
Jason Liang,
Haixin Liu,
Yinghai Lu,
Jack Montgomery,
Arun Moorthy,
Satish Nadathur,
Sam Naghshineh,
Avinash Nayak,
Jongsoo Park,
Chris Petersen,
Martin Schatz,
Narayanan Sundaram,
Bangsheng Tang,
Peter Tang,
Amy Yang,
Jiecao Yu
, et al. (90 additional authors not shown)
Abstract:
In this paper, we provide a deep dive into the deployment of inference accelerators at Facebook. Many of our ML workloads have unique characteristics, such as sparse memory accesses, large model sizes, as well as high compute, memory and network bandwidth requirements. We co-designed a high-performance, energy-efficient inference accelerator platform based on these requirements. We describe the in…
▽ More
In this paper, we provide a deep dive into the deployment of inference accelerators at Facebook. Many of our ML workloads have unique characteristics, such as sparse memory accesses, large model sizes, as well as high compute, memory and network bandwidth requirements. We co-designed a high-performance, energy-efficient inference accelerator platform based on these requirements. We describe the inference accelerator platform ecosystem we developed and deployed at Facebook: both hardware, through Open Compute Platform (OCP), and software framework and tooling, through Pytorch/Caffe2/Glow. A characteristic of this ecosystem from the start is its openness to enable a variety of AI accelerators from different vendors. This platform, with six low-power accelerator cards alongside a single-socket host CPU, allows us to serve models of high complexity that cannot be easily or efficiently run on CPUs. We describe various performance optimizations, at both platform and accelerator level, which enables this platform to serve production traffic at Facebook. We also share deployment challenges, lessons learned during performance optimization, as well as provide guidance for future inference hardware co-design.
△ Less
Submitted 4 August, 2021; v1 submitted 8 July, 2021;
originally announced July 2021.
-
Histogram of Cell Types: Deep Learning for Automated Bone Marrow Cytology
Authors:
Rohollah Moosavi Tayebi,
Youqing Mu,
Taher Dehkharghanian,
Catherine Ross,
Monalisa Sur,
Ronan Foley,
Hamid R. Tizhoosh,
Clinton JV Campbell
Abstract:
Bone marrow cytology is required to make a hematological diagnosis, influencing critical clinical decision points in hematology. However, bone marrow cytology is tedious, limited to experienced reference centers and associated with high inter-observer variability. This may lead to a delayed or incorrect diagnosis, leaving an unmet need for innovative supporting technologies. We have developed the…
▽ More
Bone marrow cytology is required to make a hematological diagnosis, influencing critical clinical decision points in hematology. However, bone marrow cytology is tedious, limited to experienced reference centers and associated with high inter-observer variability. This may lead to a delayed or incorrect diagnosis, leaving an unmet need for innovative supporting technologies. We have developed the first ever end-to-end deep learning-based technology for automated bone marrow cytology. Starting with a bone marrow aspirate digital whole slide image, our technology rapidly and automatically detects suitable regions for cytology, and subsequently identifies and classifies all bone marrow cells in each region. This collective cytomorphological information is captured in a novel representation called Histogram of Cell Types (HCT) quantifying bone marrow cell class probability distribution and acting as a cytological "patient fingerprint". The approach achieves high accuracy in region detection (0.97 accuracy and 0.99 ROC AUC), and cell detection and cell classification (0.75 mAP, 0.78 F1-score, Log-average miss rate of 0.31). HCT has potential to revolutionize hematopathology diagnostic workflows, leading to more cost-effective, accurate diagnosis and opening the door to precision medicine.
△ Less
Submitted 8 July, 2021; v1 submitted 5 July, 2021;
originally announced July 2021.
-
Magnetic Impurities, Integrable Vortices and the Toda Equation
Authors:
Sven Bjarke Gudnason,
Calum Ross
Abstract:
The five integrable vortex equations, recently studied by Manton, are generalized to include magnetic impurities of the Tong-Wong type. Under certain conditions these generalizations remain integrable. We further set up a gauge theory with a product gauge group, two complex scalar fields and a general charge matrix. The second species of vortices, when frozen, are interpreted as the magnetic impur…
▽ More
The five integrable vortex equations, recently studied by Manton, are generalized to include magnetic impurities of the Tong-Wong type. Under certain conditions these generalizations remain integrable. We further set up a gauge theory with a product gauge group, two complex scalar fields and a general charge matrix. The second species of vortices, when frozen, are interpreted as the magnetic impurity for all five vortex equations. We then give a geometric compatibility condition, which enables us to remove the constant term in all the equations. This is similar to the reduction from the Taubes equation to the Liouville equation. We further find a family of charge matrices that turn the five vortex equations into either the Toda equation or the Toda equation with the opposite sign. We find exact analytic solutions in all cases and the solution with the opposite sign appears to be new.
△ Less
Submitted 24 July, 2021; v1 submitted 4 May, 2021;
originally announced May 2021.
-
SOLO: Search Online, Learn Offline for Combinatorial Optimization Problems
Authors:
Joel Oren,
Chana Ross,
Maksym Lefarov,
Felix Richter,
Ayal Taitler,
Zohar Feldman,
Christian Daniel,
Dotan Di Castro
Abstract:
We study combinatorial problems with real world applications such as machine scheduling, routing, and assignment. We propose a method that combines Reinforcement Learning (RL) and planning. This method can equally be applied to both the offline, as well as online, variants of the combinatorial problem, in which the problem components (e.g., jobs in scheduling problems) are not known in advance, bu…
▽ More
We study combinatorial problems with real world applications such as machine scheduling, routing, and assignment. We propose a method that combines Reinforcement Learning (RL) and planning. This method can equally be applied to both the offline, as well as online, variants of the combinatorial problem, in which the problem components (e.g., jobs in scheduling problems) are not known in advance, but rather arrive during the decision-making process. Our solution is quite generic, scalable, and leverages distributional knowledge of the problem parameters. We frame the solution process as an MDP, and take a Deep Q-Learning approach wherein states are represented as graphs, thereby allowing our trained policies to deal with arbitrary changes in a principled manner. Though learned policies work well in expectation, small deviations can have substantial negative effects in combinatorial settings. We mitigate these drawbacks by employing our graph-convolutional policies as non-optimal heuristics in a compatible search algorithm, Monte Carlo Tree Search, to significantly improve overall performance. We demonstrate our method on two problems: Machine Scheduling and Capacitated Vehicle Routing. We show that our method outperforms custom-tailored mathematical solvers, state of the art learning-based algorithms, and common heuristics, both in computation time and performance.
△ Less
Submitted 18 May, 2021; v1 submitted 4 April, 2021;
originally announced April 2021.
-
An antisite defect mechanism for room temperature ferroelectricity in orthoferrites
Authors:
Shuai Ning,
Abinash Kumar,
Konstantin Klyukin,
Jong Heon Kim,
Tingyu Su,
Hyun-Suk Kim,
James M. LeBeau,
Bilge Yildiz,
Caroline A. Ross
Abstract:
Single-phase multiferroic materials that allow the coexistence of ferroelectric and magnetic ordering above room temperature are highly desirable, motivating an ongoing search for mechanisms for unconventional ferroelectricity in magnetic oxides. Here, we report an antisite defect mechanism for room temperature ferroelectricity in epitaxial thin films of yttrium orthoferrite, YFeO3, a perovskite-s…
▽ More
Single-phase multiferroic materials that allow the coexistence of ferroelectric and magnetic ordering above room temperature are highly desirable, motivating an ongoing search for mechanisms for unconventional ferroelectricity in magnetic oxides. Here, we report an antisite defect mechanism for room temperature ferroelectricity in epitaxial thin films of yttrium orthoferrite, YFeO3, a perovskite-structured canted antiferromagnet. A combination of piezoresponse force microscopy, atomically resolved elemental mapping with aberration corrected scanning transmission electron microscopy and density functional theory calculations reveals that the presence of YFe antisite defects facilitates a non-centrosymmetric distortion promoting ferroelectricity. This mechanism is predicted to work analogously for other rare earth orthoferrites, with a dependence of the polarization on the radius of the rare earth cation. Furthermore, a vertically aligned nanocomposite consisting of pillars of a magnetoelastic oxide CoFe2O4 embedded epitaxially in the YFeO3 matrix exhibits both robust ferroelectricity and ferrimagnetism at room temperature, as well as a noticeable strain-mediated magnetoelectric coupling effect. Our work uncovers the distinctive role of antisite defects in providing a novel mechanism for ferroelectricity in a range of magnetic orthoferrites and further augments the functionality of this family of complex oxides for multiferroic applications.
△ Less
Submitted 18 March, 2021;
originally announced March 2021.
-
Actinium-225 Production with an Electron Accelerator
Authors:
W. T. Diamond,
C. K. Ross
Abstract:
There has been growing clinical evidence of the value of targeted alpha therapy for treatment of several cancers. The work has been slowed by the lack of availability of the key alpha emitting isotopes, especially Ac-225. Until this time, most of the supply has been from three Th-229 generators that are milked to produce hundreds of mCi of Ac-225 every month. There has been a growing effort to pro…
▽ More
There has been growing clinical evidence of the value of targeted alpha therapy for treatment of several cancers. The work has been slowed by the lack of availability of the key alpha emitting isotopes, especially Ac-225. Until this time, most of the supply has been from three Th-229 generators that are milked to produce hundreds of mCi of Ac-225 every month. There has been a growing effort to produce new sources of Ac-225 from several different accelerator-based routes. It can be produced with medical-isotope cyclotrons with a proton energy of at least 16 MeV using the reaction Ra-226(p,2n)Ac-225. It can also be produced by using high-energy protons (150 to 800 MeV) for spallation of a thorium target. Significant experimental work has been applied to both processes. It can also be produced by the photonuclear reaction, Ra-226(γ,n)Ra-225. The Ra-225 decays via beta decay to Ac-225 with a half life of 14.9 days. The photons are produced by an intense beam of electrons with an energy about 25 to 30 MeV. This paper will provide a technical description of radium targets and a target chamber that would be capable of producing a yield of four curies of Ra-225 from a 10-day irradiation of one gram of radium segmented into two to four separate encapsulated targets, at a beam power of 20 kW. These targets could be milked at least three times, yielding nearly four curies of Ac-225. There is also a description of a method to reduce production of Ac-227 to values less than a few parts per million of the yield of Ac-225. The Monte Carlo code Fluka has been used to model the yields of Ra-225 and support the design concept to reduce the production of Ac-227. It has also been used to model the experimental results by Maslov et al. [https://doi.org/10.1134/S1066362206020184] to provide reasonable confidence in the cross-section value used by the code.
△ Less
Submitted 1 January, 2021;
originally announced January 2021.
-
Exact Ground States and Domain Walls in One Dimensional Chiral Magnets
Authors:
Calum Ross,
Norisuke Sakai,
Muneto Nitta
Abstract:
We determine exactly the phase structure of a chiral magnet in one spatial dimension with the Dzyaloshinskii-Moriya (DM) interaction and a potential that is a function of the third component of the magnetization vector, $n_3$, with a Zeeman (linear with the coefficient $B$) term and an anisotropy (quadratic with the coefficient $A$) term, constrained so that $2A\leq \vert B\vert$. For large values…
▽ More
We determine exactly the phase structure of a chiral magnet in one spatial dimension with the Dzyaloshinskii-Moriya (DM) interaction and a potential that is a function of the third component of the magnetization vector, $n_3$, with a Zeeman (linear with the coefficient $B$) term and an anisotropy (quadratic with the coefficient $A$) term, constrained so that $2A\leq \vert B\vert$. For large values of potential parameters $A$ and $B$, the system is in one of the ferromagnetic phases, whereas it is in the spiral phase for small values. In the spiral phase we find a continuum of spiral solutions, which are one-dimensionally modulated solutions with various periods. The ground state is determined as the spiral solution with the lowest average energy density. As the phase boundary approaches, the period of the lowest energy spiral solution diverges, and the spiral solutions become domain wall solutions with zero energy at the boundary. The energy of the domain wall solutions is positive in the homogeneous phase region, but is negative in the spiral phase region, signaling the instability of the homogeneous (ferromagnetic) state. The order of the phase transition between spiral and homogeneous phases and between polarized ($n_3=\pm 1$) and canted ($n_3\not=\pm 1$) ferromagnetic phases is found to be second order.
△ Less
Submitted 6 January, 2022; v1 submitted 16 December, 2020;
originally announced December 2020.
-
"Thought I'd Share First" and Other Conspiracy Theory Tweets from the COVID-19 Infodemic: Exploratory Study
Authors:
Dax Gerts,
Courtney D. Shelley,
Nidhi Parikh,
Travis Pitts,
Chrysm Watson Ross,
Geoffrey Fairchild,
Nidia Yadria Vaquera Chavez,
Ashlynn R. Daughton
Abstract:
Background: The COVID-19 outbreak has left many people isolated within their homes; these people are turning to social media for news and social connection, which leaves them vulnerable to believing and sharing misinformation. Health-related misinformation threatens adherence to public health messaging, and monitoring its spread on social media is critical to understanding the evolution of ideas t…
▽ More
Background: The COVID-19 outbreak has left many people isolated within their homes; these people are turning to social media for news and social connection, which leaves them vulnerable to believing and sharing misinformation. Health-related misinformation threatens adherence to public health messaging, and monitoring its spread on social media is critical to understanding the evolution of ideas that have potentially negative public health impacts. Results: Analysis using model-labeled data was beneficial for increasing the proportion of data matching misinformation indicators. Random forest classifier metrics varied across the four conspiracy theories considered (F1 scores between 0.347 and 0.857); this performance increased as the given conspiracy theory was more narrowly defined. We showed that misinformation tweets demonstrate more negative sentiment when compared to nonmisinformation tweets and that theories evolve over time, incorporating details from unrelated conspiracy theories as well as real-world events. Conclusions: Although we focus here on health-related misinformation, this combination of approaches is not specific to public health and is valuable for characterizing misinformation in general, which is an important first step in creating targeted messaging to counteract its spread. Initial messaging should aim to preempt generalized misinformation before it becomes widespread, while later messaging will need to target evolving conspiracy theories and the new facets of each as they become incorporated.
△ Less
Submitted 15 April, 2021; v1 submitted 14 December, 2020;
originally announced December 2020.