Skip to main content

Showing 1–20 of 20 results for author: Hoffman, S

  1. arXiv:2405.04912  [pdf, other

    q-bio.BM cs.LG physics.chem-ph

    GP-MoLFormer: A Foundation Model For Molecular Generation

    Authors: Jerret Ross, Brian Belgodere, Samuel C. Hoffman, Vijil Chenthamarakshan, Youssef Mroueh, Payel Das

    Abstract: Transformer-based models trained on large and general purpose datasets consisting of molecular strings have recently emerged as a powerful tool for successfully modeling various structure-property relations. Inspired by this success, we extend the paradigm of training chemical language transformers on large-scale chemical datasets to generative tasks in this work. Specifically, we propose GP-MoLFo… ▽ More

    Submitted 4 April, 2024; originally announced May 2024.

  2. arXiv:2302.09190  [pdf, other

    cs.LG cs.CY

    Function Composition in Trustworthy Machine Learning: Implementation Choices, Insights, and Questions

    Authors: Manish Nagireddy, Moninder Singh, Samuel C. Hoffman, Evaline Ju, Karthikeyan Natesan Ramamurthy, Kush R. Varshney

    Abstract: Ensuring trustworthiness in machine learning (ML) models is a multi-dimensional task. In addition to the traditional notion of predictive performance, other notions such as privacy, fairness, robustness to distribution shift, adversarial robustness, interpretability, explainability, and uncertainty quantification are important considerations to evaluate and improve (if deficient). However, these s… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

  3. arXiv:2210.05594  [pdf, other

    cs.LG cs.CY

    Navigating Ensemble Configurations for Algorithmic Fairness

    Authors: Michael Feffer, Martin Hirzel, Samuel C. Hoffman, Kiran Kate, Parikshit Ram, Avraham Shinnar

    Abstract: Bias mitigators can improve algorithmic fairness in machine learning models, but their effect on fairness is often not stable across data splits. A popular approach to train more stable models is ensemble learning, but unfortunately, it is unclear how to combine ensembles with mitigators to best navigate trade-offs between fairness and predictive performance. To that end, we built an open-source l… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

    Comments: arXiv admin note: text overlap with arXiv:2202.00751

  4. arXiv:2207.07174  [pdf, other

    cs.LG stat.ML

    Causal Graphs Underlying Generative Models: Path to Learning with Limited Data

    Authors: Samuel C. Hoffman, Kahini Wadhawan, Payel Das, Prasanna Sattigeri, Karthikeyan Shanmugam

    Abstract: Training generative models that capture rich semantics of the data and interpreting the latent representations encoded by such models are very important problems in unsupervised learning. In this work, we provide a simple algorithm that relies on perturbation experiments on latent codes of a pre-trained generative autoencoder to uncover a causal graph that is implied by the generative model. We le… ▽ More

    Submitted 14 July, 2022; originally announced July 2022.

  5. Accelerating Material Design with the Generative Toolkit for Scientific Discovery

    Authors: Matteo Manica, Jannis Born, Joris Cadow, Dimitrios Christofidellis, Ashish Dave, Dean Clarke, Yves Gaetan Nana Teukam, Giorgio Giannone, Samuel C. Hoffman, Matthew Buchan, Vijil Chenthamarakshan, Timothy Donovan, Hsiang Han Hsu, Federico Zipoli, Oliver Schilter, Akihiro Kishimoto, Lisa Hamada, Inkit Padhi, Karl Wehden, Lauren McHugh, Alexy Khrabrov, Payel Das, Seiji Takeda, John R. Smith

    Abstract: With the growing availability of data within various scientific domains, generative models hold enormous potential to accelerate scientific discovery. They harness powerful representations learned from datasets to speed up the formulation of novel hypotheses with the potential to impact material discovery broadly. We present the Generative Toolkit for Scientific Discovery (GT4SD). This extensible… ▽ More

    Submitted 31 January, 2023; v1 submitted 8 July, 2022; originally announced July 2022.

    Comments: 15 pages, 2 figures

    Journal ref: Nature Partner Journals (npj) Computational Materials 9, 69 (2023)

  6. arXiv:2204.09042  [pdf, other

    q-bio.QM cs.LG q-bio.BM stat.ML

    Accelerating Inhibitor Discovery With A Deep Generative Foundation Model: Validation for SARS-CoV-2 Drug Targets

    Authors: Vijil Chenthamarakshan, Samuel C. Hoffman, C. David Owen, Petra Lukacik, Claire Strain-Damerell, Daren Fearon, Tika R. Malla, Anthony Tumber, Christopher J. Schofield, Helen M. E. Duyvesteyn, Wanwisa Dejnirattisai, Loic Carrique, Thomas S. Walter, Gavin R. Screaton, Tetiana Matviiuk, Aleksandra Mojsilovic, Jason Crain, Martin A. Walsh, David I. Stuart, Payel Das

    Abstract: The discovery of novel inhibitor molecules for emerging drug-target proteins is widely acknowledged as a challenging inverse design problem: Exhaustive exploration of the vast chemical search space is impractical, especially when the target structure or active molecules are unknown. Here we validate experimentally the broad utility of a deep generative framework trained at-scale on protein sequenc… ▽ More

    Submitted 14 October, 2022; v1 submitted 19 April, 2022; originally announced April 2022.

    Comments: Revised title, abstract, and text; additional figures

  7. arXiv:2202.00751  [pdf, other

    cs.LG cs.CY

    An Empirical Study of Modular Bias Mitigators and Ensembles

    Authors: Michael Feffer, Martin Hirzel, Samuel C. Hoffman, Kiran Kate, Parikshit Ram, Avraham Shinnar

    Abstract: There are several bias mitigators that can reduce algorithmic bias in machine learning models but, unfortunately, the effect of mitigators on fairness is often not stable when measured across different data splits. A popular approach to train more stable models is ensemble learning. Ensembles, such as bagging, boosting, voting, or stacking, have been successful at making predictive performance mor… ▽ More

    Submitted 1 February, 2022; originally announced February 2022.

  8. arXiv:2112.01625  [pdf, other

    cs.LG physics.chem-ph

    Sample-Efficient Generation of Novel Photo-acid Generator Molecules using a Deep Generative Model

    Authors: Samuel C. Hoffman, Vijil Chenthamarakshan, Dmitry Yu. Zubarev, Daniel P. Sanders, Payel Das

    Abstract: Photo-acid generators (PAGs) are compounds that release acids ($H^+$ ions) when exposed to light. These compounds are critical components of the photolithography processes that are used in the manufacture of semiconductor logic and memory chips. The exponential increase in the demand for semiconductors has highlighted the need for discovering novel photo-acid generators. While de novo molecule des… ▽ More

    Submitted 2 December, 2021; originally announced December 2021.

  9. arXiv:2109.12151  [pdf, other

    cs.LG cs.AI

    AI Explainability 360: Impact and Design

    Authors: Vijay Arya, Rachel K. E. Bellamy, Pin-Yu Chen, Amit Dhurandhar, Michael Hind, Samuel C. Hoffman, Stephanie Houde, Q. Vera Liao, Ronny Luss, Aleksandra Mojsilovic, Sami Mourad, Pablo Pedemonte, Ramya Raghavendra, John Richards, Prasanna Sattigeri, Karthikeyan Shanmugam, Moninder Singh, Kush R. Varshney, Dennis Wei, Yunfeng Zhang

    Abstract: As artificial intelligence and machine learning algorithms become increasingly prevalent in society, multiple stakeholders are calling for these algorithms to provide explanations. At the same time, these stakeholders, whether they be affected citizens, government regulators, domain experts, or system developers, have different explanation needs. To address these needs, in 2019, we created AI Expl… ▽ More

    Submitted 24 September, 2021; originally announced September 2021.

    Comments: arXiv admin note: text overlap with arXiv:1909.03012

    Journal ref: IAAI 2022

  10. arXiv:2106.04464  [pdf, other

    physics.chem-ph cs.LG math.AT

    Augmenting Molecular Deep Generative Models with Topological Data Analysis Representations

    Authors: Yair Schiff, Vijil Chenthamarakshan, Samuel Hoffman, Karthikeyan Natesan Ramamurthy, Payel Das

    Abstract: Deep generative models have emerged as a powerful tool for learning useful molecular representations and designing novel molecules with desired properties, with applications in drug discovery and material design. However, most existing deep generative models are restricted due to lack of spatial information. Here we propose augmentation of deep generative models with topological data analysis (TDA… ▽ More

    Submitted 15 February, 2022; v1 submitted 8 June, 2021; originally announced June 2021.

    Comments: Accepted to ICASSP, 2022

  11. arXiv:2104.00038  [pdf, other

    cs.LG cs.HC

    Smartphone Camera Oximetry in an Induced Hypoxemia Study

    Authors: Jason S. Hoffman, Varun Viswanath, Xinyi Ding, Matthew J. Thompson, Eric C. Larson, Shwetak N. Patel, Edward Wang

    Abstract: Hypoxemia, a medical condition that occurs when the blood is not carrying enough oxygen to adequately supply the tissues, is a leading indicator for dangerous complications of respiratory diseases like asthma, COPD, and COVID-19. While purpose-built pulse oximeters can provide accurate blood-oxygen saturation (SpO$_2$) readings that allow for diagnosis of hypoxemia, enabling this capability in unm… ▽ More

    Submitted 31 March, 2021; originally announced April 2021.

    Comments: 26 pages, 8 figures

  12. Optimizing Molecules using Efficient Queries from Property Evaluations

    Authors: Samuel Hoffman, Vijil Chenthamarakshan, Kahini Wadhawan, Pin-Yu Chen, Payel Das

    Abstract: Machine learning based methods have shown potential for optimizing existing molecules with more desirable properties, a critical step towards accelerating new chemical discovery. Here we propose QMO, a generic query-based molecule optimization framework that exploits latent embeddings from a molecule autoencoder. QMO improves the desired properties of an input molecule based on efficient queries,… ▽ More

    Submitted 18 October, 2021; v1 submitted 3 November, 2020; originally announced November 2020.

    Comments: Preprint version to be published at Nature Machine Intelligence; Github: https://github.com/IBM/QMO

    Journal ref: Nat Mach Intell 4, 21-31 (2022)

  13. arXiv:2007.10970  [pdf

    cs.CY astro-ph.IM

    Recommendations for Planning Inclusive Astronomy Conferences

    Authors: Inclusive Astronomy 2 Local Organizing Committee, :, Brian Brooks, Keira Brooks, Lea Hagen, Nimish Hathi, Samantha Hoffman, James Paranilam, Laura Prichard

    Abstract: The Inclusive Astronomy (IA) conference series aims to create a safe space where community members can listen to the experiences of marginalized individuals in astronomy, discuss actions being taken to address inequities, and give recommendations to the community for how to improve diversity, equity, and inclusion in astronomy. The first IA was held in Nashville, TN, USA, 17-19 June, 2015. The Inc… ▽ More

    Submitted 21 July, 2020; originally announced July 2020.

    Comments: 41 pages. An editable version of the document and contact information available here: https://outerspace.stsci.edu/display/IA2/LOC+Recommendations

  14. arXiv:2006.03963  [pdf, other

    cs.LG stat.ML

    Combinatorial Black-Box Optimization with Expert Advice

    Authors: Hamid Dadkhahi, Karthikeyan Shanmugam, Jesus Rios, Payel Das, Samuel Hoffman, Troy David Loeffler, Subramanian Sankaranarayanan

    Abstract: We consider the problem of black-box function optimization over the boolean hypercube. Despite the vast literature on black-box function optimization over continuous domains, not much attention has been paid to learning models for optimization over combinatorial domains until recently. However, the computational complexity of the recently devised algorithms are prohibitive even for moderate number… ▽ More

    Submitted 13 October, 2020; v1 submitted 6 June, 2020; originally announced June 2020.

    Journal ref: KDD 2020

  15. A Mosquito Pick-and-Place System for PfSPZ-based Malaria Vaccine Production

    Authors: Henry Phalen, Prasad Vagdargi, Mariah L. Schrum, Sumana Chakravarty, Amanda Canezin, Michael Pozin, Suat Coemert, Iulian Iordachita, Stephen L. Hoffman, Gregory S. Chirikjian, Russell H. Taylor

    Abstract: The treatment of malaria is a global health challenge that stands to benefit from the widespread introduction of a vaccine for the disease. A method has been developed to create a live organism vaccine using the sporozoites (SPZ) of the parasite Plasmodium falciparum (Pf), which are concentrated in the salivary glands of infected mosquitoes. Current manual dissection methods to obtain these PfSPZ… ▽ More

    Submitted 12 April, 2020; originally announced April 2020.

    Comments: 12 pages, 11 figures, Manuscript submitted for Special Issue of IEEE CASE 2019 for IEEE T-ASE

  16. arXiv:2004.01215  [pdf, other

    cs.LG q-bio.QM stat.ML

    CogMol: Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models

    Authors: Vijil Chenthamarakshan, Payel Das, Samuel C. Hoffman, Hendrik Strobelt, Inkit Padhi, Kar Wai Lim, Benjamin Hoover, Matteo Manica, Jannis Born, Teodoro Laino, Aleksandra Mojsilovic

    Abstract: The novel nature of SARS-CoV-2 calls for the development of efficient de novo drug design approaches. In this study, we propose an end-to-end framework, named CogMol (Controlled Generation of Molecules), for designing new drug-like small molecules targeting novel viral proteins with high affinity and off-target selectivity. CogMol combines adaptive pre-training of a molecular SMILES Variational Au… ▽ More

    Submitted 23 June, 2020; v1 submitted 2 April, 2020; originally announced April 2020.

  17. arXiv:1909.03012  [pdf, other

    cs.AI cs.CV cs.HC stat.ML

    One Explanation Does Not Fit All: A Toolkit and Taxonomy of AI Explainability Techniques

    Authors: Vijay Arya, Rachel K. E. Bellamy, Pin-Yu Chen, Amit Dhurandhar, Michael Hind, Samuel C. Hoffman, Stephanie Houde, Q. Vera Liao, Ronny Luss, Aleksandra Mojsilović, Sami Mourad, Pablo Pedemonte, Ramya Raghavendra, John Richards, Prasanna Sattigeri, Karthikeyan Shanmugam, Moninder Singh, Kush R. Varshney, Dennis Wei, Yunfeng Zhang

    Abstract: As artificial intelligence and machine learning algorithms make further inroads into society, calls are increasing from multiple stakeholders for these algorithms to explain their outputs. At the same time, these stakeholders, whether they be affected citizens, government regulators, domain experts, or system developers, present different requirements for explanations. Toward addressing these need… ▽ More

    Submitted 14 September, 2019; v1 submitted 6 September, 2019; originally announced September 2019.

  18. arXiv:1903.02532  [pdf

    q-bio.QM cs.RO

    An Efficient Production Process for Extracting Salivary Glands from Mosquitoes

    Authors: Mariah Schrum, Amanda Canezin, Sumana Chakravarty, Michelle Laskowski, Suat Comert, Yunuscan Sevimli, Gregory S. Chirikjian, Stephen L. Hoffman, Russell H. Taylor

    Abstract: Malaria is the one of the leading causes of morbidity and mortality in many developing countries. The development of a highly effective and readily deployable vaccine represents a major goal for world health. There has been recent progress in developing a clinically effective vaccine manufactured using Plasmodium falciparum sporozoites (PfSPZ) extracted from the salivary glands of Anopheles sp. Mo… ▽ More

    Submitted 5 March, 2019; originally announced March 2019.

    Comments: 5 pages, 5 figures

  19. arXiv:1810.01943  [pdf, other

    cs.AI

    AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias

    Authors: Rachel K. E. Bellamy, Kuntal Dey, Michael Hind, Samuel C. Hoffman, Stephanie Houde, Kalapriya Kannan, Pranay Lohia, Jacquelyn Martino, Sameep Mehta, Aleksandra Mojsilovic, Seema Nagar, Karthikeyan Natesan Ramamurthy, John Richards, Diptikalyan Saha, Prasanna Sattigeri, Moninder Singh, Kush R. Varshney, Yunfeng Zhang

    Abstract: Fairness is an increasingly important concern as machine learning models are used to support decision making in high-stakes applications such as mortgage lending, hiring, and prison sentencing. This paper introduces a new open source Python toolkit for algorithmic fairness, AI Fairness 360 (AIF360), released under an Apache v2.0 license {https://github.com/ibm/aif360). The main objectives of this… ▽ More

    Submitted 3 October, 2018; originally announced October 2018.

    Comments: 20 pages

  20. arXiv:1805.09910  [pdf, other

    stat.ML cs.CY cs.LG

    Fairness GAN

    Authors: Prasanna Sattigeri, Samuel C. Hoffman, Vijil Chenthamarakshan, Kush R. Varshney

    Abstract: In this paper, we introduce the Fairness GAN, an approach for generating a dataset that is plausibly similar to a given multimedia dataset, but is more fair with respect to protected attributes in allocative decision making. We propose a novel auxiliary classifier GAN that strives for demographic parity or equality of opportunity and show empirical results on several datasets, including the CelebF… ▽ More

    Submitted 24 May, 2018; originally announced May 2018.