Skip to main content

Showing 1–14 of 14 results for author: Kesselman, C

  1. arXiv:2407.01608  [pdf, other

    cs.LG cs.AI cs.DB cs.HC cs.SE

    Deriva-ML: A Continuous FAIRness Approach to Reproducible Machine Learning Models

    Authors: Zhiwei Li, Carl Kesselman, Mike D'Arch, Michael Pazzani, Benjamin Yizing Xu

    Abstract: Increasingly, artificial intelligence (AI) and machine learning (ML) are used in eScience applications [9]. While these approaches have great potential, the literature has shown that ML-based approaches frequently suffer from results that are either incorrect or unreproducible due to mismanagement or misuse of data used for training and validating the models [12, 15]. Recognition of the necessity… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

  2. The History of the Grid

    Authors: Ian Foster, Carl Kesselman

    Abstract: With the widespread availability of high-speed networks, it becomes feasible to outsource computing to remote providers and to federate resources from many locations. Such observations motivated the development, from the mid-1990s onwards, of a range of innovative Grid technologies, applications, and infrastructures. We review the history, current status, and future prospects for Grid computing.

    Submitted 8 April, 2022; originally announced April 2022.

    Journal ref: High Performance Computing: From Grids and Clouds to Exascale, IOS Press, pages 3-30, 2011

  3. CUF-Links: Continuous and Ubiquitous FAIRness Linkages for reproducible research

    Authors: Ian Foster, Carl Kesselman

    Abstract: Despite much creative work on methods and tools, reproducibility -- the ability to repeat the computational steps used to obtain a research result -- remains elusive. One reason for these difficulties is that extant tools for capturing research processes do not align well with the rich working practices of scientists. We advocate here for simple mechanisms that can be integrated easily with curren… ▽ More

    Submitted 20 January, 2022; originally announced January 2022.

    Journal ref: Computer, vol. 55, no. 8, pp. 20-30, Aug. 2022

  4. Sharing Begins at Home

    Authors: William Dempsey, Ian Foster, Scott Fraser, Carl Kesselman

    Abstract: The broad sharing of research data is widely viewed as of critical importance for the speed, quality, accessibility, and integrity of science. Despite increasing efforts to encourage data sharing, both the quality of shared data, and the frequency of data reuse, remain stubbornly low. We argue here that a major reason for this unfortunate state of affairs is that the organization of research resul… ▽ More

    Submitted 8 July, 2022; v1 submitted 17 January, 2022; originally announced January 2022.

    Journal ref: Harvard Data Science Review, Volume 4, Issue 3, 2022

  5. arXiv:2110.01781  [pdf, other

    cs.HC

    Model-Adaptive Interface Generation for Data-Driven Discovery

    Authors: Hongsuda Tangmunarunkit, Aref Shafaeibejestan, Joshua Chudy, Karl Czajkowski, Robert Schuler, Carl Kesselman

    Abstract: Discovery of new knowledge is increasingly data-driven, predicated on a team's ability to collaboratively create, find, analyze, retrieve, and share pertinent datasets over the duration of an investigation. This is especially true in the domain of scientific discovery where generation, analysis, and interpretation of data are the fundamental mechanisms by which research teams collaborate to achiev… ▽ More

    Submitted 4 October, 2021; originally announced October 2021.

  6. arXiv:2008.09591  [pdf, other

    cs.DC

    Translating the Grid: How a Translational Approach Shaped the Development of Grid Computing

    Authors: Ian Foster, Carl Kesselman

    Abstract: A growing gap between progress in biological knowledge and improved health outcomes inspired the new discipline of translational medicine, in which the application of new knowledge is an explicit part of a research plan. Abramson and Parashar argue that a similar gap between complex computational technologies and ever-more-challenging applications demands an analogous discipline of translational c… ▽ More

    Submitted 21 August, 2020; originally announced August 2020.

  7. arXiv:1610.06044  [pdf, other

    cs.DB cs.DC cs.DL cs.HC

    ERMrest: an entity-relationship data storage service for web-based, data-oriented collaboration

    Authors: Karl Czajkowski, Carl Kesselman, Robert Schuler, Hongsuda Tangmunarunkit

    Abstract: Scientific discovery is increasingly dependent on a scientist's ability to acquire, curate, integrate, analyze, and share large and diverse collections of data. While the details vary from domain to domain, these data often consist of diverse digital assets (e.g. image files, sequence data, or simulation outputs) that are organized with complex relationships and context which may evolve over the c… ▽ More

    Submitted 19 October, 2016; originally announced October 2016.

  8. arXiv:1005.4454  [pdf, other

    astro-ph.IM cs.DC cs.SE

    Montage: a grid portal and software toolkit for science-grade astronomical image mosaicking

    Authors: Joseph C. Jacob, Daniel S. Katz, G. Bruce Berriman, John Good, Anastasia C. Laity, Ewa Deelman, Carl Kesselman, Gurmeet Singh, Mei-Hui Su, Thomas A. Prince, Roy Williams

    Abstract: Montage is a portable software toolkit for constructing custom, science-grade mosaics by composing multiple astronomical images. The mosaics constructed by Montage preserve the astrometry (position) and photometry (intensity) of the sources in the input images. The mosaic to be constructed is specified by the user in terms of a set of parameters, including dataset and wavelength to be used, locati… ▽ More

    Submitted 24 May, 2010; originally announced May 2010.

    Comments: 16 pages, 11 figures

    Journal ref: Int. J. Computational Science and Engineering. 2009

  9. arXiv:0712.2262  [pdf

    cs.CE cs.DC cs.NI

    The Earth System Grid: Supporting the Next Generation of Climate Modeling Research

    Authors: David Bernholdt, Shishir Bharathi, David Brown, Kasidit Chanchio, Meili Chen, Ann Chervenak, Luca Cinquini, Bob Drach, Ian Foster, Peter Fox, Jose Garcia, Carl Kesselman, Rob Markel, Don Middleton, Veronika Nefedova, Line Pouchard, Arie Shoshani, Alex Sim, Gary Strand, Dean Williams

    Abstract: Understanding the earth's climate system and how it might be changing is a preeminent scientific challenge. Global climate models are used to simulate past, present, and future climates, and experiments are executed continuously on an array of distributed supercomputers. The resulting data archive, spread over several sites, currently contains upwards of 100 TB of simulation data and is growing… ▽ More

    Submitted 13 December, 2007; originally announced December 2007.

  10. arXiv:cs/0306129  [pdf

    cs.CR cs.DC

    Security for Grid Services

    Authors: Von Welch, Frank Siebenlist, Ian Foster, John Bresnahan, Karl Czajkowski, Jarek Gawor, Carl Kesselman, Sam Meder, Laura Pearlman, Steven Tuecke

    Abstract: Grid computing is concerned with the sharing and coordinated use of diverse resources in distributed "virtual organizations." The dynamic and multi-institutional nature of these environments introduces challenging security issues that demand new technical approaches. In particular, one must deal with diverse local mechanisms, support dynamic creation of services, and enable dynamic creation of t… ▽ More

    Submitted 24 June, 2003; originally announced June 2003.

    Comments: 10 pages; 4 figures

    Report number: Preprint ANL/MCS-P1024-0203 ACM Class: C.2.4

  11. arXiv:cs/0306082  [pdf

    cs.SE

    The Community Authorization Service: Status and Future

    Authors: L. Pearlman, V. Welch, I. Foster, C. Kesselman, S. Tuecke

    Abstract: Virtual organizations (VOs) are communities of resource providers and users distributed over multiple policy domains. These VOs often wish to define and enforce consistent policies in addition to the policies of their underlying domains. This is challenging, not only because of the problems in distributing the policy to the domains, but also because of the fact that those domains may each have d… ▽ More

    Submitted 13 June, 2003; originally announced June 2003.

    Comments: Talk from the 2003 Computing in High Energy and Nuclear Physics (CHEP03), La Jolla, Ca, USA, March 2003. 9 Pages, PDF

    ACM Class: C.2.4

  12. arXiv:cs/0306053  [pdf

    cs.DC cs.CR

    A Community Authorization Service for Group Collaboration

    Authors: Laura Pearlman, Von Welch, Ian Foster, Carl Kesselman, Steven Tuecke

    Abstract: In "Grids" and "collaboratories," we find distributed communities of resource providers and resource consumers, within which often complex and dynamic policies govern who can use which resources for which purpose. We propose a new approach to the representation, maintenance, and enforcement of such policies that provides a scalable mechanism for specifying and enforcing these policies. Our appro… ▽ More

    Submitted 12 June, 2003; originally announced June 2003.

    Comments: 10 pages,2 figures

    Report number: Preprint ANL/MCS-P1042-0502 ACM Class: C.2.4

  13. arXiv:cs/0103025  [pdf

    cs.AR cs.DC

    The Anatomy of the Grid - Enabling Scalable Virtual Organizations

    Authors: Ian Foster, Carl Kesselman, Steven Tuecke

    Abstract: "Grid" computing has emerged as an important new field, distinguished from conventional distributed computing by its focus on large-scale resource sharing, innovative applications, and, in some cases, high-performance orientation. In this article, we define this new field. First, we review the "Grid problem," which we define as flexible, secure, coordinated resource sharing among dynamic collect… ▽ More

    Submitted 29 March, 2001; originally announced March 2001.

    Comments: 24 pages, 5 figures

    Report number: ANL/MCS-P870-0201 ACM Class: C.1.4; C.2.4

  14. arXiv:cs/0103022  [pdf

    cs.DC cs.DB

    Secure, Efficient Data Transport and Replica Management for High-Performance Data-Intensive Computing

    Authors: Bill Allcock, Joe Bester, John Bresnahan, Ann L. Chervenak, Ian Foster, Carl Kesselman, Sam Meder, Veronika Nefedova, Darcy Quesnel, Steven Tuecke

    Abstract: An emerging class of data-intensive applications involve the geographically dispersed extraction of complex scientific information from very large collections of measured or computed data. Such applications arise, for example, in experimental physics, where the data in question is generated by accelerators, and in simulation science, where the data is generated by supercomputers. So-called Data… ▽ More

    Submitted 28 March, 2001; originally announced March 2001.

    Comments: 15 pages

    Report number: ANL/MCS-P871-0201 ACM Class: C.1.4; E.1