-
ACR: A Benchmark for Automatic Cohort Retrieval
Authors:
Dung Ngoc Thai,
Victor Ardulov,
Jose Ulises Mena,
Simran Tiwari,
Gleb Erofeev,
Ramy Eskander,
Karim Tarabishy,
Ravi B Parikh,
Wael Salloum
Abstract:
Identifying patient cohorts is fundamental to numerous healthcare tasks, including clinical trial recruitment and retrospective studies. Current cohort retrieval methods in healthcare organizations rely on automated queries of structured data combined with manual curation, which are time-consuming, labor-intensive, and often yield low-quality results. Recent advancements in large language models (…
▽ More
Identifying patient cohorts is fundamental to numerous healthcare tasks, including clinical trial recruitment and retrospective studies. Current cohort retrieval methods in healthcare organizations rely on automated queries of structured data combined with manual curation, which are time-consuming, labor-intensive, and often yield low-quality results. Recent advancements in large language models (LLMs) and information retrieval (IR) offer promising avenues to revolutionize these systems. Major challenges include managing extensive eligibility criteria and handling the longitudinal nature of unstructured Electronic Medical Records (EMRs) while ensuring that the solution remains cost-effective for real-world application. This paper introduces a new task, Automatic Cohort Retrieval (ACR), and evaluates the performance of LLMs and commercial, domain-specific neuro-symbolic approaches. We provide a benchmark task, a query dataset, an EMR dataset, and an evaluation framework. Our findings underscore the necessity for efficient, high-quality ACR systems capable of longitudinal reasoning across extensive patient databases.
△ Less
Submitted 1 July, 2024; v1 submitted 20 June, 2024;
originally announced June 2024.
-
DISCRET: Synthesizing Faithful Explanations For Treatment Effect Estimation
Authors:
Yinjun Wu,
Mayank Keoliya,
Kan Chen,
Neelay Velingker,
Ziyang Li,
Emily J Getzen,
Qi Long,
Mayur Naik,
Ravi B Parikh,
Eric Wong
Abstract:
Designing faithful yet accurate AI models is challenging, particularly in the field of individual treatment effect estimation (ITE). ITE prediction models deployed in critical settings such as healthcare should ideally be (i) accurate, and (ii) provide faithful explanations. However, current solutions are inadequate: state-of-the-art black-box models do not supply explanations, post-hoc explainers…
▽ More
Designing faithful yet accurate AI models is challenging, particularly in the field of individual treatment effect estimation (ITE). ITE prediction models deployed in critical settings such as healthcare should ideally be (i) accurate, and (ii) provide faithful explanations. However, current solutions are inadequate: state-of-the-art black-box models do not supply explanations, post-hoc explainers for black-box models lack faithfulness guarantees, and self-interpretable models greatly compromise accuracy. To address these issues, we propose DISCRET, a self-interpretable ITE framework that synthesizes faithful, rule-based explanations for each sample. A key insight behind DISCRET is that explanations can serve dually as database queries to identify similar subgroups of samples. We provide a novel RL algorithm to efficiently synthesize these explanations from a large search space. We evaluate DISCRET on diverse tasks involving tabular, image, and text data. DISCRET outperforms the best self-interpretable models and has accuracy comparable to the best black-box models while providing faithful explanations. DISCRET is available at https://github.com/wuyinjun-1993/DISCRET-ICML2024.
△ Less
Submitted 2 June, 2024;
originally announced June 2024.
-
Towards Trustworthy Artificial Intelligence for Equitable Global Health
Authors:
Hong Qin,
Jude Kong,
Wandi Ding,
Ramneek Ahluwalia,
Christo El Morr,
Zeynep Engin,
Jake Okechukwu Effoduh,
Rebecca Hwa,
Serena Jingchuan Guo,
Laleh Seyyed-Kalantari,
Sylvia Kiwuwa Muyingo,
Candace Makeda Moore,
Ravi Parikh,
Reva Schwartz,
Dongxiao Zhu,
Xiaoqian Wang,
Yiye Zhang
Abstract:
Artificial intelligence (AI) can potentially transform global health, but algorithmic bias can exacerbate social inequities and disparity. Trustworthy AI entails the intentional design to ensure equity and mitigate potential biases. To advance trustworthy AI in global health, we convened a workshop on Fairness in Machine Intelligence for Global Health (FairMI4GH). The event brought together a glob…
▽ More
Artificial intelligence (AI) can potentially transform global health, but algorithmic bias can exacerbate social inequities and disparity. Trustworthy AI entails the intentional design to ensure equity and mitigate potential biases. To advance trustworthy AI in global health, we convened a workshop on Fairness in Machine Intelligence for Global Health (FairMI4GH). The event brought together a global mix of experts from various disciplines, community health practitioners, policymakers, and more. Topics covered included managing AI bias in socio-technical systems, AI's potential impacts on global health, and balancing data privacy with transparency. Panel discussions examined the cultural, political, and ethical dimensions of AI in global health. FairMI4GH aimed to stimulate dialogue, facilitate knowledge transfer, and spark innovative solutions. Drawing from NIST's AI Risk Management Framework, it provided suggestions for handling AI risks and biases. The need to mitigate data biases from the research design stage, adopt a human-centered approach, and advocate for AI transparency was recognized. Challenges such as updating legal frameworks, managing cross-border data sharing, and motivating developers to reduce bias were acknowledged. The event emphasized the necessity of diverse viewpoints and multi-dimensional dialogue for creating a fair and ethical AI framework for equitable global health.
△ Less
Submitted 10 September, 2023;
originally announced September 2023.
-
The Busboy Problem: Efficient Tableware Decluttering Using Consolidation and Multi-Object Grasps
Authors:
Kishore Srinivas,
Shreya Ganti,
Rishi Parikh,
Ayah Ahmad,
Wisdom Agboh,
Mehmet Dogar,
Ken Goldberg
Abstract:
We present the "Busboy Problem": automating an efficient decluttering of cups, bowls, and silverware from a planar surface. As grasping and transporting individual items is highly inefficient, we propose policies to generate grasps for multiple items. We introduce the metric of Objects per Trip (OpT) carried by the robot to the collection bin to analyze the improvement seen as a result of our poli…
▽ More
We present the "Busboy Problem": automating an efficient decluttering of cups, bowls, and silverware from a planar surface. As grasping and transporting individual items is highly inefficient, we propose policies to generate grasps for multiple items. We introduce the metric of Objects per Trip (OpT) carried by the robot to the collection bin to analyze the improvement seen as a result of our policies. In physical experiments with singulated items, we find that consolidation and multi-object grasps resulted in an 1.8x improvement in OpT, compared to methods without multi-object grasps. See https://sites.google.com/berkeley.edu/busboyproblem for code and supplemental materials.
△ Less
Submitted 7 July, 2023;
originally announced July 2023.
-
Can Machines Garden? Systematically Comparing the AlphaGarden vs. Professional Horticulturalists
Authors:
Simeon Adebola,
Rishi Parikh,
Mark Presten,
Satvik Sharma,
Shrey Aeron,
Ananth Rao,
Sandeep Mukherjee,
Tomson Qu,
Christina Wistrom,
Eugen Solowjow,
Ken Goldberg
Abstract:
The AlphaGarden is an automated testbed for indoor polyculture farming which combines a first-order plant simulator, a gantry robot, a seed planting algorithm, plant phenotyping and tracking algorithms, irrigation sensors and algorithms, and custom pruning tools and algorithms. In this paper, we systematically compare the performance of the AlphaGarden to professional horticulturalists on the staf…
▽ More
The AlphaGarden is an automated testbed for indoor polyculture farming which combines a first-order plant simulator, a gantry robot, a seed planting algorithm, plant phenotyping and tracking algorithms, irrigation sensors and algorithms, and custom pruning tools and algorithms. In this paper, we systematically compare the performance of the AlphaGarden to professional horticulturalists on the staff of the UC Berkeley Oxford Tract Greenhouse. The humans and the machine tend side-by-side polyculture gardens with the same seed arrangement. We compare performance in terms of canopy coverage, plant diversity, and water consumption. Results from two 60-day cycles suggest that the automated AlphaGarden performs comparably to professional horticulturalists in terms of coverage and diversity, and reduces water consumption by as much as 44%. Code, videos, and datasets are available at https://sites.google.com/berkeley.edu/systematiccomparison.
△ Less
Submitted 29 June, 2023;
originally announced June 2023.
-
Controlling the Extraction of Memorized Data from Large Language Models via Prompt-Tuning
Authors:
Mustafa Safa Ozdayi,
Charith Peris,
Jack FitzGerald,
Christophe Dupuy,
Jimit Majmudar,
Haidar Khan,
Rahil Parikh,
Rahul Gupta
Abstract:
Large Language Models (LLMs) are known to memorize significant portions of their training data. Parts of this memorized content have been shown to be extractable by simply querying the model, which poses a privacy risk. We present a novel approach which uses prompt-tuning to control the extraction rates of memorized content in LLMs. We present two prompt training strategies to increase and decreas…
▽ More
Large Language Models (LLMs) are known to memorize significant portions of their training data. Parts of this memorized content have been shown to be extractable by simply querying the model, which poses a privacy risk. We present a novel approach which uses prompt-tuning to control the extraction rates of memorized content in LLMs. We present two prompt training strategies to increase and decrease extraction rates, which correspond to an attack and a defense, respectively. We demonstrate the effectiveness of our techniques by using models from the GPT-Neo family on a public benchmark. For the 1.3B parameter GPT-Neo model, our attack yields a 9.3 percentage point increase in extraction rate compared to our baseline. Our defense can be tuned to achieve different privacy-utility trade-offs by a user-specified hyperparameter. We achieve an extraction rate reduction of up to 97.7% relative to our baseline, with a perplexity increase of 16.9%.
△ Less
Submitted 19 May, 2023;
originally announced May 2023.
-
Automated Pruning of Polyculture Plants
Authors:
Mark Presten,
Rishi Parikh,
Shrey Aeron,
Sandeep Mukherjee,
Simeon Adebola,
Satvik Sharma,
Mark Theis,
Walter Teitelbaum,
Ken Goldberg
Abstract:
Polyculture farming has environmental advantages but requires substantially more pruning than monoculture farming. We present novel hardware and algorithms for automated pruning. Using an overhead camera to collect data from a physical scale garden testbed, the autonomous system utilizes a learned Plant Phenotyping convolutional neural network and a Bounding Disk Tracking algorithm to evaluate the…
▽ More
Polyculture farming has environmental advantages but requires substantially more pruning than monoculture farming. We present novel hardware and algorithms for automated pruning. Using an overhead camera to collect data from a physical scale garden testbed, the autonomous system utilizes a learned Plant Phenotyping convolutional neural network and a Bounding Disk Tracking algorithm to evaluate the individual plant distribution and estimate the state of the garden each day. From this garden state, AlphaGardenSim selects plants to autonomously prune. A trained neural network detects and targets specific prune points on the plant. Two custom-designed pruning tools, compatible with a FarmBot gantry system, are experimentally evaluated and execute autonomous cuts through controlled algorithms. We present results for four 60-day garden cycles. Results suggest the system can autonomously achieve 0.94 normalized plant diversity with pruning shears while maintaining an average canopy coverage of 0.84 by the end of the cycles. For code, videos, and datasets, see https://sites.google.com/berkeley.edu/pruningpolyculture.
△ Less
Submitted 22 August, 2022;
originally announced August 2022.
-
Learning Switching Criteria for Sim2Real Transfer of Robotic Fabric Manipulation Policies
Authors:
Satvik Sharma,
Ellen Novoseller,
Vainavi Viswanath,
Zaynah Javed,
Rishi Parikh,
Ryan Hoque,
Ashwin Balakrishna,
Daniel S. Brown,
Ken Goldberg
Abstract:
Simulation-to-reality transfer has emerged as a popular and highly successful method to train robotic control policies for a wide variety of tasks. However, it is often challenging to determine when policies trained in simulation are ready to be transferred to the physical world. Deploying policies that have been trained with very little simulation data can result in unreliable and dangerous behav…
▽ More
Simulation-to-reality transfer has emerged as a popular and highly successful method to train robotic control policies for a wide variety of tasks. However, it is often challenging to determine when policies trained in simulation are ready to be transferred to the physical world. Deploying policies that have been trained with very little simulation data can result in unreliable and dangerous behaviors on physical hardware. On the other hand, excessive training in simulation can cause policies to overfit to the visual appearance and dynamics of the simulator. In this work, we study strategies to automatically determine when policies trained in simulation can be reliably transferred to a physical robot. We specifically study these ideas in the context of robotic fabric manipulation, in which successful sim2real transfer is especially challenging due to the difficulties of precisely modeling the dynamics and visual appearance of fabric. Results in a fabric smoothing task suggest that our switching criteria correlate well with performance in real. In particular, our confidence-based switching criteria achieve average final fabric coverage of 87.2-93.7% within 55-60% of the total training budget. See https://tinyurl.com/lsc-case for code and supplemental materials.
△ Less
Submitted 2 July, 2022;
originally announced July 2022.
-
Impact of Acoustic Event Tagging on Scene Classification in a Multi-Task Learning Framework
Authors:
Rahil Parikh,
Harshavardhan Sundar,
Ming Sun,
Chao Wang,
Spyros Matsoukas
Abstract:
Acoustic events are sounds with well-defined spectro-temporal characteristics which can be associated with the physical objects generating them. Acoustic scenes are collections of such acoustic events in no specific temporal order. Given this natural linkage between events and scenes, a common belief is that the ability to classify events must help in the classification of scenes. This has led to…
▽ More
Acoustic events are sounds with well-defined spectro-temporal characteristics which can be associated with the physical objects generating them. Acoustic scenes are collections of such acoustic events in no specific temporal order. Given this natural linkage between events and scenes, a common belief is that the ability to classify events must help in the classification of scenes. This has led to several efforts attempting to do well on Acoustic Event Tagging (AET) and Acoustic Scene Classification (ASC) using a multi-task network. However, in these efforts, improvement in one task does not guarantee an improvement in the other, suggesting a tension between ASC and AET. It is unclear if improvements in AET translates to improvements in ASC. We explore this conundrum through an extensive empirical study and show that under certain conditions, using AET as an auxiliary task in the multi-task network consistently improves ASC performance. Additionally, ASC performance further improves with the AET data-set size and is not sensitive to the choice of events or the number of events in the AET data-set. We conclude that this improvement in ASC performance comes from the regularization effect of using AET and not from the network's improved ability to discern between acoustic events.
△ Less
Submitted 27 June, 2022;
originally announced June 2022.
-
An Empirical Analysis on the Vulnerabilities of End-to-End Speech Segregation Models
Authors:
Rahil Parikh,
Gaspar Rochette,
Carol Espy-Wilson,
Shihab Shamma
Abstract:
End-to-end learning models have demonstrated a remarkable capability in performing speech segregation. Despite their wide-scope of real-world applications, little is known about the mechanisms they employ to group and consequently segregate individual speakers. Knowing that harmonicity is a critical cue for these networks to group sources, in this work, we perform a thorough investigation on ConvT…
▽ More
End-to-end learning models have demonstrated a remarkable capability in performing speech segregation. Despite their wide-scope of real-world applications, little is known about the mechanisms they employ to group and consequently segregate individual speakers. Knowing that harmonicity is a critical cue for these networks to group sources, in this work, we perform a thorough investigation on ConvTasnet and DPT-Net to analyze how they perform a harmonic analysis of the input mixture. We perform ablation studies where we apply low-pass, high-pass, and band-stop filters of varying pass-bands to empirically analyze the harmonics most critical for segregation. We also investigate how these networks decide which output channel to assign to an estimated source by introducing discontinuities in synthetic mixtures. We find that end-to-end networks are highly unstable, and perform poorly when confronted with deformations which are imperceptible to humans. Replacing the encoder in these networks with a spectrogram leads to lower overall performance, but much higher stability. This work helps us to understand what information these network rely on for speech segregation, and exposes two sources of generalization-errors. It also pinpoints the encoder as the part of the network responsible for these errors, allowing for a redesign with expert knowledge or transfer learning.
△ Less
Submitted 19 June, 2022;
originally announced June 2022.
-
Canary Extraction in Natural Language Understanding Models
Authors:
Rahil Parikh,
Christophe Dupuy,
Rahul Gupta
Abstract:
Natural Language Understanding (NLU) models can be trained on sensitive information such as phone numbers, zip-codes etc. Recent literature has focused on Model Inversion Attacks (ModIvA) that can extract training data from model parameters. In this work, we present a version of such an attack by extracting canaries inserted in NLU training data. In the attack, an adversary with open-box access to…
▽ More
Natural Language Understanding (NLU) models can be trained on sensitive information such as phone numbers, zip-codes etc. Recent literature has focused on Model Inversion Attacks (ModIvA) that can extract training data from model parameters. In this work, we present a version of such an attack by extracting canaries inserted in NLU training data. In the attack, an adversary with open-box access to the model reconstructs the canaries contained in the model's training set. We evaluate our approach by performing text completion on canaries and demonstrate that by using the prefix (non-sensitive) tokens of the canary, we can generate the full canary. As an example, our attack is able to reconstruct a four digit code in the training dataset of the NLU model with a probability of 0.5 in its best configuration. As countermeasures, we identify several defense mechanisms that, when combined, effectively eliminate the risk of ModIvA in our experiments.
△ Less
Submitted 25 March, 2022;
originally announced March 2022.
-
Acoustic To Articulatory Speech Inversion Using Multi-Resolution Spectro-Temporal Representations Of Speech Signals
Authors:
Rahil Parikh,
Nadee Seneviratne,
Ganesh Sivaraman,
Shihab Shamma,
Carol Espy-Wilson
Abstract:
Multi-resolution spectro-temporal features of a speech signal represent how the brain perceives sounds by tuning cortical cells to different spectral and temporal modulations. These features produce a higher dimensional representation of the speech signals. The purpose of this paper is to evaluate how well the auditory cortex representation of speech signals contribute to estimate articulatory fea…
▽ More
Multi-resolution spectro-temporal features of a speech signal represent how the brain perceives sounds by tuning cortical cells to different spectral and temporal modulations. These features produce a higher dimensional representation of the speech signals. The purpose of this paper is to evaluate how well the auditory cortex representation of speech signals contribute to estimate articulatory features of those corresponding signals. Since obtaining articulatory features from acoustic features of speech signals has been a challenging topic of interest for different speech communities, we investigate the possibility of using this multi-resolution representation of speech signals as acoustic features. We used U. of Wisconsin X-ray Microbeam (XRMB) database of clean speech signals to train a feed-forward deep neural network (DNN) to estimate articulatory trajectories of six tract variables. The optimal set of multi-resolution spectro-temporal features to train the model were chosen using appropriate scale and rate vector parameters to obtain the best performing model. Experiments achieved a correlation of 0.675 with ground-truth tract variables. We compared the performance of this speech inversion system with prior experiments conducted using Mel Frequency Cepstral Coefficients (MFCCs).
△ Less
Submitted 25 June, 2022; v1 submitted 11 March, 2022;
originally announced March 2022.
-
Harmonicity Plays a Critical Role in DNN Based Versus in Biologically-Inspired Monaural Speech Segregation Systems
Authors:
Rahil Parikh,
Ilya Kavalerov,
Carol Espy-Wilson,
Shihab Shamma
Abstract:
Recent advancements in deep learning have led to drastic improvements in speech segregation models. Despite their success and growing applicability, few efforts have been made to analyze the underlying principles that these networks learn to perform segregation. Here we analyze the role of harmonicity on two state-of-the-art Deep Neural Networks (DNN)-based models- Conv-TasNet and DPT-Net. We eval…
▽ More
Recent advancements in deep learning have led to drastic improvements in speech segregation models. Despite their success and growing applicability, few efforts have been made to analyze the underlying principles that these networks learn to perform segregation. Here we analyze the role of harmonicity on two state-of-the-art Deep Neural Networks (DNN)-based models- Conv-TasNet and DPT-Net. We evaluate their performance with mixtures of natural speech versus slightly manipulated inharmonic speech, where harmonics are slightly frequency jittered. We find that performance deteriorates significantly if one source is even slightly harmonically jittered, e.g., an imperceptible 3% harmonic jitter degrades performance of Conv-TasNet from 15.4 dB to 0.70 dB. Training the model on inharmonic speech does not remedy this sensitivity, instead resulting in worse performance on natural speech mixtures, making inharmonicity a powerful adversarial factor in DNN models. Furthermore, additional analyses reveal that DNN algorithms deviate markedly from biologically inspired algorithms that rely primarily on timing cues and not harmonicity to segregate speech.
△ Less
Submitted 8 March, 2022;
originally announced March 2022.
-
AlphaGarden: Learning to Autonomously Tend a Polyculture Garden
Authors:
Mark Presten,
Yahav Avigal,
Mark Theis,
Satvik Sharma,
Rishi Parikh,
Shrey Aeron,
Sandeep Mukherjee,
Sebastian Oehme,
Simeon Adebola,
Walter Teitelbaum,
Varun Kamat,
Ken Goldberg
Abstract:
This paper presents AlphaGarden: an autonomous polyculture garden that prunes and irrigates living plants in a 1.5m x 3.0m physical testbed. AlphaGarden uses an overhead camera and sensors to track the plant distribution and soil moisture. We model individual plant growth and interplant dynamics to train a policy that chooses actions to maximize leaf coverage and diversity. For autonomous pruning,…
▽ More
This paper presents AlphaGarden: an autonomous polyculture garden that prunes and irrigates living plants in a 1.5m x 3.0m physical testbed. AlphaGarden uses an overhead camera and sensors to track the plant distribution and soil moisture. We model individual plant growth and interplant dynamics to train a policy that chooses actions to maximize leaf coverage and diversity. For autonomous pruning, AlphaGarden uses two custom-designed pruning tools and a trained neural network to detect prune points. We present results for four 60-day garden cycles. Results suggest AlphaGarden can autonomously achieve 0.96 normalized diversity with pruning shears while maintaining an average canopy coverage of 0.86 during the peak of the cycle. Code, datasets, and supplemental material can be found at https://github.com/BerkeleyAutomation/AlphaGarden.
△ Less
Submitted 22 August, 2022; v1 submitted 10 November, 2021;
originally announced November 2021.
-
Exploring Machine Teaching with Children
Authors:
Utkarsh Dwivedi,
Jaina Gandhi,
Raj Parikh,
Merijke Coenraad,
Elizabeth Bonsignore,
Hernisa Kacorri
Abstract:
Iteratively building and testing machine learning models can help children develop creativity, flexibility, and comfort with machine learning and artificial intelligence. We explore how children use machine teaching interfaces with a team of 14 children (aged 7-13 years) and adult co-designers. Children trained image classifiers and tested each other's models for robustness. Our study illuminates…
▽ More
Iteratively building and testing machine learning models can help children develop creativity, flexibility, and comfort with machine learning and artificial intelligence. We explore how children use machine teaching interfaces with a team of 14 children (aged 7-13 years) and adult co-designers. Children trained image classifiers and tested each other's models for robustness. Our study illuminates how children reason about ML concepts, offering these insights for designing machine teaching experiences for children: (i) ML metrics (e.g. confidence scores) should be visible for experimentation; (ii) ML activities should enable children to exchange models for promoting reflection and pattern recognition; and (iii) the interface should allow quick data inspection (e.g. images vs. gestures).
△ Less
Submitted 27 September, 2021; v1 submitted 23 September, 2021;
originally announced September 2021.
-
Recent Advances in Video Question Answering: A Review of Datasets and Methods
Authors:
Devshree Patel,
Ratnam Parikh,
Yesha Shastri
Abstract:
Video Question Answering (VQA) is a recent emerging challenging task in the field of Computer Vision. Several visual information retrieval techniques like Video Captioning/Description and Video-guided Machine Translation have preceded the task of VQA. VQA helps to retrieve temporal and spatial information from the video scenes and interpret it. In this survey, we review a number of methods and dat…
▽ More
Video Question Answering (VQA) is a recent emerging challenging task in the field of Computer Vision. Several visual information retrieval techniques like Video Captioning/Description and Video-guided Machine Translation have preceded the task of VQA. VQA helps to retrieve temporal and spatial information from the video scenes and interpret it. In this survey, we review a number of methods and datasets for the task of VQA. To the best of our knowledge, no previous survey has been conducted for the VQA task.
△ Less
Submitted 18 March, 2021; v1 submitted 14 January, 2021;
originally announced January 2021.
-
Comparative Study of Machine Learning Models and BERT on SQuAD
Authors:
Devshree Patel,
Param Raval,
Ratnam Parikh,
Yesha Shastri
Abstract:
This study aims to provide a comparative analysis of performance of certain models popular in machine learning and the BERT model on the Stanford Question Answering Dataset (SQuAD). The analysis shows that the BERT model, which was once state-of-the-art on SQuAD, gives higher accuracy in comparison to other models. However, BERT requires a greater execution time even when only 100 samples are used…
▽ More
This study aims to provide a comparative analysis of performance of certain models popular in machine learning and the BERT model on the Stanford Question Answering Dataset (SQuAD). The analysis shows that the BERT model, which was once state-of-the-art on SQuAD, gives higher accuracy in comparison to other models. However, BERT requires a greater execution time even when only 100 samples are used. This shows that with increasing accuracy more amount of time is invested in training the data. Whereas in case of preliminary machine learning models, execution time for full data is lower but accuracy is compromised.
△ Less
Submitted 22 May, 2020;
originally announced May 2020.
-
Building Resource Adaptive Software Systems (BRASS): Objectives and System Evaluation
Authors:
Jeffrey Hughes,
Cassandra Sparks,
Alley Stoughton,
Rinku Parikh,
Albert Reuther,
Suresh Jagannathan
Abstract:
As modern software systems continue inexorably to increase in complexity and capability, users have become accustomed to periodic cycles of updating and upgrading to avoid obsolescence -- if at some cost in terms of frustration. In the case of the U.S. military, having access to well-functioning software systems and underlying content is critical to national security, but updates are no less probl…
▽ More
As modern software systems continue inexorably to increase in complexity and capability, users have become accustomed to periodic cycles of updating and upgrading to avoid obsolescence -- if at some cost in terms of frustration. In the case of the U.S. military, having access to well-functioning software systems and underlying content is critical to national security, but updates are no less problematic than among civilian users and often demand considerable time and expense. To address these challenges, DARPA has announced a new four-year research project to investigate the fundamental computational and algorithmic requirements necessary for software systems and data to remain robust and functional in excess of 100 years. The Building Resource Adaptive Software Systems, or BRASS, program seeks to realize foundational advances in the design and implementation of long-lived software systems that can dynamically adapt to changes in the resources they depend upon and environments in which they operate. MIT Lincoln Laboratory will provide the test framework and evaluation of proposed software tools in support of this revolutionary vision.
△ Less
Submitted 7 October, 2015;
originally announced October 2015.
-
Table of Content detection using Machine Learning
Authors:
Rachana Parikh,
Avani R. Vasant
Abstract:
Table of content (TOC) detection has drawn attention now a day because it plays an important role in digitization of multipage document. Generally book document is multipage document. So it becomes necessary to detect Table of Content page for easy navigation of multipage document and also to make information retrieval faster for desirable data from the multipage document. All the Table of content…
▽ More
Table of content (TOC) detection has drawn attention now a day because it plays an important role in digitization of multipage document. Generally book document is multipage document. So it becomes necessary to detect Table of Content page for easy navigation of multipage document and also to make information retrieval faster for desirable data from the multipage document. All the Table of content pages follow the different layout, different way of presenting the contents of the document like chapter, section, subsection etc. This paper introduces a new method to detect Table of content using machine learning technique with different features. With the main aim to detect Table of Content pages is to structure the document according to their contents.
△ Less
Submitted 6 June, 2013;
originally announced June 2013.
-
Relevance Sensitive Non-Monotonic Inference on Belief Sequences
Authors:
Samir Chopra,
Konstantinos Georgatos,
Rohit Parikh
Abstract:
We present a method for relevance sensitive non-monotonic inference from belief sequences which incorporates insights pertaining to prioritized inference and relevance sensitive, inconsistency tolerant belief revision.
Our model uses a finite, logically open sequence of propositional formulas as a representation for beliefs and defines a notion of inference from maxiconsistent subsets of formul…
▽ More
We present a method for relevance sensitive non-monotonic inference from belief sequences which incorporates insights pertaining to prioritized inference and relevance sensitive, inconsistency tolerant belief revision.
Our model uses a finite, logically open sequence of propositional formulas as a representation for beliefs and defines a notion of inference from maxiconsistent subsets of formulas guided by two orderings: a temporal sequencing and an ordering based on relevance relations between the conclusion and formulas in the sequence. The relevance relations are ternary (using context as a parameter) as opposed to standard binary axiomatizations. The inference operation thus defined easily handles iterated revision by maintaining a revision history, blocks the derivation of inconsistent answers from a possibly inconsistent sequence and maintains the distinction between explicit and implicit beliefs. In doing so, it provides a finitely presented formalism and a plausible model of reasoning for automated agents.
△ Less
Submitted 7 March, 2000;
originally announced March 2000.