-
Fine-grained activity recognition for assembly videos
Authors:
Jonathan D. Jones,
Cathryn Cortesa,
Amy Shelton,
Barbara Landau,
Sanjeev Khudanpur,
Gregory D. Hager
Abstract:
In this paper we address the task of recognizing assembly actions as a structure (e.g. a piece of furniture or a toy block tower) is built up from a set of primitive objects. Recognizing the full range of assembly actions requires perception at a level of spatial detail that has not been attempted in the action recognition literature to date. We extend the fine-grained activity recognition setting…
▽ More
In this paper we address the task of recognizing assembly actions as a structure (e.g. a piece of furniture or a toy block tower) is built up from a set of primitive objects. Recognizing the full range of assembly actions requires perception at a level of spatial detail that has not been attempted in the action recognition literature to date. We extend the fine-grained activity recognition setting to address the task of assembly action recognition in its full generality by unifying assembly actions and kinematic structures within a single framework. We use this framework to develop a general method for recognizing assembly actions from observation sequences, along with observation features that take advantage of a spatial assembly's special structure. Finally, we evaluate our method empirically on two application-driven data sources: (1) An IKEA furniture-assembly dataset, and (2) A block-building dataset. On the first, our system recognizes assembly actions with an average framewise accuracy of 70% and an average normalized edit distance of 10%. On the second, which requires fine-grained geometric reasoning to distinguish between assemblies, our system attains an average normalized edit distance of 23% -- a relative improvement of 69% over prior work.
△ Less
Submitted 2 December, 2020;
originally announced December 2020.
-
DASZL: Dynamic Action Signatures for Zero-shot Learning
Authors:
Tae Soo Kim,
Jonathan D. Jones,
Michael Peven,
Zihao Xiao,
Jin Bai,
Yi Zhang,
Weichao Qiu,
Alan Yuille,
Gregory D. Hager
Abstract:
There are many realistic applications of activity recognition where the set of potential activity descriptions is combinatorially large. This makes end-to-end supervised training of a recognition system impractical as no training set is practically able to encompass the entire label set. In this paper, we present an approach to fine-grained recognition that models activities as compositions of dyn…
▽ More
There are many realistic applications of activity recognition where the set of potential activity descriptions is combinatorially large. This makes end-to-end supervised training of a recognition system impractical as no training set is practically able to encompass the entire label set. In this paper, we present an approach to fine-grained recognition that models activities as compositions of dynamic action signatures. This compositional approach allows us to reframe fine-grained recognition as zero-shot activity recognition, where a detector is composed "on the fly" from simple first-principles state machines supported by deep-learned components. We evaluate our method on the Olympic Sports and UCF101 datasets, where our model establishes a new state of the art under multiple experimental paradigms. We also extend this method to form a unique framework for zero-shot joint segmentation and classification of activities in video and demonstrate the first results in zero-shot decoding of complex action sequences on a widely-used surgical dataset. Lastly, we show that we can use off-the-shelf object detectors to recognize activities in completely de-novo settings with no additional training.
△ Less
Submitted 17 November, 2020; v1 submitted 7 December, 2019;
originally announced December 2019.
-
Material-based Non-neural Analogues of Lateral Inhibition: A Multi-agent Approach
Authors:
Jeff Dale Jones
Abstract:
Lateral Inhibition (LI) phenomena occur in a wide range of sensory modalities and are most famously described in the human visual system. In LI the activity of a stimulated neuron is itself excited and suppresses the activity of its local neighbours via inhibitory connections, increasing the contrast between spatial environmental stimuli. Simple or- ganisms, such as the single-celled slime mould P…
▽ More
Lateral Inhibition (LI) phenomena occur in a wide range of sensory modalities and are most famously described in the human visual system. In LI the activity of a stimulated neuron is itself excited and suppresses the activity of its local neighbours via inhibitory connections, increasing the contrast between spatial environmental stimuli. Simple or- ganisms, such as the single-celled slime mould Physarum polycephalum possess no neural tissue yet, despite this, are known to exhibit complex computational behaviour. Could simple organisms such as slime mould approximate LI without recourse to neural tissue? We describe a model whereby LI can emerge without explicit inhibitory wiring, using only bulk transport effects. We use a multi-agent virtual material model of slime mould to reproduce the characteristic contrast amplification response of LI using excitation via attractant stimuli. Restoration of baseline activ- ity occurs when the stimuli are removed. We also explore an opposite counterpart behaviour, Lateral Activation (LA), using repellent stimuli. These preliminary results suggest that simple organisms without neural tissue may approximate sensory contrast enhancement using alternative analogues of LI and suggests novel approaches towards generating collec- tive contrast enhancement in distributed computing and robotic devices.
△ Less
Submitted 24 November, 2015;
originally announced November 2015.