subscribe to arXiv mailings

BonnBot-I Plus: A Bio-diversity Aware Precise Weed Management Robotic Platform

Authors: Alireza Ahmadi, Michael Halstead, Claus Smitt, Chris McCool

Abstract: In this article, we focus on the critical tasks of plant protection in arable farms, addressing a modern challenge in agriculture: integrating ecological considerations into the operational strategy of precision weeding robots like \bbot. This article presents the recent advancements in weed management algorithms and the real-world performance of \bbot\ at the University of Bonn's Klein-Altendorf… ▽ More In this article, we focus on the critical tasks of plant protection in arable farms, addressing a modern challenge in agriculture: integrating ecological considerations into the operational strategy of precision weeding robots like \bbot. This article presents the recent advancements in weed management algorithms and the real-world performance of \bbot\ at the University of Bonn's Klein-Altendorf campus. We present a novel Rolling-view observation model for the BonnBot-Is weed monitoring section which leads to an average absolute weeding performance enhancement of $3.4\%$. Furthermore, for the first time, we show how precision weeding robots could consider bio-diversity-aware concerns in challenging weeding scenarios. We carried out comprehensive weeding experiments in sugar-beet fields, covering both weed-only and mixed crop-weed situations, and introduced a new dataset compatible with precision weeding. Our real-field experiments revealed that our weeding approach is capable of handling diverse weed distributions, with a minimal loss of only $11.66\%$ attributable to intervention planning and $14.7\%$ to vision system limitations highlighting required improvements of the vision system. △ Less

Submitted 4 July, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

Journal ref: IEEE Robotics and Automation Letters 2024

arXiv:2309.05339 [pdf, other]

PAg-NeRF: Towards fast and efficient end-to-end panoptic 3D representations for agricultural robotics

Authors: Claus Smitt, Michael Halstead, Patrick Zimmer, Thomas Läbe, Esra Guclu, Cyrill Stachniss, Chris McCool

Abstract: Precise scene understanding is key for most robot monitoring and intervention tasks in agriculture. In this work we present PAg-NeRF which is a novel NeRF-based system that enables 3D panoptic scene understanding. Our representation is trained using an image sequence with noisy robot odometry poses and automatic panoptic predictions with inconsistent IDs between frames. Despite this noisy input, o… ▽ More Precise scene understanding is key for most robot monitoring and intervention tasks in agriculture. In this work we present PAg-NeRF which is a novel NeRF-based system that enables 3D panoptic scene understanding. Our representation is trained using an image sequence with noisy robot odometry poses and automatic panoptic predictions with inconsistent IDs between frames. Despite this noisy input, our system is able to output scene geometry, photo-realistic renders and 3D consistent panoptic representations with consistent instance IDs. We evaluate this novel system in a very challenging horticultural scenario and in doing so demonstrate an end-to-end trainable system that can make use of noisy robot poses rather than precise poses that have to be pre-calculated. Compared to a baseline approach the peak signal to noise ratio is improved from 21.34dB to 23.37dB while the panoptic quality improves from 56.65% to 70.08%. Furthermore, our approach is faster and can be tuned to improve inference time by more than a factor of 2 while being memory efficient with approximately 12 times fewer parameters. △ Less

Submitted 11 September, 2023; originally announced September 2023.

arXiv:2307.12588 [pdf, other]

BonnBot-I: A Precise Weed Management and Crop Monitoring Platform

Authors: Alireza Ahmadi, Michael Halstead, Chris McCool

Abstract: Cultivation and weeding are two of the primary tasks performed by farmers today. A recent challenge for weeding is the desire to reduce herbicide and pesticide treatments while maintaining crop quality and quantity. In this paper we introduce BonnBot-I a precise weed management platform which can also performs field monitoring. Driven by crop monitoring approaches which can accurately locate and c… ▽ More Cultivation and weeding are two of the primary tasks performed by farmers today. A recent challenge for weeding is the desire to reduce herbicide and pesticide treatments while maintaining crop quality and quantity. In this paper we introduce BonnBot-I a precise weed management platform which can also performs field monitoring. Driven by crop monitoring approaches which can accurately locate and classify plants (weed and crop) we further improve their performance by fusing the platform available GNSS and wheel odometry. This improves tracking accuracy of our crop monitoring approach from a normalized average error of 8.3% to 3.5%, evaluated on a new publicly available corn dataset. We also present a novel arrangement of weeding tools mounted on linear actuators evaluated in simulated environments. We replicate weed distributions from a real field, using the results from our monitoring approach, and show the validity of our work-space division techniques which require significantly less movement (a 50% reduction) to achieve similar results. Overall, BonnBot-I is a significant step forward in precise weed management with a novel method of selectively spraying and controlling weeds in an arable field △ Less

Submitted 24 July, 2023; originally announced July 2023.

Comments: 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

arXiv:2303.08923 [pdf, other]

Panoptic Mapping with Fruit Completion and Pose Estimation for Horticultural Robots

Authors: Yue Pan, Federico Magistri, Thomas Läbe, Elias Marks, Claus Smitt, Chris McCool, Jens Behley, Cyrill Stachniss

Abstract: Monitoring plants and fruits at high resolution play a key role in the future of agriculture. Accurate 3D information can pave the way to a diverse number of robotic applications in agriculture ranging from autonomous harvesting to precise yield estimation. Obtaining such 3D information is non-trivial as agricultural environments are often repetitive and cluttered, and one has to account for the p… ▽ More Monitoring plants and fruits at high resolution play a key role in the future of agriculture. Accurate 3D information can pave the way to a diverse number of robotic applications in agriculture ranging from autonomous harvesting to precise yield estimation. Obtaining such 3D information is non-trivial as agricultural environments are often repetitive and cluttered, and one has to account for the partial observability of fruit and plants. In this paper, we address the problem of jointly estimating complete 3D shapes of fruit and their pose in a 3D multi-resolution map built by a mobile robot. To this end, we propose an online multi-resolution panoptic mapping system where regions of interest are represented with a higher resolution. We exploit data to learn a general fruit shape representation that we use at inference time together with an occlusion-aware differentiable rendering pipeline to complete partial fruit observations and estimate the 7 DoF pose of each fruit in the map. The experiments presented in this paper evaluated both in the controlled environment and in a commercial greenhouse, show that our novel algorithm yields higher completion and pose estimation accuracy than existing methods, with an improvement of 41% in completion accuracy and 52% in pose estimation accuracy while keeping a low inference time of 0.6s in average. Codes are available at: https://github.com/PRBonn/HortiMapping. △ Less

Submitted 22 August, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

Comments: 8 pages, IROS 2023

arXiv:2303.08689 [pdf, other]

doi 10.1109/LRA.2023.3254451

Panoptic One-Click Segmentation: Applied to Agricultural Data

Authors: Patrick Zimmer, Michael Halstead, Chris McCool

Abstract: In weed control, precision agriculture can help to greatly reduce the use of herbicides, resulting in both economical and ecological benefits. A key element is the ability to locate and segment all the plants from image data. Modern instance segmentation techniques can achieve this, however, training such systems requires large amounts of hand-labelled data which is expensive and laborious to obta… ▽ More In weed control, precision agriculture can help to greatly reduce the use of herbicides, resulting in both economical and ecological benefits. A key element is the ability to locate and segment all the plants from image data. Modern instance segmentation techniques can achieve this, however, training such systems requires large amounts of hand-labelled data which is expensive and laborious to obtain. Weakly supervised training can help to greatly reduce labelling efforts and costs. We propose panoptic one-click segmentation, an efficient and accurate offline tool to produce pseudo-labels from click inputs which reduces labelling effort. Our approach jointly estimates the pixel-wise location of all N objects in the scene, compared to traditional approaches which iterate independently through all N objects; this greatly reduces training time. Using just 10% of the data to train our panoptic one-click segmentation approach yields 68.1% and 68.8% mean object intersection over union (IoU) on challenging sugar beet and corn image data respectively, providing comparable performance to traditional one-click approaches while being approximately 12 times faster to train. We demonstrate the applicability of our system by generating pseudo-labels from clicks on the remaining 90% of the data. These pseudo-labels are then used to train Mask R-CNN, in a semi-supervised manner, improving the absolute performance (of mean foreground IoU) by 9.4 and 7.9 points for sugar beet and corn data respectively. Finally, we show that our technique can recover missed clicks during annotation outlining a further benefit over traditional approaches. △ Less

Submitted 15 March, 2023; originally announced March 2023.

Comments: in IEEE Robotics and Automation Letters (2023)

arXiv:2206.13406 [pdf, other]

Explicitly incorporating spatial information to recurrent networks for agriculture

Authors: Claus Smitt, Michael Halstead, Alireza Ahmadi, Chris McCool

Abstract: In agriculture, the majority of vision systems perform still image classification. Yet, recent work has highlighted the potential of spatial and temporal cues as a rich source of information to improve the classification performance. In this paper, we propose novel approaches to explicitly capture both spatial and temporal information to improve the classification of deep convolutional neural netw… ▽ More In agriculture, the majority of vision systems perform still image classification. Yet, recent work has highlighted the potential of spatial and temporal cues as a rich source of information to improve the classification performance. In this paper, we propose novel approaches to explicitly capture both spatial and temporal information to improve the classification of deep convolutional neural networks. We leverage available RGB-D images and robot odometry to perform inter-frame feature map spatial registration. This information is then fused within recurrent deep learnt models, to improve their accuracy and robustness. We demonstrate that this can considerably improve the classification performance with our best performing spatial-temporal model (ST-Atte) achieving absolute performance improvements for intersection-over-union (IoU[%]) of 4.7 for crop-weed segmentation and 2.6 for fruit (sweet pepper) segmentation. Furthermore, we show that these approaches are robust to variable framerates and odometry errors, which are frequently observed in real-world applications. △ Less

Submitted 27 June, 2022; originally announced June 2022.

arXiv:2109.11936 [pdf, other]

Towards Autonomous Visual Navigation in Arable Fields

Authors: Alireza Ahmadi, Michael Halstead, Chris McCool

Abstract: Autonomous navigation of a robot in agricultural fields is essential for every task from crop monitoring to weed management and fertilizer application. Many current approaches rely on accurate GPS, however, such technology is expensive and also prone to failure (e.g. through lack of coverage). As such, autonomous navigation through sensors that can interpret their environment (such as cameras) is… ▽ More Autonomous navigation of a robot in agricultural fields is essential for every task from crop monitoring to weed management and fertilizer application. Many current approaches rely on accurate GPS, however, such technology is expensive and also prone to failure (e.g. through lack of coverage). As such, autonomous navigation through sensors that can interpret their environment (such as cameras) is important to achieve the goal of autonomy in agriculture. In this paper, we introduce a purely vision-based navigation scheme that is able to reliably guide the robot through row-crop fields without manual intervention. Independent of any global localization or mapping, this approach is able to accurately follow the crop-rows and switch between the rows, only using onboard cameras. With the help of a novel crop-row detection and a novel crop-row switching technique, our navigation scheme can be deployed in a wide range of fields with different canopy types in various growth stages with limited parameter tuning, creating a crop agnostic navigation approach. We have extensively evaluated our approach in three different fields under various illumination conditions using our agricultural robotic platform (BonnBot-I). For navigation, our approach is evaluated on five crop types and achieves an average navigation accuracy of 3.82cm relative to manual teleoperation. △ Less

Submitted 1 July, 2022; v1 submitted 24 September, 2021; originally announced September 2021.

Journal ref: IROS 2022 Kyoto Japan

arXiv:2108.08114 [pdf, other]

Combining Local and Global Viewpoint Planning for Fruit Coverage

Authors: Tobias Zaenker, Chris Lehnert, Chris McCool, Maren Bennewitz

Abstract: Obtaining 3D sensor data of complete plants or plant parts (e.g., the crop or fruit) is difficult due to their complex structure and a high degree of occlusion. However, especially for the estimation of the position and size of fruits, it is necessary to avoid occlusions as much as possible and acquire sensor information of the relevant parts. Global viewpoint planners exist that suggest a series… ▽ More Obtaining 3D sensor data of complete plants or plant parts (e.g., the crop or fruit) is difficult due to their complex structure and a high degree of occlusion. However, especially for the estimation of the position and size of fruits, it is necessary to avoid occlusions as much as possible and acquire sensor information of the relevant parts. Global viewpoint planners exist that suggest a series of viewpoints to cover the regions of interest up to a certain degree, but they usually prioritize global coverage and do not emphasize the avoidance of local occlusions. On the other hand, there are approaches that aim at avoiding local occlusions, but they cannot be used in larger environments since they only reach a local maximum of coverage. In this paper, we therefore propose to combine a local, gradient-based method with global viewpoint planning to enable local occlusion avoidance while still being able to cover large areas. Our simulated experiments with a robotic arm equipped with a camera array as well as an RGB-D camera show that this combination leads to a significantly increased coverage of the regions of interest compared to just applying global coverage planning. △ Less

Submitted 18 August, 2021; originally announced August 2021.

Comments: 7 pages, 7 figures, accepted at ECMR 2021. arXiv admin note: text overlap with arXiv:2011.00275

arXiv:2106.10118 [pdf, other]

Virtual Temporal Samples for Recurrent Neural Networks: applied to semantic segmentation in agriculture

Authors: Alireza Ahmadi, Michael Halstead, Chris McCool

Abstract: This paper explores the potential for performing temporal semantic segmentation in the context of agricultural robotics without temporally labelled data. We achieve this by proposing to generate virtual temporal samples from labelled still images. By exploiting the relatively static scene and assuming that the robot (camera) moves we are able to generate virtually labelled temporal sequences with… ▽ More This paper explores the potential for performing temporal semantic segmentation in the context of agricultural robotics without temporally labelled data. We achieve this by proposing to generate virtual temporal samples from labelled still images. By exploiting the relatively static scene and assuming that the robot (camera) moves we are able to generate virtually labelled temporal sequences with no extra annotation effort. Normally, to train a recurrent neural network (RNN), labelled samples from a video (temporal) sequence are required which is laborious and has stymied work in this direction. By generating virtual temporal samples, we demonstrate that it is possible to train a lightweight RNN to perform semantic segmentation on two challenging agricultural datasets. Our results show that by training a temporal semantic segmenter using virtual samples we can increase the performance by an absolute amount of $4.6$ and $4.9$ on sweet pepper and sugar beet datasets, respectively. This indicates that our virtual data augmentation technique is able to accurately classify agricultural images temporally without the use of complicated synthetic data generation techniques nor with the overhead of labelling large amounts of temporal sequences. △ Less

Submitted 5 September, 2021; v1 submitted 18 June, 2021; originally announced June 2021.

arXiv:2011.00275 [pdf, other]

Viewpoint Planning for Fruit Size and Position Estimation

Authors: Tobias Zaenker, Claus Smitt, Chris McCool, Maren Bennewitz

Abstract: Modern agricultural applications require knowledge about the position and size of fruits on plants. However, occlusions from leaves typically make obtaining this information difficult. We present a novel viewpoint planning approach that builds up an octree of plants with labeled regions of interest (ROIs), i.e., fruits. Our method uses this octree to sample viewpoint candidates that increase the i… ▽ More Modern agricultural applications require knowledge about the position and size of fruits on plants. However, occlusions from leaves typically make obtaining this information difficult. We present a novel viewpoint planning approach that builds up an octree of plants with labeled regions of interest (ROIs), i.e., fruits. Our method uses this octree to sample viewpoint candidates that increase the information around the fruit regions and evaluates them using a heuristic utility function that takes into account the expected information gain. Our system automatically switches between ROI targeted sampling and exploration sampling, which considers general frontier voxels, depending on the estimated utility. When the plants have been sufficiently covered with the RGB-D sensor, our system clusters the ROI voxels and estimates the position and size of the detected fruits. We evaluated our approach in simulated scenarios and compared the resulting fruit estimations with the ground truth. The results demonstrate that our combined approach outperforms a sampling method that does not explicitly consider the ROIs to generate viewpoints in terms of the number of discovered ROI cells. Furthermore, we show the real-world applicability by testing our framework on a robotic arm equipped with an RGB-D camera installed on an automated pipe-rail trolley in a capsicum glasshouse. △ Less

Submitted 18 August, 2021; v1 submitted 31 October, 2020; originally announced November 2020.

Comments: 7 pages, 7 figures, accepted at IROS 2021

arXiv:2010.16272 [pdf, other]

PATHoBot: A Robot for Glasshouse Crop Phenotyping and Intervention

Authors: Claus Smitt, Michael Halstead, Tobias Zaenker, Maren Bennewitz, Chris McCool

Abstract: We present PATHoBot an autonomous crop surveying and intervention robot for glasshouse environments. The aim of this platform is to autonomously gather high quality data and also estimate key phenotypic parameters. To achieve this we retro-fit an off-the-shelf pipe-rail trolley with an array of multi-modal cameras, navigation sensors and a robotic arm for close surveying tasks and intervention. In… ▽ More We present PATHoBot an autonomous crop surveying and intervention robot for glasshouse environments. The aim of this platform is to autonomously gather high quality data and also estimate key phenotypic parameters. To achieve this we retro-fit an off-the-shelf pipe-rail trolley with an array of multi-modal cameras, navigation sensors and a robotic arm for close surveying tasks and intervention. In this paper we describe PATHoBot design choices made to ensure proper operation in a commercial glasshouse environment. As a surveying platform we collect a number of datasets which include both sweet pepper and tomatoes. We show how PATHoBot enables novel surveillance approaches by first improving our previous work on fruit counting by incorporating wheel odometry and depth information. We find that by introducing re-projection and depth information we are able to achieve an absolute improvement of 20 points over the baseline technique in an "in the wild" situation. Finally, we present a 3D mapping case study, further showcasing PATHoBot's crop surveying capabilities. △ Less

Submitted 26 March, 2021; v1 submitted 30 October, 2020; originally announced October 2020.

arXiv:1810.11920 [pdf, other]

A Sweet Pepper Harvesting Robot for Protected Cropping Environments

Authors: Chris Lehnert, Chris McCool, Inkyu Sa, Tristan Perez

Abstract: Using robots to harvest sweet peppers in protected cropping environments has remained unsolved despite considerable effort by the research community over several decades. In this paper, we present the robotic harvester, Harvey, designed for sweet peppers in protected cropping environments that achieved a 76.5% success rate (within a modified scenario) which improves upon our prior work which achie… ▽ More Using robots to harvest sweet peppers in protected cropping environments has remained unsolved despite considerable effort by the research community over several decades. In this paper, we present the robotic harvester, Harvey, designed for sweet peppers in protected cropping environments that achieved a 76.5% success rate (within a modified scenario) which improves upon our prior work which achieved 58% and related sweet pepper harvesting work which achieved 33\%. This improvement was primarily achieved through the introduction of a novel peduncle segmentation system using an efficient deep convolutional neural network, in conjunction with 3D post-filtering to detect the critical cutting location. We benchmark the peduncle segmentation against prior art demonstrating a considerable improvement in performance with an F_1 score of 0.564 compared to 0.302. The robotic harvester uses a perception pipeline to detect a target sweet pepper and an appropriate grasp and cutting pose used to determine the trajectory of a multi-modal harvesting tool to grasp the sweet pepper and cut it from the plant. A novel decoupling mechanism enables the gripping and cutting operations to be performed independently. We perform an in-depth analysis of the full robotic harvesting system to highlight bottlenecks and failure points that future work could address. △ Less

Submitted 28 October, 2018; originally announced October 2018.

arXiv:1809.07896 [pdf, other]

3D Move to See: Multi-perspective visual servoing for improving object views with semantic segmentation

Authors: Chris Lehnert, Dorian Tsai, Anders Eriksson, Chris McCool

Abstract: In this paper, we present a new approach to visual servoing for robotics, referred to as 3D Move to See (3DMTS), based on the principle of finding the next best view using a 3D camera array and a robotic manipulator to obtain multiple samples of the scene from different perspectives. The method uses semantic vision and an objective function applied to each perspective to sample a gradient represen… ▽ More In this paper, we present a new approach to visual servoing for robotics, referred to as 3D Move to See (3DMTS), based on the principle of finding the next best view using a 3D camera array and a robotic manipulator to obtain multiple samples of the scene from different perspectives. The method uses semantic vision and an objective function applied to each perspective to sample a gradient representing the direction of the next best view. The method is demonstrated within simulation and on a real robotic platform containing a custom 3D camera array for the challenging scenario of robotic harvesting in a highly occluded and unstructured environment. It was shown on a real robotic platform that by moving the end effector using the gradient of an objective function leads to a locally optimal view of the object of interest, even amongst occlusions. The overall performance of the 3DMTS method obtained a mean increase in target size by 29.3% compared to a baseline method using a single RGB-D camera, which obtained 9.17%. The results demonstrate qualitatively and quantitatively that the 3DMTS method performed better in most scenarios, and yielded three times the target size compared to the baseline method. The increased target size in the final view will improve the detection of key features of the object of interest for further manipulation, such as grasping and harvesting. △ Less

Submitted 20 September, 2018; originally announced September 2018.

arXiv:1801.08613 [pdf, other]

doi 10.1016/j.compag.2018.02.023

A Rapidly Deployable Classification System using Visual Data for the Application of Precision Weed Management

Authors: David Hall, Feras Dayoub, Tristan Perez, Chris McCool

Abstract: In this work we demonstrate a rapidly deployable weed classification system that uses visual data to enable autonomous precision weeding without making prior assumptions about which weed species are present in a given field. Previous work in this area relies on having prior knowledge of the weed species present in the field. This assumption cannot always hold true for every field, and thus limits… ▽ More In this work we demonstrate a rapidly deployable weed classification system that uses visual data to enable autonomous precision weeding without making prior assumptions about which weed species are present in a given field. Previous work in this area relies on having prior knowledge of the weed species present in the field. This assumption cannot always hold true for every field, and thus limits the use of weed classification systems based on this assumption. In this work, we obviate this assumption and introduce a rapidly deployable approach able to operate on any field without any weed species assumptions prior to deployment. We present a three stage pipeline for the implementation of our weed classification system consisting of initial field surveillance, offline processing and selective labelling, and automated precision weeding. The key characteristic of our approach is the combination of plant clustering and selective labelling which is what enables our system to operate without prior weed species knowledge. Testing using field data we are able to label 12.3 times fewer images than traditional full labelling whilst reducing classification accuracy by only 14%. △ Less

Submitted 26 April, 2018; v1 submitted 25 January, 2018; originally announced January 2018.

Comments: 36 pages, 14 figures, published Computers and Electronics in Agriculture Vol. 148

Journal ref: D. Hall, F. Dayoub, T. Perez, and C. McCool, "A Rapidly Deployable Classification System using Visual Data for the Application of Precision Weed Management," Computers and Electronics in Agriculture, Vol. 148, pp. 107-120, May 2018

arXiv:1801.05560 [pdf, other]

Fruit Quantity and Quality Estimation using a Robotic Vision System

Authors: M. Halstead, C. McCool, S. Denman, T. Perez, C. Fookes

Abstract: Accurate localisation of crop remains highly challenging in unstructured environments such as farms. Many of the developed systems still rely on the use of hand selected features for crop identification and often neglect the estimation of crop quantity and quality, which is key to assigning labor during farming processes. To alleviate these limitations we present a robotic vision system that can a… ▽ More Accurate localisation of crop remains highly challenging in unstructured environments such as farms. Many of the developed systems still rely on the use of hand selected features for crop identification and often neglect the estimation of crop quantity and quality, which is key to assigning labor during farming processes. To alleviate these limitations we present a robotic vision system that can accurately estimate the quantity and quality of sweet pepper (Capsicum annuum L), a key horticultural crop. This system consists of three parts: detection, quality estimation, and tracking. Efficient detection is achieved using the FasterRCNN framework. Quality is then estimated in the same framework by learning a parallel layer which we show experimentally results in superior performance than treating quality as extra classes in the traditional Faster-RCNN framework. Evaluation of these two techniques outlines the improved performance of the parallel layer, where we achieve an F1 score of 77.3 for the parallel technique yet only 72.5 for the best scoring (red) of the multi-class implementation. To track the crop we present a tracking via detection approach, which uses the FasterRCNN with parallel layers, that is also a vision-only solution. This approach is cheap to implement as it only requires a camera and in experiments across 2 days we show that our proposed system can accurately estimate the number of sweet pepper present, within 4.1% of the ground truth. △ Less

Submitted 17 January, 2018; originally announced January 2018.

arXiv:1709.10275 [pdf, other]

In-Field Peduncle Detection of Sweet Peppers for Robotic Harvesting: a comparative study

Authors: Chris Lehnert, Chris McCool, Tristan Perez

Abstract: Robotic harvesting of crops has the potential to disrupt current agricultural practices. A key element to enabling robotic harvesting is to safely remove the crop from the plant which often involves locating and cutting the peduncle, the part of the crop that attaches it to the main stem of the plant. In this paper we present a comparative study of two methods for performing peduncle detection.… ▽ More Robotic harvesting of crops has the potential to disrupt current agricultural practices. A key element to enabling robotic harvesting is to safely remove the crop from the plant which often involves locating and cutting the peduncle, the part of the crop that attaches it to the main stem of the plant. In this paper we present a comparative study of two methods for performing peduncle detection. The first method is based on classic colour and geometric features obtained from the scene with a support vector machine classifier, referred to as PFH-SVM. The second method is an efficient deep neural network approach, MiniInception, that is able to be deployed on a robotic platform. In both cases we employ a secondary filtering process that enforces reasonable assumptions about the crop structure, such as the proximity of the peduncle to the crop. Our tests are conducted on Harvey, a sweet pepper harvesting robot, and is evaluated in a greenhouse using two varieties of sweet pepper, Ducati and Mercuno. We demonstrate that the MiniInception method achieves impressive accuracy and considerably outperforms the PFH-SVM approach achieving an F1 score of 0.564 and 0.302 respectively. △ Less

Submitted 29 September, 2017; originally announced September 2017.

Comments: Submitted to International Conference on Robotics and Automation 2018, under review

arXiv:1706.06203 [pdf]

Lessons Learnt from Field Trials of a Robotic Sweet Pepper Harvester

Authors: Christopher Lehnert, Christopher McCool, Tristan Perez

Abstract: In this paper, we present the lessons learnt during the development of a new robotic harvester (Harvey) that can autonomously harvest sweet pepper (capsicum) in protected cropping environments. Robotic harvesting offers an attractive potential solution to reducing labour costs while enabling more regular and selective harvesting, optimising crop quality, scheduling and therefore profit. Our approa… ▽ More In this paper, we present the lessons learnt during the development of a new robotic harvester (Harvey) that can autonomously harvest sweet pepper (capsicum) in protected cropping environments. Robotic harvesting offers an attractive potential solution to reducing labour costs while enabling more regular and selective harvesting, optimising crop quality, scheduling and therefore profit. Our approach combines effective vision algorithms with a novel end-effector design to enable successful harvesting of sweet peppers. We demonstrate a simple and effective vision-based algorithm for crop detection, a grasp selection method, and a novel end-effector design for harvesting. To reduce the complexity of motion planning and to minimise occlusions we focus on picking sweet peppers in a protected cropping environment where plants are grown on planar trellis structures. Initial field trials in protected cropping environments, with two cultivars, demonstrate the efficacy of this approach. The results show that the robot harvester can successfully detect, grasp, and detach crop from the plant within a real protected cropping system. The novel contributions of this work have resulted in significant and encouraging improvements in sweet pepper picking success rates compared with the state-of-the-art. Future work will look at detecting sweet pepper peduncles and improving the total harvesting cycle time for each sweet pepper. The methods presented in this paper provide steps towards the goal of fully autonomous and reliable crop picking systems that will revolutionise the horticulture industry by reducing labour costs, maximising the quality of produce, and ultimately improving the sustainability of farming enterprises. △ Less

Submitted 19 June, 2017; originally announced June 2017.

Comments: Preprint for Asian-Australasian Conference on Precision Agriculture

arXiv:1706.02023 [pdf, other]

doi 10.1109/LRA.2017.2655622

Autonomous Sweet Pepper Harvesting for Protected Cropping Systems

Authors: Chris Lehnert, Andrew English, Chris McCool, Adam Tow, Tristan Perez

Abstract: In this letter, we present a new robotic harvester (Harvey) that can autonomously harvest sweet pepper in protected cropping environments. Our approach combines effective vision algorithms with a novel end-effector design to enable successful harvesting of sweet peppers. Initial field trials in protected cropping environments, with two cultivar, demonstrate the efficacy of this approach achieving… ▽ More In this letter, we present a new robotic harvester (Harvey) that can autonomously harvest sweet pepper in protected cropping environments. Our approach combines effective vision algorithms with a novel end-effector design to enable successful harvesting of sweet peppers. Initial field trials in protected cropping environments, with two cultivar, demonstrate the efficacy of this approach achieving a 46% success rate for unmodified crop, and 58% for modified crop. Furthermore, for the more favourable cultivar we were also able to detach 90% of sweet peppers, indicating that improvements in the grasping success rate would result in greatly improved harvesting performance. △ Less

Submitted 6 June, 2017; originally announced June 2017.

Journal ref: IEEE Robotics and Automation Letters, vol. 2, no. 2, pp. 872-879, April 2017. doi: 10.1109/LRA.2017.2655622

arXiv:1702.01247 [pdf, other]

Towards Unsupervised Weed Scouting for Agricultural Robotics

Authors: David Hall, Feras Dayoub, Jason Kulk, Chris McCool

Abstract: Weed scouting is an important part of modern integrated weed management but can be time consuming and sparse when performed manually. Automated weed scouting and weed destruction has typically been performed using classification systems able to classify a set group of species known a priori. This greatly limits deployability as classification systems must be retrained for any field with a differen… ▽ More Weed scouting is an important part of modern integrated weed management but can be time consuming and sparse when performed manually. Automated weed scouting and weed destruction has typically been performed using classification systems able to classify a set group of species known a priori. This greatly limits deployability as classification systems must be retrained for any field with a different set of weed species present within them. In order to overcome this limitation, this paper works towards developing a clustering approach to weed scouting which can be utilized in any field without the need for prior species knowledge. We demonstrate our system using challenging data collected in the field from an agricultural robotics platform. We show that considerable improvements can be made by (i) learning low-dimensional (bottleneck) features using a deep convolutional neural network to represent plants in general and (ii) tying views of the same area (plant) together. Deploying this algorithm on in-field data collected by AgBotII, we are able to successfully cluster cotton plants from grasses without prior knowledge or training for the specific plants in the field. △ Less

Submitted 25 February, 2017; v1 submitted 4 February, 2017; originally announced February 2017.

Comments: to appear in the proceedings of the IEEE International Conference on Robotics and Automation ICRA2017

arXiv:1701.08608 [pdf, other]

doi 10.1109/LRA.2017.2651952

Peduncle Detection of Sweet Pepper for Autonomous Crop Harvesting - Combined Colour and 3D Information

Authors: Inkyu Sa, Chris Lehnert, Andrew English, Chris McCool, Feras Dayoub, Ben Upcroft, Tristan Perez

Abstract: This paper presents a 3D visual detection method for the challenging task of detecting peduncles of sweet peppers (Capsicum annuum) in the field. Cutting the peduncle cleanly is one of the most difficult stages of the harvesting process, where the peduncle is the part of the crop that attaches it to the main stem of the plant. Accurate peduncle detection in 3D space is therefore a vital step in re… ▽ More This paper presents a 3D visual detection method for the challenging task of detecting peduncles of sweet peppers (Capsicum annuum) in the field. Cutting the peduncle cleanly is one of the most difficult stages of the harvesting process, where the peduncle is the part of the crop that attaches it to the main stem of the plant. Accurate peduncle detection in 3D space is therefore a vital step in reliable autonomous harvesting of sweet peppers, as this can lead to precise cutting while avoiding damage to the surrounding plant. This paper makes use of both colour and geometry information acquired from an RGB-D sensor and utilises a supervised-learning approach for the peduncle detection task. The performance of the proposed method is demonstrated and evaluated using qualitative and quantitative results (the Area-Under-the-Curve (AUC) of the detection precision-recall curve). We are able to achieve an AUC of 0.71 for peduncle detection on field-grown sweet peppers. We release a set of manually annotated 3D sweet pepper and peduncle images to assist the research community in performing further research on this topic. △ Less

Submitted 30 January, 2017; originally announced January 2017.

Comments: 8 pages, 14 figures, Robotics and Automation Letters

arXiv:1609.05258 [pdf, other]

The ACRV Picking Benchmark (APB): A Robotic Shelf Picking Benchmark to Foster Reproducible Research

Authors: Jürgen Leitner, Adam W. Tow, Jake E. Dean, Niko Suenderhauf, Joseph W. Durham, Matthew Cooper, Markus Eich, Christopher Lehnert, Ruben Mangels, Christopher McCool, Peter Kujala, Lachlan Nicholson, Trung Pham, James Sergeant, Liao Wu, Fangyi Zhang, Ben Upcroft, Peter Corke

Abstract: Robotic challenges like the Amazon Picking Challenge (APC) or the DARPA Challenges are an established and important way to drive scientific progress. They make research comparable on a well-defined benchmark with equal test conditions for all participants. However, such challenge events occur only occasionally, are limited to a small number of contestants, and the test conditions are very difficul… ▽ More Robotic challenges like the Amazon Picking Challenge (APC) or the DARPA Challenges are an established and important way to drive scientific progress. They make research comparable on a well-defined benchmark with equal test conditions for all participants. However, such challenge events occur only occasionally, are limited to a small number of contestants, and the test conditions are very difficult to replicate after the main event. We present a new physical benchmark challenge for robotic picking: the ACRV Picking Benchmark (APB). Designed to be reproducible, it consists of a set of 42 common objects, a widely available shelf, and exact guidelines for object arrangement using stencils. A well-defined evaluation protocol enables the comparison of \emph{complete} robotic systems -- including perception and manipulation -- instead of sub-systems only. Our paper also describes and reports results achieved by an open baseline system based on a Baxter robot. △ Less

Submitted 14 December, 2016; v1 submitted 16 September, 2016; originally announced September 2016.

Comments: 8 pages, submitted to RA:Letters

arXiv:1608.00486 [pdf, other]

doi 10.1109/DICTA.2016.7797039

Exploiting Temporal Information for DCNN-based Fine-Grained Object Classification

Authors: ZongYuan Ge, Chris McCool, Conrad Sanderson, Peng Wang, Lingqiao Liu, Ian Reid, Peter Corke

Abstract: Fine-grained classification is a relatively new field that has concentrated on using information from a single image, while ignoring the enormous potential of using video data to improve classification. In this work we present the novel task of video-based fine-grained object classification, propose a corresponding new video dataset, and perform a systematic study of several recent deep convolutio… ▽ More Fine-grained classification is a relatively new field that has concentrated on using information from a single image, while ignoring the enormous potential of using video data to improve classification. In this work we present the novel task of video-based fine-grained object classification, propose a corresponding new video dataset, and perform a systematic study of several recent deep convolutional neural network (DCNN) based approaches, which we specifically adapt to the task. We evaluate three-dimensional DCNNs, two-stream DCNNs, and bilinear DCNNs. Two forms of the two-stream approach are used, where spatial and temporal data from two independent DCNNs are fused either via early fusion (combination of the fully-connected layers) and late fusion (concatenation of the softmax outputs of the DCNNs). For bilinear DCNNs, information from the convolutional layers of the spatial and temporal DCNNs is combined via local co-occurrences. We then fuse the bilinear DCNN and early fusion of the two-stream approach to combine the spatial and temporal information at the local and global level (Spatio-Temporal Co-occurrence). Using the new and challenging video dataset of birds, classification performance is improved from 23.1% (using single images) to 41.1% when using the Spatio-Temporal Co-occurrence system. Incorporating automatically detected bounding box location further improves the classification accuracy to 53.6%. △ Less

Submitted 24 October, 2016; v1 submitted 1 August, 2016; originally announced August 2016.

Comments: International Conference on Digital Image Computing: Techniques and Applications, 2016

ACM Class: I.2.6, I.4, I.5

arXiv:1602.01601 [pdf, other]

doi 10.1007/978-3-319-42996-0_10

Joint Recognition and Segmentation of Actions via Probabilistic Integration of Spatio-Temporal Fisher Vectors

Authors: Johanna Carvajal, Chris McCool, Brian Lovell, Conrad Sanderson

Abstract: We propose a hierarchical approach to multi-action recognition that performs joint classification and segmentation. A given video (containing several consecutive actions) is processed via a sequence of overlapping temporal windows. Each frame in a temporal window is represented through selective low-level spatio-temporal features which efficiently capture relevant local dynamics. Features from eac… ▽ More We propose a hierarchical approach to multi-action recognition that performs joint classification and segmentation. A given video (containing several consecutive actions) is processed via a sequence of overlapping temporal windows. Each frame in a temporal window is represented through selective low-level spatio-temporal features which efficiently capture relevant local dynamics. Features from each window are represented as a Fisher vector, which captures first and second order statistics. Instead of directly classifying each Fisher vector, it is converted into a vector of class probabilities. The final classification decision for each frame is then obtained by integrating the class probabilities at the frame level, which exploits the overlapping of the temporal windows. Experiments were performed on two datasets: s-KTH (a stitched version of the KTH dataset to simulate multi-actions), and the challenging CMU-MMAC dataset. On s-KTH, the proposed approach achieves an accuracy of 85.0%, significantly outperforming two recent approaches based on GMMs and HMMs which obtained 78.3% and 71.2%, respectively. On CMU-MMAC, the proposed approach achieves an accuracy of 40.9%, outperforming the GMM and HMM approaches which obtained 33.7% and 38.4%, respectively. Furthermore, the proposed system is on average 40 times faster than the GMM based approach. △ Less

Submitted 4 October, 2016; v1 submitted 4 February, 2016; originally announced February 2016.

ACM Class: I.2.10; I.4; I.5; I.5.4

Journal ref: Lecture Notes in Computer Science (LNCS), Vol. 9794, pp. 115-127, 2016

arXiv:1602.01599 [pdf, other]

doi 10.1007/978-3-319-42996-0_8

Comparative Evaluation of Action Recognition Methods via Riemannian Manifolds, Fisher Vectors and GMMs: Ideal and Challenging Conditions

Authors: Johanna Carvajal, Arnold Wiliem, Chris McCool, Brian Lovell, Conrad Sanderson

Abstract: We present a comparative evaluation of various techniques for action recognition while keeping as many variables as possible controlled. We employ two categories of Riemannian manifolds: symmetric positive definite matrices and linear subspaces. For both categories we use their corresponding nearest neighbour classifiers, kernels, and recent kernelised sparse representations. We compare against tr… ▽ More We present a comparative evaluation of various techniques for action recognition while keeping as many variables as possible controlled. We employ two categories of Riemannian manifolds: symmetric positive definite matrices and linear subspaces. For both categories we use their corresponding nearest neighbour classifiers, kernels, and recent kernelised sparse representations. We compare against traditional action recognition techniques based on Gaussian mixture models and Fisher vectors (FVs). We evaluate these action recognition techniques under ideal conditions, as well as their sensitivity in more challenging conditions (variations in scale and translation). Despite recent advancements for handling manifolds, manifold based techniques obtain the lowest performance and their kernel representations are more unstable in the presence of challenging conditions. The FV approach obtains the highest accuracy under ideal conditions. Moreover, FV best deals with moderate scale and translation changes. △ Less

Submitted 4 October, 2016; v1 submitted 4 February, 2016; originally announced February 2016.

ACM Class: I.4; I.5; I.5.4

Journal ref: Lecture Notes in Computer Science (LNCS), Vol. 9794, pp. 88-100, 2016

arXiv:1511.09209 [pdf, other]

Fine-Grained Classification via Mixture of Deep Convolutional Neural Networks

Authors: ZongYuan Ge, Alex Bewley, Christopher McCool, Ben Upcroft, Peter Corke, Conrad Sanderson

Abstract: We present a novel deep convolutional neural network (DCNN) system for fine-grained image classification, called a mixture of DCNNs (MixDCNN). The fine-grained image classification problem is characterised by large intra-class variations and small inter-class variations. To overcome these problems our proposed MixDCNN system partitions images into K subsets of similar images and learns an expert D… ▽ More We present a novel deep convolutional neural network (DCNN) system for fine-grained image classification, called a mixture of DCNNs (MixDCNN). The fine-grained image classification problem is characterised by large intra-class variations and small inter-class variations. To overcome these problems our proposed MixDCNN system partitions images into K subsets of similar images and learns an expert DCNN for each subset. The output from each of the K DCNNs is combined to form a single classification decision. In contrast to previous techniques, we provide a formulation to perform joint end-to-end training of the K DCNNs simultaneously. Extensive experiments, on three datasets using two network structures (AlexNet and GoogLeNet), show that the proposed MixDCNN system consistently outperforms other methods. It provides a relative improvement of 12.7% and achieves state-of-the-art results on two datasets. △ Less

Submitted 30 November, 2015; originally announced November 2015.

arXiv:1505.02269 [pdf, other]

Subset Feature Learning for Fine-Grained Category Classification

Authors: Zongyuan Ge, Christopher Mccool, Conrad Sanderson, Peter Corke

Abstract: Fine-grained categorisation has been a challenging problem due to small inter-class variation, large intra-class variation and low number of training images. We propose a learning system which first clusters visually similar classes and then learns deep convolutional neural network features specific to each subset. Experiments on the popular fine-grained Caltech-UCSD bird dataset show that the pro… ▽ More Fine-grained categorisation has been a challenging problem due to small inter-class variation, large intra-class variation and low number of training images. We propose a learning system which first clusters visually similar classes and then learns deep convolutional neural network features specific to each subset. Experiments on the popular fine-grained Caltech-UCSD bird dataset show that the proposed method outperforms recent fine-grained categorisation methods under the most difficult setting: no bounding boxes are presented at test time. It achieves a mean accuracy of 77.5%, compared to the previous best performance of 73.2%. We also show that progressive transfer learning allows us to first learn domain-generic features (for bird classification) which can then be adapted to specific set of bird classes, yielding improvements in accuracy. △ Less

Submitted 9 May, 2015; originally announced May 2015.

arXiv:1502.07802 [pdf, other]

Modelling Local Deep Convolutional Neural Network Features to Improve Fine-Grained Image Classification

Authors: ZongYuan Ge, Chris McCool, Conrad Sanderson, Peter Corke

Abstract: We propose a local modelling approach using deep convolutional neural networks (CNNs) for fine-grained image classification. Recently, deep CNNs trained from large datasets have considerably improved the performance of object recognition. However, to date there has been limited work using these deep CNNs as local feature extractors. This partly stems from CNNs having internal representations which… ▽ More We propose a local modelling approach using deep convolutional neural networks (CNNs) for fine-grained image classification. Recently, deep CNNs trained from large datasets have considerably improved the performance of object recognition. However, to date there has been limited work using these deep CNNs as local feature extractors. This partly stems from CNNs having internal representations which are high dimensional, thereby making such representations difficult to model using stochastic models. To overcome this issue, we propose to reduce the dimensionality of one of the internal fully connected layers, in conjunction with layer-restricted retraining to avoid retraining the entire network. The distribution of low-dimensional features obtained from the modified layer is then modelled using a Gaussian mixture model. Comparative experiments show that considerable performance improvements can be achieved on the challenging Fish and UEC FOOD-100 datasets. △ Less

Submitted 26 February, 2015; originally announced February 2015.

Comments: 5 pages, three figures

arXiv:1502.01782 [pdf, other]

doi 10.1145/2689746.2689748

Multi-Action Recognition via Stochastic Modelling of Optical Flow and Gradients

Authors: Johanna Carvajal, Conrad Sanderson, Chris McCool, Brian C. Lovell

Abstract: In this paper we propose a novel approach to multi-action recognition that performs joint segmentation and classification. This approach models each action using a Gaussian mixture using robust low-dimensional action features. Segmentation is achieved by performing classification on overlapping temporal windows, which are then merged to produce the final result. This approach is considerably less… ▽ More In this paper we propose a novel approach to multi-action recognition that performs joint segmentation and classification. This approach models each action using a Gaussian mixture using robust low-dimensional action features. Segmentation is achieved by performing classification on overlapping temporal windows, which are then merged to produce the final result. This approach is considerably less complicated than previous methods which use dynamic programming or computationally expensive hidden Markov models (HMMs). Initial experiments on a stitched version of the KTH dataset show that the proposed approach achieves an accuracy of 78.3%, outperforming a recent HMM-based approach which obtained 71.2%. △ Less

Submitted 5 February, 2015; originally announced February 2015.

ACM Class: I.4.6; I.4.7; I.4.8; I.5.1; I.5.4

Journal ref: Workshop on Machine Learning for Sensory Data Analysis (MLSDA), pp. 19-24, 2014

arXiv:1408.2313 [pdf, other]

doi 10.1109/DICTA.2015.7371239

Bags of Affine Subspaces for Robust Object Tracking

Authors: Sareh Shirazi, Conrad Sanderson, Chris McCool, Mehrtash T. Harandi

Abstract: We propose an adaptive tracking algorithm where the object is modelled as a continuously updated bag of affine subspaces, with each subspace constructed from the object's appearance over several consecutive frames. In contrast to linear subspaces, affine subspaces explicitly model the origin of subspaces. Furthermore, instead of using a brittle point-to-subspace distance during the search for the… ▽ More We propose an adaptive tracking algorithm where the object is modelled as a continuously updated bag of affine subspaces, with each subspace constructed from the object's appearance over several consecutive frames. In contrast to linear subspaces, affine subspaces explicitly model the origin of subspaces. Furthermore, instead of using a brittle point-to-subspace distance during the search for the object in a new frame, we propose to use a subspace-to-subspace distance by representing candidate image areas also as affine subspaces. Distances between subspaces are then obtained by exploiting the non-Euclidean geometry of Grassmann manifolds. Experiments on challenging videos (containing object occlusions, deformations, as well as variations in pose and illumination) indicate that the proposed method achieves higher tracking accuracy than several recent discriminative trackers. △ Less

Submitted 5 February, 2016; v1 submitted 11 August, 2014; originally announced August 2014.

Comments: in International Conference on Digital Image Computing: Techniques and Applications, 2015

MSC Class: 14M15; 54B05 ACM Class: I.2.10; I.4.8; I.5.4

arXiv:1403.0315 [pdf, ps, other]

doi 10.1109/WACV.2014.6836025

Summarisation of Short-Term and Long-Term Videos using Texture and Colour

Authors: Johanna Carvajal, Chris McCool, Conrad Sanderson

Abstract: We present a novel approach to video summarisation that makes use of a Bag-of-visual-Textures (BoT) approach. Two systems are proposed, one based solely on the BoT approach and another which exploits both colour information and BoT features. On 50 short-term videos from the Open Video Project we show that our BoT and fusion systems both achieve state-of-the-art performance, obtaining an average F-… ▽ More We present a novel approach to video summarisation that makes use of a Bag-of-visual-Textures (BoT) approach. Two systems are proposed, one based solely on the BoT approach and another which exploits both colour information and BoT features. On 50 short-term videos from the Open Video Project we show that our BoT and fusion systems both achieve state-of-the-art performance, obtaining an average F-measure of 0.83 and 0.86 respectively, a relative improvement of 9% and 13% when compared to the previous state-of-the-art. When applied to a new underwater surveillance dataset containing 33 long-term videos, the proposed system reduces the amount of footage by a factor of 27, with only minor degradation in the information content. This order of magnitude reduction in video data represents significant savings in terms of time and potential labour cost when manually reviewing such footage. △ Less

Submitted 3 March, 2014; originally announced March 2014.

Comments: IEEE Winter Conference on Applications of Computer Vision (WACV), 2014

ACM Class: I.4.6; I.4.7; I.4.9; I.5.4; I.2.10

Showing 1–30 of 30 results for author: McCool, C