5 Conclusion

“A PhD research project is an academic exercise of given effort and length.”

– Dr. Ron Anderson, PhD adviser of prof. Clement Fortin

This thesis is dedicated to designing, implementing, and testing of a framework for estimating forest parameters at the scale of individual trees. The proposed framework relies on fusion of UAV LiDAR point clouds and RGB orthophotos data, deep neural networks designed for processing 3D points clouds to segment the data into individual trees, and a collection of parameter-specific classic machine learning models to process the segmented trees and predict forest parameters of interest. An end-to-end implementation is described in detail, with parts of the system evaluated separately, and the overall system validated on a detailed manual field inventory dataset.

The main contributions of the thesis can be summarized as follows:

Publication of two original open access datasets that will undoubtedly be useful to the research community with a rapidly growing interest in detailed digital forest inventories (Dubrovin and Fortin 2024; Dubrovin, Fortin, and Kedrov 2024). I believe the release of data into open access is an important contribution, especially in the current era of commercial machine learning¹. When having good data is an advantage, fewer people are willing to make their data public, which makes it harder for researchers to explore, develop, and compare novel approaches.
A novel approach to training individual tree segmentation neural networks on forest patches synthetically generated from individual trees with heavy use of augmentations to make the data more closely represent real data.
Design, implementation, and experimental verification of the framework mentioned in the first paragraph. The framework is still in the early prototype stage, since the components are far behind the current state of the art. The focus was on creating a proof of concept, thus prioritizing ease of integration over potential performance to show that the end-to-end system has potential.

Coming back to the research question and the hypothesis stated in Section 1.2, I want to address the ways I believe the proposed framework answers it. As already mentioned there, the baselines are the manual forest inventory and the widely used area-based approach described in Section 2.3. The cost reduction mainly comes from the choice of the platform for remote sensing observations. Using UAVs is cheaper than purchasing high-resolution data from satellites or planes, and it offers a lot more control over the acquisition parameters, although it requires trained operators. The effort reduction mainly comes from a dramatic decrease in the amount of required field inventory data, which is very labor-intensive to collect. UAVs also allow covering huge areas easily by a few people, unlike fully manual inventories or those based on terrestrial LiDAR measurements, where a forester or an operator needs to cover the entire area of interest by foot.

The reported experimental results achieved by the implemented system are by no means state of the art, but they do serve as a proof of concept. The current state of the system, described in the previous chapter, can be seen as a baseline, since it prioritizes simple components to show that the overall system has potential. And I believe the resulting framework has potential, with specific steps I would try if I had unlimited time, listed in the next section.

5.1 Potential further improvements

I see numerous ways to potentially improve the results of the proposed system. Some of the most important are as follows:

Use of more advanced tree segmentation network architectures. As mentioned, the PointNet++ is a pioneering model in the field of deep learning on 3D point clouds that was chosen because it is relatively easy to implement and work with. It is still competitive in many tasks, but the field of deep learning is evolving rapidly, and many more powerful architectures were suggested and proven since its release. The framework will definitely benefit from a better, more efficient tree segmentation network.
Use of more advanced feature extraction for RGB orthophoto-based features. For the same reason the PointNet++ was chosen as a baseline for tree segmentation, a very basic approach was used to extract features from RGB orthophotos. Despite being better than just RGB colors, these features still offer very limited representation power. Instead, convolutional neural networks specialized in creating good representations for downstream tasks can be used.
Use of open-access datasets for pretraining and additional validation. At its current state, the tree segmentation network learns to perform a very hard task from scratch on a dataset of a very limited size. I believe it can benefit from first training on a similar, but larger dataset, for example, the NeonTreeEvaluation Benchmark (Weinstein, Marconi, and White 2022). It is briefly described in the comparison subsection of the Lysva field inventory section. The bounding boxes from the orthophoto can be used to label the point cloud for tree segmentation. The labels will be noisy, but pretraining on a large amount of noisy data often shows good results. Including other data for validation will also make it significantly more reliable.
Use of more sophisticated techniques for creating synthetic forest patches. Both the sampling of the trees and the placement onto the patch can be improved. For example, sampling can take into consideration the species of the trees, to make the patches more realistic: some fully coniferous, some fully deciduous, some mixed in various proportions. Placing of the trees can use packing algorithms, or try to mimic the spatial distribution of trees observed in the inventory, where the tree density is not constant.
Use of more sophisticated augmentations to make the synthetic forest patches look as close as possible to real forest patches. For example, the height-depended dropout can be improved to take into account canopy density and canopy overlap, since dense and overlapping canopies are more likely to block the laser. A shorter tree almost completely covered from above by a larger tree should have fewer points overall, not just at low heights relative to its highest point. It will also be beneficial to generate patches slightly larger than the desired target size and crop them to it randomly as an augmentation. This will mimic the way the data will come to the model during inference, where trees on the edges of the patch are highly likely to be cropped.
Use of a more careful approach to training attribute prediction models. The results of the second step can be further improved by adopting a more careful approach to modeling, including a more careful approach to feature engineering and model selection.
Use of synthetic data for pretraining and fine-tuning on the real data. The most common way to enhance training deep learning models with synthetic data usually consist of pretraining on large amount of synthetic data and then careful fine-tuning on the small amount of real data. The current iteration of the dataset does not have labeling to enable direct fine-tuning, but it is possible to create such labels to enable this in the future. Release of the dataset into open access allows the community to enhance the it in many ways, and adding such labels would be one of the most useful ones.

I am purposefully avoiding the term “artificial intelligence” as it seems to have been abused so much as to become practically meaningless, and thus not useful in technical context.↩︎