-
Calibration Wizard: A Guidance System for Camera Calibration Based on Modelling Geometric and Corner Uncertainty.Songyou Peng and Peter Sturm.
tl;dr: Uses three freely-acquired poses to initialize, creates an optimization problem for the next pose such that the expected uncertainty of the intrinsic camera parameters is minimized. The process is to formulate the calibration problem as geometric reprojection error, and Jacobian matrices are computed. The data is extended to a hypothetical next pose, the next pose and intrinsic parameters are parameterized within the Jacobian. Through some matrix transformations, the covariance matrix of the intrinsic parameters can be extracted using the Jacobian. Corner uncertainty is incorporated, as poses that reduce uncertainty may be perpendicular to the image plane and be unusable. Code is available but in Matlab.
Today I read.
-
Motion-Based Extrinsic Sensor-to-Sensor Calibration: Effect of Reference Frame Selection for New and Existing Methods.Tuomas Välimäki, Bharath Garigipati, and Reza Ghabcheloo.
tl;dr: Uses hand-eye robot calibration formulation of AX=XB to calibrate sensor pairs in infrastructure context. The paper explores different methods for the calibration, as well as the choice of relative coordinate frame, as for the hand-eye calibration problem, the transformations are relative. I have not seen any treatment of this issue in the literature before, of how to choose the relative transformations when using AX=XB. The answer: ’it depends.’
-
How tree roots respond to drought.Ivano Brunner, Claude Herzog, Melissa A. Dawes, Matthias Arend, and Christoph Sperisen.
tl;dr: A review article covering responses of tree roots to drought conditions in the forest context; discusses drought avoidance as well as drought tolerance. New to me was the discussion of how root turnover contributes to soil organic matter. Table 1 on page 5 lists root traits and how each trait is affected by drought; ’growth’, ’architectural’, and ’morphological’ traits likely to be of interest for root phenotyping.
-
Simultaneous Direct Depth Estimation and Synthesis Stereo for Single Image Plant Root Reconstruction.Yawen Lu, Yuxing Wang, Devarth Parikh, Awais Khan, and Guoyu Lu.
tl;dr: Root reconstruction from one image of young apple tree roots. Two approaches to generating depth maps; first is to predict depth map from one image. The second is to generate another image from a single image, and then generate the depth map using a stereo technique. The results are combined to form the resulting point cloud.
-
A scalable, low-cost phenotyping strategy to assess tuber size, shape, and the colorimetric features of tuber skin and flesh in potato breeding populations.Max J. Feldman et al.
tl;dr: Measures the following traits from images: length and width, aspect ratio, eccentricity, biomass profiles; uses a size marker in images. Color assessed of skin and flesh in consumer camera and flat-bed scanner images, uses color checker and perform color calibration. Deep learning to classify halved tubers as possessing the hollow heart defect. Lists tools to automate capture with python and links to code. Population of 189 tubers.
-
The Importance of Coordinate Frames in Dynamic SLAM.Jesse Morris, Yiduo Wang, and Viorela Ila.
tl;dr: A back-end for Dynamic SLAM. Discusses object- versus world-centric dynamic SLAM, advocates for world-centric formulation but evaluates both object- and world-centric versions in factor graph library GTSAM. “Model free" in that tracked points are used (another option would be object pose). Dynamic objects are assumed to be rigid. An example of where gauge choice leads to different formulations and results.
-
Comparing YOLOv8 and Mask RCNN for object segmentation in complex orchard environments.Ranjan Sapkota, Dawood Ahmed, and Manoj Karkee.
tl;dr: Evaluation of YOLO8 and Mask RCNN in two datasets and for two different tasks. Datasets are color images of production apple trees; Dataset 1 from the dormant season (leafless trees) and Datatset 2 from the growing season with fruitlets. Tasks are single-class instance segmentation of fruitlets from Dataset 2, and multi-class instance segmentation of branches and tree trunks from Datatset 1. Total of 1550 images, all manually annotated and split into train / val / test sets; models trained on this data. References of other works using YOLO-N or Mask-RCNN in orchard environments is useful. Concludes that YOLO8 works better in these environments than Mask-RCNN, with better precision and recall and lower inference times.
-
Certifiable Solver for Real-Time N-View Triangulation.Mercedes Garcia-Salguero and Javier Gonzalez-Jimenez.
tl;dr: Formulates L2 norm N-view triangulation problem as a QCQP (Quadratically Constrained Quadratic Problem), where constraints are pair-wise epipolar constraints. Iterative solve using linear relaxations of the QCQP. Solutions are certified for optimality by checking for constraints’ satisfaction and positive semi-definiteness of a Hessian.
-
Condition numbers in multiview geometry, instability in relative pose estimation, and RANSAC.Hongyi Fan, Joe Kileel, and Benjamin Kimia.
tl;dr: Argues that the 5-point and 7-point (compute essential, and fundamental matrix, respectively from image correspondences) algorithms may be numerically unstable even in cases with no outliers. Then RANSAC not only filters outliers, but also tends towards selecting data points such that condition numbers are well-behaved.
-
Masked Autoencoders Are Scalable Vision Learners.Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross Girshick.
tl;dr: Proposes a masked autoencoder (MAE) for pretraining a Vision Transformer (ViT) for the image recognition task. The masked autoencoder is trained for the reconstruction task, with an asymmetric design; encoder does not take masked patches as input, while the decoder does. For image recognition, the decoder is abandoned and the encoder fine-tuned. Best results: ViT-Huge model, experiments on ImageNet-1K. Ablations abound in the paper.
-
Pre-Trained Masked Image Model for Mobile Robot Navigation.
tl;dr: Robotic exploration and map building. Uses an off-the-shelf model for inpainting, MAE (Masked Autoencoder, He et al. 2022), and applies it to three contexts. For field-of-view expansion experiments, the larger the patches to be inpainted, the worse the performance. Tested with semantic and binary (occupancy) maps, synthetic data. No fine-tuning of MAE, and performance is better than classical techniques on single-agent and multiple-agent exploration. I liked the writing in this paper – the hypothesis and themes are very clear throughout.
-
Deep Learning Based 3d Reconstruction for Phenotyping of Wheat Seeds: a Dataset, Challenge, and Baseline Method.Vsevolod Cherepashkin, Erenus Yildiz, Andreas Fischbach, Leif Kobbelt, and Hanno Scharr.
tl;dr: Three-dimensional reconstruction of wheat seeds for phenotyping. The most relevant trait is seed volume, because it is indicative of seed mass, which correlates to nutrients available to a seedling plant. The dataset consists of image data from robotic system phenoSeeder; different proportions of data are used in different scenarios. Test data is held back and uses three views; dataset train / val sets and challenge at the website link. Baseline methods are VGG11 and ResNet-152, no code published for the baselines.
-
Variational autoencoder based anomaly detection using reconstruction probability.Jinwon An and Sungzoon Cho.
tl;dr: In the context of anomaly detection with varational autoencoders, argues that reconstruction probability is a more objective measure than reconstruction error. Experiments with MNIST and KDD cup 1999 network intrustion dataset. The VAEs provide reconstructions as well as reconstruction probabilities.
-
TreeScope: An Agricultural Robotics Dataset for LiDAR-Based Mapping of Trees in Forests and Orchards .Derek Cheng et al.
tl;dr: Dataset paper. Acquired LiDAR scans of forestry and large orchard trees (almond, pistachio) from under the canopy, using small UAVs or a mobile unit in a backpack or cart. Provides semantic segmentation labels of scans for tree stems, ground, and misc. Provides ground truth diameter at breast height (DBH) measurements. Baseline semantic segmentation methods are RangeNet++, SqueezeSegV2, and SqueezeSegV3. Baseline diameter estimation methods are DBCRE and SLOAM.
-
Modelling wine grapevines for autonomous robotic cane pruning.Henry Williams et al.
tl;dr: Systems paper concerning the 3D modeling of grape vines, for cane pruning. Uses learned methods for panoptic segmentation and stereo inference. (Detectron 2 for panoptic segmentation, HSMnet for stereo inference.) Uses an over-the-row unit with two UR5 arms to acquire camera data.
-
The Beauty of Roots.John C. Baez, J. Daniel Christensen, and Sam Derbyshire.
tl;dr: A ’Short Stories’ paper, 3 pages. Considers Littlewood polynomials (each coefficient is +1 or -1) of degree n, and the patterns that arise from plotting the set of all roots for a particular degree. Note: Figures are plots of the complex plane, with the intensity proportional to the number of roots at that point. The plots resemble a unit circle with fractal patterns on the circle’s boundary. Subject area is outside of my regular reading; I enjoyed the article.
-
Normalization Techniques in Training DNNs: Methodology, Analysis and Application.Lei Huang, Jie Qin, Yi Zhou, Fan Zhu, Li Liu, and Ling Shao.
tl;dr: A review and commentary of normalization methods in DNNs. I skimmed this one. Good for definitions of all of the normalization terms and especially Figure 1.
-
Optimal Whitening and Decorrelation.Agnan Kessy, Alex Lewin, and Korbinian Strimmer.
tl;dr: Covers ‘whitening’, linear transforms that convert random vectors to another random vector, where the new random vector has covariance equal to the identity matrix. Five types discussed: zero-phase components analysis (ZCA) or Mahalanobis whitening, PCA whitening, Cholesky whitening, ZCA-cor, and PCA-cor. ZCA whitening is used in paper ‘CamP: Camera Preconditioning for Neural Radiance Fields’, Park et al. 2023. ‘Whitening’ is equivalent to the term ‘sphering’.
-
CamP: Camera Preconditioning for Neural Radiance Fields.Keunhong Park, Philipp Henzler, Ben Mildenhall, Jonathan T. Barron, and Ricardo Martin-Brualla.
tl;dr: NeRF joint optimization of camera parameters and scene reconstruction. Uses a left preconditioner for each camera’s parameters (Zero Component Analysis (ZCA) whitening transform (Kessy et al. 2018)), derived from a projection function; apply this at the initial iteration of the optimization. The new method is implemented on top of Zip-NeRF (Barron et al. 2023).
-
The Little Book of Deep Learning.François Fleuret.
tl;dr: I really like this introduction to deep learning and reference guide. Want to remember a term without getting in too deep? This little book has it, and the top-level references if I want to read more. See the website to order a physical version, printing two book pages per printed page worked well for me too.
-
Making Your Python Code Run Faster.Brandon Rohrer.
tl;dr: Profiling, vectorization, pre-compilation with Numba, 10 optimization suggestions, "try it and test it", examples presented in context of a physics simulation. Good discussions about troubleshooting and debugging, when to visualize, determining project goals.
-
Potential of Unmanned Aerial Sampling for Monitoring Insect Populations in Rice Fields.Hong Geun Kim, Jong-Seok Park, and Doo-Hyung Lee.
tl;dr: Need to monitor for seasonal insect migrations in rice fields. Uses a UAS with small nets to collect samples of insects at different altitudes. To my knowledge, the only work to collect insects with a UAS versus using already-tagged insects.