26^th Iberoamerican Congress on Pattern Recognition

27-30 November • Coimbra • Portugal

JavaScript is Disabled
Your browser's JavaScript functionality is disabled. It has to be enabled to use this function of ConfTool.
Here you can find information on how to enable JavaScript
If you have any problems, please contact the organizers at ciarp2023@isec.pt.

Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Session Overview

Session

Import to your local calendar

16: Coffee Break & Posters Session 1: Data Analysis and Machine Learning

Time:

Wednesday, 29/Nov/2023:

10:00am - 11:00am

Location: Polivalente

Presentations

Graph Embedding of almost constant large graphs

Francesc Serratosa

Universitat Rovira i Virgili

In some machine learning applications, graphs tend to be composed of a large number of tiny almost constant sub-structures. The current embedding methods are not prepared for this type of graphs and thus, their representational power tends to be very low. Our aim is to define a new graph embedding, called GraphFingerprint, that considers this specific type of graphs. The three-dimensional characterisation of a chemical metal-oxide nanocompound easily fits in these types of graphs, which nodes are atoms and edges are their bonds. Our graph embedding method has been used to predict the toxicity of these nanocompounds, achieving a high accuracy compared to other embedding methods.

Uncovering manipulated files using mathematical natural laws

Pedro Alexandre Fernandes¹, Séamus Ó Ciardhuáin¹, Mário Antunes²

¹Technological University of the Shannon, Ireland; ²Polytechnic of Leiria, Portugal

The data exchange between different sectors of society has led to the development of electronic documents supported by different reading formats, namely portable PDF format. These documents have characteristics similar to those used in programming languages, allowing the incorporation of potentially malicious code, which makes them a vector for cyberattacks. Thus, detecting anomalies in digital documents, such as PDF files, has become crucial in several domains, such as finance, digital forensic analysis and law enforcement. Currently, detection methods are mostly based on machine learning and are characterised by being complex, slow and mainly inefficient in detecting zero-day attacks. This paper aims to propose a Benford Law (BL) based model to uncover manipulated PDF documents by analysing potential anomalies in the first digit extracted from the PDF document's characteristics.

The proposed model was evaluated using the CIC Evasive PDFMAL2022 dataset, consisting of 1191 documents (278 benign and 918 malicious).

To classify the PDF documents, based on BL, into malicious or benign documents, three statistical models were used in conjunction with the mean absolute deviation: the parametric Pearson and the non-parametric Spearman and Cramer-Von Mises models. The results show a maximum F1 score of $87.63\%$ in detecting malicious documents using Pearson's model, demonstrating the suitability and effectiveness of applying Benford's Law in detecting anomalies in digital documents to maintain the accuracy and integrity of information and promoting trust in systems and institutions.

Vehicle Re-Identification based on unsupervised domain adaptation by incremental generation of Pseudo-labels

Paula Moral, Álvaro García-Martín, José M. Martínez

Universidad Autónoma de Madrid, Spain

The main goal of vehicle re-identification (ReID) is to associate the same vehicle identity in different cameras. This is a challenging task due to variations in light, viewpoints or occlusions; in particular, vehicles present a large intra-class variability and a small inter-class variability. In ReID, the samples in the test sets belong to identities that have not been seen during training. To reduce the domain gap between train and test sets, this work explores unsupervised domain adaptation generating automatically pseudo-labels from the testing data, which are used to fine-tune the ReID models.

Specifically, the pseudo-labels are obtained by clustering using different hyperparameters and incrementally due to retraining the model a number of times per hyperparameter with the generated pseudo-labels. The ReID system is evaluated in CityFlow-ReID-v2 dataset.

How to turn your camera into a perfect pinhole model

Ivan De Boi¹, Stuti Pathak¹, Marina Oliveira², Rudi Penne¹

¹University of Antwerp, Belgium; ²University of Coimbra, Portugal

Camera calibration is a first and fundamental step in various computer vision applications. Despite being an active field of research, Zhang's method remains widely used for camera calibration due to its implementation in popular toolboxes like MATLAB and OpenCV. However, this method initially assumes a pinhole model with oversimplified distortion models. In this work, we propose a novel approach that involves a pre-processing step to remove distortions from images by means of Gaussian processes.

Our method does not need to assume any distortion model and can be applied to severely warped images, even in the case of multiple distortion sources, e.g., a fisheye image of a curved mirror reflection. The Gaussian processes capture all distortions and camera imperfections, resulting in virtual images as though taken by an ideal pinhole camera with square pixels. Furthermore, this ideal GP-camera only needs one image of a square grid calibration pattern.

This model allows for a serious upgrade of many algorithms and applications that are designed in a pure projective geometry setting but with a performance that is very sensitive to non-linear lens distortions. We demonstrate the effectiveness of our method by simplifying Zhang's calibration method, reducing the number of parameters and getting rid of the distortion parameters and iterative optimization. We validate by means of synthetic data and real world images. The contributions of this work include the construction of a virtual ideal pinhole camera using Gaussian processes, a simplified calibration method and lens distortion removal.

Abandoned Object Detection Using Persistent Homology

Javier Lamar Leon¹, Raul Alonso Baryolo², Rocio Gonzales Diaz³, Edel Garcia Reyes⁴

¹University of Évora, Portugal, Portugal; ²Software Engineer at Microsoft; ³University of Seville, Seville, Spain; ⁴GEOCUBA Company, Cuba

The automatic detection of suspicious abandoned objects has become a priority in video surveillance in the last years. Terrorist attacks, improperly parked vehicles, abandoned drug packages and many other events, endorse the interest in automating this task.

It is challenge to detect such objects due to many issues present in public spaces for video-sequence process, like occlusions, illumination changes, crowded environments, etc. On the other hand, using deep learning can be difficult due to the fact that it is more successful in perceptual tasks and generally what are called system 1 tasks.

In this work we propose to use topological features to describe the scenery objects. These features have been used in objects with dynamic shape and maintain the stability under perturbations.

The objects (foreground) are the result of to apply a background subtraction algorithm.

We propose the concept the surveillance points: set of points uniformly distributed on scene.

Then we keep track of the changes in a cubic region centered at each surveillance points.

For that, we construct a simplicial complex (topological space) from the $k$ foreground frames.

We obtain the topological features (using persistent homology) in the sub-complexes for each cubical-regions, which represents the activity around the surveillance points.

Finally for each surveillance points we keep track of the changes of its associated topological signature in time, in order to detect the abandoned objects. The accuracy of our method is tested on PETS2006 database with promising results.

Analysis and Impact of Training Set Size in Cross-Subject Human Activity Recognition

Miguel Matey-Sanz¹, Joaquín Torres-Sospedra², Alberto González-Pérez¹, Sven Casteleyn¹, Carlos Granell¹

¹Institute of New Imaging Technologies, Universitat Jaume I, Spain; ²ALGORITMI Research Centre, University of Minho, Portugal

The ubiquity of consumer devices with sensing and computational capabilities, such as smartphones and smartwatches, has increased interest in their use in human activity recognition for healthcare monitoring applications, among others. When developing such a system, researchers rely on input data to train recognition models. In the absence of openly available datasets that meet the model requirements, researchers face a hard and time-consuming process to decide which sensing device to use or how much data needs to be collected. In this paper, we explore the effect of the amount of training data on the performance (i.e., classification accuracy and activity-wise F1-scores) of a CNN model by performing an incremental cross-subject evaluation using data collected from a consumer smartphone and smartwatch. Systematically studying the incremental inclusion of subject data from a set of 22 training subjects, the results show that the model's performance initially improves significantly with each addition, yet this improvement slows down the larger the number of included subjects. We compare the performance of models based on smartphone and smartwatch data. The latter option is significantly better with smaller sizes of training data, while the former outperforms with larger amounts of training data. In addition, gait-related activities show significantly better results with smartphone-collected data, while non-gait-related activities, such as standing up or sitting down, were better recognized with smartwatch-collected data.

YOLOMM – You Only Look Once for Multi-modal Multi-tasking

Filipe Campos¹, Francisco Gonçalves Cerqueira¹, Ricardo P. M. Cruz^1,2, Jaime S. Cardoso^1,2

¹Faculty of Engineering, University of Porto, Portugal, Portugal; ²INESC TEC, Porto, Portugal

Autonomous driving can reduce the number of road accidents due to human error and result in safer roads. One important part of the system is the perception unit, which provides information about the environment surrounding the car. Currently, most manufacturers are using not only RGB cameras, which are passive sensors that capture light already in the environment but also Lidar. This sensor actively emits a laser and measures reflection and time-of-flight. Previous work, YOLOP, already proposed a model for object detection and semantic segmentation, but only using RGB. This work extends it for Lidar and evaluates performance on KITTI, a public autonomous driving dataset. The implementation shows improved precision, across all objects of different sizes. The implementation is entirely made available: https://github.com/filipepcampos/yolomm.

Towards Robust Defect Detection in Casting Using Contrastive Learning

Eneko Intxausti Arbaiza, Ekhi Zugasti, Carlos Cernuda

Mondragon Unibertsitatea - Faculty of Engineering, Spain

Defect detection plays a vital role in ensuring product quality and safety within industrial casting processes. In these dynamic environments, the occasional emergence of new defects in the production line poses a significant challenge for supervised methods. We present a defect detection framework to effectively detect novel defect patterns without prior exposure during training. Our method is based on contrastive learning applied to the Faster R-CNN model, enhanced with a contrastive head to obtain discriminative representations of different defects. By training on an diverse and comprehensive labeled dataset, our method achieves comparable performance to the supervised baseline model, showcasing commendable defect detection capabilities. To evaluate the robustness of our approach, we authentically replicate a real-world use case by deliberately excluding several defect types from the training data. Remarkably, in this new context, our proposed method significantly improves detection performance of the baseline model, particularly in situations with very limited training data, showcasing a remarkable 34.7% enhancement. Our research highlights the potential of the proposed method in real-world environments where the number of available images may be limited or inexistent. By providing valuable insights into defect detection in challenging scenarios, our framework could contribute to ensuring efficient and reliable product quality and safety in industrial manufacturing processes.

DIF-SR: A Differential Item Functioning-Based Sample Reweighting Method

Diego Minatel, Antonio R. S. Parmezan, Mariana Curi, Alneu de Andrade Lopes

University of Sao Paulo (USP), Brazil

In recent years, numerous machine learning-based systems have actively propagated discriminatory effects and harmed historically disadvantaged groups through their decision-making. This undesired behavior highlights the importance of research topics such as fairness in machine learning, whose primary goal is to include fairness notions into the training process to build fairer models. In parallel, Differential Item Functioning (DIF) is a mathematical tool often used to identify bias in test preparation for candidate selection; DIF detection assists in identifying test items that disproportionately favor or disadvantage candidates solely because they belong to a specific sociodemographic group. This paper argues that transposing DIF concepts into the machine learning domain can lead to promising approaches for developing fairer solutions. As such, we propose DIF-SR, the first DIF-based Sample Reweighting method for weighting samples so that the assigned values help build fairer classifiers. DIF-SR can be seen as a data preprocessor that imposes more importance on the most auspicious examples in achieving equity ideals. We experimentally evaluated our proposal against two baseline strategies by employing twelve datasets, five classification algorithms, four performance measures, one multicriteria measure, and one statistical significance test. Results indicate that the sample weight computed by DIF-SR can guide supervised machine learning methods to fit fairer models, simultaneously improving group fairness notions such as demographic parity, equal opportunity, and equalized odds.

Improving Pest Detection via Transfer Learning

Dinis Costa¹, Catarina Silva¹, Joana Costa^1,2, Bernardete Ribeiro¹

¹CISUC, Dep. of Informatics Engineering, Coimbra, Portugal; ²Polytechnic Institute of Leiria, School of Technology and Management, Leiria, Portugal

Pest monitoring models aid in making informed decisions for pest control and implementing effective management strategies. In the context of smart farming, various approaches have been developed and have surpassed traditional techniques in efficiency and accuracy. However, the application of Few-Shot Learning (FSL) methods in this domain remains limited. In this study, we address this gap by leveraging Transfer Learning (TL). Our findings highlight the considerable efficacy of applying transfer learning techniques in this context, demonstrating a significant improvement in mAP performance by 24%, and a 10% decrease in training time efficiency.

Single Image HDR Synthesis with Histogram Learning

Huei-Yung Lin¹, Yi-Rung Lin², Wen-Chieh Lin³

¹National Taipei University of Technology; ²National Chung Cheng University; ³National Yang Ming Chiao Tung University

High dynamic range imaging aims for a more accurate representation of the scene. It provides a large luminance coverage to yield the human perception range. In this paper, we present a technique to synthesize an HDR image from the LDR input. The proposed two-stage approach expands the dynamic range and predict its histogram with cumulative histogram learning. Histogram matching is then carried out to reallocate the pixel intensity.

In the second stage, HDR images are constructed using reinforcement learning with pixel-wise rewards for local consistency adjustment. Experiments are conducted on HDR-Real and HDR-EYE datasets. The quantitative evaluation on HDR-VDP-2, PSNR, and SSIM have demonstrated the effectiveness compared to the state-of-the-art techniques.

Deblur Capsule Networks

Daniel Felipe Silva Santos, Joao Paulo Papa

Sao Paulo State University, Brazil

Blur is often caused by physical limitations of the image acquisition sensor or by unsuitable environmental conditions. Blind image deblurring recovers the underlying sharp image from its blurry counterpart without further knowledge regarding the blur kernel or the sharp image itself. Traditional deconvolution filters are highly dependent on specific kernels or prior knowledge to guide the deblurring process. This work proposes an end-to-end deep learning approach to address blind image deconvolution in three stages: (i) it first predicts the blur type, (ii) then it deconvolves the blurry image by the identified and reconstructed blur kernel, and (iii) it deep regularizes the output image. Our proposed approach, called Deblur Capsule Networks, explores the capsule structure in the context of image deblurring. Such a versatile structure showed promising results for synthetic uniform camera motion and multi-domain blind deblur of general-purpose and remote sensing image datasets compared to some state-of-the-art techniques.

Mobile View Print View

Contact and Legal Notice · Contact Address:
Privacy Statement · Conference: CIARP 2023

26th Iberoamerican Congress on Pattern Recognition

27-30 November • Coimbra • Portugal

Conference Agenda

26^th Iberoamerican Congress on Pattern Recognition