This article on artificial intelligence (AI) proposes a new differentiable angle encoder named phase shift encoder (PSC) to accurately predict the orientation of objects

This article on artificial intelligence (AI) proposes a new differentiable angle encoder named phase shift encoder (PSC) to accurately predict the orientation of objects

Object detection is the technique used to identify and classify various elements in an image. Different methods are commonly used for object detection to recognize and locate objects, and these algorithms use deep learning to provide relevant results. Deep learning object identification is a fast and accurate method to predict the placement of an object in an image, which can be useful in various circumstances.

Since objects in natural situations generally face upwards due to gravity, early research focused primarily on detecting horizontal objects. Oriented bounding boxes are preferred in other contexts, such as aerial imagery, industrial inspection, and scene text. The identification of oriented objects quickly became more important due to the needs in these contexts. Unfortunately, two significant challenges exist: the boundary discontinuity problem primarily caused by angular periodicity, and the square-like problem that typically occurs when a square bounding box cannot be uniquely defined. To solve these problems, a Chinese research team from Southeastern University proposes to use phase shift coding for angle prediction in the detection of oriented objects.

The authors proposed to modify phase shift coding (PSC), which was mainly created for optical measurement, to adapt it to the detection of oriented objects. This choice was made for two main reasons:

1 – In optical measurement, the phase shift converts the measured distance into periodic phases. The boundary discontinuity is then automatically resolved because the orientation angle can also be encoded in periodic phases.

2 – There are several solutions to the periodic fuzzy problem, which also arises out of phase and is comparable to the square type problem. By combining the phase of multiple frequencies, the dual frequency phase shift approach, for example, solves the problem of periodic blurring.

The authors postulate that it is possible to naturally unite the boundary problem and the square problem by reconsidering the two. The boundary problem arises when a bounding box is identical to itself when rotated 180 degrees while the square type problem arises when it is equivalent when rotated 90 degrees. Although they have distinct cycles, both situations are fuzzy periodic problems. The improved version, bi-frequency phase-shifting code (PSCD), is then proposed to carry out this operation.

An experimental study was conducted to evaluate the proposed method (PSC and PSCD) across three publicly available datasets: DOTA, HRSC and OCDPCB using PyTorch, ultralytics/yolov5 and MMRotate toolkits. The mean average precision (mAP) was chosen as the main metric to compare with the existing literature.

In addition, to confirm the effectiveness of the dual-frequency module and help researchers choose, this study provides an understandable comparison between single-frequency PSC and dual-frequency PSC. A visual comparison demonstrates that the dual-frequency approach can work as expected and provide a unified solution to boundary discontinuity and square-like problems. Therefore, the dual frequency process is strongly advised in environments with square type objects.

In this work, the phase-shifting encoder is used for the first time in deep learning to deal with the orientation angle regression problem. The proposed method encodes the orientation angle in a periodic phase to solve the boundary discontinuity problem. Based on the PSC, an improved dual-frequency PSCD variant is presented that elegantly solves both boundary discontinuity and square-like problems by mapping the rotational periodicity of various cycles into multiple-frequency phases. The authors provided well-written public codes with reproducible results.

Check paper and coded. All credit for this research goes to the researchers on this project. Also don’t forget to register. our Reddit page and discord channelwhere we share the latest AI research news, cool AI projects, and more.

Mahmoud is a PhD researcher in machine learning. He also holds a
bachelor’s degree in physical sciences and a master’s degree in
telecommunications systems and networks. His current areas of
research focuses on computer vision, stock market prediction and
learning. He has produced several scientific articles on the re-
identification and study of the robustness and stability of

#article #artificial #intelligence #proposes #differentiable #angle #encoder #named #phase #shift #encoder #PSC #accurately #predict #orientation #objects

Leave a Comment

Your email address will not be published. Required fields are marked *