Ayuda
Ir al contenido

Dialnet


Deep and reinforcement learning in perception and control for autonomous aerial robots

  • Autores: Alejandro Rodriguez-Ramos
  • Directores de la Tesis: Pascual Campoy Cervera (dir. tes.)
  • Lectura: En la Universidad Politécnica de Madrid ( España ) en 2020
  • Idioma: español
  • Tribunal Calificador de la Tesis: Martín Molina González (presid.), Miguel Hernando Gutiérrez (secret.), Lakmal Seneviratne (voc.), Arturo de la Escalera Hueso (voc.), José María Cañas Plaza (voc.)
  • Programa de doctorado: Programa de Doctorado en Automática y Robótica por la Universidad Politécnica de Madrid
  • Materias:
  • Enlaces
  • Resumen
    • The expansion of machine learning methods has followed an unprecedented pace during the last decade. Their ability to solve problems in domains of diverse nature has placed machine learning techniques on the focus of several research lines and industrial projects.

      Besides, the research and innovation in robotic systems have been constantly growing, in terms of hardware and classical algorithmic development. Nevertheless, the level of robotic autonomy provided by hand-engineered algorithms, with commonly simplified models of the problem, is reaching a technological limit. In this context, machine learning methods, such as deep and reinforcement learning, have provided outstanding results in complex scenarios (e.g. computer vision tasks), which require the treatment of high-dimensional information or heterogeneous sources of data.

      Following these ideas, the present doctoral thesis is framed in a novel paradigm which the robotics field is currently exploring, where robots can learn high-level behaviors in a simulated environment, in order to be finally deployed in a real-world relevant environment.

      For the first time, deep and reinforcement learning methods and simulated environments have been used to solve challenging vision-based applications in aerial robotics, such as multirotor landing on a moving platform and non-cooperative multirotor following. Additionally, deep and reinforcement learning techniques have been utilized in the scope of object detection in video sequences. All of the techniques have been designed, implemented, and thoroughly validated in a wide variety of real and relevant scenarios. The stated applications have been formulated as vision-based problems, in order to be solved with a low-cost and off-the-shelf multirotors and/or embedded systems.

      In this doctoral work, the application of autonomous multirotor landing on top of a moving platform has been explored by making use of deep reinforcement learning methods.

      The complete approach has been learned in a simulated environment and deployed in a real environment. The strategy has been exhaustively validated and compared to several classical multirotor landing techniques, further verifying its effectiveness. In addition, the task of autonomous non-cooperative multirotor following has been solved through deep and reinforcement learning, which involved an increased challenge due to its higher dimensional nature and complexity of the maneuver, among others. In a similar trend, the complete application has been solely aided by synthetic information, such as low-level simulated states and images. Moreover, a method for the utilization of synthetic photorealistic images in the context of object detection has been proposed, with the final requirement of performing with real-world images.

      Finally, in this thesis, deep and reinforcement learning techniques have been researched for video object detection, with the global aim of taking advantage of the temporal information present in the frames of a video sequence, in order to reduce the processing latency.

      The approach has been inspired by the lack of attention to the context of an object which the human vision can exhibit when performing a focused tracking. The implementation led to dynamic context reuse across frames, along with a special temporal structure that further reduced the computational cost during the video computation. On this subject, an innovative technique has been proposed, where a reinforcement learning policy can be trained with a distribution of reward functions, being able to encapsulate several behaviors in one unique policy. At inference time, the policy can be conditioned on one unique behavior depending on the requirements of the application. The proposed technique has been validated under the scope of the video object detection application. However, the method is generic enough to be applied in further reinforcement learning related applications.

      This thesis by compendium is composed of three peer-reviewed scientific journal publications.

      Said publications equally contribute to satisfying the objectives of this doctoral thesis, following a clear and progressive thematic unity. Moreover, these publications have extended the state of the art in the stated aerial applications and have contributed to the usage of synthetic information in the scope of real-world robotics


Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus

Opciones de compartir

Opciones de entorno