Emplois
>
Cesson-Sévigné

    PhD "Spatial neural audio coding" F/M - Cesson-Sévigné, France - Orange Business Services

    Orange Business Services
    Orange Business Services Cesson-Sévigné, France

    il y a 3 semaines

    Default job background
    Description

    about the role

    Your role is to conduct PhD work on multichannel audio coding using deep neural networks (DNN).

    Background and problem

    After MPEG-H for streaming, the 3GPP IVAS (Immersive Voice and Audio Services) standard, to which Orange has contributed, inaugurates the ability of a conversational codec to address immersive services, with some spatial user interaction. There is a convergence with virtual meetings (bringing the naturalness of physical meetings and even more interactivity) and the concept of the augmented collaborator (incl. augmented reality). "Spatial parametric" approaches seem well suited to interaction needs, as well as High Order Ambisonics (HOA).

    In recent years, neural approaches to (single channel) audio coding have made enormous strides in terms of quality/rate ratios. The Generative Adversarial Networks (GAN) allowed this performance improvement. The algorithmic complexity of these approaches is sometimes too great for practical use.

    However, end-to-end neural spatial (multichannel) audio coding has received little attention so far.

    Voice and audio coding is in Orange's DNA, and the quality of its services depends on it.

    Scientific objective - results and challenges

    The aim of this PhD is to design a spatialized audio codec based on a deep neural network (DNN) offering interactive playback possibilities.

    To bring interactivity in decoding, it is necessary to make progress in the design of interpretable neural networks. To this end, we consider combining tasks that have hitherto been generally separate: coding, source separation, enhancement, spatial analysis and dereverberation. This is common practice in the DNN field, but it is still only partially applied to these tasks.

    It will be necessary to make use of state-of-the-art models and compete with them, while aiming for reduced complexity/consumption.

    A critical aspect of the thesis is the exploitation (or even constitution) of 3D audio databases for learning and evaluating the algorithms developed.

    about you

  • Scientific and technical skills and soft skills needed for the position A solid background in machine learning techniques and deep neural networksAdvanced knowledge of (audio) signal processing. Understanding of the spatial properties of acousticsKnowledge of coding principlesAutonomy and initiativeAbility to summarize and communicate their ideas and results
  • Education required 5-year degree (master, engineering degree, etc.) in signal processing, machine learning or acousticsKnowledge of the audio field is a plusExperience with the Python language (Pytorch library)
  • additional information

    The work of the PhD must take cardinal account of the notions of interpretability and algorithmic efficiency. These two notions are critical to the development of neural network-based techniques. Indeed, the lack of interpretability of algorithms leads to safety and robustness problems. Algorithmic efficiency, both in learning and deployment, is crucial in an environment of finite resources.

    The performance of deep neural networks relies on the relevance of the training datasets used, as well as considerable expertise in the phenomena involved, to effectively design the architecture. You will be part of a team that has both unique databases adapted to multichannel coding and expertise in acoustics, signal processing and psychoacoustics. You'll also be tackling issues that are critical in today's data-driven algorithm development.

    department

    Orange Innovation brings together the research and innovation activities and expertise of the Group's entities and countries. We work every day to ensure that Orange is recognized as an innovative provider by its customers and we create value for the Group and the Brand in each of our projects. With 740 researchers, thousands of marketers, developers, designers and data analysts, it is the expertise of our 6,000 employees that fuels this ambition every day.

    Orange Innovation anticipates technological breakthroughs and supports the Group's countries and entities in making the best technological choices to meet the needs of our consumer and business customers.

    Within Innovation, you will be part of a team at the forefront of innovation and expertise in audio telecommunication systems (sound recording, denoising, scene analysis, spatialization, coding/compression, etc.). You'll be involved in a research ecosystem where researchers and engineers work together to put concepts into practice, as well as state-of-the-art audio systems for sound recording, broadcast and transmission.

    contract

    Thesis



  • Orange Business Services Cesson-Sévigné, France

    about the role · Your role is to carry out a thesis on: "Source separation using generative AI. Application to ambisonic content". · Overall context and problem of the subject · Recent advances in automatic speech recognition, driven by advances in deep learning, have played ...