Internship Research On Partial Latent Encoding of - Paris, France - Inria

Inria
Inria
Entreprise vérifiée
Paris, France

il y a 3 semaines

Sophie Dupont

Posté par:

Sophie Dupont

beBee Recruiter


StageSHIP
Description
Le descriptif de l'offre ci-dessous est en Anglais_


Type de contrat :
Convention de stage
Niveau de diplôme exigé :Bac + 5 ou équivalent
Fonction :Stagiaire de la recherche
Niveau d'expérience souhaité :Jeune diplôméA propos du centre ou de la direction fonctionnelle

The Inria centre at Université Côte d'Azur includes 37 research teams and 8 support services. The centre's staff (about 500 people) is made up of scientists of different nationalities, engineers, technicians and administrative staff.

The teams are mainly located on the university campuses of Sophia Antipolis and Nice as well as Montpellier, in close collaboration with research and higher education laboratories and establishments (Université Côte d'Azur, CNRS, INRAE, INSERM...), but also with the regiona economic players.


With a presence in the fields of computational neuroscience and biology, data science and modeling, software engineering and certification, as well as collaborative robotics, the Inria Centre at Université Côte d'Azur is a major player in terms of scientific excellence through its results and collaborations at both European and international levels.

Contexte et atouts du poste


The work will be embedded in a project in collaboration between Université de Paris Cité (team LIPADE, Paris) and Inria (team EVERGREEN, Montpellier).


By using location on the Earth's surface as the common link between different modalities, a geo-spatial foundation model would be able to incorporate a variety of data sources, including remote sensing imagery, textual descriptions of places, and features in maps.

Leveraging the large amounts of available unlabeled geo-spatial data from these different sources, the GEO-ReSeT (Generalized Earth Observation with Remote Sensing and Text) ANR project has the objective to learn a better representation of any geo-spatial location and convey a semantic representation of the information.


By leveraging several data modalities, this foundation model could provide a more comprehensive and accurate understanding of the Earth's surface, enabling more informed decisions and actions.

This will be particularly valuable for new potential users in sectors such as journalism, social sciences or environmental monitoring, who may not have the resources or expertise to collect their own training datasets and develop their own methods, thus moving beyond open Earth observation data and democratizing the access to Earth observation information.

Mission confiée


The work to be conducted during the proposed M2 internship will contribute to the ambition of the GEO-ReSeT ANR project by proposing a new methodology for projecting multi-modal data of different natures to a common latent space.

One classical way to achieve this is through a contrastive self-supervised learning approach. A feature extractor for each modality is trained through a contrastive loss.

This loss ensures that similar examples (in the case of geo-spatial data, from the same geographical location) are close in the feature space, while dissimilar examples are projected far away.

These self-supervised models can then be used on downstream tasks through linear probing.


This approach tends to work well on natural images and has been successfully on geo-spatial data, such as remote sensing image.

However, retaining the particularities of each modality, each given partial information of the underlining reality, is a challenge. In this work, the authors propose to learn factorized representations of each modalities.

Principales activités


In this work, our objective is to to explicitly model which part of the latent space is concerned with each of the modalities.

We propose to achieve this objective by modeling the uncertainty on the feature representation of each modality.

The work to be performed in this internship will lead to the following three contributions:

Compétences

  • Python programming
  • Deep Learning with Python (preferably with Pytorch)
  • Experience with Remote Sensing imagery
Avantages

  • Subsidized meals
  • Partial reimbursement of public transport costs
  • Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
  • Possibility of teleworking (after 6 months of employment) and flexible organization of working hours
  • Professional equipment available (videoconferencing, loan of computer equipment, etc.)
  • Social, cultural and sports events and activities
  • Access to vocational training
  • Social security coverage

Plus d'emplois de Inria