Internship - Ai Engineer - Uncertainty Estimation - Clamart, France - Schlumberger

Schlumberger

Entreprise vérifiée

Clamart, France

il y a 2 semaines

Posté par:

Sophie Dupont

beBee Recruiter

StageSHIP

Description

Job title:

Internship - AI Engineer - Uncertainty estimation and active learning (6 months)

About Us:

We are a global technology company, driving energy innovation for a balanced planet.

At SLB we create amazing technology that unlocks access to energy for the benefit of all. That is our purpose. As innovators, that's been our mission for 100 years. We are facing the world's greatest balancing act
- how to simultaneously reduce emissions and meet the world's growing energy demands. We're working on that answer. Every day, a step closer.

Our collective future depends on decarbonizing the fossil fuel industry, while innovating a new energy landscape. It's what drives us. Ensuring progress for people and the planet, on the journey to net zero and beyond. For a balanced planet.

Our Purpose
Together, we create amazing technology that unlocks access to energy for the benefit of all.

Location:
Clamart, France

Come and Join SLB's AI Lab in Paris. We are currently offering internship to bright minds specialized in Data Science and Artificial Intelligence. Discover a multinational company. We have brought a little bit of the Silicon Valley in Paris.

Experience working within a team of young and fun passionate Data Scientists, tackling real business challenges, in tandem with business experts who are sitting at your desk.

Description:

Two of the biggest issues preventing the widespread use of machine learning in the Energy industry is the need for expert labelled data, and the lack of confidence in the trained models.

The issue with expert labelled data is that the experts generally only have a very limited amount of time available for such labelling.

This means that any project must limit itself to the very minimum number of labels necessary.

The issue with confidence in the trained models on the other hand, is the lack of knowledge of when the model can be trusted or not.

Given data similar to the training data the model might indeed perform very well, while it can fail catastrophically when the data becomes sufficiently different from the training data.

Is there a way for the model itself to predict such failures, and tell us when and when not it should be trusted?

These two seemingly disjointed topics are brought together in the field of "Active Learning".

Its goal is to reduce the number of labels necessary by selecting data points to be labelled, by predicting how much a label on a given datapoint would improve the model's performance on all unseen data.

This way it aims to minimize the cost of obtaining labeled data, in order to get a model to a given performance level.

To do this a key piece of information is how certain the model is of the label of any unseen data point.

If the model already knows the answer, having it labeled will add nothing to its performance.

This internship aims at investigating the topic of active learning in SLB, and help us develop recommendations and best practices.

We are interested in analyzing uncertainty sampling query strategies, based on selecting instances for which the model is the least certain how to label.

In order to do this, a trustworthy representation of uncertainty is necessary and should be considered as a key feature of any machine learning method.

A particular focus during this internship will be devoted to analyzing the types of uncertainty in prediction models, their sources, and an accurate estimation of overall prediction uncertainty.

The intern will be in charge of defining the best strategy and methodology to provide a new answer product based on data science technologies (statistics and machine learning algorithms).

The intern will have the opportunity to use available experimental equipment and numerical tools (large possibilities within SLB with research and engineering centers.

Required

skills:

Skills:
applied mathematics, probability & statistics, deep learning, time series, Bayesian statistics, and physics/electronics (general).

Cloud computing services:
Google Cloud Platform

Programming language:
Python, PyTorch or Keras/Tensorflow

References:

YouTube: Professor Yarin Gal's Keynote on Humanintheloop Bayesian Deep Learning [link]

Deep Bayesian Active Learning with Image Data [link]
Batch

BALD:
Efficient and Diverse Batch Acquisition for Deep Bayesian AL [link]
Aleatoric and epistemic uncertainty in ML: An introduction [link]
Bayesian Deep Learning and a Probabilistic Perspective of Generalization [link]

SLB is an equal employment opportunity employer.

Qualified applicants are considered without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, or other characteristics protected by law.