Deep Fuzzy Extractors for Document Integrity Check - Lyon, France - Université Lumière Lyon 2

Université Lumière Lyon 2

Entreprise vérifiée

Lyon, France

il y a 1 semaine

Posté par:

Sophie Dupont

beBee Recruiter

Description

Deep fuzzy extractors for document integrity check:

Réf

-
ABG-118239

Stage master 2 / Ingénieur
Durée 6 mois
Salaire net mensuel environ 600€650€ 21/11/2023
Université Lumière Lyon 2
Lieu de travail
Lyon Auvergne-RhôneAlpes France
Champs scientifiques
Informatique

Mots clés

Deep learning, fuzzy extractors, integrity checking, document processing
Établissement recruteur:

Founded in 1973, Université Lumière Lyon 2 welcomes nearly 30,000 students on its two campuses, ranging from undergraduate to doctoral level.

As a university of literature, languages, and human and social sciences, it is comprised of 13 teaching units spread over four main areas of teaching and research.

With 33 laboratories and four research federations, which cover the areas of literature, languages, and human and social sciences (LLSHS - Lettres, Langues, Sciences Humaines et Sociales), Université Lumière Lyon 2 bases its approach on innovation, interdisciplinarity, partnership and an international outlook.

Through the projects developed and coordinated by its 1000 researchers, the university would like to enable communication and discussion between the human and social sciences, on one hand, and the hard sciences, on the other, as well as to put research at the centre of current societal and scientific challenges.

Université Lumière Lyon 2 has a strong focus on international cooperation and currently has agreements with 350 institutions throughout the world.

International students, whether part of an exchange programme or otherwise, account for more than 15% of the overall student body.

Description:

Context of the study
The current health situation is forcing the authorities to use digitized copies of paper documents.

Nevertheless, the widespread availability of professional image-editing tools, simple scanning devices and the accessibility of high-quality printing tools are increasing the number of document forgeries.

Scanned copies can be easily forged using certain image editing tools (such as Photoshop or Gimp) or new approaches based on the use of deep learning.

As a result, there is a great need for efficient, robust solutions for verifying the integrity of printed documents, which are then digitized.

The aim is to extract a signature from an electronic document that can be used to verify the integrity of digitized documents.

When an electronic document is printed and scanned several times, a slightly different image of the document - due to the optical characteristics of the capture devices - is obtained each time.

A similar problem arises in biometrics.

For example, when we capture the same fingerprint several times, it is not possible to obtain perfectly identical images, even if they are very close.

The difficulty of developing a method for verifying the integrity of printed and scanned documents is similar to the difficulties encountered in biometrics:

we want to record and then compare characteristics in order to deduce that two sets of data do indeed represent the same thing, despite the presence of noise.

To consider the uniqueness property of biometric data, fuzzy extractors [3] robust to sensor noise are used [1].

Unlike falsified biometric data, a falsified document does not differ significantly from its authentic version, making integrity verification more complex.

In previous work, an initial document integrity verification system was set up [1,2]. The features extracted are based on the analysis of intersections and bifurcations within alpha-numeric characters.

We are now interested in exploring the use of machine learning for fuzzy feature extraction, based on a method previously developed in biometrics [4].

The objectives of this internship are:

Explore the use of neural networks for fuzzy feature extraction [4].
Adapt method [4] to document integrity verification.
Compare the existing methods [1,2] with developed deep learning-based methods.

Profil:

Programming languages: Python.
Libraries for image analysis and processing: OpenCV, scikitimage (Python).
Machine learning frameworks: scikitlearn, Pytorch.
Scientific knowledge: signal processing, image analysis, machine learning and deep learning.
Knowledge in multimedia security will be considered a plus.
Languages: French or English.

Prise de fonction: