Phd Position F/m Efficient Deep Learning - Talence, France - Inria
Description
Le descriptif de l'offre ci-dessous est en Anglais_Type de contrat :
CDD
Niveau de diplôme exigé :
Bac + 5 ou équivalent
Autre diplôme apprécié :
Master's degree in Computer Science/ Mathematics/ Machine Learning/other technical field
Fonction :
Doctorant
Contexte et atouts du poste:
Objective
Optimise the training and inference of modern neural networks to create large-scale AI models for science
. Develop theoretical approaches and corresponding software.
Is regular travel foreseen for this post?
Short-term visits to conferences and collaborative laboratories. In particular, the team is involved with a tight collaboration with Caltech within the framework of Associated Team ELF.
Mission confiée:
Scientific Research context:
Work description:
Concerning the training phase, one group of methods proposes advanced parallelization techniques, such as model and pipelined parallelism, for which the members of Topal already contributed [1, 3, 4].
Offloading to CPU saves memory at the price of an overhead on communications, while activation checkpointing recomputes parts of the computational graph when applied, thus saving memory at the price of an overhead on computations.
All types of techniques can be combined to achieve better throughput. Recent papers consider a combination of pipeline parallelism with activation checkpointing techniques [5, 6].
An important point is that algorithms with theoretically better time/memory complexity in practice might provide fewer benefits as it could be expected from analytical derivations.
To make deep learning algorithms efficient in real life it is important to combine software and hardware optimization when creating new deep learning algorithms.
During the PhD we plan to propose novel approaches to improve efficiency (memory/time/communication costs) of neural network training and inference.
Particularly, by finding best model execution schedule which allows using different types of techniques, including but not limited to parallelisms, re-materialization, offloading, low-bit computations.
Along with theoretical contribution to the field, there will be developed software to automatically optimize the training and inference of modern deep learning architectures.
References:
[1] Zhao, X., Le Hellard, T., Eyraud-Dubois, L., Gusak, J.
& Beaumont, O Rockmate:
an Efficient, Fast, Automatic and Generic Tool for Re-materialization in PyTorch. Proceedings of the 40th International Conference on Machine Learning
[2] Gusak, J., Cherniuk, D., Shilova, A., Katrutsa, A., Bershatsky, D., Zhao, X., Eyraud-Dubois, L., Shlyazhko, O., Dimitrov, D., Oseledets, I. & Beaumont, O. (2022, July). Survey on Large Scale Neural Network Training. In IJCAI-ECAI st International Joint Conference on Artificial Intelligence (pp International Joint Conferences on Artificial Intelligence Organization.
[3] Beaumont, O., Eyraud-Dubois, L., Shilova, A., & Zhao, X Weight Offloading Strategies for Training Large DNN Models.
[4] Beaumont, O., Eyraud-Dubois, L., & Shilova, A Efficient combination of rematerialization and offloading for training dnns. Advances in Neural Information Processing Systems, 34,
[5] Smith, S., Patwary, M., Norick, B., LeGresley, P., Rajbhandari, S., Casper, J., Liu, Z., Prabhumoye, S., Zerveas, G., Korthikanti, V. and Zhang, E., 2022. Using deepspeed and megatron to train megatron-turing nlg 530b, a large-scale generative language model.
arXiv preprint arXiv:
[6] Li, S., & Hoefler, T. (2021, November).
Chimera:
efficiently training large-scale neural networks with bidirectional pipelines. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (pp
Principales activités:
Activities:
- Implement different techniques for efficient multiGPU training and inference.
- Proposal of new approaches for efficient deep learning (based on pipelining, checkpointing, offloading, and other optimization techniques).
- Development of software to automatically optimise the training and inference of modern deep learning architectures.
- Analyze the performance of models using profiling tools.
- Write scientific papers
- Collaborate with Topal colleagues in Europe and US
Compétences:
Technical skills and level required:
- Good knowledge in Machine Learning and Deep Learning
- Basic knowledge in Linear algebra, Optimization,
Plus d'emplois de Inria
-
Chargé.e de Gestion Budgétaire/financière
Le Chesnay, France - il y a 1 jour
-
Internship Research
Sophia Antipolis, France - il y a 2 semaines
-
Post-doctorant (F/H) Méthodes Cut-cells Avec
Sophia Antipolis, France - il y a 3 jours
-
Chargé (E) de Ressources Humaines (H/F)
Villeurbanne, France - il y a 2 semaines
-
Stage en Modélisation Hydraulique Numérique
Montpellier, France - il y a 3 semaines
-
Apprenti-e Archiviste Et Numérisation
Le Chesnay, France - il y a 2 semaines