Hai-Vy NGUYEN - Experience

Professional Experience

Ph.D. CIFRE – Data Scientist (2022 – present)
Renault Group (CIFRE), IMT Toulouse & IRIT Toulouse

I work on a CIFRE Ph.D. thesis in collaboration with Renault Group, focusing on predicting real-time road conditions (e.g., estimating road friction or adherence) using onboard visual data. As part of this effort, I’ve been involved in collecting data through dedicated road image acquisition campaigns and deploying some of the developed code on prototype in-vehicle computers.

My research combines applied mathematics and deep learning, with an emphasis on developing lightweight, robust neural models suitable for real-world automotive applications. Key contributions include:

Knowledge distillation via feature space transfer to compress large models into efficient ones
Designing novel loss functions to enhance the discriminativeness of features and robustness to noise
Developing tools for uncertainty quantification in neural predictions to support safer decision-making

In parallel, I actively collaborate with my academic supervisors to investigate new theoretical ideas. This joint academic–industrial work has led to publications and pre-publications .

Apprentice Data Scientist (Oct 2021 – Sep 2022)
CLS Group, Ramonville

Developed tools to visualize trajectory data for non-specialists.
Used an LSTM autoencoder to encode variable-length animal trajectories into fixed-length latent codes, then applied clustering techniques in the latent space to group similar movements.
Designed a novel unsupervised clustering algorithm, DBSCAN-temporel, which incorporates both spatial and temporal dynamics. Also introduced automated hyperparameter selection. The method proved effective on both animal and maritime trajectories.

ML/DL Intern (Jun 2021 – Sep 2021)
CNRS, Observatoire Midi-Pyrénées

Applied deep learning techniques to non-destructively detect subsurface geological structures by segmenting dispersion images that relate phase velocity and frequency.
Built a synthetic dataset by synchronizing two independently developed simulation tools: SPECFEM2D (for generating seismic signals and dispersion images of complex structures) and CPS (for approximating dispersion curves of simple structures). Developed an algorithm to align inputs and labels automatically — a significant advance over previous work which required time-consuming manual labeling.
Trained a convolutional neural network named DCNet — a dual U-Net architecture with a shared encoder and two independent decoders — to robustly segment dispersion curves for geological structure reconstruction.