MOMI2022: Posters – PhD Seminars

A poster session will take place on the first day of the workshop, which gives you the opportunity to present your current research.

– 500€ for the 1st prize,
– 300€ for the 2nd prize,
– 200€ for the 3rd prize.

Please note that you do not need to have finalized results for this poster. You can also present the scientific question which you tackle, the approach you are using, intermediate results, and difficulties you might encounter. The prices will take into account the ability to communicate to a broad scientific audience, who does not necessarily know the details of your field.

Participants

Lucrezia Carboni – University of Grenoble

Title: Human Brain Functional Network Characterization

Functional brain connectivity networks are challenging data to be properly analyzed and characterized. While networks are a good model to represent the set of connections between brain regions, distinguishing pathological versus healthy states relying on the graph structure can be arduous. Indeed, even if many network descriptors and graph comparison distances exist, there is no clear evidence of the best metrics to be used in the discrimination of different brain states. Such metrics appear to be dataset dependent and to be used separately. Moreover, while many different approaches have been proposed with good accuracy results, their interpretation, a key point in the Neurosciences domain, remains difficult. For these reasons, we propose a way to combine different nodal statistics, i.e. any possible functions of the adjacency matrix defined on the nodes set of a graph. This allows characterizing graphs at both global and nodal levels. First, we define an equivalence relation on the set of nodes of a graph associated with a single nodal statistics. Next, we extend the definition to any collection of nodal statistics and define a measure of orthogonality among nodal statistics. We show our proposal’s usefulness both for determining which nodal statistics are less redundant depending on the underlying structure of the graph and which structural properties are the predominant ones in the graph. Finally, we propose a way to interpret brain regions’ connectivity at the nodal level. We apply our method to functional connectivity networks constructed with different brain atlases and different databases concerning different pathology. We show promising results that enlighten differences at the nodal level related to pathological states.

Huiyu Li – INRIA

Title: Data Stealing Attack on Medical Images: Is it Safe to Export Networks from Data Lakes ?
In privacy-preserving machine learning, it is common that the owner of the learned model does not have any physical access to the data. Instead, only a secured remote access to a data lake is granted to the model owner without any ability to retrieve data from the data lake. Yet, the model owner may want to export the trained model periodically from the remote repository and a question arises whether this may cause is a risk of data leakage. In this paper, we introduce the concept of data stealing attack during the export of neural networks. It consists in hiding some information in the exported network that allows the reconstruction outside the data lake of images initially stored in that data lake. More precisely, we show that it is possible to train a network that can perform lossy image compression and at the same time solve some utility tasks such as image segmentation. The attack then proceeds by exporting the compression decoder network together with some image codes that leads to the image reconstruction outside the data lake. We explore the feasibility of such attacks on databases of CT and MR images, showing that it is possible to obtain perceptually meaningful reconstructions of the target dataset, and that the stolen dataset can be used in turns to solve a broad range of tasks. Comprehensive experiments and analyses show that data stealing attacks should be considered as a threat for sensitive imaging data sources.

Shakeel Ahmad Sheikh – Université de Lorraine, CNRS, INRIA, LORIA

Title: Stuttering Identification using Deep Learning

Stuttering identification (SI) is a speech characterization problem that has been approached via different signal processing and statistical machine learning methods. Speech technology has been drastically revolutionized, thanks to advances in deep learning but SI has received less attention. This work explores different deep learning algorithms to solve the SI problem. First, we introduce StutterNet, a time-delay neural network architecture for SI. Then, we investigate multi-task (MTL)
and adversarial (ADV) learning frameworks to learn robust speech representation. To address the limited data problem, we further introduce speech embeddings for SI where embeddings were extracted from models trained on large datasets and for separate tasks. We have achieved the best SI performance so far using the Wav2Vec2.0 embeddings with the neural network backend.

Yingyu Yang – INRIA

Title: Patch-based Unsupervised Cardiac Motion Tracking using MLPs and Transformers

Cardiac motion tracking plays an important role in cardiac function analysis. Traditional(non-learning) tracking, especially registration-based approaches rely on the iterative optimisation of a similarity metric which is usually costly in both time and space complexity. In recent years, convolutional neural network (CNN) based image registration methods have shown promising effectiveness. In the meantime, recent studies show that the attention-based model (eg. Transformer) can bring superior performance in pattern recognition tasks, while, whether the superior performance of the Transformer comes from the long-winded architecture, or is attributed to the use of patches for dividing the inputs is unclear yet. In this work, we introduce three patch-based framework for image registration using MLPs and transformers. We provide experiments on 2D-echocardiography motion tracking to answer partially the former question and provide benchmark solution. Our results on a large public 2D-echocardiography dataset and on a 2D synthetic dataset shows that patch-based MLP/Transformer model can be effectively used for image registration and cardiac motion tracking. They demonstrate comparable and even better registration performance than a popular CNN registration model on both in vivo and in silico datasets. In particular, the MLP-Mixer based architecture presents the best generalisability on echocardiography motion estimation. Our results share a similar conclusion with recent research that the attention mechanism in the Transformer model may not be the main determinant of success, at least for image registration.

Mulin Yu – INRIA

Title: Finding Good Configurations of Planar Primitives in Unorganized Point Clouds

We present an algorithm for detecting planar primitives from unorganized 3D point clouds. Departing from an initial configuration, the algorithm refines both the continuous plane parameters and the discrete assignment of input points to them by seeking high fidelity, high simplicity and high completeness. Our key contribution relies upon the design of an exploration mechanism guided by a multiobjective energy function. The transitions within the large solution space are handled by five geometric operators that create, remove and modify primitives. We demonstrate the potential of our method on a variety of scenes, from organic shapes to man-made objects, and sensors, from multiview stereo to laser. We show its efficacy with respect to existing primitive fitting approaches and illustrate its applicative interest in compact mesh reconstruction, when combined with a plane assembly method.

Othmane Marfoq – INRIA

Title: Federated multi-task learning under a mixture of distributions

The increasing size of data generated by smartphones and IoT devices motivated the development of Federated Learning (FL), a framework for on-device collaborative training of machine learning models. First efforts in FL focused on learning a single global model with good average performance across clients, but the global model may be arbitrarily bad for a given client, due to the inherent heterogeneity of local data distributions. Federated multi-task learning (MTL) approaches can learn personalized models by formulating an opportune penalized optimization problem. The penalization term can capture complex relations among personalized models, but eschews clear statistical assumptions about local data distributions. In this work, we propose to study federated MTL under the flexible assumption that each local data distribution is a mixture of unknown underlying distributions. This assumption encompasses most of the existing personalized FL approaches and leads to federated EM-like algorithms for both client-server and fully decentralized settings. Moreover, it provides a principled way to serve personalized models to clients not seen at training time. The algorithms’ convergence is analyzed through a novel federated surrogate optimization framework, which can be of general interest. Experimental results on FL benchmarks show that our approach provides models with higher accuracy and fairness than state-of-the-art methods.

Marina Costantini – EURECOM

Title: Spread gossip faster and be happy about it. Let’s share our knowledge, not our data!

Decentralized optimization algorithms allow multiple nodes in a network to collaboratively train a machine learning model using the data of all nodes, but keeping the data private. To achieve this, nodes communicate with their neighbors to exchange optimization values (parameters, gradients) instead of the data itself. By interleaving communication steps with computation steps, all nodes can converge to the optimal solution that a centralized algorithm would find if the data of all nodes was gathered at a single location.

In particular, gossip algorithms allow nodes to wake up at any time and contact one single neighbor to complete an iteration together. These algorithms have the attractive property of not needing a synchronization enforcer, and thus, they offer remarkable time and communication savings. Furthermore, they provide an extra degree of freedom to speed up convergence: the choice of the neighbor to contact when a node goes active.

Gossip algorithms were first proposed in the context of decentralized averaging, where all nodes in the network have a single scalar value and the task is to find the average of all the values in the network. In this setting, and when the neighbor choice is randomized, it is well-known how to choose the neighbor contacting probabilities to maximize convergence speed. However, in the context of decentralized optimization this choice is less clear, and recent work has reported that the probabilities that are optimal for decentralized averaging are not optimal anymore for some decentralized optimization settings.

In this poster we will explain the key differences between the tasks of averaging and optimization in the decentralized setting and will give insights on how to design fast algorithms for the latter. We will make special emphasis on two complementary and perhaps competing forces that drive the convergence speed: the network structure (graph-theoretic point of view), and the optimization landscape (mathematical optimization point of view).

Ziming Liu – INRIA

Title: A General Hybrid Visual Localization Method for Indoor Healthcare Robots

Recently, more and more healthcare and medical service robots are applied, such as hospital service, patient healthcare, and delivery. Visual localization is an important part of the perception module of autonomous robots. Recent advances in deep learning approaches have given rise to hybrid visual localization approaches that combine both deep networks and traditional pose estimation methods. One limitation of deep learning approaches is the availability of ground truth data needed to train the neural networks. For example, it is extremely difficult, if not impossible, to obtain a ground truth dense depth map of the environment to be used for stereo visual localization. Even if unsupervised training of networks has been investigated, supervised training remains more reliable and robust. In this paper, we propose a new hybrid dense stereo visual localization approach in which a dense depth map is obtained with a network that is supervised using ground truth poses that can be more easily obtained than ground truth depths maps. The depth map obtained from the neural network is used to warp the current image into the reference frame and the optimal pose is obtained by minimizing a cost function that encodes the similarity between the warped image and the reference image. The experimental results show that the proposed approach, not only improves state-of-the-art depth maps estimation networks on some of the standard benchmark datasets, but also outperforms the state-of-the-art visual localization methods.

Bernard Tamba Sandouno – INRIA

Title: Spatial signal strength estimation

Estimating signal strength have always been a subeject of interest both in academia and industries. For this purpose, different categories of the so-called propagation models were developped. Soem of these categories are known to be faster but have less accuracy, while the deterministic one is known to be highly accurate because it takes into consideration the 3D environment of a receiver into account while estimating the signal strength. However,this accuracy is at the expense of high memory consumption and high computational load, which makes them not usable in (almost) real time scenarios.
Our goal in this thesis is to accelerate this model, i.e to have an accurate deterministic model but with lower computational load and lower memory consumption. To do this, we developed our own RT model from scratch using Python programming language. During our implementation, we developed a new and optimize way to launch ray from an antenna in order to cover the whole radiation pattern without gaps. Afterwards, the acceleration was done using Machine Learning on the one hand and the generation of a continuous coverage map on the other hand.

Evangelos-Marios Nikolados – University of Edinburgh

Title: Deep learning models of protein expression

Machine learning has emerged as a promising tool to leverage large-scale data for strain optimization. However, it remains unclear what models can deliver accurate predictions, or what amount and quality of data is required for training. Using a large protein expression screen in Escherichia coli, reveals that non-deep models can achieve prediction accuracy >70% with as few as ∼2,000 DNA sequences, while deep learning further improves performance with the same amount of data. Those results highlight the interplay between model accuracy and the structure of the genotypic space, suggesting that controlled sequence diversity can lead to gains in data efficiency.

Yu Wang – LPMT

Title: Numerical simulation of tensal fracture for HMPWE yarns using virtual fibers

As yarns control the ultimate failure of fiber reinforced composites, mechanical behavior warrants much consideration. The present study focuses on the experimental and simulation analysis of the tensile fracture behavior of twisted yarns. A quasi-fiber scale model of twisted yarns made of HMWPE fibers is established using virtual fibers. The tensile fracture behavior of twisted yarns is simulated using a maximum stress criterion with a random distribution of properties. The tensile fracture loads of six kinds of twisted yarns agree well with the experimental results. The numerical precision and dispersion of the proposed method are analyzed systematically. The influence of damage factor and displacement at failures on the simulation results of twisted yarns are also obtained. Above modeling and analysis method provides a helpful tool for understanding the yarn architectures of fiber reinforced composites.

Marie Guyomard – I3S, CNRS

Title: Adaptative spline-based Logistic Regression with a ReLU Neural Network

This poster proposes a neural network for nonlinear classification tasks. This method is equivalent to considering a logistic regression applied to a MARS model. The neural network tresholds the features and produces a decision rule that approximates a spline. The partitioning of the input space by hyperplanes induced by the hidden layers is controlled and explicable. Experiments on simulated data demonstrate the relevance of the network’s architecture for medical diagnostic support.

Alexandre Bonlarron – INRIA, UCA

Title: Constrained text generation to measure reading performance: A new approach based on multivalued decision diagrams

Measuring reading performance is one of the most widely used methods in ophthalmology clinics to judge the effectiveness of treatments, surgical procedures, or rehabilitation techniques.
However, reading tests are limited by the small number of standardized texts available. For the MNREAD test, which is one of the reference tests used as an example in this paper, there are only two sets of 19 sentences in French. These sentences are challenging to write because they have to respect rules of different kinds (e.g., related to grammar, length, lexicon, and display). They are also tricky to find: out of a sample of more than three million sentences from children’s literature, only four satisfy the criteria of the MNREAD reading test. To obtain more sentences, we propose an original approach to text generation that considers all the rules at the generation stage. Our approach is based on Multi-valued Decision Diagrams (MDD). First, we represent the corpus by n-grams and the different rules by MDDs, and then we combine them using operators, notably intersections. The results obtained show that this approach is promising, even if some problems remain, such as memory consumption or a posteriori validation of the meaning of sentences. In 5-gram, we generate more than 4000 sentences that meet the MNREAD criteria and thus easily provide an extension of a 19-sentence set to the MNREAD test.

Angelo Saadeh – Telecom Paris

Title: Privacy-Preserving Vertical Federated Learning

Growing confidentiality concerns make it harder for organisations – like hospitals, banks and governmental institutions – and for departments within an institution to collaborate in order to federately train machine learning models on combined datasets.

We describe a solution for two to train a logistic regression on a vertically split dataset such that the privacy of the data used to train the models is protected not only from members of both the collaborating organisations but also from third party users of the models.

In other words, the data will be protected during the training process and after publishing the models’ parameters. Secure multi-party computation (MPC) and epsilon-differential privacy (DP) devise solutions to address the issues of protection from collaborating parties and from users of the models separately.

Can these be combined to form a unified solution? We propose, present, and evaluate a two-party epsilon-differentially private and fully secure logistic regression on a vertically partitioned dataset where the players need to jointly train a model.

Riccardo Di Dio – INRIA

Title: Influence of lung physical properties on its flow-volume curves using a detailed multi-scale mathematical model of the lung

We develop a mathematical model of the lung that can estimate independently the air flows and pressures in the upper bronchi. It accounts for the lung multi-scale properties and for the air-tissue interactions. The model equations are solved using the Discrete Fourier Transform, which allows quasi instantaneous solving, in the limit of the model hypotheses. With this model, we explore how the air flow–volume curves are affected by airways obstruction or by change in lung compliance. Our work suggests that a fine analysis of the flow-volume curves might bring information about the inner phenomena occurring in the lung.