Student Projects

Welcome to the LTS4 Student Projects page!

Below you will find a collection of projects that are available for the coming semesters. Even though most of the proposed projects are categorized as Semester or Master projects, they can generally be modified to fit other formats. Contact the responsible person that is given in the detailed project description for further information.

The project topics are the following:

In case you want to work on your own ideas that are closely related to our research activities, we are more than happy to discuss them with you.

Image Analysis and Vision

    • Adversarial perturbations for interpretability and robustness
      Semester/Master Project

      Although Deep Neural Networks are considered the state-of-the-art for image classification tasks, they are also highly sensitive to small adversarial perturbations of their input data; small changes on the image can lead to extreme misclassification scenarios [1]. Towards explaining this phenomenon, it is very important to understand and interpret what deep networks are eventually forced to “see” when adversarial perturbations are applied, and exploit this information properly for building more robust classifiers.
      In this work, we will mainly focus on sparse adversarial perturbations [2] for generating adversarial data, since the perturbed features are quite discrete and easier to analyze. We will extract meaningful information from the perturbed features. Finally, based on our observations, we will try different approaches aiming to improve the robustness of the networks.

      References:

      [1] Szegedey et al., “Intriguing properties of neural networks”, ICLR 2014.
      [2] Modas et al., “SparseFool: a few pixels make a big difference”, arXiv preprint 2018.

      Requirements: Good knowledge of Python, and sufficient familiarity with image processing and machine learning. Experience with PyTorch is a plus..

      Contact:apostolos.modas@epfl.ch

    • Robustness of deep neural networks on different spaces
      Semester/Master Project

      Deep Neural Networks have achieved extraordinary results on many image classification tasks, but have been shown to be vulnerable to small alterations of their input, also known as adversarial perturbations [1]. There are several methods that compute adversarial perturbations, each one of which perturbs different features/characteristics (spatial, spectral, chromatic) of the input image.
      In this project, we will study the robustness of deep neural networks to perturbations that lie on specific spatial, spectral, and color spaces. The goal is to identify those spaces where the networks are more vulnerable, and use them properly to build methods that increase the robustness against adversarial attacks.

      References:

      [1] Szegedy et al., “Intriguing properties of neural networks”, ICLR 2014.

      Requirements: Good knowledge of Python, sufficient familiarity with computer vision and deep learning. Experience with PyTorch or other deep learning library is a plus.

      Contact: apostolos.modas@epfl.ch

    • Behavioural analysis of neural networks to adversarial perturbations
      Semester/Master Project

      The vulnerability of deep neural networks to small, carefully crafted noise known as adversarial perturbations [1], has raised many questions regarding their safety and their overall behaviour when classifying different images. In order to improve these deep nets and increase their robustness, first we need to better understand their total behaviour both in correct classification and misclassification situations.

      In this project, we aim to study the internal behaviour of such deep architectures, when different types of perturbations are applied in the input layer. By observing the neuron activation diffusion and visualising the feature changes, we want to understand how these small perturbations in the input affect the different layers of the network, leading eventually to such significant changes in terms of classification.

      [1] Szegedy et al., “Intriguing properties of neural networks”, ICLR 2014.

      Requirements: Good knowledge of Python and MATLAB, and sufficient familiarity with deep learning and deep neural networks. Experience with PyTorch is a plus.

      Contact: apostolos.modas@epfl.ch

    • Omnidirectional vision for drones
      Semester/Master Project

      Omnidirectional imaging is a hot research topic and has important applications to aerial robots equipped with multiple cameras. Next to numerous use cases in cinematography and film-making, omnidirectional vision can enhance drone autonomy by providing 360 degree visual sensing for tasks such as collision avoidance. The present project aims at implementing and comparing efficient stitching algorithms for omnidirectional vision, possibly enhancing existing solutions. Since the algorithm is to be run on the drone onboard computer and potentially used for real-time applications such as collision avoidance, it should have low complexity and must be able to run at an acceptable speed.

      The first part of the project will involve the selection and implementation of suitable stitching algorithms. Existing algorithms may be enhanced by taking the a priori information such as the camera configuration into account. The second part of the project will involve testing of the proposed algorithms on a physical drone platform.

      Requirements: Programming skills (shell scripting, Python, C/C++), basics of image processing.

      Contact: giuseppe.cocco@epfl.ch

    • Bandwidth efficient object recognition for drone swarms
      Semester/Master Project

      The project aims at developing a bandwidth efficient distributed object detection system which can be flown on a drone swarm. The system exploits the different points of view of the drones in the swarm to improve object recognition, while keeping the amount of data that is transmitted to the ground station as low as possible. In this way a more efficient use of the limited wireless resources can be achieved. The projects will involve the use of both off-the-shelf neural network algorithms and WiFi communication protocols. 
 The first part of the project will focus on the setup of the communication network between the communication modules to be mounted on the drones and the ground station. The second part of the project will focus on the setup of the image capture/object recognition system and some basic onboard image processing. The core part of the project will consist in the implementation and optimization of the detection and communication protocol, which will build upon the modules developed so far. This part will include the evaluation of the system performance in terms of both bandwidth efficiency and detection accuracy.

      Requirements: Good knowledge of WiFi communication protocols (hands-on experience desirable), programming skills (shell scripting, Python, C/C++), familiarity with computer vision.

      Contact: giuseppe.cocco@epfl.ch

    • Deep Learning for Depth Estimation
      Semester/Master Project

      Although Deep Neural Networks (DNN) are widely known for their remarkable classification capabilities, it has recently been shown that they can effectively be used for Depth Estimation as well. The Depth Estimation problem consists in the inference of the depth of a scene from two or possibly more images of the same scene captured from different points of view. It represents a critical problem in Computer Vision, as it is necessary to target more high level problems, such as semantic segmentation, action recognition, etc. Light field cameras can capture hundreds of images of the same scene from slightly different points of view in just a single exposure. Ideally, the availability of such a number of views makes the depth estimation problem easier to address, compared to the most common scenario where only two image are available. In this project, the student will investigate the adaption of existing DNN-based Depth Estimation algorithm designed for the common setup with just two of images (stereo setup) to the Light Field case (multi-stereo setup), where multiple images of the same scene are available. Although the availability of multiple images of the same scene makes the depth estimation problem less ill-conditioned and easier to approach, the high amount of data requires to carefully take into account the overall computational and memory complexity of the final Depth Estimation algorithm, as these can become a main bottleneck when a depth estimation is required in real-time application.

      [1] Alexey Dosovitskiy et al., FlowNet: Learning Optical Flow with Convolutional Networks, ICCV 2015.
      [2] Nikolaus Mayer et al., A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation, IEEE CVPR 2016.
      [3] Wenjie Luo et al., Efficient deep learning for stereo matching, IEEE CVPR 2016.

      Requirements: Knowledge of Python and DNN tools such as PyThorch. Basics of Image Processing.

      Contact: mattia.rossi@epfl.ch

    • Geometry meets robustness
      Semester Project

      Deep networks have shown a great performance in classification tasks. However, they are proved to be vulnerable to small and often imperceptible noise called “adversarial perturbations”. It becomes extremely crucial when they are deployed in safety-critical applications such as driverless cars. So far, no definite methods have been found to counter this vulnerability effectively.

      Nevertheless, the adversarial perturbations are recently revealed to be tightly related to the geometry of the decision boundary of deep networks. In this project, we characterize these perturbations in terms of the geometrical properties of the decision boundary. Based on this understanding, we eventually design efficient algorithms to detect adversarial perturbations.

      Requirements: Good knowledge of Python and MATLAB, sufficient familiarity with machine learning and image processing, having experience with PyTorch (or similar deep learning frameworks) is a plus.

      Contact: seyed.moosavi@epfl.ch

    • Investigation of Deep Convolutional Neural Networks on the Information Plane
      Master/Semester Project

      With their success and wide application areas, there is an urge for a comprehensive understanding of learning with Deep Convolutional Neural Networks (DCNNs). There is a recent approach [1,2] that studies the information paths of Deep Neural Networks (DNNs) in the information plane, where the joint distribution of the input and output of each layer is plotted against their mutual information throughout the learning procedure. Despite bringing a new perspective and revealing details about the inner working of DNNs, these two papers experiment only with very small-scale fully-connected feedforward networks for classification, which leads to easy computations for distributions. To adapt this framework to real-world problems, one needs to apply estimation methods to compute mutual information and required distributions for the analysis. The aim of this project is to analyze simple but more realistic DCNNs, like LeNet-5 [3], in the information plane by finding and applying proper and efficient estimation techniques. Possible extension would be to examine how skip connections [4] or routing between capsules, which are groups of neurons proposed in [5] to achieve equivariance, contribute to learning process.

      [1] Tishby, N., & Zaslavsky, N. (2015, April). Deep learning and the information bottleneck principle. In Information Theory Workshop (ITW), 2015 IEEE (pp. 1-5). IEEE.
      [2] Shwartz-Ziv, R., & Tishby, N. (2017). Opening the Black Box of Deep Neural Networks via Information. arXiv preprint arXiv:1703.00810.
      [3] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
      [4] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
      [5] Sabour, S., Frosst, N., & Hinton, G. E. (2017). Dynamic Routing Between Capsules. In Advances in Neural Information Processing Systems (pp. 3857-3867).

      Requirements: Sufficient familiarity with machine learning and probability. Experience with one of deep learning libraries and good knowledge of the corresponding coding language (preferably Python).

      Contact: beril.besbinar@epfl.ch

    • Interpretable machine learning in personalised medicine
      Master/Semester Project

      Modern machine learning models mostly act as a black box and their decisions cannot be easily inspected by humans. To trust the automated decision-making, we need to understand the reasons behind predictions, and gain insights into the models. This can be achieved by building models that are interpretable. Recently, different methods have been proposed for data classification, such as augmenting the training set with useful features [1], visualizing the intermediate features in order to understand the input stimuli that excite individual feature maps at any layer in the model [2-3], or introducing logical rules in the network that guide the classification decision [4], [5]. The aim of this project is to study existing algorithms, which attempt to interpret deep architectures by studying the structure of their inner layer representations, and based on these methods find patterns for classification decisions along with coherent explanations. The studied algorithms will most be considered in the context of personalised medicine applications.

      [1] R. Collobert, J. Weston, L. Bottou, M. M. Karlen, K. Kavukcuoglu, and P. Kuksa, “Natural language processing (almost) from scratch,”J. Mach. Learn. Res., vol. 12, pp. 2493–2537, Nov. 2011.
      [2] K. Simonyan, A. Vedaldi, and A. Zisserman, “Deep inside convolutional networks: Visualising image classification models and saliency maps,” arXiv:1312.6034, 2013.
      [3] L. M. Zintgraf, T. S. Cohen, T. Adel, and M. Welling, “Visualizing deep neural network decisions: Prediction difference analysis,” arXiv:1702.04595, 2017.
      [4] Z. Hu, X. Ma, Z. Liu, E. Hovy, and E. Xing, “Harnessing deep neural networks with logic rules,” in ACL, 2016.
      [5] Z. Hu, Z. Yang, R. Salakhutdinov, and E. Xing, “Deep neural networks with massive learned knowledge,” in Conf. on Empirical Methods in Natural Language Processing, EMNLP, 2016.

      Requirements: Familiarity with machine learning and deep learning architectures. Experience with one of deep learning libraries and good knowledge of the corresponding coding language (preferably Python) is a plus.

      Contact: mireille.elgheche@epfl.ch

    • Depth estimation via deep learning: the light field case
      Master Project

      The depth estimation problem is concerned with the inference of the depth of a scene from multiple pictures of it, all captured from different points of view. Depth estimation represents a critical problem in Computer Vision, as it is necessary to target more high level problems, such as 3D reconstruction, semantic segmentation, action recognition, etc. Typically, since depth estimation requires multiple pictures of the same scene to be available, the scene has to be still while the user moves around and captures the necessary pictures: this represents a significant limitation. On the other hand, Light Field cameras [1] can capture hundreds of images of the same scene from slightly different points of view in just a single exposure, therefore depth estimation from light field camera data has raised a large interested in the last decade [2][3].
      Due to the recent success of Deep Neural Network in Computer Vision tasks, the student will address the light field depth estimation problem within the framework of Deep Learning. Although some preliminary work in this direction already exists [3], this research track is still at its beginning. The student will start by studying the recent deep learning-based depth estimation algorithms for the standard stereo setup [4] (two pictures available) and the multi-view stereo setup [5] (three or more pictures available). Then, the student will consider their extension to the particularly scenario of light field cameras. On one hand, light field cameras provide a much higher number of pictures than those considered in the traditional stereo and multi-view stereo setups, therefore, the computational and memory complexity will have to be carefully taken into account. On the other hand, the light field camera pictures exhibit a very regular structure [2] that can be largely exploited in the depth estimation task. The depth estimation results will be evaluated on a state-of-the-art light field benchmark [6].

      References:

      [1] Raytrix website (https://raytrix.de/)
      [2] S. Wanner and B. Goldluecke, Globally consistent depth labeling of 4D light fields, IEEE CVPR, pp. 41-48, 2012
      [3] S. Changa et al., EPINET: A Fully-Convolutional Neural Network Using Epipolar Geometry for Depth from Light Field Images, IEEE CVPR, 2018
      [4] W. Luo et al., Efficient deep learning for stereo matching, IEEE CVPR, pp.5695-5703, 2016
      [5] H. Po-Han et al., DeepMVS: Learning Multi-View Stereopsis, IEEE CVPR, 2018
      [6] Light field depth estimation benchmark (http://hci-lightfield.iwr.uni-heidelberg.de/)

      Requirements: Knowledge of MATLAB and/or Python. Basic knowledge of optimization and machine learning. Knowledge of DCNN libraries (e.g., Caffe) and image processing is a plus.

      Contact:mattia.rossi@epfl.ch

    • Comparative study of CNNs and human visual system under the effect of clutter
      Master Project

      Convolutional Neural Networks (CNN) are feedforward, hierarchical architectures that achieve extremely accurate classification of natural images. However, there are differences in the visual aspects captured by CNNs and by the human visual system. This project aims to do a comparative study between the behaviour of CNN and humans for the task of classification of images under the effect of crowding (clutter). According to [1] although the performance of the human visual system decreases in the presence of crowding, the performance can be progressively regained in the presence of more crowding with specific characteristics (color, size, orientation). The main goal of this project will be to investigate whether CNNs exhibit similar behaviour. The student will learn how to:

      i) create a dataset to train the CNN

      ii) train the CNN so that it classifies accurately the images without the presence of clutter

      iii) fashion experiments (similar to those in [1]) in order to test the performance of the trained CNN in the presence of clutter

      iv) provide comments on the results

      References:

      [1] Michael H. Herzog, Mauro Manassi, Uncorking the bottleneck of crowding: a fresh look at object recognition, Current Opinion in Behavioral Sciences, 1, p86-93, 2015.

      [2] Yann LeCun, Yoshua Bengio and Geoffrey Hinton, Deep Learning, Nature 521, no. 7553 (2015): 436-444.

      [3] Alex Krizhevsky, Ilya Sutskever, Geoffrey E Hinton, Classification with Deep Convolutional Neural Networks, NIPS 2012.

      Requirements: Good knowledge of Matlab or Python, sufficient familiarity with machine learning and image processing. Having experience with deep learning is a plus.

      The project will be co-supervised with the Laboratory of Psychophysics from the School of Life Sciences.

      Contact: effrosyni.simou@epfl.ch

    • Transformation invariant deep learning systems
      Semester Project

      Deep learning systems have achieved remarkable results in last decades. The network is able to learn meaningful features for machine learning tasks from RAW data. The system is even able to learn the data representation under some transformation due to the max-pooling operator. However, this operator is not sufficient to handle ambiguity in the natural data.

      In this project we propose to study existing algorithms, which build features invariant to transformation with deep network, and based on these methods build rotation and translation invariant system for an image classification task.

      Requirements: Programming skills, signal processing. Knowledge in deep learning is a plus.

      Contact: renata.khasanova@epfl.ch

    • Omnidirectional vision
      Semester/Master Project

      Omnidirectional cameras have a 360 degrees field of view. Therefore, they are powerful tools for object detection and classification tasks [1, 2]. For example, a single omnidirectional camera can be used to capture traffic data based on the information about vehicles on the roads or it can be mounted on drone and used for object detection and collision avoidance. Despite the broad variety of applications where omnidirectional vision can be used, to the best of our knowledge, there are just a few methods that perform object classification directly on omnidirectional images without data transformation to the classical images that leads to increase of computational complexity and possible information losses.

      We propose to use deep architecture [3] to tackle classification problem of omnidirectional images. In the project student would be required to develop an algorithm for object classification by considering omnidirectional camera’s lens geometry.

      References:

      [1] H.C. Karaimer, Y. Bastanlar. Detection and classification of vehicles from omnidirectional videos using temporal average of silhouettes. InProceedings of the Int. Conference on Computer Vision Theory and Applications (2015), pp.197-204.

      [2] L.F. Posada, K.K Narayanan, F. Hoffmann, T. Bertram. Semantic classification of scenes and places with omnidirectional vision. In European Conference on Mobile Robots (ECMR), IEEE (2013), pp. 113-118.

      [3] Krizhevsky, A., Sutskever, I., and Hinton, G. E. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (2012), pp. 1097–1105.

      Requirements: Basic knowledge in computer vision, neural networks, signal processing, programming skills.

      Contact: renata.khasanova@epfl.ch

    • Light field depth estimation meets deep learning in the epipolar image domain
      Master Project

      A Light Field camera [1] looks exactly like your favorite point-and-shot camera, however, its smart design permits to capture multiple pictures at the same time. These pictures capture the scene from slightly different points of view, thus recording its 3D structure. With some computation, it is possible to exploit these pictures for multiple applications: re-focusing the captured pictures at an arbitrary depth, measuring the depth of the objects in the scene, or even building a 3D model of the scene.

      The first step toward these applications is depth estimation. Depth estimation [2] is a fundamental problem in Computer Vision and consists in assigning a depth value to each object (pixel) in a captured picture. Typically, to solve this problem, multiple pictures of the same scene are necessary, therefore, the scene or subject has to be still while the user moves around and captures the pictures. On the other hand, a light field camera captures all the required images in a single shot. In addition, the set of images captured by a light camera, referred to as the Light Field, exhibits a very particular structure: the depth associated to each pixel in the light field is in a one-to-one relation with the slope of a line in the Epipolar Image Domain representation of the light field [3]. Therefore, the depth estimation problem reduced to a much simpler slope detection problem in the context of light fields.

      In this project, the student will first study the special structure of light field data. Then, the student will take advantage of the effectiveness of Deep Neural Network in detecting patterns in the data to determine the line slopes, and therefore the light field depth map. The student will design a network architecture suitable for the considered task and will have to take into account the high dimensionality of the light field data, that may not permit to process the full light field at once. The depth estimation results will be evaluated on a state-of-the-art light field benchmark [4].

      References:

      [1] Raytrix website (https://raytrix.de/)
      [2] M. Bleyer and C. Breiteneder, Stereo Matching: State-of-the-Art and Research Challenges, Advanced Topics in Computer Vision, Springer, pp. 143-179, 2013
      [3] S. Wanner and B. Goldluecke, Globally consistent depth labeling of 4D light fields, IEEE CVPR, pp. 41-48, 2012
      [4] Light field depth estimation benchmark (http://hci-lightfield.iwr.uni-heidelberg.de/)

      Requirements: Fundamentals of image processing, basic knowledge of deep learning and machine learning, Python, PyTorch or TensorFlow.

      Contact: mattia.rossi@epfl.ch

    • Conditional Wassertein GANs for Video Prediction
      Master Thesis Project

      Capability of future prediction requires an in-depth understanding of the physical and causal rules that govern the world. Despite having appealing applications as robotic planning or representation learning, predicting raw observations, such as video frames, is inherently challenging due to high dimensionality of the data, as well as the difficulty of modelling the uncertainties. The first attempts to solve prediction problem lacked the ability to generate different plausible frames due to the deterministic nature of the proposed schemes. Recently, two different approaches have dominated the literature: i) modelling latent variables as probability distributions to model the stochasticity (VAE-like approaches), ii) using adversarial training to improve the quality of the generated frames (GAN-like approaches). Many works from both tracks use the stochastic variable as an input the generator by sampling from a learned distribution, independently throughout time.
      We would like to model the stochasticy that is conditioned not only on the input data, e.g. available frames, but also on the previously sampled variables, hence generated frames. Such a framework requires the derivation of a loss function where the inference of the latent distribution might have some Markovian characteristics. The proposed framework will be analysed in a video frame prediction application.

      References:

      [1] Wasserstein Autoencoders: https://arxiv.org/pdf/1711.01558.pdf
      [2] Stochastic Adversarial Video Prediction: https://arxiv.org/pdf/1804.01523.pdf
      [3] Sequential Neural Models with Stochastic Layers: https://arxiv.org/pdf/1605.07571.pdf

      Requirements: Fundamentals of linear algebra, fundamentals of image processing, basic knowledge of deep learning and machine learning, Python, PyTorch or TensorFlow.

      Contact: beril.besbinar@epfl.ch

    • Omnidirectional stereo: Patch Match meets Deep Learning
      Semester/Master Project

      Omnidirectional cameras capture videos with a 360-degree field of view. Thanks to a Head Mounted Display, the user can be thrown in the middle of the scene and experience a much deeper immersion than with a traditional video. At this point, although the user can watch the scene around itself while the video flows, his point of view is bound to the camera one: the user cannot make a step in an arbitrary direction in the scene, not yet. Interesting, coupling together two omnidirectional cameras permits to estimate the geometry of the scene, thus paving the way to 3D reconstruction and unveiling the possibility for the user to navigate the scene freely. A preliminary work in this direction has already been carried out in [1][2].
      The problem of geometry estimation from pairs of traditional perspective cameras is referred to as Stereo Matching [3] and is a long studied problem, for which fast and effective methods exist. Instead, omnidirectional stereo is a very recent research track. In this project, the student will first become familiar with Path Match Stereo [4][5], a fast, effective, yet simple, stereo matching algorithm, and then extend it to the case of omnidirectional camera pairs. Moreover, while Patch Match uses an Euclidean-based metric to match the views captures by the two cameras, and it is therefore sensible to light changes and occlusions, the student will consider its replacement with a metric computed by a Deep Neural Network [6], with the attempt to get a more robust algorithm.

      References:

      [1] C. Schroers et al., An Omnistereoscopic Video Pipeline for Capture and Display of Real-World VR, ACM Transactions on Graphics, Special Issue on Production Rendering, vol. 37, no. 3, August 2018
      [2] H. Jingwei et al., 6-DOF VR videos with a single 360-camera, IEEE Virtual Reality, 2017
      [3] M. Bleyer and C. Breiteneder, Stereo Matching: State-of-the-Art and Research Challenges, Advanced Topics in Computer Vision, Springer, pp. 143-179, 2013
      [4] M. Bleyer et al., PatchMatch Stereo – Stereo Matching with Slanted Support Windows, BMCV, pp. 14.1-14.11, 2011
      [5] E. Zheng et al., PatchMatch Based Joint View Selection and Depthmap Estimation, IEEE CVPR, pp. 1510-1517, 2014
      [6] J. Zbontar and Y. LeCun, Stereo matching by training a convolutional neural network to compare image patches, Journal of Machine Learning Research, vol. 17, pp. 1-32, 2016

      Requirements: Fundamentals of linear algebra, fundamentals of image processing, basic knowledge of deep learning and machine learning, Python, PyTorch or TensorFlow.

      Contact: mattia.rossi@epfl.ch

Image/Video Coding and Communication

    • Video compression for moving cameras
      Semester/Master Project

      Video streaming from mobile platforms such as drones is a widespread application which is gaining an even higher popularity in the last years. Efficient video coding standards are used nowadays to compress videos onboard the platform. However, despite the impressive progresses both from the practical and the theoretical perspective, a full theoretical model of the rate-distortion function (that is, the relationship between the compressed video data rate and quality) for natural videos taken by moving cameras is still missing. This project aims at gathering and processing data in a fully controlled environment in order to move one step forward towards the development of such modelling.

      The project is divided into three parts.
      • In the first part raw images will be taken under different conditions in a controlled environment using a camera
      • In the second part the data will be post-processed by developing a predictive video coder (e.g. in Matlab)
      • In the third part, fitting to an existing model will be carried out.

      Requirements: knowledge of Matlab, basic knowledge of image processing (e.g., discrete cosine transform, quantization) or signal processing, precision and motivation

      Contact: guiseppe.cocco@epfl.ch

    • Peer-assisted adaptive streaming of omnidirectional videos
      Semester/Master Project

      The current state of the art on multimedia streaming over the Internet is mainly based on Adaptive Streaming over HTTP (HAS) techniques [1], which provides a standard delivery framework that allows interoperability between different devices and servers while optimizing bandwidth consumption. The key concept behind adaptive streaming is that the same video content is encoded at different resolutions/encoding rates, stored on streaming servers, and each user requests over time the desired version of the content according to its download capacity. The emergence of new media formats supporting improved user experiences, such as omnidirectional and multiview videos, however, requires the transmission of a huge amount of data, and impose many new challenges on the current HAS infrastructures. One possibility for improving the state of the art when transmitting those new media formats is through the use of peer-assisted delivery techniques [2].

      This project aims at exploring the use of peer-assisted adaptive streaming in the scope of omnidirectional video. The main goal is to study how current HAS format and adaptation logics can be adapted to support an optimized omnidirectional video delivery chain and consequently improve the experience of users consuming those content.

      References:

      [1] Stockhammer, Thomas. “Dynamic adaptive streaming over HTTP–: standards and design principles.” Proceedings of the second annual ACM conference on Multimedia systems. ACM, 2011.

      [2] Streamingroot white paper: Peer-Assisted Adaptive Streaming; http://files.streamroot.io/public/whitepapers/Streamroot-Whitepaper-Peer-Assisted-Adaptive-Streaming.pdf

      [3] YouTube 360° https://www.youtube.com/channel/UCzuqhhs6NWbgTzMuM09WKDQ

      Requirements: Basic knowledge of multimedia networking, basic programming skills.

      Contact: roberto.azevedo@epfl.ch

    • Visual quality of omnidirectional images and videos
      Semester/Master Project

      Many algorithms (i.e. objective quality metrics) have been proposed in literature in order to quantify the visual quality of a digital image or video sequence, as perceived by the end user. These metrics are extremely useful in order to optimise the different steps of the digital processing chain, such as acquisition, compression, transmission, and rendering, to maximize the perceptual quality of the signal presented to the multimedia user [1].

      Nowadays, cameras which allow capturing omnidirectional (i.e. 360°) digital images and video sequences have started to appear as commercial products [2][3]. The spread in the near future of applications involving omnidirectional content will require the optimisation of its processing chain. Thus, algorithms able to quantify the visual quality of omnidirectional images and videos will be needed soon. While many quality metrics exists for classical not-omnidirectional images and videos, the quality assessment of this new kind of visual media is an open research challenge. The goal of this project will be the study of the quality perception of omnidirectional content and the design of algorithms to assess the quality of omnidirectional content.

      References:

      [1] Wang and Bovik, “Mean squared error: Love it or leave it? A new look at Signal Fidelity Measures”, IEEE Signal Processing Magazine, 2009

      [2] https://theta360.com/

      [3] http://gopro.com/odyssey

      Requirements: Matlab programming skills, image processing

      Contact: roberto.azevedo@epfl.ch

    • Omnidirectional image and video processing
      Semester/Master Project

      Nowadays, cameras which allow capturing omnidirectional (i.e. 360°) images and video sequences have started to appear as commercial products [1][2]. An omnidirectional image can be seen as a 360° viewing sphere, since the real-world environment surrounding the camera is captured in all directions.

      In order to be processed with widely-spread algorithms designed for standard rectangular planar images, the viewing sphere is often mapped to a plane, resulting in a so-called panoramic image [3]. The alternative to this approach would be to process the signal directly in its original spherical domain. The goal of this project will be the study of a framework to adapt classical digital image and video processing techniques to the omnidirectional scenario, where signals are defined on the surface of a sphere.

      References:

      [1] https://theta360.com/

      [2] http://gopro.com/odyssey

      [3] David Salomon, “Transformation and projection in Computer Graphics”

      Requirements: Matlab programming skills, signal processing, basics of trigonometry.

      Contact: roberto.azevedo@epfl.ch

    • Plenoptic sampling
      Master Project

      With the recent advances in 3D scene representation with camera networks, like in 3DTV or Free-viewpoint TV, the problem of camera positioning is becoming very important. Optimal camera placement increases the 3D reconstruction precision and augments the compression performance of multi-view images. The problem of camera placement can be formulated as the plenoptic sampling problem, where the best sampling of the viewpoints is sought. The plenoptic function captures the luminance and chrominance properties of a light ray in any direction, at any time instant and from any viewing point.

      This project aims at proposing a novel representation and parameterization of the plenoptic function that can efficiently capture the underlying geometry in 3D scenes. As a second part of this project, the camera positioning problem will be formulated as a sampling problem in the transform domain defined by the novel geometric representation of the plenoptic function.

      Requirements: signal processing, Matlab (C/C++)

      Contact: pascal.frossard@epfl.ch

    • Study and implementation of a channel coding scheme for delay-constrained applications
      Semester/Master Project

      Multimedia streaming over wireless channels will play a fundamental role in the next generation of mobile communication networks. Real-time image and video streaming are particularly challenging due to the strict constraints in terms of delay. Such constraints are strict in case the video is streamed from a drone and is used as a reference by the pilot to control the machine. The goal of the project is to develop a channel coding scheme which can cope with limited channel state information at the transmitter while providing high reliability and meeting the strict delay constraints of real-time multimedia streaming. The implementation will be based on LDPC codes and multiuser detection principles.

      Requirements: knowledge of channel coding principles, good programming skills (Matlab and C/C++).

      Contact: giuseppe.cocco@epfl.ch

Signal Processing on Graphs

    • Learn to encode graphs using a “dummy task”
      Master Semester/Thesis Project

      From molecules or proteins to social networks, a lot of data is structured as a graph. Graphs are not trivial to process, since most machine learning algorithms only accept tensors as input. A challenge in this domain is thus to learn how to represent or encode a graph into a vector [1]. The state-of-the art method [2] consists in maximising the mutual information between the encoded vector and local statistics in the graph, but little is known on the quality of the resulting encoding.

      The problem of encoding and decoding data is not restricted to graphs: it is also at the core of Generative Adversarial Networks (GANs). A recent idea that worked successfully for GANs was to try to force the discriminator to learn global features, and not focus only on local statistics. A simple trick to achieve that is to randomly rotate the input images, and ask the discriminator to predict by how much the image has been rotated [3].

      This project consists in building a similar method for graph: find a “dummy task” that requires to compute global features in order to create better encodings. They are many potential applications, depending on student’s interests: predicting chemical properties, learn to classify documents by topic, traffic forecasting…

      References:

      [1]: W. L. Hamilton et al., Representation learning on graphs: Methods and Applications
      [2]: P. Velickovic et al., Deep Graph Infomax
      [3]: S. Gidaris et al., Unsupervised representation learning by predicting image rotations

      Requirements: Good knowledge of Python, at least one machine learning course.

      Contact: clement.vignac@epfl.ch

    • Learn directions from signals on graphs to improve traffic forecasting
      Master Semester/Thesis Project

      A natural way to model traffic is to consider the density and velocity of cars as a signal on a transportation network [1]. Traffic forecasting is a challenging problem because spatio-temporal interactions must be accounted for in a non linear way. This problem is currently solved using graph neural networks [2]. However, these networks have some limitations: for example, when applied to a grid, they can only learn spherical filters [3] which is a very small class of functions.

      Because we know that the transportation network lays on a 2d manifold, we can try to improve the graph neural networks by adding a notion of direction on the graph. A way to do it is to use the graph signal itself.

      References:

      [1]: Y. Li et al., Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting
      [2]: Defferrard, Michaël, Xavier Bresson, and Pierre Vandergheynst. “Convolutional neural networks on graphs with fast localized spectral filtering.” Advances in Neural Information Processing Systems. 2016.
      [3]: R. Levie et al., CayleyNets: Graph Convolutional Neural Networks With Complex Rational Spectral Filters
      [4]: Velickovic, Petar, et al. “Graph attention networks.” arXiv preprint arXiv:1710.10903 1.2 (2017).

      Requirements: Good knowledge of Python, at least one machine learning course.

      Contact: clement.vignac@epfl.ch

    • Informed Source Separation for Multi-Modal Graph Signals
      Master Semester Project

      We consider that the processes developing at each layer of multi-graph data representation are distinct and separable, yet we are only able to perceive an unknown combination of them by means of the observations. This assumption permits a graph signal to be decomposed by some distinct components living in different structures, so that we may describe the signal components structured by each graph layer separately and write the observations as a linear combination of them.

      In quest of the sources of observations, we can define a problem on the multi-graph settings. Such a problem ultimately targets to estimate the spectral models acting on each graph layer and to reveal the sources of the observations. Specifically, we will consider applications in heat source separation problems, which can be represented by spreading processes on networks.

      References:

      [1] Pena, R., Bresson, X., & Vandergheynst, P. (2016, July). Source localization on graphs via ℓ1 recovery and spectral graph theory. In 2016 IEEE 12th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP) (pp. 1-5). IEEE.
      [2] Cencetti, Giulia, and Federico Battiston. “Diffusive behavior of multiplex networks.” New Journal of Physics (2019).
      [3] Gomez, Sergio, et al. “Diffusion dynamics on multiplex networks.” Physical review letters 110.2 (2013): 028701.
      [4] Sole-Ribalta, Albert, et al. “Spectral properties of the Laplacian of multiplex networks.” Physical Review E 88.3 (2013): 032807.

      Requirements: Python, basics of Graph signal processing: graphs signal filtering (covered in EE-558 Network Tour of Data Science), familiarity with basic machine learning tools.

      Contact: eda.bayram@epfl.ch

    • Deep Learning for time-varying networked data
      Master Semester/Thesis Project

      In this era of data deluge, we are overwhelmed with massive volumes of extremely complex datasets. Data generated today is complex because it lacks a clear geometric structure, comes in great volumes, and it often contains information from multiple domains. The emerging fields of deep learning and graph signal processing (GSP) have attracted a lot of attention as potential tools to overcome these challenges [1].

      In this sense, graph convolutional neural networks (GCNN) have recently been extended to work with multidomain graph signals [2], e.g., time-varying signals, defining a new framework to deal with graph signals defined on top of several domains, e.g., electroencephalograms or traffic networks. This new convolutional layer can be efficiently implemented to run on a GPU and it has shown promising generalization abilities on synthetic datasets.

      In this project, we propose applying this new type of time-varying GCNN to some real data, and analyze the impact that different architectures based on this convolutional layer have on classification accuracy of time-varying graph signals. The students are encouraged to find an application of their liking, but we provide the following list of suggestions:

      – EEG decoding [3]: A hot topic in neuroscience is the classification of brain signals for the development of brain computer interfaces (BCI). In this context, EEG signals can be viewed as time-varying graph processes where the spatial domain is represented by a graph of electrodes that are placed on the human skull.
      – Traffic prediction: Monitoring traffic in a city is a major concern for local governments. TV-GCNNs could be a potential tool for traffic prediction.
      – Characterization of epidemics [4]: One of the main issues in epidemiology is the statistical characterization of the dynamics of a disease outbreak given the recorded time variations of the disease state around the world. To tackle this problem, epidemiologists rely on complicated systems of coupled non-linear differential equations that are governed by a few characteristic parameters, i.e., probability of contagion, illness duration, or death rate. In this context, TV-GCNNs could be used as powerful regressors to infer these parameters from data.

      References:

      [1] M. M. Bronstein, J. Bruna, Y. LeCun, A. Szlam, and P. Vandergheynst, “Geometric deep learning going beyond euclidean data,” IEEE Signal Process. Mag., vol. 34, no. 4, pp. 18–42, 2017.
      [2] G.Ortiz-Jiménez,“Multidomain Graph Signal Processing: Learning and Sampling,” Master’s thesis, Delft University of Technology, Aug 2018.
      [3] R. T. Schirrmeister, J. T. Springenberg, L. D. J. Fiederer, M. Glasstetter, K. Eggensperger, M. Tangermann, F. Hutter, W. Burgard, and T. Ball, “Deep learning with convolutional neural networks for eeg decoding and visualization,” Hum Brain Mapp, vol. 38, no. 11, pp. 5391–5420, 2017.
      [4] R. Anderson and R. May, Infectious Diseases of Humans: Dynamics and Control, ser. Dynamics and Control. OUP Oxford, 1992.

      Requirements: Good knowledge of Python, sufficient familiarity with machine learning and graph theory. Having experience with one of deep learning libraries (preferably Tensorflow) is a plus.

      Contact: guillermo.ortizjimenez@epfl.ch

    • Network inference from a noisy graph structure
      Master Semester/Thesis Project

      Graph structures carry a lot of information on the interactions between data, and can be very useful in data analysis or interpretation. However, available structures are often very noisy or not entirely representative of data. Meanwhile, graph inference algorithms can provide very good estimations of this structure, but need a large amount of data to learn from.
      The goal of this project is to propose a graph inference algorithm for applications where fewer data is available, but a noisy graph estimation already exists. This can be an empirically constructed geometric graph or an available graph that does not necessarily capture the full information (eg. social network graph, where the same connection can represent very close friendship and people that have never met in person).

      References:

      [1] Dong, Xiaowen, et al. “Learning Laplacian matrix in smooth graph signal representations.” IEEE Transactions on Signal Processing 64.23 (2016): 6160-6173.

      [2] Kalofolias, Vassilis. “How to learn a graph from smooth signals.” Artificial Intelligence and Statistics. 2016.

      [3] Nguyen, Viet Anh, Daniel Kuhn, and Peyman Mohajerin Esfahani. “Distributionally Robust Inverse Covariance Estimation: The Wasserstein Shrinkage Estimator.” arXiv preprint arXiv:1805.07194 (2018).

      Requirements: Python, basics of probability theory, linear algebra, optimization is a plus

      Contact: hermina.petricmaretic@epfl.ch or mireille.elgheche@epfl.ch

    • Graph learning with a degree distribution prior
      Master Semester/Thesis Project

      Graphs are flexible structures that represent connections between data, whether they’re connections between people in social network or connectivity networks that govern processes in our brains. Very often (as is the case in brain networks) these graph structures are not readily available and need to be inferred. Graph learning methods have become very popular in the last few years, with solutions covering many different models of data behaviour on the graph.
      However, most solutions ignore any possible information we might have on the network structure. For example, additional knowledge about the graph can be included in terms of a degree distribution prior. The goal of this work is to propose a method for graph inference incorporating the prior information on the degree distribution of nodes.

      References:

      [1] Dong, Xiaowen, et al. “Learning Laplacian matrix in smooth graph signal representations.” IEEE Transactions on Signal Processing 64.23 (2016): 6160-6173.

      [2] Tzikas, Dimitris G., Aristidis C. Likas, and Nikolaos P. Galatsanos. “The variational approximation for Bayesian inference.” IEEE Signal Processing Magazine 25.6 (2008): 131-146.

      [3] Dong, Xiaowen, et al. “Learning Graphs from Data: A Signal Representation Perspective.” arXiv preprint arXiv:1806.00848(2018).

      Requirements: Basics of (graph) signal processing, basics of machine learning, Python/Matlab coding skills.

      Contact: hermina.petricmaretic@epfl.ch

    • Inference of multiple functional brain networks using Graph Laplacian Mixture Model
      Master Semester Project/Thesis Project

      Spontaneous brain activity, as measured through resting-state functional magnetic resonance imaging (fMRI) has provided key insights into the functional architecture of the brain. Global patterns of neural activity can be obtained by directly computing the statistical interdependence between different brain regions. The information can then be conveniently summarised into a functional connectome (1), and more intuitively represented as a set of functional brain networks. These networks are observed to be dynamic (2) and spatio-temporally overlapping (3). In this project, the goal is to simultaneously separate signals corresponding to different phases and infer multiple functional brain networks. This will be done by building upon an emerging field of graph learning, specifically by utilising a Graph Laplacian mixture model (4), a generative model for mixed signals living on multiple networks.

      References:

      [1] Bullmore, E., Sporns, O., 2009. Complex brain networks: graph theoretical analysis of structural and functional systems. Nat. Rev. Neurosci. 10 (3), 186–198.

      [2] Chang C, Glover GH (2010) Time-frequency dynamics of resting-state brain connectivity measured with fmri. Neuroimage 50:81–98

      [3] Karahanoğlu, F. I. & Van De Ville, D. (2015) Transient brain activity disentangles fmri resting-state dynamics in terms of spatially and temporally overlapping networks. Nature communications 6

      [4] Maretic, Hermina Petric, and Pascal Frossard. “Graph Laplacian mixture model.” arXiv preprint arXiv:1810.10053(2018),

      Requirements: Basics of (graph) signal processing, basics of machine learning, basics of linear algebra, basics of probabilities, Python/Matlab coding skills.

      Contact: hermina.petricmaretic@epfl.ch

    • Improving classification performance of convolutional neural networks by changing the convolutional kernel shape
      Semester/Master Project

      Convolutional neural networks have become one of the most efficient tools for tasks such as classification. In such networks, a convolutional kernel is translated over an image in order to detect some local patterns that may be characteristic of a particular class. In general, these kernels take the shape of a NxM window of pixels, which is quite arbitrary. The idea of the project is to explore different kernel shapes, possibly learnt from the data, that would allow classification improvements.

      Requirements: Basics of (graph) signal processing, basics of machine learning, Python/Matlab coding skills.

      Contact: bastien.pasdeloup@epfl.ch

    • Finding the closest regular topology to a graph representing an irregular space
      Semester/Master Project

      Graphs are often used to represent irregular topologies, such as relations in a social network, roads of a city, brain connectivity, etc. Due to this irregular aspect, some operations such as translation of a signal on a graph are quite hard to define. However, these tasks can easily be done for some particular graphs, such as an N-dimensional grid for instance. The project would consist, given a graph, to find the « closest » regular space to approximate it, using particular properties of adjacency matrices for the latter graphs (block circulancy bandwidth…). Operations could then be performed on this space as a proxy for the irregular topology.

      Requirements: Basics of (graph) signal processing, basics of machine learning, Python/Matlab coding skills.

      Contact: bastien.pasdeloup@epfl.ch

    • Defining heuristics for translations of a convolutional kernel on an irregular graph
      Semester/Master Project

      Graph convolutional neural networks are possible extensions of classical CNNs on images, allowing one to achieve better classification performance on irregular data. One of the approaches to extending CNNs for graphs consists in translating a convolutional kernel over a graph. However, translation is not easily defined on spaces that are not associated with a notion of distance. A possible way of mimicking translation on Euclidean spaces is to find an injective function on the graph that preserves neighborhoods within the kernel to translate (https://arxiv.org/abs/1710.10035). When the kernel is highly localized, such functions can be explored exhaustively to find those that minimize the kernel deformation upon translation. However, when it grows in size, this approach becomes intractable due to the NP-completeness complexity of finding interesting translations. In this project, we would like to explore possible heuristics to translate a convolutional kernel on a graph.

      Requirements: Basics of (graph) signal processing, basics of machine learning, Python/Matlab coding skills.

      Contact: bastien.pasdeloup@epfl.ch

    • Exploring hyper graph signal processing
      Semester/Master Project

      Graph signal processing (GSP) has emerged as a possible extension to classical Fourier analysis, allowing one to study signals evolving on complex topologies. In this framework, graphs model the domains on which data evolve, by creating edges between variables with some relationships. However, graphs only capture 1-to-1 relationships and ignore some more complex relations between 3 or more variables. Such relations can be modeled using a hypergraph, i.e., a graph in which edges can link more than 2 nodes. The goal of this project is to explore whether tools from the (GSP) theory can be found for such structures.

      Requirements: Basics of (graph) signal processing, basics of machine learning, Python/Matlab coding skills.

      Contact: bastien.pasdeloup@epfl.ch

    • Semi-Supervised Learning and Inpainting on Multi-Layer Graph Representations
      Semester/Master Project

      On partially labeled data, semi-supervised learning methods have been studied profoundly by expressing the relations between the data entities within weighted graph representations [1]. The inpainting task, on the other hand, is usually defined on the domains accompanying a signal content, which has been addressed with graph signal representations and operations [2]. Most of these studies focus on one type of relationship between the pair of data points during the construction of the graph structure. However, the connections between the entities may possess different types of relationships, which can be represented better by multiple graph structures. The objective of this project is to extend semi-supervised clustering and inpainting tasks on multi-layer graph settings, where each graph layer signifies a particular type of relation between vertices. This yields the same number of vertices in each graph layer, yet the topology (i.e., weight matrix) is different due to the difference between the focus of each layer.

      References:

      [1] Belkin, Mikhail, and Partha Niyogi. “Semi-supervised learning on Riemannian manifolds.” Machine learning 56.1-3 (2004): 209-239.

      [2] Perraudin, Nathanaël, and Pierre Vandergheynst. “Stationary signal processing on graphs.” IEEE Transactions on Signal Processing 65.13 (2017): 3462-3477.

      [3] Davide Eynard, Klaus Glashoff, Michael M Bronstein, and Alexander M. Bronstein. Multimodal diffusion geometry by joint diagonalization of laplacians.arXiv preprint arXiv:1209.2295, 2012.

      [4] Xiaowen Dong, Pascal Frossard, Pierre Vandergheynst, and Nikolai Nefedov. Clustering on multi-layer graphs via subspace analysis on grassmann manifolds. IEEE Transactions on signal processing, 62(4):905–918, 2014.

      [5] Xiaowen Dong, Pascal Frossard, Pierre Vandergheynst, and Nikolai Nefedov. Clustering with multi-layer graphs: A spectral perspective. IEEE Transactions on Signal Processing, 60(11):5820–5831, 2012.

      Requirements: Python, basics of Graph signal processing: graphs signal filtering and spectral clustering (covered in EE-558 Network Tour of Data Science).

      Contact: eda.bayram@epfl.ch

    • Transfer learning for network data
      Semester/Master Project

      Technology is in constant evolution and the amount of collected network data is increasing every day. Numerical examples can be found in geographical, transportation, biomedical and social networks, such as temperatures within a geographical area, traffic capacities at hubs in a transportation network, or human behaviors in a social network. Due to this growing volume of information and its diverse interactions, when represented, data resides now on irregular and complex structures, and can be modelled by graphs [1, 2, 3]. Let us consider, for example, the traffic congestion problem. If we have a model that links two different graphs of two different cities, we will be able to transfer the traffic information from one city to another and then predict the condition in which this later would be. The idea of this project is exactly to build a method and implement a machine learning algorithm, that is able to transfer information from one graph to another [4].

      References:

      [1] D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega and P. Vandergheynst. The Emerging Field of Signal Processing on Graphs: Extending High-Dimensional Data Analysis to Networks and Other Irregular Domains, in IEEE Signal Processing Magazine, vol. 30, num. 3, p. 83-98, 2013.

      [2] B. Yener, Cell-Graphs: Image-Driven Modeling of Structure-Function Relationship, ACM, Vol. 60 No. 1, Pages 74-84, 2017.

      [3] A. Ortega, P. frossard, J. Kovacevic, J. Mora P. Vandergheynst, Graph Signal Processing: Overview, Challenges and Applications, 2018.

      [4] M. Pilanci and E. Vural, “Domain adaptation on graphs by learning aligned graph bases,” 2018.

      Requirements: Good knowledge of Python and MATLAB, familiarity with machine learning and linear algebra.

      Contact: mireille.elgheche@epfl.ch

    • Building Extraction on Aerial LIDAR Point Clouds using Spectral Graph Features
      Semester/Master Project

      Airborne Laser Scanning is a well-known remote sensing technology, which provides quite dense and highly accurate, yet unorganized point cloud descriptions of the earth surface. Weighted graphs are very convenient tools for representing such irregular and 3D data types, moreover, spectral graph based methods provide spectral analysis of signals residing on the weighted graph representations. With this in mind, one can consider the airborne LIDAR data as unstructured elevation signal so that it can be processed on an appropriate graph structure.

      The goal of this project is to discover the spectral attributes of various objects in a LIDAR scene, such as buildings and vegetation, through the graph signal processing. Instead of calculating the geometric primitives such as normals, slopes and curvatures for each point on a scene and thresholding them, this approach aims to transpose classical signal processing tools to analyze 3D aerial LIDAR point clouds.

      In particular, the points on the breakline of the buildings constitute some features which can be detected and discriminated from the other objects by augmenting the spectral information on the graph. This could be achieved by formulating a spectral descriptor on the detected points and then, yielding a classification problem to discriminate the ones existing on the buildings. On a building extraction problem, the later step is to retrieve whole body of the building objects.

      References:

      [1] Michaël Defferrard, Lionel Martin, Rodrigo Pena, & Nathanaël Perraudin. (2017, October 6). PyGSP: Graph Signal Processing in Python (Version v0.5.0). Zenodo. http://doi.org/10.5281/zenodo.1003158.

      [2] ISPRS, “Isprs test project on 3d semantic labeling contest,”2017, http://www2.isprs.org/commissions/comm3/wg4/3d-semantic-abeling.html.

      [3] Blomley, R., and M. Weinmann. “USING MULTI-SCALE FEATURES FOR THE 3D SEMANTIC LABELING OF AIRBORNE LASER SCANNING DATA.” ISPRS Annals of Photogrammetry, Remote Sensing & Spatial Information Sciences 4 (2017).

      Requirements: Python, basics of Graph signal processing: graphs signal filtering (covered in EE-558 Network Tour of Data Science), familiarity with basic machine learning tools.

      Contact: eda.bayram@epfl.ch

Learning

    • Analyse medical images using group equivariant neural networks
      Master Semester/Thesis Project

      Despite being 40 years old, convolutional neural networks are still state-of-the art when it comes to analysing images. The reason for this success is that they hard-code in the neural network the prior that the features learned must be invariant to translations, because the same objects can appear in any part of an image.

      Methods have recently been proposed to extend this idea to other group of transformations as well [1]. For example, images for skin cancer detection have no notion of “up” and “down”, and can be arbitrarily rotated [2]. Roughly speaking, the method proposed in [2] amounts to learning at the same time one filter and all their rotated versions. It allows the network to learn with little training data, but is less efficient in terms of memory and computations, because the image must be fed to all the rotated filters.

      When using standard CNNs, memory usage is not a problem. What makes translation invariance efficient and rotation invariance costly? The answer is that in CNNs we assume that the features are local. Similarly, we expect the first layers of CNNs to act as edge detectors, which can be seen as local functions of the angle in polar coordinates. This project consists in creating a neural network based on this hypothesis, to improve upon rotation invariant neural networks.

      References:

      [1]: T. Cohen M. Welling, Group equivariant convolutional networks
      [2]: B. S. Veeling et al., Rotation equivariant CNNs for digital pathology

      Requirements: This project requires some notions of group theory, and a good knowledge of Python. During the project, you may have to code in CUDA to run the computations on a GPU, but no prior knowledge of CUDA is required.

      Contact: clement.vignac@epfl.ch

    • Neural Network Pruning
      Semester/Master Thesis Project

      Neural network pruning techniques [1,2] can reduce the parameter counts of trained networks by over 90%, decreasing storage requirements and improving computational performance of inference without compromising accuracy.

      Although the original ideas date back to the 90s [3,4], network pruning techniques have regained a lot of popularity lately, now that more and more applications require that deep networks are deployed on mobile devices.

      Furthermore, some researchers [5] claim that pruning overparameterized neural networks can naturally uncover subnetworks whose initializations made them capable of training effectively. Based on these results, they articulate the lottery ticket hypothesis: dense, randomly-initialized, feed-forward networks contain subnetworks (winning tickets) that—when trained in isolation— reach test accuracy comparable to the original network in a similar number of iterations. This hyphothesis could explain why overparameterized neural networks are asier to optimize in practice. A work for which the authors were awarded the Best Paper award at ICLR 2019.

      However, up till now most pruning techiques are based on heuristics that assign arbitrary importance to the weights of a neural network which lack a theoretically grounded explanation. In this project we aim to bridge the theoretical gap and propose to use recent advances in sampling theory to determine near-optimal pruning strategies for neural networks.

      References:

      [1] Song Han, Jeff Pool, John Tran, and William Dally. Learning both weights and connections for efficient neural network. In Advances in neural information processing systems, pp. 1135–1143, 2015.

      [2] Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet, and Hans Peter Graf. Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710, 2016.

      [3] Yann LeCun, John S Denker, and Sara A Solla. Optimal brain damage. In Advances in neural information processing systems, pp. 598–605, 1990.

      [4] Babak Hassibi and David G Stork. Second order derivatives for network pruning: Optimal brain surgeon. In Advances in neural information processing systems, pp. 164–171, 1993.

      [5] Jonathan Frankle and Michael Carbin. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks. In proceedings of International Conference on Learning Representations, 2019

      Requirements: Good knowledge of Python, sufficient familiarity with machine learning and deep learning. Having experience with one of deep learning libraries (preferably Tensorflow) is a plus.

      Contact: guillermo.ortizjimenez@epfl.ch

    • Inside our mind: detecting dreams
      Semester/Master Thesis Project

      Electroencephalogram (EEG) data is usually very complicated. Not only does it depend on both time and space; it is merely a surface measure of a deep phenomenon taking place inside our brain.
      The purpose of this project is understanding the hidden patterns of our brain’s electric activity during dreaming phases, and building a classifier which can correctly label a person’s EEG signal as belonging to a dreaming or non-dreaming phase across different sleep stages. This machine-learning project will have a crucial feature engineering component: to begin with, simple and powerful features will have to be created to take into account the quasi-periodicity of brain waves; more advanced inverse problem techniques may later be used to reconstruct the 3D electrical signal inside the brain and pass to a voxelised version of the problem. Clever data augmentation techniques may also prove crucial to balance the relative scarcity of training samples. The final goal of this project will be to engineer and train a number of sequence models (e.g. recurrent neural networks) on the selected features in order to capture the relationship between the EEG patterns and the dreaming process.

      Requirements: Basics of signal processing, Basics of Machine Learning

      Contact: m.caorsi@l2f.ch or pascal.frossard@epfl.ch

    • Topology of critical phenomena
      Semester/Master Thesis Project

      Topological data analysis (TDA) has been applied successfully to the study of time series: its robustness against noise and stability under deformations of the point cloud makes it an excellent alternative to the standard DFT approach [1]. The aim of this project is to preprocess the time series (e.g. using stabilisation techniques based on standard and fractional derivatives) so as to maximise the effectiveness of TDA. Other preprocessing techniques may require to define a laplacian operator over the persistent simplicial filtration of the point cloud and use spectral techniques to reduce the dimensionality effectively. After the preprocessing stage has been carried out, the candidate shall focus on the application of persistent homology, and in particular on the computation of the bottleneck distance between different persistent barcodes. This may lead to a promising approach for anticipating the occurrence of catastrophic events. Time permitting, it would be worth investigating the relationship between the topological techniques and the well-established extreme value theory.

      References:

      [1] Shafie Gholizadeh, Wlodek Zadrozny A Short Survey of Topological Data Analysis in Time Series and Systems Analysis arXiv:1809.10745

      Requirements: Good background in signal processing.

      Contact: m.caorsi@l2f.ch or pascal.frossard@epfl.ch

    • Topological filtering
      Semester/Master Thesis Project

      Signal processing is a vast field of engineering and research. The applications are everywhere in our society and creating innovation in this field is not for everyone. Fortunately, not much topology has been exploited over the past century by engineers in this field: the purpose of this project is to dive in this world equipped with the tools of persistence homology. The starting point of this project is to build a topological low pass filter: instead of cleaning the noise from averaging the signal directly, one can clean its topological representation (obtained, for example, via Takens or phase-space embeddings). Then, one can develop similar concepts but for high-pass filters and band-pass ones.

      Requirements: Good knowledge of signal processing, good programming skills and a lot of out-of-the-box thinking.

      Contact: m.caorsi@l2f.ch or pascal.frossard@epfl.ch

    • Learning similarity metric for video reconstruction
      Semester Project

      Obtaining an unsupervised representation of sequential visual data can be crucial for any autonomous intelligent agent. One of the simplest schemes to learn an unsupervised representation is to encode the data under some constraints in such a way that the error between the reconstructed and input data is minimized.

      The use of latent random variables with neural networks has been argued to model the variability observed in the data efficiently, as in Variational Autoencoders (VAE) [1] and their counterparts for sequential data, e.g., variational Recurrent Neural Networks (RNN) in [2] (an RNN is a type of neural network which includes an inner loop so that it performs the same task for every element). The aim for the first part of the project is to understand variational RNNs and use them for video reconstruction using simple moving MNIST dataset. Later, instead of element-wise errors, the student is expected to use another neural network structure to learn a similarity metric as the basis for the reconstructive objective, as done in [3] for VAEs.

      References:

      [1] Kingma, D. P., & Welling, M. (2013). Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.

      [2] Chung, J., Kastner, K., Dinh, L., Goel, K., Courville, A. C., & Bengio, Y. (2015). A recurrent latent variable model for sequential data. In Advances in neural information processing systems (pp. 2980-2988)

      [3] Larsen, A. B. L., Sønderby, S. K., Larochelle, H., & Winther, O. (2015). Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300.

      Requirements: Good knowledge of Python, sufficient familiarity with machine learning and probability. Having experience with one of deep learning libraries (preferably Tensorflow) is a plus.

      Contact: beril.besbinar@epfl.ch

    • Spectral vs. spatial approaches of deep learning for data on non-Euclidean domains
      Semester Project

      Deep learning methods have been very successful for signals defined on regular grids, such as images and audio. Recently, some works extend deep learning models on data defined on irregular domain, such as graphs and manifolds. The approaches so far presented in this field can be classified to (i) spatial and (ii) spectral. The aim of this semester project is to compare these methods and to find their limitations and advantages.

      The student will (i) read and understand in depth existing methods of both categories, (ii) run experiments to compare different approaches and (optional) propose ideas of dealing with the limitations of the current methods.

      References:

      [1] M.M. Bronstein, J. Bruna, Y. LeCun, A. Szlam, P. Vandergheynst, “Geometric deep learning: going beyond Euclidean data”, arXiv:1611.08097

      [2] D. I Shuman, S. K. Narang, P. Frossard, A. Ortega, and P. Vandergheynst, “The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains,” IEEE Signal Processing Magazine, vol. 30, no. 3, pp. 83-98, May 2013.

      Requirements: Python, background in machine learning and signal processing.

      Contact: effrosyni.simou@epfl.ch

Hands-on

    • User-friendly drawing of large graphs
      Semester Project

      Very large graphs can be difficult to visualise, and more importantly, difficult to make sense of. As there is an ever-increasing amount of data surrounding us, recovering useful information becomes more and more challenging, and interpreting large data becomes the main task of data scientists. In terms of large graphs, a coarser representation can make its behaviour and trends much clearer. Furthermore, a coarse representation can be coarsened further until we reach a version that is simple enough to easily interpret. However, once we recover the basic behaviour, we might want to go back to the finer version for more meaningful information.

      The goal of this project is to create a tool to plot a coarse graph representation. Representations with different levels of detail (original graph, coarser version, futher coarsened version…) will be provided and the tool should be able to zoom into the graph to recover a finer version of the graph and zoom out to draw a coarser version. The tool should be user-friendly and ensure consistent embedding of graphs with different amounts of detail. Integrating the tool in a website would be a plus.

      Requirements: Python, basics of linear algebra and graph theory, knowledge of web programming is a plus

      Contact: hermina.petricmaretic@epfl.ch