Welcome to the LTS4 Student Projects page!
Below you will find a collection of projects that are available for the coming semesters. Even though most of the proposed projects are categorized as Semester or Master projects, they can generally be modified to fit other formats. Contact the responsible person that is given in the detailed project description for further information.
The project topics are the following:
 Image Analysis and Vision
 Image Communication
 Graph and Network Signal Processing and Analysis
 Learning
 Handson
In case you want to work on your own ideas that are closely related to our research activities, we are more than happy to discuss them with you.
Image Analysis and Vision


Adversarial perturbations for interpretability and robustness
Semester/Master ProjectAlthough Deep Neural Networks are considered the stateoftheart for image classification tasks, they are also highly sensitive to small adversarial perturbations of their input data; small changes on the image can lead to extreme misclassification scenarios [1]. Towards explaining this phenomenon, it is very important to understand and interpret what deep networks are eventually forced to “see” when adversarial perturbations are applied, and exploit this information properly for building more robust classifiers.
In this work, we will mainly focus on sparse adversarial perturbations [2] for generating adversarial data, since the perturbed features are quite discrete and easier to analyze. We will extract meaningful information from the perturbed features. Finally, based on our observations, we will try different approaches aiming to improve the robustness of the networks.References:
[1] Szegedey et al., “Intriguing properties of neural networks”, ICLR 2014.
[2] Modas et al., “SparseFool: a few pixels make a big difference”, arXiv preprint 2018.Requirements: Good knowledge of Python, and sufficient familiarity with image processing and machine learning. Experience with PyTorch is a plus..
Contact:apostolos.modas@epfl.ch

Robustness of deep neural networks on different spaces
Semester/Master ProjectDeep Neural Networks have achieved extraordinary results on many image classification tasks, but have been shown to be vulnerable to small alterations of their input, also known as adversarial perturbations [1]. There are several methods that compute adversarial perturbations, each one of which perturbs different features/characteristics (spatial, spectral, chromatic) of the input image.
In this project, we will study the robustness of deep neural networks to perturbations that lie on specific spatial, spectral, and color spaces. The goal is to identify those spaces where the networks are more vulnerable, and use them properly to build methods that increase the robustness against adversarial attacks.References:
[1] Szegedy et al., “Intriguing properties of neural networks”, ICLR 2014.Requirements: Good knowledge of Python, sufficient familiarity with computer vision and deep learning. Experience with PyTorch or other deep learning library is a plus.
Contact: apostolos.modas@epfl.ch

Behavioural analysis of neural networks to adversarial perturbations
Semester/Master ProjectThe vulnerability of deep neural networks to small, carefully crafted noise known as adversarial perturbations [1], has raised many questions regarding their safety and their overall behaviour when classifying different images. In order to improve these deep nets and increase their robustness, first we need to better understand their total behaviour both in correct classification and misclassification situations.
In this project, we aim to study the internal behaviour of such deep architectures, when different types of perturbations are applied in the input layer. By observing the neuron activation diffusion and visualising the feature changes, we want to understand how these small perturbations in the input affect the different layers of the network, leading eventually to such significant changes in terms of classification.
[1] Szegedy et al., “Intriguing properties of neural networks”, ICLR 2014.
Requirements: Good knowledge of Python and MATLAB, and sufficient familiarity with deep learning and deep neural networks. Experience with PyTorch is a plus.
Contact: apostolos.modas@epfl.ch



Omnidirectional vision for drones
Semester/Master ProjectOmnidirectional imaging is a hot research topic and has important applications to aerial robots equipped with multiple cameras. Next to numerous use cases in cinematography and filmmaking, omnidirectional vision can enhance drone autonomy by providing 360 degree visual sensing for tasks such as collision avoidance. The present project aims at implementing and comparing efficient stitching algorithms for omnidirectional vision, possibly enhancing existing solutions. Since the algorithm is to be run on the drone onboard computer and potentially used for realtime applications such as collision avoidance, it should have low complexity and must be able to run at an acceptable speed.
The first part of the project will involve the selection and implementation of suitable stitching algorithms. Existing algorithms may be enhanced by taking the a priori information such as the camera configuration into account. The second part of the project will involve testing of the proposed algorithms on a physical drone platform.
Requirements: Programming skills (shell scripting, Python, C/C++), basics of image processing.
Contact: giuseppe.cocco@epfl.ch



Bandwidth efficient object recognition for drone swarms
Semester/Master ProjectThe project aims at developing a bandwidth efficient distributed object detection system which can be flown on a drone swarm. The system exploits the different points of view of the drones in the swarm to improve object recognition, while keeping the amount of data that is transmitted to the ground station as low as possible. In this way a more efficient use of the limited wireless resources can be achieved. The projects will involve the use of both offtheshelf neural network algorithms and WiFi communication protocols. The first part of the project will focus on the setup of the communication network between the communication modules to be mounted on the drones and the ground station. The second part of the project will focus on the setup of the image capture/object recognition system and some basic onboard image processing. The core part of the project will consist in the implementation and optimization of the detection and communication protocol, which will build upon the modules developed so far. This part will include the evaluation of the system performance in terms of both bandwidth efficiency and detection accuracy.
Requirements: Good knowledge of WiFi communication protocols (handson experience desirable), programming skills (shell scripting, Python, C/C++), familiarity with computer vision.
Contact: giuseppe.cocco@epfl.ch



Deep Learning for Depth Estimation
Semester/Master ProjectAlthough Deep Neural Networks (DNN) are widely known for their remarkable classification capabilities, it has recently been shown that they can effectively be used for Depth Estimation as well. The Depth Estimation problem consists in the inference of the depth of a scene from two or possibly more images of the same scene captured from different points of view. It represents a critical problem in Computer Vision, as it is necessary to target more high level problems, such as semantic segmentation, action recognition, etc. Light field cameras can capture hundreds of images of the same scene from slightly different points of view in just a single exposure. Ideally, the availability of such a number of views makes the depth estimation problem easier to address, compared to the most common scenario where only two image are available. In this project, the student will investigate the adaption of existing DNNbased Depth Estimation algorithm designed for the common setup with just two of images (stereo setup) to the Light Field case (multistereo setup), where multiple images of the same scene are available. Although the availability of multiple images of the same scene makes the depth estimation problem less illconditioned and easier to approach, the high amount of data requires to carefully take into account the overall computational and memory complexity of the final Depth Estimation algorithm, as these can become a main bottleneck when a depth estimation is required in realtime application.
[1] Alexey Dosovitskiy et al., FlowNet: Learning Optical Flow with Convolutional Networks, ICCV 2015.
[2] Nikolaus Mayer et al., A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation, IEEE CVPR 2016.
[3] Wenjie Luo et al., Efficient deep learning for stereo matching, IEEE CVPR 2016.Requirements: Knowledge of Python and DNN tools such as PyThorch. Basics of Image Processing.
Contact: mattia.rossi@epfl.ch



Geometry meets robustness
Semester ProjectDeep networks have shown a great performance in classification tasks. However, they are proved to be vulnerable to small and often imperceptible noise called “adversarial perturbations”. It becomes extremely crucial when they are deployed in safetycritical applications such as driverless cars. So far, no definite methods have been found to counter this vulnerability effectively.
Nevertheless, the adversarial perturbations are recently revealed to be tightly related to the geometry of the decision boundary of deep networks. In this project, we characterize these perturbations in terms of the geometrical properties of the decision boundary. Based on this understanding, we eventually design efficient algorithms to detect adversarial perturbations.
Requirements: Good knowledge of Python and MATLAB, sufficient familiarity with machine learning and image processing, having experience with PyTorch (or similar deep learning frameworks) is a plus.
Contact: seyed.moosavi@epfl.ch



Investigation of Deep Convolutional Neural Networks on the Information Plane
Master/Semester ProjectWith their success and wide application areas, there is an urge for a comprehensive understanding of learning with Deep Convolutional Neural Networks (DCNNs). There is a recent approach [1,2] that studies the information paths of Deep Neural Networks (DNNs) in the information plane, where the joint distribution of the input and output of each layer is plotted against their mutual information throughout the learning procedure. Despite bringing a new perspective and revealing details about the inner working of DNNs, these two papers experiment only with very smallscale fullyconnected feedforward networks for classification, which leads to easy computations for distributions. To adapt this framework to realworld problems, one needs to apply estimation methods to compute mutual information and required distributions for the analysis. The aim of this project is to analyze simple but more realistic DCNNs, like LeNet5 [3], in the information plane by finding and applying proper and efficient estimation techniques. Possible extension would be to examine how skip connections [4] or routing between capsules, which are groups of neurons proposed in [5] to achieve equivariance, contribute to learning process.
[1] Tishby, N., & Zaslavsky, N. (2015, April). Deep learning and the information bottleneck principle. In Information Theory Workshop (ITW), 2015 IEEE (pp. 15). IEEE.
[2] ShwartzZiv, R., & Tishby, N. (2017). Opening the Black Box of Deep Neural Networks via Information. arXiv preprint arXiv:1703.00810.
[3] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradientbased learning applied to document recognition. Proceedings of the IEEE, 86(11), 22782324.
[4] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770778).
[5] Sabour, S., Frosst, N., & Hinton, G. E. (2017). Dynamic Routing Between Capsules. In Advances in Neural Information Processing Systems (pp. 38573867).Requirements: Sufficient familiarity with machine learning and probability. Experience with one of deep learning libraries and good knowledge of the corresponding coding language (preferably Python).
Contact: beril.besbinar@epfl.ch



Interpretable machine learning in personalised medicine
Master/Semester ProjectModern machine learning models mostly act as a black box and their decisions cannot be easily inspected by humans. To trust the automated decisionmaking, we need to understand the reasons behind predictions, and gain insights into the models. This can be achieved by building models that are interpretable. Recently, different methods have been proposed for data classification, such as augmenting the training set with useful features [1], visualizing the intermediate features in order to understand the input stimuli that excite individual feature maps at any layer in the model [23], or introducing logical rules in the network that guide the classification decision [4], [5]. The aim of this project is to study existing algorithms, which attempt to interpret deep architectures by studying the structure of their inner layer representations, and based on these methods find patterns for classification decisions along with coherent explanations. The studied algorithms will most be considered in the context of personalised medicine applications.
[1] R. Collobert, J. Weston, L. Bottou, M. M. Karlen, K. Kavukcuoglu, and P. Kuksa, “Natural language processing (almost) from scratch,”J. Mach. Learn. Res., vol. 12, pp. 2493–2537, Nov. 2011.
[2] K. Simonyan, A. Vedaldi, and A. Zisserman, “Deep inside convolutional networks: Visualising image classification models and saliency maps,” arXiv:1312.6034, 2013.
[3] L. M. Zintgraf, T. S. Cohen, T. Adel, and M. Welling, “Visualizing deep neural network decisions: Prediction difference analysis,” arXiv:1702.04595, 2017.
[4] Z. Hu, X. Ma, Z. Liu, E. Hovy, and E. Xing, “Harnessing deep neural networks with logic rules,” in ACL, 2016.
[5] Z. Hu, Z. Yang, R. Salakhutdinov, and E. Xing, “Deep neural networks with massive learned knowledge,” in Conf. on Empirical Methods in Natural Language Processing, EMNLP, 2016.Requirements: Familiarity with machine learning and deep learning architectures. Experience with one of deep learning libraries and good knowledge of the corresponding coding language (preferably Python) is a plus.
Contact: mireille.elgheche@epfl.ch



Depth estimation via deep learning: the light field case
Master ProjectThe depth estimation problem is concerned with the inference of the depth of a scene from multiple pictures of it, all captured from different points of view. Depth estimation represents a critical problem in Computer Vision, as it is necessary to target more high level problems, such as 3D reconstruction, semantic segmentation, action recognition, etc. Typically, since depth estimation requires multiple pictures of the same scene to be available, the scene has to be still while the user moves around and captures the necessary pictures: this represents a significant limitation. On the other hand, Light Field cameras [1] can capture hundreds of images of the same scene from slightly different points of view in just a single exposure, therefore depth estimation from light field camera data has raised a large interested in the last decade [2][3].Due to the recent success of Deep Neural Network in Computer Vision tasks, the student will address the light field depth estimation problem within the framework of Deep Learning. Although some preliminary work in this direction already exists [3], this research track is still at its beginning. The student will start by studying the recent deep learningbased depth estimation algorithms for the standard stereo setup [4] (two pictures available) and the multiview stereo setup [5] (three or more pictures available). Then, the student will consider their extension to the particularly scenario of light field cameras. On one hand, light field cameras provide a much higher number of pictures than those considered in the traditional stereo and multiview stereo setups, therefore, the computational and memory complexity will have to be carefully taken into account. On the other hand, the light field camera pictures exhibit a very regular structure [2] that can be largely exploited in the depth estimation task. The depth estimation results will be evaluated on a stateoftheart light field benchmark [6].References:
[1] Raytrix website (https://raytrix.de/)[2] S. Wanner and B. Goldluecke, Globally consistent depth labeling of 4D light fields, IEEE CVPR, pp. 4148, 2012[3] S. Changa et al., EPINET: A FullyConvolutional Neural Network Using Epipolar Geometry for Depth from Light Field Images, IEEE CVPR, 2018[4] W. Luo et al., Efficient deep learning for stereo matching, IEEE CVPR, pp.56955703, 2016[5] H. PoHan et al., DeepMVS: Learning MultiView Stereopsis, IEEE CVPR, 2018[6] Light field depth estimation benchmark (http://hcilightfield.iwr.uniheidelberg.de/)Requirements: Knowledge of MATLAB and/or Python. Basic knowledge of optimization and machine learning. Knowledge of DCNN libraries (e.g., Caffe) and image processing is a plus.
Contact:mattia.rossi@epfl.ch



Comparative study of CNNs and human visual system under the effect of clutter
Master ProjectConvolutional Neural Networks (CNN) are feedforward, hierarchical architectures that achieve extremely accurate classification of natural images. However, there are differences in the visual aspects captured by CNNs and by the human visual system. This project aims to do a comparative study between the behaviour of CNN and humans for the task of classification of images under the effect of crowding (clutter). According to [1] although the performance of the human visual system decreases in the presence of crowding, the performance can be progressively regained in the presence of more crowding with specific characteristics (color, size, orientation). The main goal of this project will be to investigate whether CNNs exhibit similar behaviour. The student will learn how to:
i) create a dataset to train the CNN
ii) train the CNN so that it classifies accurately the images without the presence of clutter
iii) fashion experiments (similar to those in [1]) in order to test the performance of the trained CNN in the presence of clutter
iv) provide comments on the results
References:
[1] Michael H. Herzog, Mauro Manassi, Uncorking the bottleneck of crowding: a fresh look at object recognition, Current Opinion in Behavioral Sciences, 1, p8693, 2015.
[2] Yann LeCun, Yoshua Bengio and Geoffrey Hinton, Deep Learning, Nature 521, no. 7553 (2015): 436444.
[3] Alex Krizhevsky, Ilya Sutskever, Geoffrey E Hinton, Classification with Deep Convolutional Neural Networks, NIPS 2012.
Requirements: Good knowledge of Matlab or Python, sufficient familiarity with machine learning and image processing. Having experience with deep learning is a plus.
The project will be cosupervised with the Laboratory of Psychophysics from the School of Life Sciences.
Contact: effrosyni.simou@epfl.ch



Transformation invariant deep learning systems
Semester ProjectDeep learning systems have achieved remarkable results in last decades. The network is able to learn meaningful features for machine learning tasks from RAW data. The system is even able to learn the data representation under some transformation due to the maxpooling operator. However, this operator is not sufficient to handle ambiguity in the natural data.
In this project we propose to study existing algorithms, which build features invariant to transformation with deep network, and based on these methods build rotation and translation invariant system for an image classification task.
Requirements: Programming skills, signal processing. Knowledge in deep learning is a plus.
Contact: renata.khasanova@epfl.ch



Omnidirectional vision
Semester/Master ProjectOmnidirectional cameras have a 360 degrees field of view. Therefore, they are powerful tools for object detection and classification tasks [1, 2]. For example, a single omnidirectional camera can be used to capture traffic data based on the information about vehicles on the roads or it can be mounted on drone and used for object detection and collision avoidance. Despite the broad variety of applications where omnidirectional vision can be used, to the best of our knowledge, there are just a few methods that perform object classification directly on omnidirectional images without data transformation to the classical images that leads to increase of computational complexity and possible information losses.
We propose to use deep architecture [3] to tackle classification problem of omnidirectional images. In the project student would be required to develop an algorithm for object classification by considering omnidirectional camera’s lens geometry.
References:
[1] H.C. Karaimer, Y. Bastanlar. Detection and classification of vehicles from omnidirectional videos using temporal average of silhouettes. InProceedings of the Int. Conference on Computer Vision Theory and Applications (2015), pp.197204.
[2] L.F. Posada, K.K Narayanan, F. Hoffmann, T. Bertram. Semantic classification of scenes and places with omnidirectional vision. In European Conference on Mobile Robots (ECMR), IEEE (2013), pp. 113118.
[3] Krizhevsky, A., Sutskever, I., and Hinton, G. E. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (2012), pp. 1097–1105.
Requirements: Basic knowledge in computer vision, neural networks, signal processing, programming skills.
Contact: renata.khasanova@epfl.ch



Light field depth estimation meets deep learning in the epipolar image domain
Master ProjectA Light Field camera [1] looks exactly like your favorite pointandshot camera, however, its smart design permits to capture multiple pictures at the same time. These pictures capture the scene from slightly different points of view, thus recording its 3D structure. With some computation, it is possible to exploit these pictures for multiple applications: refocusing the captured pictures at an arbitrary depth, measuring the depth of the objects in the scene, or even building a 3D model of the scene.
The first step toward these applications is depth estimation. Depth estimation [2] is a fundamental problem in Computer Vision and consists in assigning a depth value to each object (pixel) in a captured picture. Typically, to solve this problem, multiple pictures of the same scene are necessary, therefore, the scene or subject has to be still while the user moves around and captures the pictures. On the other hand, a light field camera captures all the required images in a single shot. In addition, the set of images captured by a light camera, referred to as the Light Field, exhibits a very particular structure: the depth associated to each pixel in the light field is in a onetoone relation with the slope of a line in the Epipolar Image Domain representation of the light field [3]. Therefore, the depth estimation problem reduced to a much simpler slope detection problem in the context of light fields.
In this project, the student will first study the special structure of light field data. Then, the student will take advantage of the effectiveness of Deep Neural Network in detecting patterns in the data to determine the line slopes, and therefore the light field depth map. The student will design a network architecture suitable for the considered task and will have to take into account the high dimensionality of the light field data, that may not permit to process the full light field at once. The depth estimation results will be evaluated on a stateoftheart light field benchmark [4].References:
[1] Raytrix website (https://raytrix.de/)
[2] M. Bleyer and C. Breiteneder, Stereo Matching: StateoftheArt and Research Challenges, Advanced Topics in Computer Vision, Springer, pp. 143179, 2013
[3] S. Wanner and B. Goldluecke, Globally consistent depth labeling of 4D light fields, IEEE CVPR, pp. 4148, 2012
[4] Light field depth estimation benchmark (http://hcilightfield.iwr.uniheidelberg.de/)Requirements: Fundamentals of image processing, basic knowledge of deep learning and machine learning, Python, PyTorch or TensorFlow.
Contact: mattia.rossi@epfl.ch



Conditional Wassertein GANs for Video Prediction
Master Thesis ProjectCapability of future prediction requires an indepth understanding of the physical and causal rules that govern the world. Despite having appealing applications as robotic planning or representation learning, predicting raw observations, such as video frames, is inherently challenging due to high dimensionality of the data, as well as the difficulty of modelling the uncertainties. The first attempts to solve prediction problem lacked the ability to generate different plausible frames due to the deterministic nature of the proposed schemes. Recently, two different approaches have dominated the literature: i) modelling latent variables as probability distributions to model the stochasticity (VAElike approaches), ii) using adversarial training to improve the quality of the generated frames (GANlike approaches). Many works from both tracks use the stochastic variable as an input the generator by sampling from a learned distribution, independently throughout time.We would like to model the stochasticy that is conditioned not only on the input data, e.g. available frames, but also on the previously sampled variables, hence generated frames. Such a framework requires the derivation of a loss function where the inference of the latent distribution might have some Markovian characteristics. The proposed framework will be analysed in a video frame prediction application.References:
[1] Wasserstein Autoencoders: https://arxiv.org/pdf/1711.01558.pdf
[2] Stochastic Adversarial Video Prediction: https://arxiv.org/pdf/1804.01523.pdf
[3] Sequential Neural Models with Stochastic Layers: https://arxiv.org/pdf/1605.07571.pdfRequirements: Fundamentals of linear algebra, fundamentals of image processing, basic knowledge of deep learning and machine learning, Python, PyTorch or TensorFlow.
Contact: beril.besbinar@epfl.ch

Omnidirectional stereo: Patch Match meets Deep Learning
Semester/Master ProjectOmnidirectional cameras capture videos with a 360degree field of view. Thanks to a Head Mounted Display, the user can be thrown in the middle of the scene and experience a much deeper immersion than with a traditional video. At this point, although the user can watch the scene around itself while the video flows, his point of view is bound to the camera one: the user cannot make a step in an arbitrary direction in the scene, not yet. Interesting, coupling together two omnidirectional cameras permits to estimate the geometry of the scene, thus paving the way to 3D reconstruction and unveiling the possibility for the user to navigate the scene freely. A preliminary work in this direction has already been carried out in [1][2].The problem of geometry estimation from pairs of traditional perspective cameras is referred to as Stereo Matching [3] and is a long studied problem, for which fast and effective methods exist. Instead, omnidirectional stereo is a very recent research track. In this project, the student will first become familiar with Path Match Stereo [4][5], a fast, effective, yet simple, stereo matching algorithm, and then extend it to the case of omnidirectional camera pairs. Moreover, while Patch Match uses an Euclideanbased metric to match the views captures by the two cameras, and it is therefore sensible to light changes and occlusions, the student will consider its replacement with a metric computed by a Deep Neural Network [6], with the attempt to get a more robust algorithm.References:
[1] C. Schroers et al., An Omnistereoscopic Video Pipeline for Capture and Display of RealWorld VR, ACM Transactions on Graphics, Special Issue on Production Rendering, vol. 37, no. 3, August 2018
[2] H. Jingwei et al., 6DOF VR videos with a single 360camera, IEEE Virtual Reality, 2017
[3] M. Bleyer and C. Breiteneder, Stereo Matching: StateoftheArt and Research Challenges, Advanced Topics in Computer Vision, Springer, pp. 143179, 2013
[4] M. Bleyer et al., PatchMatch Stereo – Stereo Matching with Slanted Support Windows, BMCV, pp. 14.114.11, 2011
[5] E. Zheng et al., PatchMatch Based Joint View Selection and Depthmap Estimation, IEEE CVPR, pp. 15101517, 2014
[6] J. Zbontar and Y. LeCun, Stereo matching by training a convolutional neural network to compare image patches, Journal of Machine Learning Research, vol. 17, pp. 132, 2016Requirements: Fundamentals of linear algebra, fundamentals of image processing, basic knowledge of deep learning and machine learning, Python, PyTorch or TensorFlow.
Contact: mattia.rossi@epfl.ch

Image/Video Coding and Communication


Video compression for moving cameras
Semester/Master ProjectVideo streaming from mobile platforms such as drones is a widespread application which is gaining an even higher popularity in the last years. Efficient video coding standards are used nowadays to compress videos onboard the platform. However, despite the impressive progresses both from the practical and the theoretical perspective, a full theoretical model of the ratedistortion function (that is, the relationship between the compressed video data rate and quality) for natural videos taken by moving cameras is still missing. This project aims at gathering and processing data in a fully controlled environment in order to move one step forward towards the development of such modelling.
The project is divided into three parts.
• In the first part raw images will be taken under different conditions in a controlled environment using a camera
• In the second part the data will be postprocessed by developing a predictive video coder (e.g. in Matlab)
• In the third part, fitting to an existing model will be carried out.Requirements: knowledge of Matlab, basic knowledge of image processing (e.g., discrete cosine transform, quantization) or signal processing, precision and motivation
Contact: guiseppe.cocco@epfl.ch

Peerassisted adaptive streaming of omnidirectional videos
Semester/Master ProjectThe current state of the art on multimedia streaming over the Internet is mainly based on Adaptive Streaming over HTTP (HAS) techniques [1], which provides a standard delivery framework that allows interoperability between different devices and servers while optimizing bandwidth consumption. The key concept behind adaptive streaming is that the same video content is encoded at different resolutions/encoding rates, stored on streaming servers, and each user requests over time the desired version of the content according to its download capacity. The emergence of new media formats supporting improved user experiences, such as omnidirectional and multiview videos, however, requires the transmission of a huge amount of data, and impose many new challenges on the current HAS infrastructures. One possibility for improving the state of the art when transmitting those new media formats is through the use of peerassisted delivery techniques [2].
This project aims at exploring the use of peerassisted adaptive streaming in the scope of omnidirectional video. The main goal is to study how current HAS format and adaptation logics can be adapted to support an optimized omnidirectional video delivery chain and consequently improve the experience of users consuming those content.
References:
[1] Stockhammer, Thomas. “Dynamic adaptive streaming over HTTP–: standards and design principles.” Proceedings of the second annual ACM conference on Multimedia systems. ACM, 2011.
[2] Streamingroot white paper: PeerAssisted Adaptive Streaming; http://files.streamroot.io/public/whitepapers/StreamrootWhitepaperPeerAssistedAdaptiveStreaming.pdf
[3] YouTube 360° https://www.youtube.com/channel/UCzuqhhs6NWbgTzMuM09WKDQ
Requirements: Basic knowledge of multimedia networking, basic programming skills.
Contact: roberto.azevedo@epfl.ch



Visual quality of omnidirectional images and videos
Semester/Master ProjectMany algorithms (i.e. objective quality metrics) have been proposed in literature in order to quantify the visual quality of a digital image or video sequence, as perceived by the end user. These metrics are extremely useful in order to optimise the different steps of the digital processing chain, such as acquisition, compression, transmission, and rendering, to maximize the perceptual quality of the signal presented to the multimedia user [1].
Nowadays, cameras which allow capturing omnidirectional (i.e. 360°) digital images and video sequences have started to appear as commercial products [2][3]. The spread in the near future of applications involving omnidirectional content will require the optimisation of its processing chain. Thus, algorithms able to quantify the visual quality of omnidirectional images and videos will be needed soon. While many quality metrics exists for classical notomnidirectional images and videos, the quality assessment of this new kind of visual media is an open research challenge. The goal of this project will be the study of the quality perception of omnidirectional content and the design of algorithms to assess the quality of omnidirectional content.
References:
[1] Wang and Bovik, “Mean squared error: Love it or leave it? A new look at Signal Fidelity Measures”, IEEE Signal Processing Magazine, 2009
[2] https://theta360.com/
[3] http://gopro.com/odyssey
Requirements: Matlab programming skills, image processing
Contact: roberto.azevedo@epfl.ch



Omnidirectional image and video processing
Semester/Master ProjectNowadays, cameras which allow capturing omnidirectional (i.e. 360°) images and video sequences have started to appear as commercial products [1][2]. An omnidirectional image can be seen as a 360° viewing sphere, since the realworld environment surrounding the camera is captured in all directions.
In order to be processed with widelyspread algorithms designed for standard rectangular planar images, the viewing sphere is often mapped to a plane, resulting in a socalled panoramic image [3]. The alternative to this approach would be to process the signal directly in its original spherical domain. The goal of this project will be the study of a framework to adapt classical digital image and video processing techniques to the omnidirectional scenario, where signals are defined on the surface of a sphere.
References:
[1] https://theta360.com/
[2] http://gopro.com/odyssey
[3] David Salomon, “Transformation and projection in Computer Graphics”
Requirements: Matlab programming skills, signal processing, basics of trigonometry.
Contact: roberto.azevedo@epfl.ch



Plenoptic sampling
Master ProjectWith the recent advances in 3D scene representation with camera networks, like in 3DTV or Freeviewpoint TV, the problem of camera positioning is becoming very important. Optimal camera placement increases the 3D reconstruction precision and augments the compression performance of multiview images. The problem of camera placement can be formulated as the plenoptic sampling problem, where the best sampling of the viewpoints is sought. The plenoptic function captures the luminance and chrominance properties of a light ray in any direction, at any time instant and from any viewing point.
This project aims at proposing a novel representation and parameterization of the plenoptic function that can efficiently capture the underlying geometry in 3D scenes. As a second part of this project, the camera positioning problem will be formulated as a sampling problem in the transform domain defined by the novel geometric representation of the plenoptic function.
Requirements: signal processing, Matlab (C/C++)
Contact: pascal.frossard@epfl.ch

Study and implementation of a channel coding scheme for delayconstrained applications
Semester/Master ProjectMultimedia streaming over wireless channels will play a fundamental role in the next generation of mobile communication networks. Realtime image and video streaming are particularly challenging due to the strict constraints in terms of delay. Such constraints are strict in case the video is streamed from a drone and is used as a reference by the pilot to control the machine. The goal of the project is to develop a channel coding scheme which can cope with limited channel state information at the transmitter while providing high reliability and meeting the strict delay constraints of realtime multimedia streaming. The implementation will be based on LDPC codes and multiuser detection principles.
Requirements: knowledge of channel coding principles, good programming skills (Matlab and C/C++).
Contact: giuseppe.cocco@epfl.ch

Signal Processing on Graphs


Learn to encode graphs using a “dummy task”
Master Semester/Thesis ProjectFrom molecules or proteins to social networks, a lot of data is structured as a graph. Graphs are not trivial to process, since most machine learning algorithms only accept tensors as input. A challenge in this domain is thus to learn how to represent or encode a graph into a vector [1]. The stateofthe art method [2] consists in maximising the mutual information between the encoded vector and local statistics in the graph, but little is known on the quality of the resulting encoding.
The problem of encoding and decoding data is not restricted to graphs: it is also at the core of Generative Adversarial Networks (GANs). A recent idea that worked successfully for GANs was to try to force the discriminator to learn global features, and not focus only on local statistics. A simple trick to achieve that is to randomly rotate the input images, and ask the discriminator to predict by how much the image has been rotated [3].
This project consists in building a similar method for graph: find a “dummy task” that requires to compute global features in order to create better encodings. They are many potential applications, depending on student’s interests: predicting chemical properties, learn to classify documents by topic, traffic forecasting…
References:
[1]: W. L. Hamilton et al., Representation learning on graphs: Methods and Applications
[2]: P. Velickovic et al., Deep Graph Infomax
[3]: S. Gidaris et al., Unsupervised representation learning by predicting image rotationsRequirements: Good knowledge of Python, at least one machine learning course.
Contact: clement.vignac@epfl.ch

Learn directions from signals on graphs to improve traffic forecasting
Master Semester/Thesis ProjectA natural way to model traffic is to consider the density and velocity of cars as a signal on a transportation network [1]. Traffic forecasting is a challenging problem because spatiotemporal interactions must be accounted for in a non linear way. This problem is currently solved using graph neural networks [2]. However, these networks have some limitations: for example, when applied to a grid, they can only learn spherical filters [3] which is a very small class of functions.
Because we know that the transportation network lays on a 2d manifold, we can try to improve the graph neural networks by adding a notion of direction on the graph. A way to do it is to use the graph signal itself.
References:
[1]: Y. Li et al., Diffusion Convolutional Recurrent Neural Network: DataDriven Traffic Forecasting
[2]: Defferrard, Michaël, Xavier Bresson, and Pierre Vandergheynst. “Convolutional neural networks on graphs with fast localized spectral filtering.” Advances in Neural Information Processing Systems. 2016.
[3]: R. Levie et al., CayleyNets: Graph Convolutional Neural Networks With Complex Rational Spectral Filters
[4]: Velickovic, Petar, et al. “Graph attention networks.” arXiv preprint arXiv:1710.10903 1.2 (2017).Requirements: Good knowledge of Python, at least one machine learning course.
Contact: clement.vignac@epfl.ch
 Informed Source Separation for MultiModal Graph Signals
Master Semester ProjectWe consider that the processes developing at each layer of multigraph data representation are distinct and separable, yet we are only able to perceive an unknown combination of them by means of the observations. This assumption permits a graph signal to be decomposed by some distinct components living in different structures, so that we may describe the signal components structured by each graph layer separately and write the observations as a linear combination of them.
In quest of the sources of observations, we can define a problem on the multigraph settings. Such a problem ultimately targets to estimate the spectral models acting on each graph layer and to reveal the sources of the observations. Specifically, we will consider applications in heat source separation problems, which can be represented by spreading processes on networks.
References:
[1] Pena, R., Bresson, X., & Vandergheynst, P. (2016, July). Source localization on graphs via ℓ1 recovery and spectral graph theory. In 2016 IEEE 12th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP) (pp. 15). IEEE.
[2] Cencetti, Giulia, and Federico Battiston. “Diffusive behavior of multiplex networks.” New Journal of Physics (2019).
[3] Gomez, Sergio, et al. “Diffusion dynamics on multiplex networks.” Physical review letters 110.2 (2013): 028701.
[4] SoleRibalta, Albert, et al. “Spectral properties of the Laplacian of multiplex networks.” Physical Review E 88.3 (2013): 032807.Requirements: Python, basics of Graph signal processing: graphs signal filtering (covered in EE558 Network Tour of Data Science), familiarity with basic machine learning tools.
Contact: eda.bayram@epfl.ch
 Deep Learning for timevarying networked data
Master Semester/Thesis ProjectIn this era of data deluge, we are overwhelmed with massive volumes of extremely complex datasets. Data generated today is complex because it lacks a clear geometric structure, comes in great volumes, and it often contains information from multiple domains. The emerging ﬁelds of deep learning and graph signal processing (GSP) have attracted a lot of attention as potential tools to overcome these challenges [1].
In this sense, graph convolutional neural networks (GCNN) have recently been extended to work with multidomain graph signals [2], e.g., timevarying signals, deﬁning a new framework to deal with graph signals deﬁned on top of several domains, e.g., electroencephalograms or traﬃc networks. This new convolutional layer can be eﬃciently implemented to run on a GPU and it has shown promising generalization abilities on synthetic datasets.
In this project, we propose applying this new type of timevarying GCNN to some real data, and analyze the impact that different architectures based on this convolutional layer have on classification accuracy of timevarying graph signals. The students are encouraged to find an application of their liking, but we provide the following list of suggestions:
– EEG decoding [3]: A hot topic in neuroscience is the classiﬁcation of brain signals for the development of brain computer interfaces (BCI). In this context, EEG signals can be viewed as timevarying graph processes where the spatial domain is represented by a graph of electrodes that are placed on the human skull.
– Traffic prediction: Monitoring traﬃc in a city is a major concern for local governments. TVGCNNs could be a potential tool for traﬃc prediction.
– Characterization of epidemics [4]: One of the main issues in epidemiology is the statistical characterization of the dynamics of a disease outbreak given the recorded time variations of the disease state around the world. To tackle this problem, epidemiologists rely on complicated systems of coupled nonlinear diﬀerential equations that are governed by a few characteristic parameters, i.e., probability of contagion, illness duration, or death rate. In this context, TVGCNNs could be used as powerful regressors to infer these parameters from data.References:
[1] M. M. Bronstein, J. Bruna, Y. LeCun, A. Szlam, and P. Vandergheynst, “Geometric deep learning going beyond euclidean data,” IEEE Signal Process. Mag., vol. 34, no. 4, pp. 18–42, 2017.
[2] G.OrtizJiménez,“Multidomain Graph Signal Processing: Learning and Sampling,” Master’s thesis, Delft University of Technology, Aug 2018.
[3] R. T. Schirrmeister, J. T. Springenberg, L. D. J. Fiederer, M. Glasstetter, K. Eggensperger, M. Tangermann, F. Hutter, W. Burgard, and T. Ball, “Deep learning with convolutional neural networks for eeg decoding and visualization,” Hum Brain Mapp, vol. 38, no. 11, pp. 5391–5420, 2017.
[4] R. Anderson and R. May, Infectious Diseases of Humans: Dynamics and Control, ser. Dynamics and Control. OUP Oxford, 1992.Requirements: Good knowledge of Python, sufficient familiarity with machine learning and graph theory. Having experience with one of deep learning libraries (preferably Tensorflow) is a plus.
Contact: guillermo.ortizjimenez@epfl.ch

Network inference from a noisy graph structure
Master Semester/Thesis ProjectGraph structures carry a lot of information on the interactions between data, and can be very useful in data analysis or interpretation. However, available structures are often very noisy or not entirely representative of data. Meanwhile, graph inference algorithms can provide very good estimations of this structure, but need a large amount of data to learn from.
The goal of this project is to propose a graph inference algorithm for applications where fewer data is available, but a noisy graph estimation already exists. This can be an empirically constructed geometric graph or an available graph that does not necessarily capture the full information (eg. social network graph, where the same connection can represent very close friendship and people that have never met in person).References:
[1] Dong, Xiaowen, et al. “Learning Laplacian matrix in smooth graph signal representations.” IEEE Transactions on Signal Processing 64.23 (2016): 61606173.[2] Kalofolias, Vassilis. “How to learn a graph from smooth signals.” Artificial Intelligence and Statistics. 2016.
[3] Nguyen, Viet Anh, Daniel Kuhn, and Peyman Mohajerin Esfahani. “Distributionally Robust Inverse Covariance Estimation: The Wasserstein Shrinkage Estimator.” arXiv preprint arXiv:1805.07194 (2018).
Requirements: Python, basics of probability theory, linear algebra, optimization is a plus
Contact: hermina.petricmaretic@epfl.ch or mireille.elgheche@epfl.ch

Graph learning with a degree distribution prior
Master Semester/Thesis ProjectGraphs are flexible structures that represent connections between data, whether they’re connections between people in social network or connectivity networks that govern processes in our brains. Very often (as is the case in brain networks) these graph structures are not readily available and need to be inferred. Graph learning methods have become very popular in the last few years, with solutions covering many different models of data behaviour on the graph.
However, most solutions ignore any possible information we might have on the network structure. For example, additional knowledge about the graph can be included in terms of a degree distribution prior. The goal of this work is to propose a method for graph inference incorporating the prior information on the degree distribution of nodes.References:
[1] Dong, Xiaowen, et al. “Learning Laplacian matrix in smooth graph signal representations.” IEEE Transactions on Signal Processing 64.23 (2016): 61606173.
[2] Tzikas, Dimitris G., Aristidis C. Likas, and Nikolaos P. Galatsanos. “The variational approximation for Bayesian inference.” IEEE Signal Processing Magazine 25.6 (2008): 131146.
[3] Dong, Xiaowen, et al. “Learning Graphs from Data: A Signal Representation Perspective.” arXiv preprint arXiv:1806.00848(2018).
Requirements: Basics of (graph) signal processing, basics of machine learning, Python/Matlab coding skills.
Contact: hermina.petricmaretic@epfl.ch



Inference of multiple functional brain networks using Graph Laplacian Mixture Model
Master Semester Project/Thesis ProjectSpontaneous brain activity, as measured through restingstate functional magnetic resonance imaging (fMRI) has provided key insights into the functional architecture of the brain. Global patterns of neural activity can be obtained by directly computing the statistical interdependence between different brain regions. The information can then be conveniently summarised into a functional connectome (1), and more intuitively represented as a set of functional brain networks. These networks are observed to be dynamic (2) and spatiotemporally overlapping (3). In this project, the goal is to simultaneously separate signals corresponding to different phases and infer multiple functional brain networks. This will be done by building upon an emerging field of graph learning, specifically by utilising a Graph Laplacian mixture model (4), a generative model for mixed signals living on multiple networks.
References:
[1] Bullmore, E., Sporns, O., 2009. Complex brain networks: graph theoretical analysis of structural and functional systems. Nat. Rev. Neurosci. 10 (3), 186–198.
[2] Chang C, Glover GH (2010) Timefrequency dynamics of restingstate brain connectivity measured with fmri. Neuroimage 50:81–98
[3] Karahanoğlu, F. I. & Van De Ville, D. (2015) Transient brain activity disentangles fmri restingstate dynamics in terms of spatially and temporally overlapping networks. Nature communications 6
[4] Maretic, Hermina Petric, and Pascal Frossard. “Graph Laplacian mixture model.” arXiv preprint arXiv:1810.10053(2018),Requirements: Basics of (graph) signal processing, basics of machine learning, basics of linear algebra, basics of probabilities, Python/Matlab coding skills.
Contact: hermina.petricmaretic@epfl.ch



Improving classification performance of convolutional neural networks by changing the convolutional kernel shape
Semester/Master ProjectConvolutional neural networks have become one of the most efficient tools for tasks such as classification. In such networks, a convolutional kernel is translated over an image in order to detect some local patterns that may be characteristic of a particular class. In general, these kernels take the shape of a NxM window of pixels, which is quite arbitrary. The idea of the project is to explore different kernel shapes, possibly learnt from the data, that would allow classification improvements.
Requirements: Basics of (graph) signal processing, basics of machine learning, Python/Matlab coding skills.
Contact: bastien.pasdeloup@epfl.ch



Finding the closest regular topology to a graph representing an irregular space
Semester/Master ProjectGraphs are often used to represent irregular topologies, such as relations in a social network, roads of a city, brain connectivity, etc. Due to this irregular aspect, some operations such as translation of a signal on a graph are quite hard to define. However, these tasks can easily be done for some particular graphs, such as an Ndimensional grid for instance. The project would consist, given a graph, to find the « closest » regular space to approximate it, using particular properties of adjacency matrices for the latter graphs (block circulancy bandwidth…). Operations could then be performed on this space as a proxy for the irregular topology.
Requirements: Basics of (graph) signal processing, basics of machine learning, Python/Matlab coding skills.
Contact: bastien.pasdeloup@epfl.ch



Defining heuristics for translations of a convolutional kernel on an irregular graph
Semester/Master ProjectGraph convolutional neural networks are possible extensions of classical CNNs on images, allowing one to achieve better classification performance on irregular data. One of the approaches to extending CNNs for graphs consists in translating a convolutional kernel over a graph. However, translation is not easily defined on spaces that are not associated with a notion of distance. A possible way of mimicking translation on Euclidean spaces is to find an injective function on the graph that preserves neighborhoods within the kernel to translate (https://arxiv.org/abs/1710.10035). When the kernel is highly localized, such functions can be explored exhaustively to find those that minimize the kernel deformation upon translation. However, when it grows in size, this approach becomes intractable due to the NPcompleteness complexity of finding interesting translations. In this project, we would like to explore possible heuristics to translate a convolutional kernel on a graph.
Requirements: Basics of (graph) signal processing, basics of machine learning, Python/Matlab coding skills.
Contact: bastien.pasdeloup@epfl.ch



Exploring hyper graph signal processing
Semester/Master ProjectGraph signal processing (GSP) has emerged as a possible extension to classical Fourier analysis, allowing one to study signals evolving on complex topologies. In this framework, graphs model the domains on which data evolve, by creating edges between variables with some relationships. However, graphs only capture 1to1 relationships and ignore some more complex relations between 3 or more variables. Such relations can be modeled using a hypergraph, i.e., a graph in which edges can link more than 2 nodes. The goal of this project is to explore whether tools from the (GSP) theory can be found for such structures.
Requirements: Basics of (graph) signal processing, basics of machine learning, Python/Matlab coding skills.
Contact: bastien.pasdeloup@epfl.ch



SemiSupervised Learning and Inpainting on MultiLayer Graph Representations
Semester/Master ProjectOn partially labeled data, semisupervised learning methods have been studied profoundly by expressing the relations between the data entities within weighted graph representations [1]. The inpainting task, on the other hand, is usually defined on the domains accompanying a signal content, which has been addressed with graph signal representations and operations [2]. Most of these studies focus on one type of relationship between the pair of data points during the construction of the graph structure. However, the connections between the entities may possess different types of relationships, which can be represented better by multiple graph structures. The objective of this project is to extend semisupervised clustering and inpainting tasks on multilayer graph settings, where each graph layer signifies a particular type of relation between vertices. This yields the same number of vertices in each graph layer, yet the topology (i.e., weight matrix) is different due to the difference between the focus of each layer.
References:
[1] Belkin, Mikhail, and Partha Niyogi. “Semisupervised learning on Riemannian manifolds.” Machine learning 56.13 (2004): 209239.
[2] Perraudin, Nathanaël, and Pierre Vandergheynst. “Stationary signal processing on graphs.” IEEE Transactions on Signal Processing 65.13 (2017): 34623477.
[3] Davide Eynard, Klaus Glashoff, Michael M Bronstein, and Alexander M. Bronstein. Multimodal diffusion geometry by joint diagonalization of laplacians.arXiv preprint arXiv:1209.2295, 2012.
[4] Xiaowen Dong, Pascal Frossard, Pierre Vandergheynst, and Nikolai Nefedov. Clustering on multilayer graphs via subspace analysis on grassmann manifolds. IEEE Transactions on signal processing, 62(4):905–918, 2014.
[5] Xiaowen Dong, Pascal Frossard, Pierre Vandergheynst, and Nikolai Nefedov. Clustering with multilayer graphs: A spectral perspective. IEEE Transactions on Signal Processing, 60(11):5820–5831, 2012.
Requirements: Python, basics of Graph signal processing: graphs signal filtering and spectral clustering (covered in EE558 Network Tour of Data Science).
Contact: eda.bayram@epfl.ch



Transfer learning for network data
Semester/Master ProjectTechnology is in constant evolution and the amount of collected network data is increasing every day. Numerical examples can be found in geographical, transportation, biomedical and social networks, such as temperatures within a geographical area, traffic capacities at hubs in a transportation network, or human behaviors in a social network. Due to this growing volume of information and its diverse interactions, when represented, data resides now on irregular and complex structures, and can be modelled by graphs [1, 2, 3]. Let us consider, for example, the traffic congestion problem. If we have a model that links two different graphs of two different cities, we will be able to transfer the traffic information from one city to another and then predict the condition in which this later would be. The idea of this project is exactly to build a method and implement a machine learning algorithm, that is able to transfer information from one graph to another [4].
References:
[1] D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega and P. Vandergheynst. The Emerging Field of Signal Processing on Graphs: Extending HighDimensional Data Analysis to Networks and Other Irregular Domains, in IEEE Signal Processing Magazine, vol. 30, num. 3, p. 8398, 2013.
[2] B. Yener, CellGraphs: ImageDriven Modeling of StructureFunction Relationship, ACM, Vol. 60 No. 1, Pages 7484, 2017.
[3] A. Ortega, P. frossard, J. Kovacevic, J. Mora P. Vandergheynst, Graph Signal Processing: Overview, Challenges and Applications, 2018.
[4] M. Pilanci and E. Vural, “Domain adaptation on graphs by learning aligned graph bases,” 2018.
Requirements: Good knowledge of Python and MATLAB, familiarity with machine learning and linear algebra.
Contact: mireille.elgheche@epfl.ch



Building Extraction on Aerial LIDAR Point Clouds using Spectral Graph Features
Semester/Master ProjectAirborne Laser Scanning is a wellknown remote sensing technology, which provides quite dense and highly accurate, yet unorganized point cloud descriptions of the earth surface. Weighted graphs are very convenient tools for representing such irregular and 3D data types, moreover, spectral graph based methods provide spectral analysis of signals residing on the weighted graph representations. With this in mind, one can consider the airborne LIDAR data as unstructured elevation signal so that it can be processed on an appropriate graph structure.
The goal of this project is to discover the spectral attributes of various objects in a LIDAR scene, such as buildings and vegetation, through the graph signal processing. Instead of calculating the geometric primitives such as normals, slopes and curvatures for each point on a scene and thresholding them, this approach aims to transpose classical signal processing tools to analyze 3D aerial LIDAR point clouds.
In particular, the points on the breakline of the buildings constitute some features which can be detected and discriminated from the other objects by augmenting the spectral information on the graph. This could be achieved by formulating a spectral descriptor on the detected points and then, yielding a classification problem to discriminate the ones existing on the buildings. On a building extraction problem, the later step is to retrieve whole body of the building objects.
References:
[1] Michaël Defferrard, Lionel Martin, Rodrigo Pena, & Nathanaël Perraudin. (2017, October 6). PyGSP: Graph Signal Processing in Python (Version v0.5.0). Zenodo. http://doi.org/10.5281/zenodo.1003158.
[2] ISPRS, “Isprs test project on 3d semantic labeling contest,”2017, http://www2.isprs.org/commissions/comm3/wg4/3dsemanticabeling.html.
[3] Blomley, R., and M. Weinmann. “USING MULTISCALE FEATURES FOR THE 3D SEMANTIC LABELING OF AIRBORNE LASER SCANNING DATA.” ISPRS Annals of Photogrammetry, Remote Sensing & Spatial Information Sciences 4 (2017).
Requirements: Python, basics of Graph signal processing: graphs signal filtering (covered in EE558 Network Tour of Data Science), familiarity with basic machine learning tools.
Contact: eda.bayram@epfl.ch

Learning


Analyse medical images using group equivariant neural networks
Master Semester/Thesis ProjectDespite being 40 years old, convolutional neural networks are still stateofthe art when it comes to analysing images. The reason for this success is that they hardcode in the neural network the prior that the features learned must be invariant to translations, because the same objects can appear in any part of an image.
Methods have recently been proposed to extend this idea to other group of transformations as well [1]. For example, images for skin cancer detection have no notion of “up” and “down”, and can be arbitrarily rotated [2]. Roughly speaking, the method proposed in [2] amounts to learning at the same time one filter and all their rotated versions. It allows the network to learn with little training data, but is less efficient in terms of memory and computations, because the image must be fed to all the rotated filters.
When using standard CNNs, memory usage is not a problem. What makes translation invariance efficient and rotation invariance costly? The answer is that in CNNs we assume that the features are local. Similarly, we expect the first layers of CNNs to act as edge detectors, which can be seen as local functions of the angle in polar coordinates. This project consists in creating a neural network based on this hypothesis, to improve upon rotation invariant neural networks.
References:
[1]: T. Cohen M. Welling, Group equivariant convolutional networks
[2]: B. S. Veeling et al., Rotation equivariant CNNs for digital pathologyRequirements: This project requires some notions of group theory, and a good knowledge of Python. During the project, you may have to code in CUDA to run the computations on a GPU, but no prior knowledge of CUDA is required.
Contact: clement.vignac@epfl.ch

Neural Network Pruning
Semester/Master Thesis ProjectNeural network pruning techniques [1,2] can reduce the parameter counts of trained networks by over 90%, decreasing storage requirements and improving computational performance of inference without compromising accuracy.
Although the original ideas date back to the 90s [3,4], network pruning techniques have regained a lot of popularity lately, now that more and more applications require that deep networks are deployed on mobile devices.
Furthermore, some researchers [5] claim that pruning overparameterized neural networks can naturally uncover subnetworks whose initializations made them capable of training effectively. Based on these results, they articulate the lottery ticket hypothesis: dense, randomlyinitialized, feedforward networks contain subnetworks (winning tickets) that—when trained in isolation— reach test accuracy comparable to the original network in a similar number of iterations. This hyphothesis could explain why overparameterized neural networks are asier to optimize in practice. A work for which the authors were awarded the Best Paper award at ICLR 2019.
However, up till now most pruning techiques are based on heuristics that assign arbitrary importance to the weights of a neural network which lack a theoretically grounded explanation. In this project we aim to bridge the theoretical gap and propose to use recent advances in sampling theory to determine nearoptimal pruning strategies for neural networks.
References:
[1] Song Han, Jeff Pool, John Tran, and William Dally. Learning both weights and connections for efficient neural network. In Advances in neural information processing systems, pp. 1135–1143, 2015. [2] Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet, and Hans Peter Graf. Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710, 2016. [3] Yann LeCun, John S Denker, and Sara A Solla. Optimal brain damage. In Advances in neural information processing systems, pp. 598–605, 1990. [4] Babak Hassibi and David G Stork. Second order derivatives for network pruning: Optimal brain surgeon. In Advances in neural information processing systems, pp. 164–171, 1993. [5] Jonathan Frankle and Michael Carbin. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks. In proceedings of International Conference on Learning Representations, 2019Requirements: Good knowledge of Python, sufficient familiarity with machine learning and deep learning. Having experience with one of deep learning libraries (preferably Tensorflow) is a plus.
Contact: guillermo.ortizjimenez@epfl.ch

Inside our mind: detecting dreams
Semester/Master Thesis ProjectElectroencephalogram (EEG) data is usually very complicated. Not only does it depend on both time and space; it is merely a surface measure of a deep phenomenon taking place inside our brain.
The purpose of this project is understanding the hidden patterns of our brain’s electric activity during dreaming phases, and building a classifier which can correctly label a person’s EEG signal as belonging to a dreaming or nondreaming phase across different sleep stages. This machinelearning project will have a crucial feature engineering component: to begin with, simple and powerful features will have to be created to take into account the quasiperiodicity of brain waves; more advanced inverse problem techniques may later be used to reconstruct the 3D electrical signal inside the brain and pass to a voxelised version of the problem. Clever data augmentation techniques may also prove crucial to balance the relative scarcity of training samples. The final goal of this project will be to engineer and train a number of sequence models (e.g. recurrent neural networks) on the selected features in order to capture the relationship between the EEG patterns and the dreaming process.Requirements: Basics of signal processing, Basics of Machine Learning
Contact: m.caorsi@l2f.ch or pascal.frossard@epfl.ch

Topology of critical phenomena
Semester/Master Thesis ProjectTopological data analysis (TDA) has been applied successfully to the study of time series: its robustness against noise and stability under deformations of the point cloud makes it an excellent alternative to the standard DFT approach [1]. The aim of this project is to preprocess the time series (e.g. using stabilisation techniques based on standard and fractional derivatives) so as to maximise the effectiveness of TDA. Other preprocessing techniques may require to define a laplacian operator over the persistent simplicial filtration of the point cloud and use spectral techniques to reduce the dimensionality effectively. After the preprocessing stage has been carried out, the candidate shall focus on the application of persistent homology, and in particular on the computation of the bottleneck distance between different persistent barcodes. This may lead to a promising approach for anticipating the occurrence of catastrophic events. Time permitting, it would be worth investigating the relationship between the topological techniques and the wellestablished extreme value theory.
References:
[1] Shafie Gholizadeh, Wlodek Zadrozny A Short Survey of Topological Data Analysis in Time Series and Systems Analysis arXiv:1809.10745
Requirements: Good background in signal processing.
Contact: m.caorsi@l2f.ch or pascal.frossard@epfl.ch

Topological filtering
Semester/Master Thesis ProjectSignal processing is a vast field of engineering and research. The applications are everywhere in our society and creating innovation in this field is not for everyone. Fortunately, not much topology has been exploited over the past century by engineers in this field: the purpose of this project is to dive in this world equipped with the tools of persistence homology. The starting point of this project is to build a topological low pass filter: instead of cleaning the noise from averaging the signal directly, one can clean its topological representation (obtained, for example, via Takens or phasespace embeddings). Then, one can develop similar concepts but for highpass filters and bandpass ones.
Requirements: Good knowledge of signal processing, good programming skills and a lot of outofthebox thinking.
Contact: m.caorsi@l2f.ch or pascal.frossard@epfl.ch

Learning similarity metric for video reconstruction
Semester ProjectObtaining an unsupervised representation of sequential visual data can be crucial for any autonomous intelligent agent. One of the simplest schemes to learn an unsupervised representation is to encode the data under some constraints in such a way that the error between the reconstructed and input data is minimized.
The use of latent random variables with neural networks has been argued to model the variability observed in the data efficiently, as in Variational Autoencoders (VAE) [1] and their counterparts for sequential data, e.g., variational Recurrent Neural Networks (RNN) in [2] (an RNN is a type of neural network which includes an inner loop so that it performs the same task for every element). The aim for the first part of the project is to understand variational RNNs and use them for video reconstruction using simple moving MNIST dataset. Later, instead of elementwise errors, the student is expected to use another neural network structure to learn a similarity metric as the basis for the reconstructive objective, as done in [3] for VAEs.
References:
[1] Kingma, D. P., & Welling, M. (2013). Autoencoding variational bayes. arXiv preprint arXiv:1312.6114.
[2] Chung, J., Kastner, K., Dinh, L., Goel, K., Courville, A. C., & Bengio, Y. (2015). A recurrent latent variable model for sequential data. In Advances in neural information processing systems (pp. 29802988)
[3] Larsen, A. B. L., Sønderby, S. K., Larochelle, H., & Winther, O. (2015). Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300.
Requirements: Good knowledge of Python, sufficient familiarity with machine learning and probability. Having experience with one of deep learning libraries (preferably Tensorflow) is a plus.
Contact: beril.besbinar@epfl.ch



Spectral vs. spatial approaches of deep learning for data on nonEuclidean domains
Semester ProjectDeep learning methods have been very successful for signals defined on regular grids, such as images and audio. Recently, some works extend deep learning models on data defined on irregular domain, such as graphs and manifolds. The approaches so far presented in this field can be classified to (i) spatial and (ii) spectral. The aim of this semester project is to compare these methods and to find their limitations and advantages.
The student will (i) read and understand in depth existing methods of both categories, (ii) run experiments to compare different approaches and (optional) propose ideas of dealing with the limitations of the current methods.
References:
[1] M.M. Bronstein, J. Bruna, Y. LeCun, A. Szlam, P. Vandergheynst, “Geometric deep learning: going beyond Euclidean data”, arXiv:1611.08097
[2] D. I Shuman, S. K. Narang, P. Frossard, A. Ortega, and P. Vandergheynst, “The emerging field of signal processing on graphs: Extending highdimensional data analysis to networks and other irregular domains,” IEEE Signal Processing Magazine, vol. 30, no. 3, pp. 8398, May 2013.
Requirements: Python, background in machine learning and signal processing.
Contact: effrosyni.simou@epfl.ch

Handson


Userfriendly drawing of large graphs
Semester ProjectVery large graphs can be difficult to visualise, and more importantly, difficult to make sense of. As there is an everincreasing amount of data surrounding us, recovering useful information becomes more and more challenging, and interpreting large data becomes the main task of data scientists. In terms of large graphs, a coarser representation can make its behaviour and trends much clearer. Furthermore, a coarse representation can be coarsened further until we reach a version that is simple enough to easily interpret. However, once we recover the basic behaviour, we might want to go back to the finer version for more meaningful information.
The goal of this project is to create a tool to plot a coarse graph representation. Representations with different levels of detail (original graph, coarser version, futher coarsened version…) will be provided and the tool should be able to zoom into the graph to recover a finer version of the graph and zoom out to draw a coarser version. The tool should be userfriendly and ensure consistent embedding of graphs with different amounts of detail. Integrating the tool in a website would be a plus.
Requirements: Python, basics of linear algebra and graph theory, knowledge of web programming is a plus
Contact: hermina.petricmaretic@epfl.ch
