PhD Research Fellow Position in Machine Learning and Software Engineering
University of Mons Departement of Computer Science
Belgium

PhD Research Fellow Position in Machine Learning and Software Engineering

Predictive Health Modelling of Evolving Software Ecosystems

University of Mons - Departement of Computer Science - Belgium

 

Based on recent developments in machine learning and open source software ecosystem analysis, an innovative and ambitious 5-year research project will start in 2021. It aims to develop prediction, simulation and discovery models to analyse and predict the health of OSS ecosystems and their constituent software components.

The project is lead by Tom Mens (Full Professor, Software Engineering Lab) and Souhaib Ben Taieb (Associate Professor, Big Data and Machine Learning Lab), both renowned experts in their respective research domains, working at the Department of Computer Science of the University of Mons. Belgium is centrally located in Europe, and the labs are well-connected to other research teams worldwide. Our group hosts researchers of various nationalities, making a research position in our group an ideal stepping stone for an independent research career in academia or industry.

Open PhD Research Fellowship

We have open 4-year PhD positions on this project. Qualified candidates should hold a Master's degree or equivalent in computer science or related domains, with a background in machine learning and/or software engineering. A good knowledge of statistics and former experience in data analysis and open source software development are highly recommended. Candidates should be proficient in English and have good oral and written communication skills.

Interested applicants should contact the principal investigators by e-mail at tom.mens@umons.ac.be and souhaib.bentaieb@umons.ac.be. Official applications should be submitted at your earliest convenience and should contain at least:

  • a motivation letter
  • the earliest available starting date of the candidate
  • a CV, including previous experience relevant to the project
  • a list of previous publications (if applicable)
  • a digital copy of the master thesis
  • a copy of relevant grade documents
  • full contact details of the candidate
  • contact information for at least two potential academic referees


Project summary

Open Source Software (OSS) is indispensable in today's software-driven society and industry. OSS communities manage and evolve ecosystems containing millions of interconnected software components released and maintained by thousands of geographically distributed contributors. Software ecosystems face a wide range of health issues induced by bugs, security vulnerabilities, incompatible component updates, and unmaintained, deprecated or outdated component releases. Because of the highly connected and inherently collaborative nature of the socio-technical networks of software ecosystems, these issues frequently impact (transitively) related components, resulting in a combination of fine-grained (component-level) and coarse-grained (network-level) health problems. This raises the need for efficient software health prediction models and techniques addressing OSS ecosystem health, at the level of individual components as well as at the socio-technical network level.

To address this need, we will extract and combine fine-grained events related to the development of individual software components (e.g., new releases, new source code commits, code reviews, reported bugs and their associated fixes, message exchanges between developers), and coarse-grained events related to the evolving socio-technical network (e.g. versions, dependency constraints, new or abandoning contributors). This event data will be gathered from various sources: software distribution managers, version control systems, bug and issue trackers, and online communication channels. Modelling such data is particularly challenging notably due to the complex temporal dynamics, as well as the heterogeneity, quality, size and complexity of the date. We will develop and apply machine learning models for prediction and causal discovery of software health problems based on temporal point processes and dynamic network modelling techniques to analyse large-scale, multi-granular and evolving software ecosystem data. Based on recent developments in machine learning, we will develop prediction, simulation and discovery models to analyse and predict the health of OSS ecosystems and their constituent components.

Main objectives

Historical events of software development activity will be modelled using multi-dimensional point processes. These processes allow to model the inherent property of software development data where past development events can have an important influence on future events affecting (in a negative or positive way) the health of a software component. We will consider state-of-the-art point process models based on deep neural networks to capture more diverse and more complex influences of past events on future events.

In addition to modelling the intrinsic temporal structure at the level of individual components, we will use dynamic network modelling to capture the temporally evolving socio-technical network of the ecosystem. Causal graph learning techniques will be used to infer the causal effects of events and associated health metrics across ecosystem components. Multi-level models will be conceived to combine component-level and network-level health prediction models by capturing the complex dynamic interplay between different levels of granularity and different time scales. To do so, dynamic graph representation learning techniques will be combined with flexible neural models for temporal point processes. Efficient learning techniques will be designed to estimate the parameters of the neural models based on data extracted from selected OSS ecosystems.

The resulting machine learning algorithms and models will be used to provide practical predictive and discovery models for health analysis of OSS ecosystems, taking into account the diversity in activities, granularities and temporal scales, as well as the socio-technical aspects. They will be used to analyse and predict health issues in upcoming component releases, and to assess the network-level health impact by predicting how events with a (positive or negative) effect on health affect other ecosystem components.


If you apply for this position please say you saw it on Computeroxy

Apply

All Jobs

FACEBOOK
TWITTER
LINKEDIN

Harvard University Academic Positions

Kuwait University Current Faculty Openings

Osaka University Academic Opportunities

Purdue University Job Postings for Faculty Positions

Texas Tech University Faculty Openings

Tsinghua University Job Postings

University of Cambridge Job Openings

University of Geneva Faculty Opportunities

University of New South Wales Job Openings

University of Nottingham Research Positions

University of Oslo Academic Jobs

University of Saskatchewan Faculty Positions

University of Southampton Research Vacancies

University of Tokyo Current Academic Vacancies

University of Toronto Open Faculty Positions