PhD student, on the subject of development of dynamic data analysis pipelines on distributed data clusters
The Formal Methods group at CWI has an open positions for a PhD student to work in the project Evolutionary changes in Distributed Analysis (ECiDA). This …
- Science Park, Amsterdam, Noord-Holland
- Tijdelijk contract / Tijdelijke opdracht
- Uren per week:
- 40 uur
- € 2291 - € 2937 per maand
The Formal Methods group at CWI has an open positions for a PhD student to work in the project Evolutionary changes in Distributed Analysis (ECiDA). This project involves development of dynamic data analysis pipelines on distributed data clusters.
Distributed server clusters are often used effectively to perform data analysis on voluminous collections of data. These clusters substantially speed up large-scale data analysis, by dividing data collections among available machines, where they can be processed in parallel. For instance, the distributed data processing platform Spark has become a de-facto standard in the world of large-scale data processing. The data processing pipelines for such platforms are composed during design time and then submitted to the central “master” component who then distributes the code among several worker nodes.
In many practical situations, the analysis application is not static and evolves over time: the developers add new processing steps, data scientists adjust parameters of their algorithm, and quality assurance discovers new bugs. Currently, an update of a pipeline looks as follows: the developers patch their code, re-submit the updated version, and finally restart the entire pipeline. However, restarting a processing pipeline safely is difficult: the intermediate state is lost and needs to be re-computed; some data needs to be reprocessed and, finally, the cost of restarting may not be trivial - especially for real-time streaming components that require 24x7 availability.
In this project we develop a platform to support evolving data-intensive applications without the need for restarting them when the requirements change (e.g., new data sources or algorithms become available). We apply our developed tools and techniques and evaluate their effectiveness in the context of three different industrial use cases from three top sectors: water treatment, life sciences, and HTSM/Smart Industry.
Candidates are required to have a master degree in computer science or related fields, with a strong background in formal methods, data analysis, service-oriented computing, software engineering, concurrency and distributed systems, and especially practical software tool development. Preferable qualifications for candidates include proven research talent, an excellent command of English, and good academic writing and presentation skills.
The terms of employment are in accordance with the Dutch Collective Labour Agreement for Research Centres ("CAO-onderzoeksinstellingen"). The initial labour agreement will be for a period of 18 months. After a positive evaluation, the agreement will extended by 30 months. The gross monthly salary, for a PhD student on a full time basis, is €2,291 during the first year and increases to €2,937 over the four year period.
Employees are also entitled to a holiday allowance of 8% of the gross annual salary and a year-end bonus of 8.33%. CWI offers attractive working conditions, including flexible scheduling and help with housing for expat employees.
Please visit our website for more information about our terms of employment:
Applications can be sent before 1 June 2018 through the 'Apply' button. All applications should include a statement of your interest, together with curriculum vitae, letters of reference, and lists of publications.
For residents outside the EER-area, a Toefl English language test might be required.