2 dagen geleden - Universiteit van Amsterdam (UvA) - Amsterdam
The Amsterdam School for Cultural Analysis, one of six research schools of the Faculty of Humanities, has a vacant PhD position as part of the ERC …
Challenging PhD position on automating Machine Learning itself. We aim to (semi)automate the data science process, from real-world streaming data data to machine learning models, to help everyone do better machine learning, faster. Applications ...
We are seeking a highly creative and motivated PhD candidate to join the Data Mining Group at the Eindhoven University of Technology. The candidate will be working in collaboration with Dr. ir. Joaquin Vanschoren, as well as the OpenML core team and the universities of Leiden (Prof. Holger Hoos and Prof. Thomas Back) and Delft (Prof. Jan van Gemert), and to develop new methods and tools to automate machine learning itself, and make them widely available and easy to use for everyone.
Whereas machine learning aims to build systems that are able to learn and improve over time, the creation of such systems is still done largely manually. The wider data science process contains many tedious, error-prone, or downright painful tasks, such as data wrangling, data preprocessing, model selection, and proper model evaluation. Current techniques tackle only small parts of this process, often don't work well with raw (or dirty) data and, importantly, they don't always learn much from one problem to the next. Moreover, when the data that you want to analyse evolves over time, most methods won't automatically evolve accordingly.
The key research question that we want to answer is how we can (semi)automate the data science process on data streams, and how we can learn effectively from one problem to the next (learning to learn), and over time.
This involves the combination of optimization (model-based optimization, bandits, genetic programming,...) and machine learning, including meta-learning (learning to learn) and deep learning. The end goal is to develop automated processes that use large amounts if prior experiments to build the most promising 'pipelines' of processes for new datasets, evaluate them, and learn from them to propose ever better pipelines. We will leverage OpenML, the open machine learning platform, and develop a series of 'bots' that immediately make the results of this work tangible to the wider data science community.
To demonstrate the approach, two complementary practical application tasks are selected: The early detection and treatment optimization for Parkinson's disease based on video data analysing the way people walk, and the cost- and environmentally optimized management of energy for private households with electric vehicles. The first case deals with videos and slow dynamics over time (analyzing the progression of the disease over a sequence of diagnostics), while the second one addresses numerical data with fast dynamics and the need for optimal decision making in real-time.
This work is set in an very interactive environment, including the Eindhoven Data Mining Group, the OpenML team, and the AutoML community. There is also interaction possible with the Leiden University Hospital (LUMC), Honda Research, and other companies working in this area. The availability of extensive (meta)data and expertise offers a unique opportunity for a bright student to tackle this hard problem.
We are looking for a motivated candidate with: