Betreuer: Florian Leiser
Forschungsgruppe: Critical Information Infrastructures
Beginn: 26. November 2021
Background: Federated Learning (FL) is an emerging paradigm which enables different clients to train a common model while maintaining data privacy. Within a FL network, every client trains a local Machine Learning (ML) model at his site. A central server aggregates the weights of different local models resulting in a common model while the raw data remains at each site. This aggregation works well for parametric approaches like Linear Regression and Neural Networks where an average or other aggregation of the values can be computed easily. However, it is still unclear how non-parametric ML approaches, like Decision Trees and Random Forests can be aggregated in such a way.
Objective(s): The aim of this thesis is to develop an aggregation of Random Forests across different clients. Within the thesis you might develop and pursue multiple approaches on how an aggregation of Random Trees might be done. A comparison of the different approaches needs to be conducted. First approaches on how to tackle this approach are presented below but you are invited to try and follow your own ideas.
Literature: Kairouz, P., McMahan, H. B., Avent, B., Bellet, A., Bennis, M., Bhagoji, A. N., ... & Zhao, S. (2019). Advances and open problems in federated learning. arXiv preprint arXiv:1912.04977. (https://arxiv.org/abs/1912.04977)
Liu, Y., Liu, Y., Liu, Z., Liang, Y., Meng, C., Zhang, J., & Zheng, Y. (2020). Federated forest. IEEE Transactions on Big Data. (https://arxiv.org/abs/1905.10053)
Wu, Y., Cai, S., Xiao, X., Chen, G., & Ooi, B. C. (2020). Privacy preserving vertical federated learning for tree-based models. arXiv preprint arXiv:2008.06170. (https://arxiv.org/pdf/2008.06170.pdf)