We study the problem of learning from multiple untrusted data sources, a scenario of increasing practical relevance given the recent emergence of crowdsourcing and collaborative learning paradigms. Specifically, we analyze the situation in which a learning system obtains datasets from multiple sources, some of which might be biased or even adversarially perturbed. It is known that in the single-source case, an adversary with the power to corrupt a fixed fraction of the training data can prevent "learnability", that is, even in the limit of infinitely much training data, no learning system can approach the optimal test error. I present recent work with Nikola Konstantinov in which we show that, surprisingly, the same is not true in the multi-source setting, where the adversary can arbitrarily corrupt a fixed fraction of the data sources.
Underlying paper: https://cvml.ist.ac.at/papers/konstantinov-icml2020.pdf
Personal website of Christoph Lampert
The talk also can be joined online via our ZOOM MEETING
Meeting room opens at: June 12, 2023, 4.30 pm Vienna
Meeting ID: 662 5111 2914
Password: 013246