Machine learning methods in large migration data research

4 JUL 2017

With more than 400 million individuals from 58 countries covering more than 50 years, the IPUMS dataset is the world's largest body of consolidated migration data on an individual level. When analysing this source of massive data for the investigation of migration drivers, traditional frequentist statistics reach their limits. Instead we apply clustering and classification techniques in order to assess determinants of migration on an individual level. First results indicate that education and age are long lasting drivers across time and space.

The research project by Centre scientists Raya Muttarak, Guy Abel and Fabian Stephany described above was presented at this year's annual meeting of the Italian Statistical Scociety in Florence, "Statistics and Data Science - New Challenges, New Generations" (http://meetings3.sis-statistica.org/index.php/sis2017/sis2017) by Fabian Stephany.

The Wittgenstein Centre aspires to be a world leader in the advancement of demographic methods and their application to the analysis of human capital and population dynamics. In assessing the effects of these forces on long-term human well-being, we combine scientific excellence in a multidisciplinary context with relevance to a global audience. It is a collaboration among the Austrian Academy of Sciences (ÖAW), the International Institute for Applied Systems Analysis (IIASA) and the University of Vienna.