It is thought that international migration has especially accelerated in the past decades. People seem to be pulled toward improved opportunities in host nations or pushed to leave challenging environments in their native countries. The population in Europe is declining due to population aging, low fertility and emigration and at the same time the ease of travel within Europe has enhanced population mobility. This is true to Estonia as well and therefore we think it’s important to understand the migration patterns in order to understand the problem better. The aim of the project is to analyze the migration data between 2004 and 2021 and understand what are the immigration and emigration trends and what conclusions can be drawn to summarize the issue at hand. Special focus is put on understanding whether there are gender differences connected to migration.
In order to do so a research question has been identified:
RQ1: Is there a difference by gender in the trends of migration?
The dataset regarding migration in Estonia was taken from Statistics Estonia web page. The dataset includes number of people emigrating and immigrating between 2004 and 2021 and specific countries the migration is aimed towards to. Emigration is defined as the action by which a person de-registers his or her place of residence from an administrative unit, settlement unit or settlement region of the beginning of the year and immigration is defined as the action by which a person registers his or her place of residence in an administrative unit, settlement unit or settlement region other than the one at the beginning of the year.
## Year Country Immigration.Males Immigration.Females
## Min. :2004 Length:486 Min. : 0.00 Min. : 0.00
## 1st Qu.:2008 Class :character 1st Qu.: 10.00 1st Qu.: 9.00
## Median :2012 Mode :character Median : 33.00 Median : 27.50
## Mean :2012 Mean : 115.31 Mean : 85.02
## 3rd Qu.:2017 3rd Qu.: 89.75 3rd Qu.: 69.00
## Max. :2021 Max. :1783.00 Max. :1369.00
## Emigration.Males Emigration.Females
## Min. : 0.0 Min. : 0.0
## 1st Qu.: 7.0 1st Qu.: 10.0
## Median : 20.0 Median : 24.0
## Mean : 100.4 Mean : 100.6
## 3rd Qu.: 55.0 3rd Qu.: 63.0
## Max. :2459.0 Max. :2668.0
The dataset includes 486 objects of 6 variables. Data regarding year, immigration males and females, emigration males and females is reported as integers, and country as a character type. There are 27 countries reported in the dataset out of which 4 are continents.
## Year Country Immigration.Males Immigration.Females Emigration.Males
## 1 2004 ..Austria 2 1 2
## 2 2004 ..Belgium 5 4 3
## 3 2004 ..Spain 4 4 8
## 4 2004 ..Netherlands 4 2 6
## 5 2004 ..Ireland 2 1 11
## 6 2004 ..Italy 11 7 0
## Emigration.Females
## 1 5
## 2 2
## 3 20
## 4 8
## 5 7
## 6 4
## Year Country Immigration.Males Immigration.Females Emigration.Males
## 481 2021 ..Russia 821 694 305
## 482 2021 Africa 159 118 43
## 483 2021 Asia 662 377 237
## 484 2021 America 241 205 126
## 485 2021 ..USA 116 150 89
## 486 2021 Oceania 88 65 56
## Emigration.Females
## 481 291
## 482 20
## 483 155
## 484 125
## 485 87
## 486 51
## 'data.frame': 486 obs. of 6 variables:
## $ Year : int 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 ...
## $ Country : chr "..Austria" "..Belgium" "..Spain" "..Netherlands" ...
## $ Immigration.Males : int 2 5 4 4 2 11 10 0 19 8 ...
## $ Immigration.Females: int 1 4 4 2 1 7 15 0 27 2 ...
## $ Emigration.Males : int 2 3 8 6 11 0 5 2 8 2 ...
## $ Emigration.Females : int 5 2 20 8 7 4 4 3 8 8 ...
Overall picture from the immigration data shows a fairly linear curve, but there are some gender differences clearly in the sample regions checked for immigration to Estonia. Throughout the years, it seems to be more males immigrating than females. We have illustrated this using a scatter plot below.
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
Below you can see histographic distribution representing male and female immigration respectively. It is observed that both the plots have a left-skewed histogram, where the majority of the total number of people for each gender immigrating into Estonia is rarely beyond 500 people. Any value above can be considered outliers. It is now, in later section (2.4), to be found out whether these outliers are with respect to regions or years.
Overall picture from the emigration data shows that, just like immigration, there are some gender differences when emigrating from Estonia from a regional perspective. Throughout the years, it seems to be more males emigrating than females. We have illustrated this using scatter plots below.
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
Below you can see histographic distribution representing male and female emigration respectively. It is observed that both the plots have a left-skewed histogram, where the majority of the total number of people for each gender emigrating out of Estonia are rarely beyond 500 people. Any value above can be considered outliers. It is now, in later section (in 2.5), to be found out whether these outliers are with respect to regions or years.
If we look at the immigration by country, we can notice for males and females that Finland, Ukraine and Russia spike the most, therefore it seems that it’s from the neighboring countries that most male immigrants come in.
To further our understanding from the scatter plot above and to confirm that outliers are due to specific country and not due to yearly variation, we decided to use box plots for both male and female gender for every region colored in yearly variation.
It can be seen that a spike in 2020 and 2021 has taken place, this is corresponding to all the countries unanimously.Therefor, considering the fact from previous section that most immigration distribution was within 500 range, main outliers values are caused due to certain countries, Finland, Russia, and Ukraine specifically.
If we look at the emigration by country, we can notice for males and females that Finland spikes the most therefore it seems that it’s to Finland both Male and Female emigrants go.
To further our understanding from the scater plot above and to confirm that outliers are due to specific country and not due to yearly variation. We use box plots for both male and female gender for every region colored in yearly variation.
It is confirmed hat Finland is the outright outlier in the data for emigration
The aim of the research is to understand whether there might be differences of migration based on gender. In order to do so, we have decided to do 4 sample t-tests, two for each, Immigration and Emigration.
Why 4? This is to check how the P value changes based on the country filter we place, to ensure no significant changes has taken place, if it does, then the filtration of the outliers has to be redone because then it can be said that there was a large loss of population which caused a significant shift in P-value and not just by a simple anomaly.The filter will be the country of Finland for both Immigration and Emigration.
In order to proceed with the hypothesis testing, the research hypothesis were stated:
Immigration
H0: There is NO differences of Immigration trends based on gender
H1: There IS a difference of Immigration trends based on gender
Emigration
H00: There is NO differences of Emigration trends based on gender
H01: There IS a difference of Emigration trends based on gender
We will analyse immigration and emigration separately to look into both aspects of the migration.
Immigration T-test Without filter…
##
## Welch Two Sample t-test
##
## data: migration_data_subset2$Immigration.Male and migration_data_subset2$Immigration.Female
## t = 2.1545, df = 873.81, p-value = 0.03148
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 2.696558 57.891919
## sample estimates:
## mean of x mean of y
## 115.31070 85.01646
Immigration T-test With filter…
##
## Welch Two Sample t-test
##
## data: migration_data_subset2$Immigration.Male and migration_data_subset2$Immigration.Female
## t = 2.3027, df = 790.16, p-value = 0.02156
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 3.123255 39.218626
## sample estimates:
## mean of x mean of y
## 83.22009 62.04915
Immigration Test is complete, with a P-Value at 3% and 2% respectively, hence null hypothesis is rejected. There is a difference between genders during immigration and it leans towards male more than female.
Emigration T-test Without filter…
##
## Welch Two Sample t-test
##
## data: migration_data_subset3$Emigration.Male and migration_data_subset3$Emigration.Female
## t = -0.012813, df = 965.34, p-value = 0.9898
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -39.65104 39.13664
## sample estimates:
## mean of x mean of y
## 100.3807 100.6379
Emigration T-test With filter…
##
## Welch Two Sample t-test
##
## data: migration_data_subset4$Emigration.Male and migration_data_subset4$Emigration.Female
## t = -1.596, df = 929.32, p-value = 0.1108
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -13.749670 1.416337
## sample estimates:
## mean of x mean of y
## 39.73291 45.89957
Emigration Test is complete, with a P-Value at 10% and 11% respectively, hence null hypothesis is not rejected. There is no difference between genders during emigration.
Based on the data available and the statistical test, it can be said that males tend to Immigrate more than females into Estonia.Although, this trend is gradually on the decline since 2020. Regions such as Finland, Russia, Ukraine, and ever so slightly Asia leading this trend.
But in terms of Emigration, the patterns are more or less the same. While a great majority still only emigrate to Finland.
In terms of limitations, one factor to keep in mind with this dataset is that migration data are considered often incomplete, because people often do not register their new place of residence, hence the number of people affected by this might actually be much bigger.Since there has been significant differences in migration during the years 2020 and 2021, it would be wise to re-evaluate these findings in the next 5 years time.
In conclusion it can be said that it is important to look into the migration trends in order to understand the real reasons behind it. As it was shown, there are differences between male and female migration which could also indicate that different approach for different gender is needed in order to fully understand what is driving the migration. Also, by taking these differences into account, countries can develop better approaches in managing possibilities and difficulties that might come with migration.
On a positive note, as population decline is a matter of great concern for most countries recently, Japan, China and Germany etc, it is is safe to say from the overall view that Immigration is much greater than Emigration for Estonia, hence there is one less thing to worry about in terms of migration for the time being.