Анализ влияния COVID-19 на глобальную экономику

Вспышка Covid-19 повлияла на глобальную экономику. Рост числа случаев Covid-19 негативно сказался на всех странах.

Анализ (учебный кейс)

В рамках учебной задачи предстоит провести анализ распространения случаев Covid-19, а также выявить последствия covid-19 для экономики.

Набор данных, который мы используем для анализа воздействия covid-19, загружается с Kaggle.

Covid-19 Impacts Analysis

Набор данных, который мы используем для анализа воздействия covid-19, загружается с Kaggle. Он содержит данные о:

  • код страны
  • название всех стран
  • дата записи
  • Индекс человеческого развития всех стран
  • Ежедневные случаи covid-19
  • Ежедневная смертность из-за covid-19
  • индекс строгости стран
  • население стран
  • ВВП на душу населения стран

Kaggle

In [1]:
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go

data = pd.read_csv("./archive/transformed_data.csv")
data2 = pd.read_csv("./archive/raw_data.csv")
# print(data)
data
Out[1]:
CODE COUNTRY DATE HDI TC TD STI POP GDPCAP
0 AFG Afghanistan 2019-12-31 0.498 0.000000 0.000000 0.000000 17.477233 7.497754
1 AFG Afghanistan 2020-01-01 0.498 0.000000 0.000000 0.000000 17.477233 7.497754
2 AFG Afghanistan 2020-01-02 0.498 0.000000 0.000000 0.000000 17.477233 7.497754
3 AFG Afghanistan 2020-01-03 0.498 0.000000 0.000000 0.000000 17.477233 7.497754
4 AFG Afghanistan 2020-01-04 0.498 0.000000 0.000000 0.000000 17.477233 7.497754
... ... ... ... ... ... ... ... ... ...
50413 ZWE Zimbabwe 2020-10-15 0.535 8.994048 5.442418 4.341855 16.514381 7.549491
50414 ZWE Zimbabwe 2020-10-16 0.535 8.996528 5.442418 4.341855 16.514381 7.549491
50415 ZWE Zimbabwe 2020-10-17 0.535 8.999496 5.442418 4.341855 16.514381 7.549491
50416 ZWE Zimbabwe 2020-10-18 0.535 9.000853 5.442418 4.341855 16.514381 7.549491
50417 ZWE Zimbabwe 2020-10-19 0.535 9.005405 5.442418 4.341855 16.514381 7.549491

50418 rows × 9 columns

Подготовка данных

In [2]:
data.head()
Out[2]:
CODE COUNTRY DATE HDI TC TD STI POP GDPCAP
0 AFG Afghanistan 2019-12-31 0.498 0.0 0.0 0.0 17.477233 7.497754
1 AFG Afghanistan 2020-01-01 0.498 0.0 0.0 0.0 17.477233 7.497754
2 AFG Afghanistan 2020-01-02 0.498 0.0 0.0 0.0 17.477233 7.497754
3 AFG Afghanistan 2020-01-03 0.498 0.0 0.0 0.0 17.477233 7.497754
4 AFG Afghanistan 2020-01-04 0.498 0.0 0.0 0.0 17.477233 7.497754
In [3]:
data2.head()
Out[3]:
iso_code location date total_cases total_deaths stringency_index population gdp_per_capita human_development_index Unnamed: 9 Unnamed: 10 Unnamed: 11 Unnamed: 12 Unnamed: 13
0 AFG Afghanistan 2019-12-31 0.0 0.0 0.0 38928341 1803.987 0.498 #NUM! #NUM! #NUM! 17.477233 7.497754494
1 AFG Afghanistan 2020-01-01 0.0 0.0 0.0 38928341 1803.987 0.498 #NUM! #NUM! #NUM! 17.477233 7.497754494
2 AFG Afghanistan 2020-01-02 0.0 0.0 0.0 38928341 1803.987 0.498 #NUM! #NUM! #NUM! 17.477233 7.497754494
3 AFG Afghanistan 2020-01-03 0.0 0.0 0.0 38928341 1803.987 0.498 #NUM! #NUM! #NUM! 17.477233 7.497754494
4 AFG Afghanistan 2020-01-04 0.0 0.0 0.0 38928341 1803.987 0.498 #NUM! #NUM! #NUM! 17.477233 7.497754494
In [4]:
data["COUNTRY"].value_counts()
Out[4]:
Afghanistan        294
Indonesia          294
Macedonia          294
Luxembourg         294
Lithuania          294
                  ... 
Tajikistan         172
Comoros            171
Lesotho            158
Hong Kong           51
Solomon Islands      4
Name: COUNTRY, Length: 210, dtype: int64
In [5]:
data['COUNTRY'].value_counts().mode()
Out[5]:
0    294
Name: COUNTRY, dtype: int64

Сборка нового датасета

In [6]:
code = data["CODE"].unique().tolist()
country = data["COUNTRY"].unique().tolist()
hdi = []
tc = []
td = []
sti = []
population = data["POP"].unique().tolist()
gdp = []

mod_value = 294
for value in country:
    hdi.append((data.loc[data["COUNTRY"] == value, "HDI"]).sum()/mod_value)
    tc.append((data2.loc[data2["location"] == value, "total_cases"]).sum())
    td.append((data2.loc[data2["location"] == value, "total_deaths"]).sum())
    sti.append((data.loc[data["COUNTRY"] == value, "STI"]).sum()/mod_value)
    population.append((data2.loc[data2["location"] == value, "population"]).sum()/mod_value)

aggregated_data = pd.DataFrame(
    list(zip(code, country, hdi, tc, td, sti, population)),
    columns = ["Country Code", "Country", "HDI",
               "Total Cases", "Total Deaths", "Stringency Index", "Population"]
)

aggregated_data.head()
Out[6]:
Country Code Country HDI Total Cases Total Deaths Stringency Index Population
0 AFG Afghanistan 0.498000 5126433.0 165875.0 3.049673 17.477233
1 ALB Albania 0.600765 1071951.0 31056.0 3.005624 14.872537
2 DZA Algeria 0.754000 4893999.0 206429.0 3.195168 17.596309
3 AND Andorra 0.659551 223576.0 9850.0 2.677654 11.254996
4 AGO Angola 0.418952 304005.0 11820.0 2.965560 17.307957

Сортировка данных по Total Cases

In [7]:
data = aggregated_data.sort_values(by=["Total Cases"], ascending=False)
data.head()
Out[7]:
Country Code Country HDI Total Cases Total Deaths Stringency Index Population
200 USA United States 0.92400 746014098.0 26477574.0 3.350949 19.617637
27 BRA Brazil 0.75900 425704517.0 14340567.0 3.136028 19.174732
90 IND India 0.64000 407771615.0 7247327.0 3.610552 21.045353
157 RUS Russia 0.81600 132888951.0 2131571.0 3.380088 18.798668
150 PER Peru 0.59949 74882695.0 3020038.0 3.430126 17.311165
In [8]:
data = data.head(10)
data
Out[8]:
Country Code Country HDI Total Cases Total Deaths Stringency Index Population
200 USA United States 0.924000 746014098.0 26477574.0 3.350949 19.617637
27 BRA Brazil 0.759000 425704517.0 14340567.0 3.136028 19.174732
90 IND India 0.640000 407771615.0 7247327.0 3.610552 21.045353
157 RUS Russia 0.816000 132888951.0 2131571.0 3.380088 18.798668
150 PER Peru 0.599490 74882695.0 3020038.0 3.430126 17.311165
125 MEX Mexico 0.774000 74347548.0 7295850.0 3.019289 18.674802
178 ESP Spain 0.887969 73717676.0 5510624.0 3.393922 17.660427
175 ZAF South Africa 0.608653 63027659.0 1357682.0 3.364333 17.898266
42 COL Colombia 0.581847 60543682.0 1936134.0 3.357923 17.745037
199 GBR United Kingdom 0.922000 59475032.0 7249573.0 3.353883 18.033340
In [9]:
data["GDP Before COVID"] = [65279.53, 8897.49, 2100.75,
                            11497.65, 7027.61, 9946.03,
                            29564.74, 6001.40, 6424.98, 42354.41]

data["GDP During COVID"] = [63466.68, 6546.78, 1900.52,
                            10126.72, 6146.78, 8534.70,
                            27057.16, 5090.72, 5332.77, 40284.64]
data
Out[9]:
Country Code Country HDI Total Cases Total Deaths Stringency Index Population GDP Before COVID GDP During COVID
200 USA United States 0.924000 746014098.0 26477574.0 3.350949 19.617637 65279.53 63466.68
27 BRA Brazil 0.759000 425704517.0 14340567.0 3.136028 19.174732 8897.49 6546.78
90 IND India 0.640000 407771615.0 7247327.0 3.610552 21.045353 2100.75 1900.52
157 RUS Russia 0.816000 132888951.0 2131571.0 3.380088 18.798668 11497.65 10126.72
150 PER Peru 0.599490 74882695.0 3020038.0 3.430126 17.311165 7027.61 6146.78
125 MEX Mexico 0.774000 74347548.0 7295850.0 3.019289 18.674802 9946.03 8534.70
178 ESP Spain 0.887969 73717676.0 5510624.0 3.393922 17.660427 29564.74 27057.16
175 ZAF South Africa 0.608653 63027659.0 1357682.0 3.364333 17.898266 6001.40 5090.72
42 COL Colombia 0.581847 60543682.0 1936134.0 3.357923 17.745037 6424.98 5332.77
199 GBR United Kingdom 0.922000 59475032.0 7249573.0 3.353883 18.033340 42354.41 40284.64

Анализ распространения COVID

In [10]:
figure = px.bar(data, y='Total Cases', x='Country', title="Countries with highest deaths")
figure.show()