How safe are My Credentials?

With Best Assignment Experts all your credentials are safe and secured. We use modern technology and updated software which are free from any cyber-attacks and malware. From us, no third person or party will get any details of your personal information. We are ready to go to any extent to protect your credentials.

Do you have a refund policy?

Yes, Best Assignment Experts have a refund policy. If your assignment retirements are not met by our top-level professors and unfortunately you get failed then we promise to refund the 50% of from the total paid amount which we took from you. You will get the money back within 7-14 working days. But, fortunately, such a situation never occurred.

What if the Expert crosses the deadline to provide the solution?

If Best Assignment Experts crosses the deadline to provide the solution then you will receive the assignment for free. We will charge no money from you and we will offer a 10 % discount more your next upcoming project. However, such a situation never happened in past and we know we will never delay in submitting.

What happens if the quality standard is not met?

If Best Assignment Experts fail to meet the quality standard then we will provide you five times amendments. We will edit it and reframes it for free. But still, after 5 times you are not happy with the quality then we will refund the whole amount in your bank account.

Country : Assignment Help Australia

University : Edith Cowan University Australia

Course Code : 101551 Coronavirus Impact On Society And Social Inequalities

Is adverse impact of COVID-19 more prevalent for specific gender or age group?

Abstract

COVID-19, the highly contagious disease that has been spreading through the world like wildfire, is a black box. There is limited visibility on how the virus behind this disease acts or who it impacts more. Robust data analysis to answer some of the unknowns associated with the disease can help save many lives.

The purpose of this analysis is to identify if death rate for men above 50 years of age is more than other age groups and gender. Once we are able to ascertain this for confirmed cases of the disease around the world, the same can be determined specifically for Australia.

Publicly available demographic data on COVID-19 has been used for the analysis. The data primarily contains details on country of confirmed case, number of cases and the outcome i.e. death or recovery, along with demographic details such as gender and age.

Based on analysis, it has been found that male infections are more than female, further, men over 50 years of age have been more impacted than men under 50 years of age.

This category of population is more at risk and such finding can help plan resources and precautions better for them. Death rate is high at 5.8% as per the dataset. Also, the most commonly exhibited symptoms are fever and cough.

Introduction

The world has been taken by storm by the spread of Novel Coronavirus, also called COVID-19. Most of the economic activities are on hold, with people around the world surviving on essentials. The virus is alleged to have jumped from bats to humans, it belongs to a class of coronaviruses common among wild animals. The less likely event of animal to human transmission occurred in China last year. The first case was reported last in 2019, in Wuhan, which became an epicentre of this highly contagious disease. Since then, as the movement of people in and out of Wuhan was still happening, more and more people travelled around the world and caused the disease to spread. Very little is known about these viruses. Doctors, scientists and data scientists are putting in all efforts to gather more information on how the virus behaves, who it impacts, how much risk it is to an individual and how to treat those affected. In the past month we have seen how this disease has severely strained the resources of countries like Italy and Spain. In such a scenario, it is important that we know which part of the population is most at risk, so that necessary preventive actions can be taken. An extra amount of caution is required. With the help of demographic data on COVID-19, we can assess if there is a specific age group and gender, which is at more risk. Further, there have been several reports on how elderly men are more likely to suffer from the adverse effect of the disease.

The role of data scientists in the current pandemic situation is huge. Almost 4 million have been affected worldwide, the medical fraternity is already working beyond what we could have considered humanly possible. The best way to defeat the virus is by finding out whatever we can about it. The phrase ‘knowledge is power’ makes perfect in this situation.

Knowledge of which gender and age-group is at more risk can help with better planning of country’s resources. Extra precautionary measures can be prescribed for identified population sets. There have been several reports in the past which indicated that elderly men are at a higher risk, this could be attributed to their lower immunity. A confirmation of this hypothesis can also help prioritize vaccines, once available, on logical grounds.

Data

COVID-19 data is publicly available. All related data has been deemed open-source so that data scientists can work on it for novel insights. The chosen dataset has been picked from Johns Hopkins Github repository. A random sample of 1085 records with 19 variables has been chosen for observation.

This data is a compilation from the official reports by nations around the world, published daily. Johns Hopkins publishes a dashboard on confirmed cases and outcomes, split by geographical area. A sample of 1085 records has been analysed.

There are 19 variables in the dataset. List of the variables –

Figure 1 Dataset variables

From the list of column names, we can determine the nature of data. It contains a unique identifier for every row, number of cases in a country, date when case was reported, summary on the type of case, location and country of the case. Further, there are demographic details i.e. gender and age. There are details on dates of onset, exposure and if the case was from Wuhan or someone who visited Wuhan. The outcome of the case is also available. Credibility of case can be ascertained by looking at the source of information.

Methods

Various steps have been performed for this analysis and to arrive at the insights, these are described below. RStudio has been installed with the help of manual from CRAN (R Installation and Administration, 2020). RVersion used for the analysis is ‘1.1.383’.

Data representation

Preliminary understanding of the data is important before we start with any analysis. Data has been imported into a data frame using the read.csv command. The str command has been used to identify data types and example values in the data frame (str function, 2020)

Type conversion

Text data is stored as factors, as symptom data is required to understand the most common symptoms in patients, type of this column has been changed. The lapply function has been used (lapply function, 2020)

Age data has been converted from factors to numeric using the as.numeric function (Statistic Globe, 2020)

Unstructured to structured data

One of the columns called ‘symptom’ has text data, this has been cleaned to remove slashes, spaces and other characters found in text, for better analysis. This has been done by loading new libraries – dplyr, stringr, ggplot2, tidytext (Text processing in R, 2020)

Data cleaning

Death rate can be figured out from the data using the death column. However, a date is mentioned for cases where outcome of the case has been death. This information is not required, and such values have been replaced with a 1, by applying the condition mydata$death !=”0”. Data stored as character is then converted to numeric

Further, the number of deaths was found to be 63 out of 1085 i.e. 5.8?ath rate according to the chosen dataset

Group based data summarization

The data has been summarized by country to understand the distribution using table function (Table function in R, 2020). This gives a fair idea of representation of countries in data set. As a result, it was observed that China has the highest representation of data (197), followed by Japan (190), South Korea (114) and Hong Kong (94)

Data visualization

As we clean the data and come to the main objective of finding out the proportion of male and female cases, we found there have been more cases among men. 60% of those infected are male and 40% are female. This plot has been created using the barplot command (Barplot, 2020), with proportion distribution for better picture.

Figure 2 Gender split for all data

Further, age has been plotted as histogram, with breaks defined in the interval of 10. Hist function has been used (Hist function, 2020) (The subset function, 2020)

Figure 3 Age split for all data

The data complies with our hypothesis that those over the age of 50 are at a higher risk

Data subset selection

Data subset is created to understand if the observation on gender and age for all nations applies to Australia as well. The subset function has been used for this (The subset function, 2020)

Figure 4 Gender split for Australia

Figure 5 Age split for Australia

The data visualizations show that worldwide observations not only apply to Australia but are more profound. However, no conclusion can be drawn as the sample data is less than 30.

Exploratory visualization

Further, after cleaning the symptoms data, it was identified that a few symptoms occurred more frequently than others. ggplot2 function (ggplot2, 2020) has been used to identify the most common symptoms

Figure 6 Most common symptoms using ggplot2

Results and Discussion

Through data exploration we identified the split of this random dataset between countries, gender and age. We also identified that if grouped by countries, a conclusive statement can be made only for a few countries due to lack of data. There are a few key points that emerged from the analysis about COVID-19 –

Population at higher risk – Male, above 50 years of age
Death rate – 5.8%
Most common symptoms – Fever and cough

Conclusions

Data understanding and cleaning, followed by data conversion, subset formation, grouping and visualizations, resulted in some actionable insights. These insights have a greater implication in everyday life. We have determined that death rate for the disease is high, which implies that without proper planning and preparation, we may lose several lives. Further, elderly men should be extra-cautious, government can take measures to restrict movement of this category of population. There should be new services aimed at making their life easier and limiting interaction with outside world. Also, those exhibiting symptoms of COVID-19 i.e. fever and cough should be tested and isolated to prevent further spread of the disease.

Bibliography

R Installation and Administration. (2020). Retrieved from CRAN R Project: https://cran.r-project.org/doc/manuals/r-release/R-admin.html

str function. (2020). Retrieved from R documentation: https://www.rdocumentation.org/packages/utils/versions/3.6.2/topics/str

lapply function. (2020). Retrieved from R documentation: https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/lapply

Statistic Globe. (2020). Retrieved from Factors to Numbers: https://statisticsglobe.com/how-to-convert-a-factor-to-numeric-in-r/

Text processing in R. (2020). Retrieved from https://www.mjdenny.com/Text_Processing_In_R.html

Table function in R. (2020). Retrieved from Data science made simple: http://www.datasciencemadesimple.com/table-function-in-r/

Barplot. (2020). Retrieved from R documentation: https://www.rdocumentation.org/packages/graphics/versions/3.6.2/topics/barplot

Hist function. (2020). Retrieved from R documentation: https://www.rdocumentation.org/packages/graphics/versions/3.6.2/topics/hist

The subset function. (2020). Retrieved from R Bloggers: https://www.r-bloggers.com/r-101-the-subset-function/

ggplot2. (2020). Retrieved from Tidyverse: https://ggplot2.tidyverse.org/

Appendices

Figure 7 str(mydata)

Figure 9 Type conversion- lapply

R Code

#check R version

RStudio.Version()

#load csv file

mydata <- read.csv(file.choose())

#check data

colnames(mydata)

#check data types

str(mydata)

#convert country column to character

mydata[18] <- lapply(mydata[18], as.character)

mydata[18]

#plot of gender

barplot(prop.table(table(mydata[7])))

#convert age data to numeric

mydata[8] <- lapply(mydata[8], as.numeric)

#plot of age

hist( mydata$age,

breaks = c(0,10,20,30,40,50,60,70,80,90,100))

#subset country data for Australia

ausdata<-subset(mydata, mydata$country=="Australia")

ausdata

#plot of gender for Australia

barplot(prop.table(table(ausdata[7])))

#plot of age for Australia

hist( ausdata$age,

breaks = c(0,10,20,30,40,50,60,70,80,90,100))

#load libraries for converting unstructured text data to more structured form

library(dplyr) # Data wrangling & manipulation

library(stringr) # For managing text

library(ggplot2) # For data visualizations & graphs

#create new list of symptoms

symptomslist<-c(mydata$symptom)

symptomslist

#clean text data

symptomslist <- paste(symptomslist, collapse = " ") # Remove spaces

symptomslist <- str_replace_all(symptomslist, pattern = '\"', replacement = "") # Remove slashes

symptomslist <- str_replace_all(symptomslist, pattern = '\n', replacement = "") # Remove \n

symptomslist <- str_replace_all(symptomslist, pattern = '\u0092', replacement = "'") #Replace with quote

symptomslist <- str_replace_all(symptomslist, pattern = '\u0091', replacement = "'") #Replace with quote

#create new vectors with symptom name and frequency

name<-c('fever','cough','throat','pain','headache','diarrhea','chills','breath','runny nose')

freq<-c(length(grep("fever", symptomslist)),

length(grep("cough", symptomslist)),

length(grep("throat", symptomslist)),

length(grep("pain", symptomslist)),

length(grep("headache", symptomslist)),

length(grep("diarrhea", symptomslist)),

length(grep("chills", symptomslist)),

length(grep("breath", symptomslist)),

length(grep("runny nose", symptomslist)))

#create a dataframe of symptom frequencies

symptomsfreq<-data.frame(name,freq)

#ggplot to determine most common symptoms

ggplot(data=symptomsfreq, aes(x=freq,y=name))+geom_bar(stat="identity")

List Of Assignment Services Provided By BestAssignmentExperts.Com

Finance Assignment Help

Law Assignment Help

Computer Assignment Help

Essay Help Writing

Marketing Assignment Help

Management Assignment Help

Order Fresh Assignment Get Instant Help

No Need To Pay Extra

Turnitin Report

~~$10.00~~
Proofreading and Editing

~~$9.00~~
Per Page
Consultation with Expert

~~$35.00~~
Per Hour
Live Session 1-on-1

~~$40.00~~
Per 30 min.
Quality Check

~~$25.00~~
Total

Free

OUR SERVICES

Call Back

Register Now

Is adverse impact of COVID-19 more prevalent for specific gender or age group?

There are 19 variables in the dataset. List of the variables –

Figure 1 Dataset variables

Methods

Type conversion

Unstructured to structured data

Data cleaning

Group based data summarization

Data visualization

Data subset selection

Figure 4 Gender split for Australia

Exploratory visualization

Figure 6 Most common symptoms using ggplot2

Results and Discussion

Conclusions

List Of Assignment Services Provided By BestAssignmentExperts.Com

No Need To Pay Extra

Total

Free

Get 25% Off

OUR SERVICES

Is adverse impact of COVID-19 more prevalent for specific gender or age group?

There are 19 variables in the dataset. List of the variables –

Figure 1 Dataset variables

Figure 4 Gender split for Australia

Figure 6 Most common symptoms using ggplot2

List Of Assignment Services Provided By BestAssignmentExperts.Com

No Need To Pay Extra

Total

Free

Get 25% Off

OUR SERVICES

Enter Your Detail

Price:

Word Count: 2000

Share Your Reviews!