art <- read.csv("art.csv",header = TRUE, stringsAsFactors = FALSE)
str(art)
## 'data.frame': 10000 obs. of 9 variables:
## $ date : chr "1/3/2012" "1/3/2012" "1/3/2012" "1/4/2012" ...
## $ year : int 2012 2012 2012 2012 2012 2012 2012 2012 2012 2012 ...
## $ rep : chr "Qiaoli" "Qiaoli" "Barakat" "Thomas" ...
## $ store : chr "Portland" "Portland" "Portland" "Davenport" ...
## $ paper : chr "watercolor" "drawing" "drawing" "watercolor" ...
## $ paper.type: chr "pad" "roll" "pads" "pad" ...
## $ unit.price: num 12.2 21 10.3 12.2 10.3 ...
## $ units.sold: int 1 1 1 2 1 1 16 2 1 1 ...
## $ total.sale: num 12.2 21 10.3 24.3 10.3 ...
dim(art)
## [1] 10000 9
- Make 4 different plots that show different ways that we can see the distribution of the total.sale column. Use the par() function to put all 4 plots in the same plot space.
par(mfrow=c(2,2))
arttest <- art
arttest$date <- as.Date(art$date, format = "%m/%d/%Y")
arttest$month <- strftime(arttest$date,"%m")
arttest$monthyear <- strftime(arttest$date,"%m%Y")
barplot(table(arttest$year), col="#d5f4e6",main="Total Art Sales Per Year")
barplot(table(art$paper.type),col="#80ced6",main="Total Art Sales by Paper Type")
units.by.region<- aggregate(art$units.sold,list(region=art$store), FUN = sum)
barplot(units.by.region$x,names.arg=units.by.region$region,col="#fefbd8",main="Total Units Sold by Region")
units.by.rep<- aggregate(art$units.sold,list(rep=art$rep), FUN = sum)
barplot(units.by.rep$x,names.arg=units.by.rep$rep,col="#618685",main="Total Units Sold by Rep")
- Does the art company sell more units of drawing paper or watercolor paper? (remember, each line is a sale that may have more than one unit. Use aggregate.)
## paper x
## 1 drawing 6576
## 2 watercolor 15613
- Does the art company bring in more money (revenue) selling drawing paper or watercolor paper?
barplot(table(art$paper), main="Total Art Sales By Paper", col="#80ced6")
- Each paper (watercolor and drawing) has different subtypes. For drawing paper only, do some subtypes always sell more (or less) units no matter which store it is, or do some stores tend to sell more units of one subtype than others?
art <- subset(art, paper == 'drawing')
art.data <- tapply(art$units.sold,list(art$paper.type,art$store),FUN=sum)
barplot(art.data,col=c("#d5f4e6","#80ced6", "#fefbd8", "#618685"),beside = T,
main="Comparison of Paper Type sales",
legend.text = c(unique(art$paper.type)),args.legend=list(x="topright",bty="s"))