Correlation_Plots with R - corrplot()

UA-60924200-1

# Creating Corrplots with the WCD Data with Outliers , will then create Corrplots with WCD with Capped Outliers  - checking difference and reporting Correlation to Client for the Capped Outliers data set . 

> library(corrplot)
> M <- cor(WCD)
> corrplot(M, method = "color",type = "upper")
> library(corrplot)
> M <- cor(WCD)
> corrplot(M, method = "number",type = "upper")
> cor(Delicassen,Fresh)
[1] 0.24469
> cor(Delicassen,Milk)
[1] 0.4063683
> cor(Delicassen,Grocery)
[1] 0.2054965
> xx<-data.frame(WCD$Fresh,WCD$Milk,WCD$Grocery,WCD$Frozen,WCD$Detergents_Paper,WCD$Delicassen)
> View(xx)
> M <- cor(xx)
> corrplot(M, method = "color",type = "upper")
> M <- cor(xx)
> corrplot(M, method = "number",type = "upper")

> ## - Created a Data.Frame (xx) to Exclude Channel and Region...

> cor(Delicassen,Grocery)

[1] 0.2054965
> cor(Delicassen,Milk)
[1] 0.4063683
> cor(Delicassen,Fresh)
[1] 0.24469
> xx<-data.frame(WCD$Fresh,WCD$Milk,WCD$Grocery,WCD$Frozen,WCD$Detergents_Paper,WCD$Delicassen)
> M <- cor(xx)
> corrplot(M, method = "color",type = "upper")
> M <- cor(xx)
> corrplot(M, method = "number",type = "upper")

> yy<-data.frame(I_W$WCD.Fresh,I_W$WCD.Milk,I_W$WCD.Grocery,I_W$WCD.Frozen,I_W$WCD.Detergents_Paper,I_W$WCD.Delicassen)
> N <- cor(yy)
> corrplot(N, method = "color",type = "upper")
> corrplot(N, method = "number",type = "upper")

  


















# As seen above - # - For Data.Frame (WCD) - ### 
> cor(Delicassen,Milk)
[1] 0.4063683
> cor(Delicassen,Fresh)
[1] 0.24469
> cor(Delicassen,Grocery)
[1] 0.2054965
# - For Data.Frame (WDC) - ### High Correlation also as seen in the Corrplot above 
> cor(Detergents_Paper,Grocery)
[1] 0.9246407
## - We now measure the same variables for Correlation after having Imputed the OUTLIERS

As seen below with ggplot - we have visually shown the high correlation between - Detergents_Paper and Grocery


# - For Data.Frame (I_W)  which is the DataSet with Capped / Imputed Outliers - ### 

> cor(I_W$WCD.Delicassen,I_W$WCD.Milk)
[1] 0.2176773
> cor(I_W$WCD.Delicassen,I_W$WCD.Fresh)
[1] 0.1177586
> cor(I_W$WCD.Delicassen,I_W$WCD.Grocery)
[1] 0.1295903

# - For Data.Frame (I_W) - ### High Correlation also as seen in the Corrplot above 
> (I_W$WCD.Detergents_Paper,I_W$WCD.Grocery)
[1] 0.6856996

Thus if the Wholesaler wants to increase the sales of Groceries in a region - they should also focus on - Detergents_Paper and vice versa as these are showing High Correlation , in both , the Original WCD Data and the Imputed WCD / I_W data .