I’m not very sure what dataset to apply when using correlation:
correlation
correlation = dataset.corr()
I have dataset1 with original values and dataset2 with log-transformed target variable. When I try correlation on both datasets, the correlation matrix give quite different results, where log_transformed dataset2 exhibits higher correlation for the same pair of variables compared with that of dataset1.
(I think .corr() default correlation method is Pearson Correlation coefficient. )
If I want to examine the correlation between variables, does it matter which dataset or correlation method I use?