![]() In the heteroscedastic case, instead of 5%, the number of false positives is between 12% and 19%. In the homoscedastic case, as expected for a test with alpha = 0.05, the proportion of false positives is very close to 0.05 at every sample size. We superimpose results for the homoscedastic case for comparison. Next, Wilcox (2012) considers the effect of this heteroscedastic situation on false positives. X and Y have variance 1 unless | X|>0.5, in which case Y has standard deviation | X|.” “ X and Y have normal distributions with both means equal to zero. If we correlate variable X with variable Y, heteroscedasticity means that the variance of Y depends on X. Let’s look at Wilcox’s heteroscedasticity example (2012, p. The effect of outliers on Pearson’s and Spearman’s correlations is described in detail in Pernet et al. the magnitude of the slope around which points are clustered.444-445) describes 6 aspects of data that affect Pearson’s r: So far, we have only considered situations where we sample from bivariate normal distributions. For instance, given n = 40 and a desired power of at least 90%, the minimum effect size we can detect is 0.49. To achieve at least 90% power given an expected population rho of 0.3, the minimum sample size is 118 observations.Īlternatively, for a given sample size and a desired power, we can determine the minimum effect size we can hope to detect. To achieve at least 80% power given an expected population rho of 0.4, the minimum sample size is 46 observations. This is the case here, as Pearson’s correlation is well behaved for bivariate normal data.įor a given expected population correlation and a desired long run power value, we can use interpolation to find out the matching sample size. It should be around 0.05 for a test with alpha = 0.05. When rho = 0, the proportion of positive tests is the proportion of false positives. Power increases with sample size and with rho. We also include rho = 0 to determine the proportion of false positives. This gives us power curves (here based on simulations with 50,000 samples). Later we will look at heteroscedasticity, when the variance of Y varies with X.įor the same distributions illustrated in the previous figure, we compute the proportion of positive Pearson’s correlation tests for different sample sizes. The variances of the two correlated variables are independent – there is homoscedasticity. For each rho, we draw a random sample and plot Y as a function of X. To get started, let’s look at examples of n=1000 samples from bivariate populations with known correlations (rho), with rho increasing from 0.1 to 0.9 in steps of 0.1. Following the previous posts on small n correlations, in this post we’re going to consider power estimation (if you do not care about power, but you’d rather focus on estimation, this post is for you).
0 Comments
Leave a Reply. |