Functions for sample size and error
Here I show two functions in R to define sample sizes and errors of a proportion, taking into account design effect, response rate, finite population correction, and stratification. They are useful when one needs to do these calculations quickly.
Note: I created a package with similar functions. See here.
The inputs are:
- n = sample size
- e = sampling error
- deff = design effect, by default 1 (SRS)
- rr = response rate, by default 1
- N = population size, by default NULL (infinite population)
- cl = confidence level , by default .95
- p = proportion, by default 0.5 (maximum variance of a proportion)
- relative = to estimate relative error, by default FALSE
first, load the functions
serr: sampling error
An example for n = 400 and all inputs at their default values:
The output is rounded to 4 decimals. A more complete example:
- n = 400
- deff = 1.5
- response rate = 80%
- population size = 1000
The sample size (n) has always to be lower than the population (N). It is important to note that the final sample size used to compute the sampling error is:
\[n = \frac{N}{deff} * rr\]ssize: sample size
Let’s get a sample size with an error of .03, a population of 1000 elements, a response rate of 0.80, and an effect design of 1.2:
If the the sample size is bigger than the population because of low response rates or big design effects, the sample size will be fixed to N:
Working with strata
Finally, we can estimate different sample sizes by strata using vectors or a data frame:
As easy as falling off a log!