Author: Genpro SAS/R Programming Team
Choosing the proper sample size for an investigation is one of the crucial jobs required of a statistician. Regardless of whether the statistician is deciding the number of patients to select in a clinical trial, electors to finish a political survey, or mice to remember for a lab experiment, the same information elements of power, significance criteria, and effect size can be utilized too effectively. An expansion in the number of considered patients may straightforwardly expand the sensitivity of an experiment. This aids through an abatement in standard error and, subsequently, expands our capacity to recognize genuine treatment contrast. Without a minimum required sample size, the experiment with a helpless possibility of recognizing genuine treatment contrasts may be a waste of time and cash. Enrolling an excessive number of subjects can make possibly unnecessary exposure inferior treatments. Sample size determinations can be finished by hand or through one of the numerous accessible software packages, for example, SAS and R. This blog is about the comparison between SAS and R Programming.
STATISTICAL POWERPower is a powerful and regularly utilized strategy for statistical power analysis and sample size determination. Statistical power is defined as the probability of rejecting the null hypothesis when it is false that is, the probability of a correct rejection. Mathematically, it can be represented as Pr(𝑟𝑒𝑗𝑒𝑐𝑡𝐻0 |𝐻1 𝑖𝑠𝑡𝑟𝑢𝑒) or as 1 – β, where β is equal to the probability of Type II error. Because power is a probability, it can take on values between 0 and 1. In spite of the fact that this may incredibly vary based on the study design and field of study, customary edges for statistical power are typically around 0.8 to 0.9 (80% to 90%).
Statistical power and sample size are inextricably linked, with a positive correlation between power and sample size. An assortment of factors influences the power of a test including the sample size, the effect size, and the intrinsic changeability in the data, a higher prerequisite of statistical power will yield a higher required sample size. Of these factors, you have the most power over the sample size. Statistical power can be utilized to figure the minimum sample size needed to recognize a determining effect size.
TYPE I AND TYPE II ERRORWhen we deal with making inferences based on the results from the sample, we cannot do so with 100% confidence. Acceptance or rejection of the null hypothesis can be stated only with a certain amount of error or with a certain amount of confidence (100% error). The error can be of two types, termed as Type-I and Type-II.
- Type I Error: Rejection of null hypothesis when the null hypothesis is actually true (i.e., false-positive) or finding an effect when actually there is no effect. This is represented by α and can be written mathematically as Pr(𝑟𝑒𝑗𝑒𝑐𝑡𝐻0 |𝐻0 𝑖𝑠 𝑡𝑟𝑢𝑒).
- Type II Error: Acceptance of null hypothesis when the alternative hypothesis is true (i.e., false negative) or not finding an effect when actually there is an effect. This is represented by β and can be written mathematically as Pr(𝑎𝑐𝑐𝑒𝑝𝑡𝐻0 |𝐻1 𝑖𝑠 𝑡𝑟𝑢𝑒).
Decision based on samples | Null Hypothesis (H_{0}) is | |
Groups are not different (H_{0} True) | Groups are different (H_{0 }False) | |
Groups are not different (Accept H_{0}) | Correct Decision (True Negative) Probability = (1 – α) | Type II Error (False Negative) Probability = β |
Groups are different (Reject H_{0}) | Type I Error (False Positive) Probability = α | Correct Decision (True Positive) Probability = 1 – β |
The factor associated with computing sample size in a study is the significance criterion. It is same as the Type I error represented by α. This value is another significant suspicion for computing sample sizes. By show, which may contrast dependent on study design and field of study, this significance criterion is normally set at a value or 0.05 or less.
EFFECT SIZEAnother significant factor related with figuring sample size is the effect size. An effect size is a number measuring the strength of the relationship between two variables in a statistical population, or a sample-based estimate of that quantity. For clinical trials, the effect size is measured by a clinician or potentially upheld by writing plotting a clinically important effect size. This could be the quantity of purposes of enhancement for a test to have any kind of effect in the patient’s personal satisfaction or the improvement of an illness condition to a more noteworthy degree than existing medicines.
STATISTICAL ANALYSIS SOFTWARESThere are various computer programs accessible for statistical analyses. Some are free and can be downloaded from sources on the web. Two main statistical analysis software used are R & Statistical Analysis Software (SAS).
FINDING POWER USING R AND SASTo conduct a power or sample size analysis using R the pwr package must be installed and loaded. In this package there are different functions to find power for different tests. For all the power calculations exactly one of the arguments (the one you want to find, most likely power or sample size) must be left NULL for the calculation to be completed.
CODES FOR SAMPLE SIZES
TWO PROPORTIONS WITH DIFFERENT SAMPLE SIZES
Function: pwr.2p2n.test Arguments: h: Effect size n1: Number of observations in first sample n2: Number of observations in second sample sig.level: Significance level power: Power of test alternative: Character string specifying the alternative hypothesis ( ‘two.sided’, ‘greater’, ‘less’) Sample size for two independent proportion.R CODE
power.prop.test(p1=0.15, p2=0.30, power=0.85, sig.level=0.05)TWO-SAMPLE COMPARISON OF PROPORTIONS POWER CALCULATION
n = 137.604 p1 = 0.15 p2 = 0.3 sig.level = 0.05 power = 0.85 alternative = two.sided NOTE: n is number in *each* groupSAS CODE
PROC POWER; TWOSAMPLEFREQ TEST=PCHI GROUPPROPORTIONS= (0.75 0.5) POWER= 0.8 NPERGROUP= .; RUN;ONE SAMPLE PROPORTION TESTS
Function: pwr.p.test Argumetns: h: Effect size n: Number of observations sig.level: Significance level power: Power of test alternative: Character string specifying the alternative hypothesis ( ‘two.sided’, ‘greater’, ‘less’)R CODE
pwr.p.test(h=0.2,power=0.95,sig.level=0.05,alternative=”two.sided”)PROPORTION POWER CALCULATION FOR BINOMIAL DISTRIBUTION (ARCSINE TRANSFORMATION)
h = 0.2 n = 324.8677 sig.level = 0.05 power = 0.95 alternative = two.sidedSAS CODE
PROC POWER; ONESAMPLEFREQ TEST=Z METHOD=NORMAL NULLPROPORTION = 0.8 PROPORTION = 0.85 SIDES = U NTOTAL = . POWER = .9; RUN;ONE SAMPLE, TWO SAMPLE, OR PAIRED T-TEST
Function: pwr.t.test Arguments: n: Sample size d: Effect size sig.level: Significance level power: Power of test type: Type of t-test (‘one.sample’, ‘two.sample’, ‘paired.sample’) alternative: Character string specifying the alternative hypothesis ( ‘two.sided’, ‘greater’, ‘less’)ONE SAMPLE
pwr.t.test(d=0.2,n=60,sig.level=0.10,type=”one.sample”,alternative=”two.sided”)ONE-SAMPLE T TEST POWER CALCULATION
n = 60 d = 0.2 sig.level = 0.1 power = 0.4555818 alternative = two.sidedTWO SAMPLE
pwr.t.test(d=0.3,power=0.75,sig.level=0.05,type=”two.sample”,alternative=”greater”)TWO-SAMPLE T TEST POWER CALCULATION
n = 120.2232 d = 0.3 sig.level = 0.05 power = 0.75 alternative = greater NOTE: n is number in *each* groupTWO SAMPLE OF DIFFERENT SIZES T-TEST
Function: pwr.t2n.test Arguments: n1: Number of observations in first sample n2: Number of observations in second sample d: Effect size sig.level: Significance level power: Power of test alternative: Character string specifying the alternative hypothesis ( ‘two.sided’, ‘greater’, ‘less’)R CODE
pwr.t.test(d=0.5, sig.level=0.05, power=0.80, type=”two.sample”, alternative=”greater”)TWO-SAMPLE T TEST POWER CALCULATION
n = 50.1508 d = 0.5 sig.level = 0.05 power = 0.8 alternative = greater NOTE: n is number in *each* groupSAS CODE
PROC POWER; TWOSAMPLEMEANS MEANDIFF= 0.5 STDDEV= 1 POWER= 0.8 NPERGROUP= .; RUN; POWER CALCULATION CODESTWO PROPORTIONS WITH DIFFERENT SAMPLE SIZES
power.prop.test(n=28,p1=0.3,p2=0.55)TWO-SAMPLE COMPARISON OF PROPORTIONS POWER CALCULATION
n = 28 p1 = 0.3 p2 = 0.55 sig.level = 0.05 power = 0.4720963 alternative = two.sided NOTE: n is number in *each* groupSAS CODE
PROC POWER; TWOSAMPLEFREQ TEST=PCHI GROUPPROPORTIONS = (.3 .2) NPERGROUP = 50 POWER = .; RUN;ONE SAMPLE PROPORTION TESTS
ONE SAMPLE
pwr.t.test(d=0.2,n=60,sig.level=0.10,type=”one.sample”,alternative=”two.sided”)ONE-SAMPLE T TEST POWER CALCULATION
n = 60 d = 0.2 sig.level = 0.1 power = 0.4555818 alternative = two.sidedSAS CODES
PROC POWER; ONESAMPLEFREQ TEST=EXACT /*DEFAULT*/ NULLPROPORTION = 0.2 PROPORTION = 0.3 NTOTAL = 100 POWER = .; RUN;TWO INDEPENDENT SAMPLES (POWER)
pwr.t.test(d=d,n=30,sig.level=0.05,type=”two.sample”,alternative=”two.sided”)TWO-SAMPLE T TEST POWER CALCULATION
n = 30 d = 0.559017 sig.level = 0.05 power = 0.5671879 alternative = two.sided NOTE: n is number in *each* groupPAIRED SAMPLES (POWER)
pwr.t.test(d=d,n=40,sig.level=0.05,type=”paired”,alternative=”two.sided”)PAIRED T TEST POWER CALCULATION
n = 40 d = 0.559017 sig.level = 0.05 power = 0.9315248 alternative = two.sidedNOTE: N IS NUMBER OF *PAIRS*
SAS CODES
ONE-SAMPLE T TEST POWER CALCULATION
PROC POWER; ONESAMPLEMEANS TEST=T MEAN = 1 STDDEV = 1 NTOTAL = 10 POWER = .; RUN;TWO INDEPENDENT SAMPLES (POWER)
PROC POWER; TWOSAMPLEMEANS TEST=DIFF_SATT MEANDIFF = 3 GROUPSTDDEVS = 5 | 8 GROUPWEIGHTS = (1 2) NTOTAL = 60 POWER = .; RUN; PAIRED T TEST POWER CALCULATION PROC POWER; PAIREDMEANS TEST=DIFF MEANDIFF = 1.5 CORR = 0.4 PAIREDSTDDEVS = (3 1) NPAIRS = 20 POWER = .; RUN; CONCLUSION We can calculate sample size and power using R and SAS. But when it comes to R comparatively it has more methods and more information available. Also, the steps or code are simpler in R and R provides lot of ways (multiple libraries) to write the programs. REFERENCES- Sample Size Calculation Using SAS®, R, and nQuery Software Jenna Cody, Johnson & Johnson
- Kr Sundaram-medical statistics
- Statistics power by jim
- Wikipedia effect size
- Wikipedia power
- Salkind, Neil (2010). Encyclopedia of Research Design Encyclopedia of research design. doi:4135/9781412961288. ISBN 9781412961271.
- https://intellipaat.com/blog/tutorial/r-programming/introduction/