# Parametric and Nonparametric Test with key differences

Parametric and nonparametric tests are two types of statistical hypothesis tests used to make inferences about populations based on samples of data.

## Parametric Tests

Parametric tests are a type of statistical test that assumes that the data being analyzed comes from a population that follows a specific distribution, typically the normal distribution. These tests are often used in research studies to compare means or variances between two or more groups or to examine the relationship between two continuous variables. In order for parametric tests to be valid, the data being analyzed must meet several assumptions.

### Some of the major assumptions of parametric tests include:

Normality: The data should be normally distributed. This means that the data should follow a bell-shaped curve, with the mean, median, and mode being approximately equal.

Homogeneity of variance: The variance of the data should be the same across groups. In other words, the spread of the data should be roughly equal across groups.

Independence: The observations in the sample should be independent of each other. This means that the value of one observation should not be influenced by the value of another observation.

Linearity: For regression models, the relationship between the independent and dependent variables should be linear. This means that the change in the dependent variable should be proportional to the change in the independent variable.

Sample size: In general, parametric tests require a larger sample size than non-parametric tests in order to be valid.

If these assumptions are not met, the results of parametric tests may be biased or unreliable. In such cases, non-parametric tests may be more appropriate. Additionally, it is important to use caution when interpreting the results of parametric tests, as they are sensitive to outliers and may produce inaccurate results if the assumptions are not met.

### Some examples of commonly used parametric tests:

Student's t-test: This test is used to compare the means of two independent groups. For example, a researcher may want to compare the average test scores of students who received a particular treatment with those who did not.

Analysis of variance (ANOVA): This test is used to compare the means of two or more independent groups. For example, a researcher may want to compare the average test scores of students who received different treatments or were exposed to different conditions.

Pearson correlation: This test is used to measure the strength of the linear relationship between two continuous variables. For example, a researcher may want to examine the relationship between a student's GPA and SAT scores.

Linear regression: This test is used to model the relationship between two or more continuous variables. For example, a researcher may want to predict a student's future earnings based on their level of education, years of experience, and other factors.

Parametric tests are used to make inferences about the population from the sample data. These tests assume that the data follow a specific distribution (usually a normal distribution) and that the data points are independent of each other.

## Non-Parametric Tests

Non-parametric tests are a type of statistical test that does not make any assumptions about the underlying distribution of the population from which the data is sampled. These tests are often used in research studies when the data is not normally distributed, or when the sample size is too small to make assumptions about the population.

### Assumptions

Independence: The observations in the sample should be independent of each other. This means that the value of one observation should not be influenced by the value of another observation.

Random sampling: The sample should be selected at random from the population of interest. This means that every member of the population should have an equal chance of being included in the sample.

Nominal or Ordinal scale: Non-parametric tests are appropriate for data that is measured on an ordinal scale or a nominal scale. This means that the data can be put in a specific order or ranking.

Sample size: Non-parametric tests are generally more robust to small sample sizes than parametric tests.

Symmetry: For some non-parametric tests, such as the Wilcoxon signed-rank test, the data should be symmetric around the median. If the data is skewed, this may affect the results of the test.

If these assumptions are not met, the results of non-parametric tests may be biased or unreliable. In such cases, other statistical methods may be more appropriate. Additionally, it is important to use caution when interpreting the results of non-parametric tests, as they may not be as powerful as parametric tests when the data meet the assumptions of the latter.

### Some examples of commonly used non-parametric tests:

Mann-Whitney U-test: This test is used to compare the medians of two independent groups. For example, a researcher may want to compare the median test scores of students who received a particular treatment with those who did not.

Wilcoxon signed-rank test: This test is used to compare the medians of two related groups. For example, a researcher may want to compare the median test scores of students before and after a particular treatment.

Kruskal-Wallis test: This test is used to compare the medians of two or more independent groups. For example, a researcher may want to compare the median test scores of students who received different treatments or were exposed to different conditions.

Spearman's rank correlation: This test is used to measure the strength of the monotonic relationship between two continuous or ordinal variables. For example, a researcher may want to examine the relationship between a student's rank in their class and their GPA.

Chi-squared test: This test is used to examine the relationship between two categorical variables. For example, a researcher may want to examine the relationship between a student's gender and their likelihood of passing a test.

Non-parametric tests do not assume any particular distribution for the data. These tests are used when the data is not normally distributed or when the sample size is too small to make assumptions about the population. Non-parametric tests are based on ranks, rather than actual values of the data.

## Key Differences

 Basis of Difference Parametric Test Non-Parametric Test Data type Assumes normal distribution of data. Does not assume a normal distribution of data. Sample size Requires large sample size (n>30) Can be used for small sample sizes (n<30) Measurement scale Can be used for continuous and categorical data. Mainly used for ordinal and nominal data. Hypothesis testing Tests the mean or variance of the population. Tests the median or frequency distribution of the population. Assumption Assumes homogeneity of variance and independence of observations. Does not assume homogeneity of variance or independence of observations. Precision of estimates Provides more precise estimates due to assumptions made. Provides less precise estimates due to a lack of assumptions. Type of statistic Uses a t-test, ANOVA, and correlation coefficient. Uses the Wilcoxon rank-sum test, the Kruskal-Wallis test, and Spearman's rank correlation coefficient. Robustness Not robust to outliers and extreme values. Robust to outliers and extreme values. Power of test Higher power of test due to assumptions. The lower power of the test is due to a lack of assumptions. Sample selection Requires random sampling from the population. Does not require random sampling from the population. Level of significance Can use both one-tailed and two-tailed tests. Mostly uses one-tailed tests. Interpretation Provides parameter estimates and confidence intervals. Provides non-parameter estimates and percentile values. Data transformations Data may need to be transformed to meet assumptions. Data can be analyzed without transformations. Statistical modeling Can be used for advanced modeling techniques. Limited to simple modeling techniques. Time and cost Can be time-consuming and expensive due to data requirements. Can be quicker and less expensive due to the lack of data requirements.

Note that these differences are not exhaustive and there may be exceptions and nuances for specific tests within each category. Additionally, the choice of parametric or non-parametric tests depends on the specific research question, study design, and characteristics of the data being analyzed.