# Basics of OLS regression

### Some Preliminary Ideas

#### Sample:

The sample is a part of the population selected for inferring the population parameter. It only represents the population and tries to infer the population characteristics as a whole.

#### Estimation:

Estimation is the process of inferring the population parameters. It is the technique and method used to obtain the estimates or parameters.

#### Estimate:

The estimate is the parameter that is obtained through estimation.

#### Properties of estimators:

The BLUE properties of estimators are:

• Unbiasedness: An estimator is said to be unbiased if the estimated sample statistics (E(theta)) is equal to the population parameter (theta).
• Consistency: An estimator of a parameter is said to be consistent if it converges in probability to the true value of the parameter as the sample size tends to infinity.
• Efficiency: An estimator is said to be efficient if it has minimum variance in comparison to other estimators.

#### Hypothesis Testing

Hypothesis testing is the process of inferring the guess made about the population on the basis of the information contained in the sample.

#### Data types:

Cross-section, time series, and panel data

#### Ordinary Least Squares (OLS)

Ordinary Least Squares is the method of estimation of parameters in a regression model or population regression model (PRM) by minimizing the sum of the squared residuals.
Y = a + bX + e --- (i)
Equation (i) is a simple regression model with one independent variable (X) and one dependent variable (X). 'e' is the error term or white noise.
In OLS, we estimate the values of 'a' and 'b' by minimizing the value of square of 'e'.

#### Assumptions

1. Linear in parameters
• Correct forms
• Y = a + bX + e
• Y = a + bX^2 + e
• Incorrect forms
• Y = a + (b^2)X + e
2. The mean value of the error term is zero
• E(e)=0
3. No autocorrelation: The previous disturbances must not affect the present disturbance.
• E(e(t)-e(t-1))=0
4. No heteroscedasticity: The variance of error term must be constant.
• Var(e(t))=c, where c is constant
5. Normality of error term: Error term must be normally and independently distributed (NID)
6. Independent variable independent of the error term
• E(X,e)=0

1. 