As in multiple linear regression, the main purpose of partial least squares regression is to build a linear model, Y=XB+E, where Y is an n cases by m variables response matrix, X is an n cases by p variables predictor (design) matrix, B is a p by m regression coefficient matrix, and E is a noise term for the model which has the same dimensions as Y. Usually, the variables in X and Y are centered by subtracting their means and scaled by dividing by their standard deviations. For more information about centering and scaling in partial least squares regression, you can refer to Geladi and Kowalski(1986).
Both principal components regression and partial least squares regression produce factor scores as linear combinations of the original predictor variables, so that there is no correlation between the factor score variables used in the predictive regression model. For example, suppose we have a data set with response variables Y (in matrix form) and a large number of predictor variables X (in matrix form), some of which are highly correlated. A regression using factor extraction for this type of data computes the factor score matrix T=XW for an appropriate weight matrix W, and then considers the linear regression model Y=TQ+E, where Q is a matrix of regression coefficients (loadings) for T, and E is an error (noise) term. Once the loadings Q are computed, the above regression model is equivalent to Y=XB+E, where B=WQ, which can be used as a predictive regression model.
Principal components regression and partial least squares regression differ in the methods used in extracting factor scores. In short, principal components regression produces the weight matrix W reflecting the covariance structure between the predictor variables, while partial least squares regression produces the weight matrix W reflecting the covariance structure between the predictor and response variables.
For establishing the model, partial least squares regression produces a p by c weight matrix W for X such that T=XW, i.e., the columns of W are weight vectors for the X columns producing the corresponding n by c factor score matrix T. These weights are computed so that each of them maximizes the covariance between responses and the corresponding factor scores. Ordinary least squares procedures for the regression of Y on T are then performed to produce Q, the loadings for Y (or weights for Y) such that Y=TQ+E. Once Q is computed, we have Y=XB+E, where B=WQ, and the prediction model is complete.
One additional matrix necessary for a complete description of partial least squares regression procedures is the p by cfactor loading matrix P which gives a factor model X=TP+F, where F is the unexplained part of the X scores. We now can describe the algorithms for computing partial least squares regression.
Resource : Statsoft