Principal Component Analysis (PCA)
Principal component analysis (PCA) is a method of dimensionality reduction. It aims to reduce the number of variables or features in a data set while preserving as much information or variation as possible. PCA is widely applied to numerical data sets containing numerous interrelated features or variables.
The vignette conducts a PCA on a new scale created by Bovey Management. The scale measures an individual’s behavioural intentions towards organisational change. The goal is to reduce the number of original variables and reveal only the most important variables that explain the variation in the data set.
The original scale for behavioural intentions was conceptualised on two latent factors, support and resistance to organisational change, across four dimensions illustrated in Figure 1.
Figure 1 Conceptual framework for behavioural intentions
The survey instrument consisted of 20 items, with data recorded on a seven-point Likert scale. Each item incorporated one of the keywords (or variables) in Figure 1. The scale was implemented across 10 public and private sector organisations implementing major organisational change projects, gathering 599 usable responses for analysis.
The raw data set was tidied before processing. This involved recoding reverse-scored items, removing missing values, relabelling and reshaping data as required. Conducted a brief exploratory analysis consisting of a statistical summary, distribution and correlation analysis to understand the manifest variables.
An initial PCA was conducted to determine a preferred number of components to extract from the data set. The PCA then focussed on exploring the variables in Figure 1 regarding the quality of representation and contribution to the components. An alternative PCA methodology was implemented to verify that the most important variables had been identified. Compared the results of both methods to extract a reduced set of the most important variables.
Respondents were then briefly explored to identify similarities. The results from exploring the variables and respondents were combined and embellished by incorporating a response factor variable, reaction to change. To conclude, this outcome was supported by a 3D interactive visualisation.
Chart 1 illustrates the distribution of responses, following recoding, for each of the 20 manifest variables.
Table 1 is a statistical summary for each of the 20 manifest variables.
| n | mean | sd | median | trimmed | mad | min | max | range | skew | kurtosis | se | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| undermine | 599 | 5.98 | 1.30 | 6 | 6.19 | 1.48 | 1 | 7 | 6 | -1.38 | 1.63 | 0.05 |
| dismantle | 599 | 5.90 | 1.44 | 6 | 6.15 | 1.48 | 1 | 7 | 6 | -1.40 | 1.43 | 0.06 |
| obstruct | 599 | 5.88 | 1.38 | 6 | 6.10 | 1.48 | 1 | 7 | 6 | -1.31 | 1.27 | 0.06 |
| stall | 599 | 5.87 | 1.46 | 6 | 6.10 | 1.48 | 1 | 7 | 6 | -1.26 | 0.84 | 0.06 |
| ignore | 599 | 5.79 | 1.40 | 6 | 6.02 | 1.48 | 1 | 7 | 6 | -1.28 | 1.15 | 0.06 |
| refrain | 599 | 5.75 | 1.44 | 6 | 5.99 | 1.48 | 1 | 7 | 6 | -1.34 | 1.33 | 0.06 |
| comply | 599 | 5.73 | 1.19 | 6 | 5.88 | 1.48 | 1 | 7 | 6 | -1.32 | 2.22 | 0.05 |
| avoid | 599 | 5.71 | 1.43 | 6 | 5.94 | 1.48 | 1 | 7 | 6 | -1.25 | 1.02 | 0.06 |
| accept | 599 | 5.64 | 1.34 | 6 | 5.84 | 1.48 | 1 | 7 | 6 | -1.43 | 2.15 | 0.05 |
| withdraw | 599 | 5.60 | 1.43 | 6 | 5.79 | 1.48 | 1 | 7 | 6 | -1.03 | 0.51 | 0.06 |
| cooperate | 599 | 5.60 | 1.34 | 6 | 5.79 | 1.48 | 1 | 7 | 6 | -1.31 | 1.80 | 0.05 |
| agree | 599 | 4.98 | 1.61 | 5 | 5.16 | 1.48 | 1 | 7 | 6 | -0.83 | 0.10 | 0.07 |
| oppose | 599 | 4.86 | 1.87 | 5 | 4.99 | 2.97 | 1 | 7 | 6 | -0.39 | -1.18 | 0.08 |
| give_in | 599 | 4.72 | 1.67 | 5 | 4.86 | 1.48 | 1 | 7 | 6 | -0.71 | -0.21 | 0.07 |
| argue | 599 | 4.71 | 1.79 | 5 | 4.80 | 2.97 | 1 | 7 | 6 | -0.28 | -1.14 | 0.07 |
| embrace | 599 | 4.31 | 1.75 | 4 | 4.39 | 1.48 | 1 | 7 | 6 | -0.37 | -0.83 | 0.07 |
| support | 599 | 4.23 | 1.84 | 4 | 4.28 | 2.97 | 1 | 7 | 6 | -0.22 | -1.08 | 0.08 |
| initiate | 599 | 4.11 | 1.84 | 4 | 4.16 | 2.97 | 1 | 7 | 6 | -0.26 | -1.00 | 0.08 |
| observe | 599 | 4.00 | 1.75 | 4 | 3.96 | 1.48 | 1 | 7 | 6 | 0.10 | -1.03 | 0.07 |
| wait | 599 | 3.65 | 1.74 | 4 | 3.59 | 2.97 | 1 | 7 | 6 | 0.32 | -0.87 | 0.07 |
Chart 2 is a box plot of the 20 manifest variables measuring behavioural intentions towards organisational change. After recoding, the median ranges from “4” to “6”.
Chart 3 presents a correlation heatmap of behavioural intentions. The correlation coefficients range from -0.27 to 0.72.
Output from the parallel analysis in Chart 4 suggests three (3) principal components for this data set. This is supported by eigenvalues, with the first three dimensions having a variance greater than one.
Chart 5 shows that the first three dimensions explain approximately 60 per cent of the variation.
Squared cosine (cos2) measures the quality of representation of variables for a given principal component. Chart 6 illustrates the strength of cos2 for all variables in each dimension.
Chart 6 Visualise quality of representation with cos2
Chart 7 is an ordered bar chart of cos2 for each variable.
Variable quality of representation can also be visualised with a correlation circle factor map illustrated in Charts 8 and 9, which show the relationship between dimensions 1 and 2 and dimensions 2 and 3, respectively. These charts show that positively correlated variables group together. In contrast, negatively correlated variables are positioned on opposite sides of the plot origin. Furthermore, the closer a variable is to the circle of correlations (the circumference), the better its representation on the factor map and the more important this variable is to the component. Conversely, variables close to the centre of the factor map are less important to the components they represent.
The importance of a variable can be measured by the percentage contribution to the principal component. Chart 10 bar chart orders variables by percentage contribution to components, with the red line indicating the expected average contribution. In terms of dimensionality reduction, variables below average contribution could be eligible for removal.
To verify the results in Chart 10, conducted a PCA using an alternative method for comparison. Chart 11 shows the results of a PCA aiming to extract three principal components from the data set with a cut-off of 0.70. There were no variable loadings onto PC3.
The variables loading onto principal components in Chart 11 (with a 0.70 cut-off) were very similar to those with an above-average contribution in Chart 10. The variables in common for both PCA methods were extracted and listed in Figure 2.
Figure 2 Extracted behavioural intentions variables from PCA
Chart 12 shows PCA results for individual respondents summarised by squared cosine (cos2). Similar respondents are grouped together in the chart. Respondent ids are removed from chart to reduce clutter.
After exploring PCA for variables and individual respondents, the next step was to combine these results in Chart 13.
Chart 14 is a 2D visualisation of respondent behavioural intentions, incorporating reaction to organisational change, with overlapping ellipses at a 0.95 confidence level.
To complement Chart 14, Chart 15 is a 3D visualisation of PCA, assigning respondents by their reaction to change. This interactive chart, when rotated, provides more insights into individual reactions to organisational change across the three principal components.
Implementing PCA on the behavioural intentions scale has extracted the most important variables. During this dimensionality reduction process, the number of variables was reduced from the original 20 (in Figure 1) to 11 (in Figure 2), a 45 per cent decrease in the number of variables employed.
To compare the results of this PCA on the behavioural intentions scale with another dimensionality reduction method, review the vignette on Exploratory Factor Analysis (EFA).
## ─ Session info ───────────────────────────────────────────────────────────────
## setting value
## version R version 4.4.0 (2024-04-24 ucrt)
## os Windows 11 x64 (build 22631)
## system x86_64, mingw32
## ui RTerm
## language (EN)
## collate English_Australia.utf8
## ctype English_Australia.utf8
## tz Australia/Brisbane
## date 2024-07-29
## pandoc 3.1.11 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
##
## ─ Packages ───────────────────────────────────────────────────────────────────
## package * version date (UTC) lib source
## abind 1.4-5 2016-07-21 [1] CRAN (R 4.4.0)
## backports 1.5.0 2024-05-23 [1] CRAN (R 4.4.0)
## broom 1.0.6 2024-05-17 [1] CRAN (R 4.4.0)
## bslib 0.7.0 2024-03-29 [1] CRAN (R 4.4.0)
## cachem 1.1.0 2024-05-16 [1] CRAN (R 4.4.0)
## car 3.1-2 2023-03-30 [1] CRAN (R 4.4.0)
## carData 3.0-5 2022-01-06 [1] CRAN (R 4.4.0)
## cli 3.6.3 2024-06-21 [1] CRAN (R 4.4.1)
## cluster 2.1.6 2023-12-01 [2] CRAN (R 4.4.0)
## coda 0.19-4.1 2024-01-31 [1] CRAN (R 4.4.0)
## codetools 0.2-20 2024-03-31 [2] CRAN (R 4.4.0)
## colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.4.1)
## corrplot * 0.92 2021-11-18 [1] CRAN (R 4.4.0)
## crosstalk 1.2.1 2023-11-23 [1] CRAN (R 4.4.0)
## data.table * 1.15.4 2024-03-30 [1] CRAN (R 4.4.0)
## devtools 2.4.5 2022-10-11 [1] CRAN (R 4.4.0)
## digest 0.6.36 2024-06-23 [1] CRAN (R 4.4.1)
## dplyr * 1.1.4 2023-11-17 [1] CRAN (R 4.4.0)
## DT 0.33 2024-04-04 [1] CRAN (R 4.4.0)
## ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.4.0)
## emmeans 1.10.3 2024-07-01 [1] CRAN (R 4.4.1)
## estimability 1.5.1 2024-05-12 [1] CRAN (R 4.4.0)
## evaluate 0.24.0 2024-06-10 [1] CRAN (R 4.4.0)
## factoextra * 1.0.7 2020-04-01 [1] CRAN (R 4.4.0)
## FactoMineR * 2.11 2024-04-20 [1] CRAN (R 4.4.0)
## fansi 1.0.6 2023-12-08 [1] CRAN (R 4.4.0)
## farver 2.1.2 2024-05-13 [1] CRAN (R 4.4.0)
## fastmap 1.2.0 2024-05-15 [1] CRAN (R 4.4.0)
## flashClust 1.01-2 2012-08-21 [1] CRAN (R 4.4.0)
## forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.4.0)
## fs 1.6.4 2024-04-25 [1] CRAN (R 4.4.0)
## generics 0.1.3 2022-07-05 [1] CRAN (R 4.4.0)
## GGally * 2.2.1 2024-02-14 [1] CRAN (R 4.4.0)
## ggplot2 * 3.5.1 2024-04-23 [1] CRAN (R 4.4.0)
## ggpubr * 0.6.0 2023-02-10 [1] CRAN (R 4.4.0)
## ggrepel 0.9.5 2024-01-10 [1] CRAN (R 4.4.0)
## ggsignif 0.6.4 2022-10-13 [1] CRAN (R 4.4.0)
## ggstats 0.6.0 2024-04-05 [1] CRAN (R 4.4.0)
## glue 1.7.0 2024-01-09 [1] CRAN (R 4.4.0)
## gridExtra 2.3 2017-09-09 [1] CRAN (R 4.4.0)
## gtable 0.3.5 2024-04-22 [1] CRAN (R 4.4.0)
## here * 1.0.1 2020-12-13 [1] CRAN (R 4.4.0)
## highr 0.11 2024-05-26 [1] CRAN (R 4.4.0)
## hms 1.1.3 2023-03-21 [1] CRAN (R 4.4.0)
## htmltools 0.5.8.1 2024-04-04 [1] CRAN (R 4.4.0)
## htmlwidgets 1.6.4 2023-12-06 [1] CRAN (R 4.4.0)
## httpuv 1.6.15 2024-03-26 [1] CRAN (R 4.4.0)
## httr 1.4.7 2023-08-15 [1] CRAN (R 4.4.0)
## jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.4.0)
## jsonlite 1.8.8 2023-12-04 [1] CRAN (R 4.4.0)
## kableExtra * 1.4.0 2024-01-24 [1] CRAN (R 4.4.0)
## knitr 1.48 2024-07-07 [1] CRAN (R 4.4.1)
## labeling 0.4.3 2023-08-29 [1] CRAN (R 4.4.0)
## later 1.3.2 2023-12-06 [1] CRAN (R 4.4.0)
## lattice 0.22-6 2024-03-20 [2] CRAN (R 4.4.0)
## lazyeval 0.2.2 2019-03-15 [1] CRAN (R 4.4.0)
## leaps 3.2 2024-06-10 [1] CRAN (R 4.4.1)
## lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.4.0)
## likert 1.3.5 2016-12-31 [1] CRAN (R 4.4.0)
## lubridate * 1.9.3 2023-09-27 [1] CRAN (R 4.4.0)
## magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.4.0)
## MASS 7.3-60.2 2024-04-24 [2] local
## Matrix 1.7-0 2024-03-22 [2] CRAN (R 4.4.0)
## memoise 2.0.1 2021-11-26 [1] CRAN (R 4.4.0)
## mime 0.12 2021-09-28 [1] CRAN (R 4.4.0)
## miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 4.4.0)
## mnormt 2.1.1 2022-09-26 [1] CRAN (R 4.4.0)
## multcomp 1.4-26 2024-07-18 [1] CRAN (R 4.4.1)
## multcompView 0.1-10 2024-03-08 [1] CRAN (R 4.4.0)
## munsell 0.5.1 2024-04-01 [1] CRAN (R 4.4.0)
## mvtnorm 1.2-5 2024-05-21 [1] CRAN (R 4.4.0)
## nlme 3.1-164 2023-11-27 [2] CRAN (R 4.4.0)
## pillar 1.9.0 2023-03-22 [1] CRAN (R 4.4.0)
## pkgbuild 1.4.4 2024-03-17 [1] CRAN (R 4.4.0)
## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.4.0)
## pkgload 1.4.0 2024-06-28 [1] CRAN (R 4.4.1)
## plotly * 4.10.4 2024-01-13 [1] CRAN (R 4.4.0)
## plyr 1.8.9 2023-10-02 [1] CRAN (R 4.4.0)
## profvis 0.3.8 2023-05-02 [1] CRAN (R 4.4.0)
## promises 1.3.0 2024-04-05 [1] CRAN (R 4.4.0)
## psych * 2.4.6.26 2024-06-27 [1] CRAN (R 4.4.1)
## purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.4.0)
## R6 2.5.1 2021-08-19 [1] CRAN (R 4.4.0)
## RColorBrewer * 1.1-3 2022-04-03 [1] CRAN (R 4.4.0)
## Rcpp 1.0.13 2024-07-17 [1] CRAN (R 4.4.1)
## readr * 2.1.5 2024-01-10 [1] CRAN (R 4.4.0)
## remotes 2.5.0 2024-03-17 [1] CRAN (R 4.4.0)
## reshape2 1.4.4 2020-04-09 [1] CRAN (R 4.4.0)
## rlang 1.1.4 2024-06-04 [1] CRAN (R 4.4.0)
## rmarkdown 2.27 2024-05-17 [1] CRAN (R 4.4.0)
## rprojroot 2.0.4 2023-11-05 [1] CRAN (R 4.4.0)
## rstatix 0.7.2 2023-02-01 [1] CRAN (R 4.4.0)
## rstudioapi 0.16.0 2024-03-24 [1] CRAN (R 4.4.0)
## sandwich 3.1-0 2023-12-11 [1] CRAN (R 4.4.0)
## sass 0.4.9 2024-03-15 [1] CRAN (R 4.4.0)
## scales 1.3.0 2023-11-28 [1] CRAN (R 4.4.0)
## scatterplot3d 0.3-44 2023-05-05 [1] CRAN (R 4.4.0)
## sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.4.0)
## shiny 1.8.1.1 2024-04-02 [1] CRAN (R 4.4.0)
## stringi 1.8.4 2024-05-06 [1] CRAN (R 4.4.0)
## stringr * 1.5.1 2023-11-14 [1] CRAN (R 4.4.0)
## survival 3.5-8 2024-02-14 [2] CRAN (R 4.4.0)
## svglite 2.1.3 2023-12-08 [1] CRAN (R 4.4.0)
## systemfonts 1.1.0 2024-05-15 [1] CRAN (R 4.4.0)
## TH.data 1.1-2 2023-04-17 [1] CRAN (R 4.4.0)
## tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.4.0)
## tidyr * 1.3.1 2024-01-24 [1] CRAN (R 4.4.0)
## tidyselect 1.2.1 2024-03-11 [1] CRAN (R 4.4.0)
## tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.4.0)
## timechange 0.3.0 2024-01-18 [1] CRAN (R 4.4.0)
## tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.4.0)
## urlchecker 1.0.1 2021-11-30 [1] CRAN (R 4.4.0)
## usethis 2.2.3 2024-02-19 [1] CRAN (R 4.4.0)
## utf8 1.2.4 2023-10-22 [1] CRAN (R 4.4.0)
## vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.4.0)
## viridisLite 0.4.2 2023-05-02 [1] CRAN (R 4.4.0)
## withr 3.0.0 2024-01-16 [1] CRAN (R 4.4.0)
## xfun 0.46 2024-07-18 [1] CRAN (R 4.4.1)
## xml2 1.3.6 2023-12-04 [1] CRAN (R 4.4.0)
## xtable 1.8-4 2019-04-21 [1] CRAN (R 4.4.0)
## yaml 2.3.9 2024-07-05 [1] CRAN (R 4.4.1)
## zoo 1.8-12 2023-04-13 [1] CRAN (R 4.4.0)
##
## [1] C:/Users/wayne/AppData/Local/R/win-library/4.4
## [2] C:/Program Files/R/R-4.4.0/library
##
## ──────────────────────────────────────────────────────────────────────────────