Objective

According to IBM, logistic regression (also known as logit model) is often used for classification and predictive analytics. Logistic regression estimates the probability of an event occurring based on a given data set of independent variables. Since the outcome is a probability, the dependent variable is bounded between 0 and 1.

The vignette conducts a stepwise logistic regression on a dataset of 208 respondents experiencing significant organisational change. Respondents reported self-efficacy, irrational ideas, maladaptive defence mechanisms, emotion, behavioural intentions and reaction towards change in their organisation.

This vignette has two objectives. First, model and identify statistically significant relationships between the outcome and explanatory variables. Second, predict outcomes and evaluate the accuracy of those predictions.

Workflow

The raw data set was wrangled and tidied before processing. Since this was a logistic regression, the outcome variable, a seven-point Likert scale, was replaced with a binary variable. Next, a brief exploratory analysis comprising a statistical summary, correlation, and comparative analysis was conducted to understand the variables.

Proceeded to conduct a stepwise binomial logistic regression, identifying statistically significant explanatory variables. Reviewed fit statistics for the stepwise model.

In the final section of this vignette, predicted outcomes using the model. Evaluated the model’s prediction performance with a confusion matrix heatmap, model fit statistics and ROC curve.

Results

1. Explore variables

Before building the logit model, the outcome variable (reaction to organisational change) was transformed from an interval variable to a binary variable. The following tables show the conversion of the outcome variable from a seven-point Likert scale (Table 1) to a binary scale (Table 2) with corresponding frequencies. Scale measures opposing change were coded as “0”, and measures supporting change were coded as “1”. The neutral measure on the Likert scale was dropped from the binary scale.

Table 1 Original seven-point Likert scale
Reaction to change Freq
Totally Oppose 73
Oppose 71
Partially Oppose 90
Neutral 124
Partially Support 189
Support 23
Totally Support 46
Table 2 New binary scale for logistic regression
Reaction to change Freq
0 184
1 359

The data set was then filtered to only those respondents who reported experiencing significant organisational change. Table 3 provides a statistical summary of the proposed explanatory variables.

Table 3 Statistical summary of explanatory variables
variable n mean sd median trimmed mad min max range skew kurtosis se
self_efficacy 208 5.55 0.81 5.65 5.63 0.78 1.71 6.94 5.24 −1.19 2.71 0.06
needs_approval 208 4.19 1.40 4.00 4.22 1.48 1.00 7.00 6.00 −0.14 −0.73 0.10
fears_failure 208 4.06 1.51 4.00 4.07 1.48 1.00 7.00 6.00 0.00 −0.83 0.10
labelling_blame 208 3.19 1.32 3.00 3.11 1.48 1.00 7.00 6.00 0.55 −0.31 0.09
catastrophising 208 3.70 1.32 3.50 3.66 1.48 1.00 7.00 6.00 0.15 −0.89 0.09
managing_feelings 208 3.88 1.31 4.00 3.88 1.48 1.00 6.50 5.50 −0.05 −0.72 0.09
anxious_thoughts 208 4.10 1.16 4.00 4.13 0.74 1.50 7.00 5.50 −0.17 −0.38 0.08
avoidance 208 2.25 0.99 2.00 2.14 0.74 1.00 5.50 4.50 1.05 0.90 0.07
past_influences 208 2.79 1.30 2.50 2.71 1.48 1.00 6.00 5.00 0.56 −0.76 0.09
facing_reality 208 4.54 1.44 5.00 4.60 1.48 1.00 7.00 6.00 −0.35 −0.80 0.10
passive_existence 208 4.23 1.24 4.00 4.26 1.48 1.00 7.00 6.00 −0.18 −0.46 0.09
dissociation 208 2.74 1.21 2.50 2.61 0.74 1.00 6.50 5.50 0.90 0.24 0.08
displacement 208 3.04 1.28 3.00 2.99 1.48 1.00 7.00 6.00 0.34 −0.50 0.09
isolation_of_affect 208 3.41 1.49 3.50 3.38 2.22 1.00 7.00 6.00 0.16 −0.92 0.10
reaction_formation 208 4.17 1.25 4.00 4.17 1.48 1.50 7.00 5.50 0.07 −0.65 0.09
denial 208 2.51 1.03 2.00 2.44 0.74 1.00 6.50 5.50 0.79 0.40 0.07
projection 208 2.49 1.11 2.00 2.39 0.74 1.00 6.00 5.00 0.88 0.09 0.08
passive_aggression 208 2.62 1.08 2.50 2.52 0.74 1.00 6.50 5.50 0.88 0.55 0.07
acting_out 208 3.55 1.34 3.50 3.54 1.48 1.00 6.50 5.50 0.04 −0.88 0.09
emotion 208 3.80 1.15 3.80 3.79 1.19 1.00 6.90 5.90 0.09 −0.32 0.08
behavioural_intentions 208 5.08 1.13 5.25 5.15 1.26 1.75 7.00 5.25 −0.57 −0.21 0.08

Chart 1 supports Table 1, showing the correlation coefficients between each explanatory variable.

Because of the number of explanatory variables under consideration, prepared two separate pairs plots. Chart 2 compares the relationship between irrational ideas and the outcome variable, reaction to organisational change. Chart 3 compares the relationship between maladaptive defence mechanisms and the outcome variable, reaction to organisational change.

2. Stepwise logistic regression

Conducted a binomial stepwise logistic regression implementing forward selection and backward elimination. Both models derived the same statistically significant explanatory variables.

Table 4 summarises logistic regression fit statistics using the stepwise approach.

Table 4 Stepwise logistic regression model fit statistics (arranged by p.value)
term estimate std.error statistic p.value
(Intercept) −27.3799 4.6754 −5.8562 0.0000
behavioural_intentions 3.0873 0.5860 5.2685 0.0000
emotion 2.3993 0.5055 4.7461 0.0000
reaction_formation −0.7021 0.2517 −2.7898 0.0053
past_influences 0.8061 0.2986 2.6996 0.0069
anxious_thoughts 0.6389 0.2959 2.1593 0.0308
avoidance 0.6689 0.3334 2.0062 0.0448

In place of the coefficient of determination (R2) as a measure of fit, a pseudo-R2 value is adopted when the outcome variable is nominal or ordinal. There are several variants of pseudo-R2. Table 5 shows pseudo-R2 ranging from 0.6908 to 0.8996 for the selected variants.

Table 5 Pseudo-R2 variants for stepwise model
variant pseudo-R2
McFadden 0.6908
Nagelkerke 0.8199
VeallZimmermann 0.8409
McKelveyZavoina 0.8996

3. Prediction and evaluation

3.1 Model prediction

The next step involved predicting each respondent’s reaction to organisational change. Table 6 shows the predicted reaction compared to the actual reaction in a small sample of cases, along with significant explanatory variables.

Table 6 Sample of model predictions
anxious_thoughts avoidance past_influences reaction_formation emotion behavioural_intentions reaction prediction
3.00 1.00 4.50 3.00 2.95 4.40 0 0.0684
5.50 2.00 5.50 4.00 2.45 4.40 0 0.1914
4.50 2.50 4.00 4.00 2.85 3.20 0 0.0033
5.50 1.50 3.50 2.00 2.05 5.05 0 0.2816
4.00 1.00 1.50 4.00 2.55 4.90 0 0.0109
6.00 3.00 5.50 4.00 4.50 6.25 1 1.0000
3.50 2.00 3.50 4.00 1.90 1.85 0 0.0000
5.50 1.50 2.00 4.50 4.80 5.85 1 0.9943
5.00 1.00 3.50 4.00 4.70 4.65 0 0.8936
6.00 1.50 2.00 3.50 2.70 4.90 0 0.1439
2.00 2.00 2.00 5.50 2.35 4.35 0 0.0004
4.00 1.00 2.00 3.00 3.80 5.05 1 0.5145
6.50 4.00 1.00 4.00 3.95 5.80 1 0.9921
4.00 1.50 4.00 2.50 3.60 5.10 1 0.8839
5.50 3.50 2.00 3.50 3.85 5.70 1 0.9886

3.2 Model evaluation

The predictive performance of the stepwise model was evaluated with a confusion matrix, model statistics and ROC curve.

Chart 4 confusion matrix summarises predictions by categorising and comparing predicted against the actual response for reaction to change. The confusion matrix shows good performance for the stepwise model, recording 92.8 per cent accuracy (true positive and true negative). False positive (top left) and false negative (bottom right) predictions account for the remaining 7.2 per cent.

Table 7 summarises the stepwise model prediction performance.

Table 7 Summary of model prediction metrics
.metric .estimator .estimate
accuracy binary 0.9279
kap binary 0.8529
sens binary 0.9328
spec binary 0.9213
ppv binary 0.9407
npv binary 0.9111
mcc binary 0.8530
j_index binary 0.8541
bal_accuracy binary 0.9271
detection_prevalence binary 0.5673
precision binary 0.9407
recall binary 0.9328
f_meas binary 0.9367

The ROC curve (receiver operating characteristic curve) plots the true positive rate (sensitivity) against the false positive rate (specificity) at all classification thresholds. AUC (area under the curve) measures the entire two-dimensional area underneath the ROC curve. It evaluates how well a logistic regression model classifies positive and negative outcomes at every possible threshold. An AUC from 0.9 to 1 is regarded as “A” grade in classification performance. Chart 5 illustrates the ROC curve for the stepwise model with AUC of 0.972.


References:

Self-efficacy was measured using the ‘Self-efficacy scale: Construction and validation’ by Sherer, Maddux, Mercandante, Prentice-Dunn and Rogers, published in Psychological Reports.
Irrational ideas were measured using the ‘Irrational belief scale’ developed by Malouff and Schutte, published in the Sourcebook of Adult Assessment Strategies, based on Ellis and Harper’s work, published in A New Guide to Rational Living.
Maladaptive defence mechanisms were measured using selected items from ‘The Defense Style Questionnaire’ by Andrews, Singh and Bond, published in The Journal of Nervous and Mental Disease.
Emotion was measured using ‘A semantic differential mood scale’ by Lorr and Wunderlich, published in the Journal of Clinical Psychology.


Session information and package update

## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.4.0 (2024-04-24 ucrt)
##  os       Windows 11 x64 (build 22631)
##  system   x86_64, mingw32
##  ui       RTerm
##  language (EN)
##  collate  English_Australia.utf8
##  ctype    English_Australia.utf8
##  tz       Australia/Brisbane
##  date     2024-07-30
##  pandoc   3.1.11 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package      * version    date (UTC) lib source
##  backports      1.5.0      2024-05-23 [1] CRAN (R 4.4.0)
##  boot           1.3-30     2024-02-26 [2] CRAN (R 4.4.0)
##  broom        * 1.0.6      2024-05-17 [1] CRAN (R 4.4.0)
##  bslib          0.7.0      2024-03-29 [1] CRAN (R 4.4.0)
##  cachem         1.1.0      2024-05-16 [1] CRAN (R 4.4.0)
##  cellranger     1.1.0      2016-07-27 [1] CRAN (R 4.4.0)
##  checkmate      2.3.1      2023-12-04 [1] CRAN (R 4.4.0)
##  class          7.3-22     2023-05-03 [2] CRAN (R 4.4.0)
##  cli            3.6.3      2024-06-21 [1] CRAN (R 4.4.1)
##  codetools      0.2-20     2024-03-31 [2] CRAN (R 4.4.0)
##  colorspace     2.1-0      2023-01-23 [1] CRAN (R 4.4.1)
##  cvms         * 1.6.1      2024-02-27 [1] CRAN (R 4.4.0)
##  data.table   * 1.15.4     2024-03-30 [1] CRAN (R 4.4.0)
##  DescTools    * 0.99.54    2024-02-03 [1] CRAN (R 4.4.0)
##  devtools       2.4.5      2022-10-11 [1] CRAN (R 4.4.0)
##  dials        * 1.2.1      2024-02-22 [1] CRAN (R 4.4.0)
##  DiceDesign     1.10       2023-12-07 [1] CRAN (R 4.4.0)
##  digest         0.6.36     2024-06-23 [1] CRAN (R 4.4.1)
##  dplyr        * 1.1.4      2023-11-17 [1] CRAN (R 4.4.0)
##  e1071          1.7-14     2023-12-06 [1] CRAN (R 4.4.0)
##  ellipsis       0.3.2      2021-04-29 [1] CRAN (R 4.4.0)
##  evaluate       0.24.0     2024-06-10 [1] CRAN (R 4.4.0)
##  Exact          3.3        2024-07-21 [1] CRAN (R 4.4.1)
##  expm           0.999-9    2024-01-11 [1] CRAN (R 4.4.0)
##  fansi          1.0.6      2023-12-08 [1] CRAN (R 4.4.0)
##  farver         2.1.2      2024-05-13 [1] CRAN (R 4.4.0)
##  fastmap        1.2.0      2024-05-15 [1] CRAN (R 4.4.0)
##  fontawesome    0.5.2      2023-08-19 [1] CRAN (R 4.4.0)
##  forcats      * 1.0.0      2023-01-29 [1] CRAN (R 4.4.0)
##  foreach        1.5.2      2022-02-02 [1] CRAN (R 4.4.0)
##  fs             1.6.4      2024-04-25 [1] CRAN (R 4.4.0)
##  furrr          0.3.1      2022-08-15 [1] CRAN (R 4.4.0)
##  future         1.33.2     2024-03-26 [1] CRAN (R 4.4.0)
##  future.apply   1.11.2     2024-03-28 [1] CRAN (R 4.4.0)
##  generics       0.1.3      2022-07-05 [1] CRAN (R 4.4.0)
##  GGally       * 2.2.1      2024-02-14 [1] CRAN (R 4.4.0)
##  ggplot2      * 3.5.1      2024-04-23 [1] CRAN (R 4.4.0)
##  ggstats        0.6.0      2024-04-05 [1] CRAN (R 4.4.0)
##  gld            2.6.6      2022-10-23 [1] CRAN (R 4.4.0)
##  globals        0.16.3     2024-03-08 [1] CRAN (R 4.4.0)
##  glue           1.7.0      2024-01-09 [1] CRAN (R 4.4.0)
##  gower          1.0.1      2022-12-22 [1] CRAN (R 4.4.0)
##  GPfit          1.0-8      2019-02-08 [1] CRAN (R 4.4.0)
##  gt           * 0.11.0     2024-07-09 [1] CRAN (R 4.4.1)
##  gtable         0.3.5      2024-04-22 [1] CRAN (R 4.4.0)
##  gtExtras     * 0.5.0      2023-09-15 [1] CRAN (R 4.4.0)
##  hardhat        1.4.0      2024-06-02 [1] CRAN (R 4.4.0)
##  here         * 1.0.1      2020-12-13 [1] CRAN (R 4.4.0)
##  highr          0.11       2024-05-26 [1] CRAN (R 4.4.0)
##  hms            1.1.3      2023-03-21 [1] CRAN (R 4.4.0)
##  htmltools      0.5.8.1    2024-04-04 [1] CRAN (R 4.4.0)
##  htmlwidgets    1.6.4      2023-12-06 [1] CRAN (R 4.4.0)
##  httpuv         1.6.15     2024-03-26 [1] CRAN (R 4.4.0)
##  httr           1.4.7      2023-08-15 [1] CRAN (R 4.4.0)
##  infer        * 1.0.7      2024-03-25 [1] CRAN (R 4.4.0)
##  ipred          0.9-15     2024-07-18 [1] CRAN (R 4.4.1)
##  iterators      1.0.14     2022-02-05 [1] CRAN (R 4.4.0)
##  jquerylib      0.1.4      2021-04-26 [1] CRAN (R 4.4.0)
##  jsonlite       1.8.8      2023-12-04 [1] CRAN (R 4.4.0)
##  knitr          1.48       2024-07-07 [1] CRAN (R 4.4.1)
##  labeling       0.4.3      2023-08-29 [1] CRAN (R 4.4.0)
##  later          1.3.2      2023-12-06 [1] CRAN (R 4.4.0)
##  lattice        0.22-6     2024-03-20 [2] CRAN (R 4.4.0)
##  lava           1.8.0      2024-03-05 [1] CRAN (R 4.4.0)
##  lhs            1.2.0      2024-06-30 [1] CRAN (R 4.4.1)
##  lifecycle      1.0.4      2023-11-07 [1] CRAN (R 4.4.0)
##  listenv        0.9.1      2024-01-29 [1] CRAN (R 4.4.0)
##  lmom           3.0        2023-08-29 [1] CRAN (R 4.4.0)
##  lubridate    * 1.9.3      2023-09-27 [1] CRAN (R 4.4.0)
##  magrittr       2.0.3      2022-03-30 [1] CRAN (R 4.4.0)
##  MASS           7.3-60.2   2024-04-24 [2] local
##  Matrix         1.7-0      2024-03-22 [2] CRAN (R 4.4.0)
##  memoise        2.0.1      2021-11-26 [1] CRAN (R 4.4.0)
##  mime           0.12       2021-09-28 [1] CRAN (R 4.4.0)
##  miniUI         0.1.1.1    2018-05-18 [1] CRAN (R 4.4.0)
##  mnormt         2.1.1      2022-09-26 [1] CRAN (R 4.4.0)
##  modeldata    * 1.4.0      2024-06-19 [1] CRAN (R 4.4.1)
##  munsell        0.5.1      2024-04-01 [1] CRAN (R 4.4.0)
##  mvtnorm        1.2-5      2024-05-21 [1] CRAN (R 4.4.0)
##  nlme           3.1-164    2023-11-27 [2] CRAN (R 4.4.0)
##  nnet           7.3-19     2023-05-03 [2] CRAN (R 4.4.0)
##  paletteer      1.6.0      2024-01-21 [1] CRAN (R 4.4.0)
##  parallelly     1.37.1     2024-02-29 [1] CRAN (R 4.4.0)
##  parsnip      * 1.2.1      2024-03-22 [1] CRAN (R 4.4.0)
##  pillar         1.9.0      2023-03-22 [1] CRAN (R 4.4.0)
##  pkgbuild       1.4.4      2024-03-17 [1] CRAN (R 4.4.0)
##  pkgconfig      2.0.3      2019-09-22 [1] CRAN (R 4.4.0)
##  pkgload        1.4.0      2024-06-28 [1] CRAN (R 4.4.1)
##  plyr           1.8.9      2023-10-02 [1] CRAN (R 4.4.0)
##  pROC         * 1.18.5     2023-11-01 [1] CRAN (R 4.4.0)
##  prodlim        2024.06.25 2024-06-24 [1] CRAN (R 4.4.1)
##  profvis        0.3.8      2023-05-02 [1] CRAN (R 4.4.0)
##  promises       1.3.0      2024-04-05 [1] CRAN (R 4.4.0)
##  proxy          0.4-27     2022-06-09 [1] CRAN (R 4.4.0)
##  psych        * 2.4.6.26   2024-06-27 [1] CRAN (R 4.4.1)
##  purrr        * 1.0.2      2023-08-10 [1] CRAN (R 4.4.0)
##  R6             2.5.1      2021-08-19 [1] CRAN (R 4.4.0)
##  RColorBrewer   1.1-3      2022-04-03 [1] CRAN (R 4.4.0)
##  Rcpp           1.0.13     2024-07-17 [1] CRAN (R 4.4.1)
##  readr        * 2.1.5      2024-01-10 [1] CRAN (R 4.4.0)
##  readxl         1.4.3      2023-07-06 [1] CRAN (R 4.4.0)
##  recipes      * 1.1.0      2024-07-04 [1] CRAN (R 4.4.1)
##  rematch2       2.1.2      2020-05-01 [1] CRAN (R 4.4.0)
##  remotes        2.5.0      2024-03-17 [1] CRAN (R 4.4.0)
##  rlang          1.1.4      2024-06-04 [1] CRAN (R 4.4.0)
##  rmarkdown      2.27       2024-05-17 [1] CRAN (R 4.4.0)
##  rootSolve      1.8.2.4    2023-09-21 [1] CRAN (R 4.4.0)
##  rpart          4.1.23     2023-12-05 [2] CRAN (R 4.4.0)
##  rprojroot      2.0.4      2023-11-05 [1] CRAN (R 4.4.0)
##  rsample      * 1.2.1      2024-03-25 [1] CRAN (R 4.4.0)
##  rstudioapi     0.16.0     2024-03-24 [1] CRAN (R 4.4.0)
##  sass           0.4.9      2024-03-15 [1] CRAN (R 4.4.0)
##  scales       * 1.3.0      2023-11-28 [1] CRAN (R 4.4.0)
##  sessioninfo    1.2.2      2021-12-06 [1] CRAN (R 4.4.0)
##  shiny          1.8.1.1    2024-04-02 [1] CRAN (R 4.4.0)
##  stringi        1.8.4      2024-05-06 [1] CRAN (R 4.4.0)
##  stringr      * 1.5.1      2023-11-14 [1] CRAN (R 4.4.0)
##  survival       3.5-8      2024-02-14 [2] CRAN (R 4.4.0)
##  tibble       * 3.2.1      2023-03-20 [1] CRAN (R 4.4.0)
##  tidymodels   * 1.2.0      2024-03-25 [1] CRAN (R 4.4.0)
##  tidyr        * 1.3.1      2024-01-24 [1] CRAN (R 4.4.0)
##  tidyselect     1.2.1      2024-03-11 [1] CRAN (R 4.4.0)
##  tidyverse    * 2.0.0      2023-02-22 [1] CRAN (R 4.4.0)
##  timechange     0.3.0      2024-01-18 [1] CRAN (R 4.4.0)
##  timeDate       4032.109   2023-12-14 [1] CRAN (R 4.4.0)
##  tune         * 1.2.1      2024-04-18 [1] CRAN (R 4.4.0)
##  tzdb           0.4.0      2023-05-12 [1] CRAN (R 4.4.0)
##  urlchecker     1.0.1      2021-11-30 [1] CRAN (R 4.4.0)
##  usethis        2.2.3      2024-02-19 [1] CRAN (R 4.4.0)
##  utf8           1.2.4      2023-10-22 [1] CRAN (R 4.4.0)
##  vctrs          0.6.5      2023-12-01 [1] CRAN (R 4.4.0)
##  withr          3.0.0      2024-01-16 [1] CRAN (R 4.4.0)
##  workflows    * 1.1.4      2024-02-19 [1] CRAN (R 4.4.0)
##  workflowsets * 1.1.0      2024-03-21 [1] CRAN (R 4.4.0)
##  xfun           0.46       2024-07-18 [1] CRAN (R 4.4.1)
##  xml2           1.3.6      2023-12-04 [1] CRAN (R 4.4.0)
##  xtable         1.8-4      2019-04-21 [1] CRAN (R 4.4.0)
##  yaml           2.3.9      2024-07-05 [1] CRAN (R 4.4.1)
##  yardstick    * 1.3.1      2024-03-21 [1] CRAN (R 4.4.0)
## 
##  [1] C:/Users/wayne/AppData/Local/R/win-library/4.4
##  [2] C:/Program Files/R/R-4.4.0/library
## 
## ──────────────────────────────────────────────────────────────────────────────