Objective

Wikipedia describes exploratory data analysis (EDA) as analysing datasets to summarise main characteristics, often using statistical graphics and other data visualisation methods. According to Wickham and Grolemund in their publication R for Data Science, two questions help make discoveries within data. These questions are: what type of variation occurs within variables and what type of covariation occurs between variables?

This vignette explores categorical and numerical variables. The categorical variable is job role, and the numeric variable is irrational ideas. The dataset comprised 616 respondents from 10 public and private sector organisations experiencing organisational change.

Workflow

The raw dataset was tidied prior to exploration. This included renaming variables, updating data types and checking for anomalies. Cases with an unworkable amount of missing values, of which there were very few, were removed. Apart from routine reshaping of data, no other wrangling was required.

The exploratory data analysis calculates statistical summaries and visualisations for job role and irrational ideas. To conclude, explored the extent of covariation between job role and irrational ideas with a descriptive summary and visualisations.

Results

1. Explore job role

Table 1 summarises the distribution of job role by count and proportion. Chart 1 visualises the distribution of job role.

Table 1 Job role statistical summary (ordered by frequency)
job role n percent
Employee 358 59%
Middle management 121 20%
Supervisor 92 15%
Executive/Senior management 40 7%

2. Explore irrational ideas

Table 2 presents a statistical summary of irrational ideas.

Table 2 Irrational ideas statistical summary
vars n mean sd median trimmed mad min max range skew kurtosis se
irrational ideas 1 616 3.67 0.77 3.65 3.66 0.82 1.2 5.8 4.6 0.04 -0.19 0.03

Charts 2 and 3 visualise irrational ideas in a combined density histogram and a combined violin box plot, respectively.

3. Explore covariation

Table 3 is a descriptive summary of the level of irrational ideas by job role.

Table 3 Irrational ideas by job role statistical summary
job role attribute count prop mean min p0.25 median p0.75 max
Employee irrational_ideas 358 58.59 3.76 1.70 3.22 3.74 4.30 5.50
Supervisor irrational_ideas 92 15.06 3.74 2.00 3.18 3.75 4.25 5.80
Middle management irrational_ideas 121 19.80 3.54 1.50 3.10 3.50 4.00 5.75
Executive/Senior management irrational_ideas 40 6.55 3.13 1.32 2.70 3.08 3.41 5.00

Charts 4 and 5 illustrate the covariation between the level of irrational ideas and job role.

This study revealed that executive/senior management roles generally recorded lower irrational beliefs than the other job roles surveyed. To see if there is a statistically significant relationship between job role and the level of irrational ideas, see the vignette on categorical and numerical hypothesis testing.


Reference:

Irrational ideas were measured using the ‘Irrational belief scale’ developed by Malouff and Schutte, published in the Sourcebook of Adult Assessment Strategies, based on Ellis and Harper’s work, published in A New Guide to Rational Living.


Session information and package update

## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.4.0 (2024-04-24 ucrt)
##  os       Windows 11 x64 (build 22631)
##  system   x86_64, mingw32
##  ui       RTerm
##  language (EN)
##  collate  English_Australia.utf8
##  ctype    English_Australia.utf8
##  tz       Australia/Brisbane
##  date     2024-07-29
##  pandoc   3.1.11 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package      * version  date (UTC) lib source
##  bslib          0.7.0    2024-03-29 [1] CRAN (R 4.4.0)
##  cachem         1.1.0    2024-05-16 [1] CRAN (R 4.4.0)
##  cli            3.6.3    2024-06-21 [1] CRAN (R 4.4.1)
##  colorspace     2.1-0    2023-01-23 [1] CRAN (R 4.4.1)
##  data.table   * 1.15.4   2024-03-30 [1] CRAN (R 4.4.0)
##  devtools       2.4.5    2022-10-11 [1] CRAN (R 4.4.0)
##  digest         0.6.36   2024-06-23 [1] CRAN (R 4.4.1)
##  dplyr        * 1.1.4    2023-11-17 [1] CRAN (R 4.4.0)
##  ellipsis       0.3.2    2021-04-29 [1] CRAN (R 4.4.0)
##  evaluate       0.24.0   2024-06-10 [1] CRAN (R 4.4.0)
##  fansi          1.0.6    2023-12-08 [1] CRAN (R 4.4.0)
##  farver         2.1.2    2024-05-13 [1] CRAN (R 4.4.0)
##  fastmap        1.2.0    2024-05-15 [1] CRAN (R 4.4.0)
##  forcats      * 1.0.0    2023-01-29 [1] CRAN (R 4.4.0)
##  fs             1.6.4    2024-04-25 [1] CRAN (R 4.4.0)
##  generics       0.1.3    2022-07-05 [1] CRAN (R 4.4.0)
##  GGally         2.2.1    2024-02-14 [1] CRAN (R 4.4.0)
##  ggplot2      * 3.5.1    2024-04-23 [1] CRAN (R 4.4.0)
##  ggstats        0.6.0    2024-04-05 [1] CRAN (R 4.4.0)
##  glue           1.7.0    2024-01-09 [1] CRAN (R 4.4.0)
##  gridExtra      2.3      2017-09-09 [1] CRAN (R 4.4.0)
##  gtable         0.3.5    2024-04-22 [1] CRAN (R 4.4.0)
##  here         * 1.0.1    2020-12-13 [1] CRAN (R 4.4.0)
##  highr          0.11     2024-05-26 [1] CRAN (R 4.4.0)
##  hms            1.1.3    2023-03-21 [1] CRAN (R 4.4.0)
##  htmltools      0.5.8.1  2024-04-04 [1] CRAN (R 4.4.0)
##  htmlwidgets    1.6.4    2023-12-06 [1] CRAN (R 4.4.0)
##  httpuv         1.6.15   2024-03-26 [1] CRAN (R 4.4.0)
##  ISLR           1.4      2021-09-15 [1] CRAN (R 4.4.0)
##  jquerylib      0.1.4    2021-04-26 [1] CRAN (R 4.4.0)
##  jsonlite       1.8.8    2023-12-04 [1] CRAN (R 4.4.0)
##  kableExtra   * 1.4.0    2024-01-24 [1] CRAN (R 4.4.0)
##  knitr          1.48     2024-07-07 [1] CRAN (R 4.4.1)
##  labeling       0.4.3    2023-08-29 [1] CRAN (R 4.4.0)
##  later          1.3.2    2023-12-06 [1] CRAN (R 4.4.0)
##  lattice        0.22-6   2024-03-20 [2] CRAN (R 4.4.0)
##  lifecycle      1.0.4    2023-11-07 [1] CRAN (R 4.4.0)
##  lpSolve        5.6.20   2023-12-10 [1] CRAN (R 4.4.0)
##  lubridate    * 1.9.3    2023-09-27 [1] CRAN (R 4.4.0)
##  magrittr       2.0.3    2022-03-30 [1] CRAN (R 4.4.0)
##  MASS           7.3-60.2 2024-04-24 [2] local
##  memoise        2.0.1    2021-11-26 [1] CRAN (R 4.4.0)
##  mime           0.12     2021-09-28 [1] CRAN (R 4.4.0)
##  miniUI         0.1.1.1  2018-05-18 [1] CRAN (R 4.4.0)
##  mnormt         2.1.1    2022-09-26 [1] CRAN (R 4.4.0)
##  munsell        0.5.1    2024-04-01 [1] CRAN (R 4.4.0)
##  nlme           3.1-164  2023-11-27 [2] CRAN (R 4.4.0)
##  pillar         1.9.0    2023-03-22 [1] CRAN (R 4.4.0)
##  pkgbuild       1.4.4    2024-03-17 [1] CRAN (R 4.4.0)
##  pkgconfig      2.0.3    2019-09-22 [1] CRAN (R 4.4.0)
##  pkgload        1.4.0    2024-06-28 [1] CRAN (R 4.4.1)
##  plyr           1.8.9    2023-10-02 [1] CRAN (R 4.4.0)
##  profvis        0.3.8    2023-05-02 [1] CRAN (R 4.4.0)
##  promises       1.3.0    2024-04-05 [1] CRAN (R 4.4.0)
##  psych        * 2.4.6.26 2024-06-27 [1] CRAN (R 4.4.1)
##  purrr        * 1.0.2    2023-08-10 [1] CRAN (R 4.4.0)
##  R6             2.5.1    2021-08-19 [1] CRAN (R 4.4.0)
##  RColorBrewer   1.1-3    2022-04-03 [1] CRAN (R 4.4.0)
##  Rcpp           1.0.13   2024-07-17 [1] CRAN (R 4.4.1)
##  readr        * 2.1.5    2024-01-10 [1] CRAN (R 4.4.0)
##  remotes        2.5.0    2024-03-17 [1] CRAN (R 4.4.0)
##  rlang          1.1.4    2024-06-04 [1] CRAN (R 4.4.0)
##  rmarkdown      2.27     2024-05-17 [1] CRAN (R 4.4.0)
##  rprojroot      2.0.4    2023-11-05 [1] CRAN (R 4.4.0)
##  rstudioapi     0.16.0   2024-03-24 [1] CRAN (R 4.4.0)
##  sampling       2.10     2023-10-29 [1] CRAN (R 4.4.0)
##  sass           0.4.9    2024-03-15 [1] CRAN (R 4.4.0)
##  scales         1.3.0    2023-11-28 [1] CRAN (R 4.4.0)
##  sessioninfo    1.2.2    2021-12-06 [1] CRAN (R 4.4.0)
##  shiny          1.8.1.1  2024-04-02 [1] CRAN (R 4.4.0)
##  SmartEDA     * 0.3.10   2024-01-30 [1] CRAN (R 4.4.0)
##  stringi        1.8.4    2024-05-06 [1] CRAN (R 4.4.0)
##  stringr      * 1.5.1    2023-11-14 [1] CRAN (R 4.4.0)
##  svglite        2.1.3    2023-12-08 [1] CRAN (R 4.4.0)
##  systemfonts    1.1.0    2024-05-15 [1] CRAN (R 4.4.0)
##  tibble       * 3.2.1    2023-03-20 [1] CRAN (R 4.4.0)
##  tidyr        * 1.3.1    2024-01-24 [1] CRAN (R 4.4.0)
##  tidyselect     1.2.1    2024-03-11 [1] CRAN (R 4.4.0)
##  tidyverse    * 2.0.0    2023-02-22 [1] CRAN (R 4.4.0)
##  timechange     0.3.0    2024-01-18 [1] CRAN (R 4.4.0)
##  tzdb           0.4.0    2023-05-12 [1] CRAN (R 4.4.0)
##  urlchecker     1.0.1    2021-11-30 [1] CRAN (R 4.4.0)
##  usethis        2.2.3    2024-02-19 [1] CRAN (R 4.4.0)
##  utf8           1.2.4    2023-10-22 [1] CRAN (R 4.4.0)
##  vctrs          0.6.5    2023-12-01 [1] CRAN (R 4.4.0)
##  viridisLite    0.4.2    2023-05-02 [1] CRAN (R 4.4.0)
##  withr          3.0.0    2024-01-16 [1] CRAN (R 4.4.0)
##  xfun           0.46     2024-07-18 [1] CRAN (R 4.4.1)
##  xml2           1.3.6    2023-12-04 [1] CRAN (R 4.4.0)
##  xtable         1.8-4    2019-04-21 [1] CRAN (R 4.4.0)
##  yaml           2.3.9    2024-07-05 [1] CRAN (R 4.4.1)
## 
##  [1] C:/Users/wayne/AppData/Local/R/win-library/4.4
##  [2] C:/Program Files/R/R-4.4.0/library
## 
## ──────────────────────────────────────────────────────────────────────────────