Part 42 Read in a sample SPSS file.

Let’s load a sample SPSS file and work with it. Download the file from here and save it in your data folder in your class GitHub repo.

These data are a random subset of the data used in this paper. This was a study looking at personality traits that distinguish C-level executives from lower-level managers among men and women.

The subset of data here consists of 200 cases, with variables indicating:

  1. The language of assessment (English, Dutch, or French)
  2. Gender
  3. C-level or not
  4. Extraversion level, as well as 4 facet traits (Leading, Communion, Persuasive, Motivating)

Let’s load in the data using the haven package.

(clevel <- haven::read_spss(
  here::here("participation", "data", "clevel.sav")
)
)

Notice that this tibble looks a little different for the language and gender variables than normal. It has labels for the numeric values. This format is what SPSS uses, but it’s not standard for R. Let’s convert those variables, and isClevel as well, to factors:

clevel_cleaned <-
  clevel |> 
  mutate(language = as_factor(language),
         gender = as_factor(gender),
         isClevel = factor(isClevel, 
                           levels = c(0, 1), 
                           labels = c("No", "Yes"))
  )
print(clevel_cleaned)

Notice how the variables are now factors with labels as the entries, instead of the original code numbers.