Part 31 Lab 5B: Multivariate pivoting

Copy this code into your script to import the data for this lab.

library(tidyverse)
set.seed(123)
missing_w2_parent <- sample(1:500, 30)
missing_w2_child <- c(missing_w2_parent[1:5], sample(1:500, 25))
family  <- read_csv(
  "https://raw.githubusercontent.com/bwiernik/progdata/main/inst/tutorials/data/family_data.csv"
) |> 
  mutate(
    across(
      starts_with("w2") & contains("parent"),
      ~ ifelse(family_id %in% missing_w2_parent, NA_real_, .x)
    ),
    across(
      starts_with("w2") & contains("child"),
      ~ ifelse(family_id %in% missing_w2_child, NA_real_, .x)
    )
  )

You’re working on a longitudinal study of parent-child relationships. You have collected data from 500 families over 2 waves. In each wave, both the child and parent completed measures of communication behavior and relationship satisfaction.

family |> 
  knitr::kable()

Reshape the dataset to a “longer” format.

Make each row 1 score
Have columns for family_id, family_member, wave, scale, and score.

family_longest <- family |> 
  pivot_longer()

print(family_longest)

Reshape the dataset to a “longer” format.

Make each row 1 person
Have columns for family_id, family_member, wave, comm, and satis.

family_long <- family |> 
  pivot_longer()

print(family_long)

Some families are missing wave 2 data for parent, child, or both. Which families are missing wave 2 data for at least one person?

Question: Is is easier to easier to find the missing data in the wide or long format?