R dplyr Sum – How to Add Two Columns Multiple Times with dplyr

dplyrrsum

Assume I have a tibble (in my example here only with four columns). In real I have two hourly time series therefore 2 times 24 columns:

s1.x,...,s24.x, s1.y,...,s24.y

Now I want to sum up column s1.x with s1.y, s2.x with s2.y to s24.x with s24.y.

a <- tibble(s1.x=2:7, s2.x=3:8, s1.y=4:9, s2.y=5:10)

a %>% 
  mutate(s1.tot=s1.x+s1.y, s2.tot=s2.x+s2.y)

How can I do this in a short (and elegant way) for all 24 hours?

Best Answer

A base R solution

cbind(
    a,
    lapply(
        split.default(a, sub("\\..*", ".tot", names(a))),
        rowSums
    )
)

or even shorter

cbind(a, t(rowsum(t(a), sub("\\..*", ".tot", names(a)))))

which gives

  s1.x s2.x s1.y s2.y s1.tot s2.tot
1    2    3    4    5      6      8
2    3    4    5    6      8     10
3    4    5    6    7     10     12
4    5    6    7    8     12     14
5    6    7    8    9     14     16
6    7    8    9   10     16     18

Benchmarking

If you are care about the speed

edward <- function() {
    a %>%
        mutate(id = 1:nrow(a)) %>%
        pivot_longer(-id,
            names_to = c("s", ".value"),
            names_pattern = "(s\\d+).(.)"
        ) |>
        mutate(tot = x + y, .by = id) |>
        pivot_wider(names_from = s, values_from = c(x, y, tot), names_glue = "{s}.{.value}") %>%
        select(-id)
}


mael <- function() {
    a |>
        mutate(across(matches(".x$"), .names = "{gsub('x', '', .col)}tot") +
            across(matches(".y$")))
}

tic1 <- function() {
    cbind(
        a,
        lapply(
            split.default(a, sub("\\..*", ".tot", names(a))),
            rowSums
        )
    )
}

tic2 <- function() {
    cbind(a, t(rowsum(t(a), sub("\\..*", ".tot", names(a)))))
}

microbenchmark(
    edward(),
    mael(),
    tic1(),
    tic2(),
    unit = "relative",
    check = "equivalent"
)

shows

Unit: relative
     expr       min        lq      mean    median        uq      max neval
 edward() 43.286275 33.290854 29.937941 32.085324 30.763830 8.997682   100
   mael()  8.465686  7.081354  6.652635  6.930750  6.867252 2.572953   100
   tic1()  1.673039  1.469692  1.527453  1.417756  1.315485 1.425672   100
   tic2()  1.000000  1.000000  1.000000  1.000000  1.000000 1.000000   100