Two-table joins. Check the "Joining verbs" vignette for more information.

# S3 method for ExprBuilder
anti_join(x, y, ...)

# S3 method for data.table
anti_join(x, ..., .expr = FALSE)

# S3 method for ExprBuilder
full_join(x, y, ..., sort = TRUE, allow = TRUE, .parent_env)

# S3 method for data.table
full_join(x, ...)

# S3 method for ExprBuilder
inner_join(x, y, ...)

# S3 method for data.table
inner_join(x, ..., .expr = FALSE)

# S3 method for ExprBuilder
left_join(
  x,
  y,
  ...,
  nomatch,
  mult,
  roll,
  rollends,
  .parent_env,
  .to_eager = FALSE
)

# S3 method for data.table
left_join(x, y, ..., allow = FALSE, .expr = FALSE)

mutate_join(x, y, ...)

# S3 method for ExprBuilder
mutate_join(
  x,
  y,
  ...,
  .SDcols,
  mult,
  roll,
  rollends,
  allow = FALSE,
  .by_each = NULL,
  .parent_env
)

# S3 method for EagerExprBuilder
mutate_join(x, ..., .parent_env = rlang::caller_env())

# S3 method for data.table
mutate_join(x, y, ...)

# S3 method for ExprBuilder
right_join(
  x,
  y,
  ...,
  allow = FALSE,
  which,
  nomatch,
  mult,
  roll,
  rollends,
  .selecting,
  .framing
)

# S3 method for data.table
right_join(x, y, ..., allow = FALSE, .expr = FALSE, .selecting, .framing)

# S3 method for ExprBuilder
semi_join(x, y, ..., allow = FALSE, .eager = FALSE)

# S3 method for data.table
semi_join(x, y, ..., allow = FALSE, .eager = FALSE)

Arguments

x

An ExprBuilder instance.

y

A data.table::data.table or, for some verbs (see details), a call to nest_expr().

...

Expressions for the on part of the join.

.expr

If the input is a data.table and .expr is TRUE, an instance of EagerExprBuilder will be returned. Useful if you want to add clauses to j, e.g. with mutate-table.express.

sort

Passed to data.table::merge.

allow

Passed as data.table's allow.cartesian.

.parent_env

See end_expr().

nomatch, mult, roll, rollends

See data.table::data.table.

.to_eager

Internal, should be left as FALSE in all external calls.

.SDcols

For mutate_join. See the details below.

.by_each

For mutate_join. See the details below.

which

If TRUE, return the row numbers that matched in x instead of the result of the join.

.selecting

One or more expressions, possibly contained in a call to list or ., that will be added to j in the same frame as the join.

.framing

Similar to .selecting, but added to the frame with frame_append().

.eager

For semi_join. If TRUE, it uses nest_expr() to build an expression like this instead of the default one. This uses the captured data.table eagerly, so use chain() when needed. The default is lazy.

Details

The following joins support nest_expr() in y:

  • anti_join

  • inner_join

  • right_join

The full_join method is really a wrapper for data.table::merge that specifies all = TRUE. The expression in x gets evaluated, merged with y, and the result is captured in a new ExprBuilder. Useful in case you want to keep building expressions after the merge.

Mutating join

The ExprBuilder method for mutate_join implements the idiom described in this link. The columns specified in .SDcols are those that will be added to x from y. The specification can be done by:

  • Using tidyselect::select_helpers.

  • Passing a character vector. If the character is named, the names are taken as the new column names for the values added to x.

  • A list, using base::list() or .(), containing:

    • Column names, either as characters or symbols.

    • Named calls expressing how the column should be summarized/modified before adding it to x.

The last case mentioned above is useful when the join returns many rows from y for each row in x, so they can be summarized while joining. The value of by in the join depends on what is passed to .by_each:

  • If NULL (the default), by is set to .EACHI if a call is detected in any of the expressions from the list in .SDcols

  • If TRUE, by is always set to .EACHI

  • If FALSE, by is never set to .EACHI

Examples


lhs <- data.table::data.table(x = rep(c("b", "a", "c"), each = 3),
                              y = c(1, 3, 6),
                              v = 1:9)

rhs <- data.table::data.table(x = c("c", "b"),
                              v = 8:7,
                              foo = c(4, 2))


rhs %>%
    anti_join(lhs, x, v)
#>    x v foo
#> 1: b 7   2


lhs %>%
    inner_join(rhs, x)
#>    x y v i.v foo
#> 1: c 1 7   8   4
#> 2: c 3 8   8   4
#> 3: c 6 9   8   4
#> 4: b 1 1   7   2
#> 5: b 3 2   7   2
#> 6: b 6 3   7   2


# creates new data.table
lhs %>%
    left_join(rhs, x)
#>    x  v foo y i.v
#> 1: b  7   2 1   1
#> 2: b  7   2 3   2
#> 3: b  7   2 6   3
#> 4: a NA  NA 1   4
#> 5: a NA  NA 3   5
#> 6: a NA  NA 6   6
#> 7: c  8   4 1   7
#> 8: c  8   4 3   8
#> 9: c  8   4 6   9


# would modify lhs by reference
lhs %>%
    start_expr %>%
    mutate_join(rhs, x, .SDcols = c("foo", rhs.v = "v"))
#> .DT_[, `:=`(c("foo", "rhs.v"), rhs[.SD, list(foo = x.foo, rhs.v = x.v), 
#>     on = list(x)])]

# would modify rhs by reference, summarizing 'y' before adding it.
rhs %>%
    start_expr %>%
    mutate_join(lhs, x, .SDcols = .(y = mean(y)))
#> .DT_[, `:=`("y", lhs[.SD, list(y = mean(y)), by = .EACHI, on = list(x)][, 
#>     list(y)])]


# creates new data.table
lhs %>%
    right_join(rhs, x)
#>    x y v i.v foo
#> 1: c 1 7   8   4
#> 2: c 3 8   8   4
#> 3: c 6 9   8   4
#> 4: b 1 1   7   2
#> 5: b 3 2   7   2
#> 6: b 6 3   7   2


# keep only columns from lhs
lhs %>%
    semi_join(rhs, x)
#>    x y v
#> 1: c 1 7
#> 2: c 3 8
#> 3: c 6 9
#> 4: b 1 1
#> 5: b 3 2
#> 6: b 6 3