R/UTILS-joins.R
, R/VERBS-anti_join.R
, R/VERBS-full_join.R
, and 5 more
joins.Rd
Two-table joins. Check the "Joining verbs" vignette for more information.
# S3 method for ExprBuilder
anti_join(x, y, ...)
# S3 method for data.table
anti_join(x, ..., .expr = FALSE)
# S3 method for ExprBuilder
full_join(x, y, ..., sort = TRUE, allow = TRUE, .parent_env)
# S3 method for data.table
full_join(x, ...)
# S3 method for ExprBuilder
inner_join(x, y, ...)
# S3 method for data.table
inner_join(x, ..., .expr = FALSE)
# S3 method for ExprBuilder
left_join(
x,
y,
...,
nomatch,
mult,
roll,
rollends,
.parent_env,
.to_eager = FALSE
)
# S3 method for data.table
left_join(x, y, ..., allow = FALSE, .expr = FALSE)
mutate_join(x, y, ...)
# S3 method for ExprBuilder
mutate_join(
x,
y,
...,
.SDcols,
mult,
roll,
rollends,
allow = FALSE,
.by_each = NULL,
.parent_env
)
# S3 method for EagerExprBuilder
mutate_join(x, ..., .parent_env = rlang::caller_env())
# S3 method for data.table
mutate_join(x, y, ...)
# S3 method for ExprBuilder
right_join(
x,
y,
...,
allow = FALSE,
which,
nomatch,
mult,
roll,
rollends,
.selecting,
.framing
)
# S3 method for data.table
right_join(x, y, ..., allow = FALSE, .expr = FALSE, .selecting, .framing)
# S3 method for ExprBuilder
semi_join(x, y, ..., allow = FALSE, .eager = FALSE)
# S3 method for data.table
semi_join(x, y, ..., allow = FALSE, .eager = FALSE)
An ExprBuilder instance.
A data.table::data.table or, for some verbs (see details), a call to
nest_expr()
.
Expressions for the on
part of the join.
If the input is a data.table
and .expr
is TRUE
, an instance of
EagerExprBuilder will be returned. Useful if you want to add clauses to j
, e.g. with
mutate-table.express.
Passed to data.table::merge.
Passed as data.table
's allow.cartesian
.
See end_expr()
.
Internal, should be left as FALSE
in all external calls.
For mutate_join
. See the details below.
For mutate_join
. See the details below.
If TRUE
, return the row numbers that matched in x
instead of the result of the
join.
One or more expressions, possibly contained in a call to list
or .
, that
will be added to j
in the same frame as the join.
Similar to .selecting
, but added to the frame with frame_append()
.
For semi_join
. If TRUE
, it uses nest_expr()
to build an expression like
this instead of the default one. This uses the
captured data.table
eagerly, so use chain()
when needed. The default is lazy.
The following joins support nest_expr()
in y
:
anti_join
inner_join
right_join
The full_join
method is really a wrapper for data.table::merge
that specifies all = TRUE
.
The expression in x
gets evaluated, merged with y
, and the result is captured in a new
ExprBuilder. Useful in case you want to keep building expressions after the merge.
The ExprBuilder method for mutate_join
implements the idiom described in this link. The columns specified in .SDcols
are
those that will be added to x
from y
. The specification can be done by:
Using tidyselect::select_helpers.
Passing a character vector. If the character is named, the names are taken as the new column
names for the values added to x
.
A list, using base::list()
or .()
, containing:
Column names, either as characters or symbols.
Named calls expressing how the column should be summarized/modified before adding it to
x
.
The last case mentioned above is useful when the join returns many rows from y
for each row
in x
, so they can be summarized while joining. The value of by
in the join depends on what
is passed to .by_each
:
If NULL
(the default), by
is set to .EACHI
if a call is detected in any of the
expressions from the list in .SDcols
If TRUE
, by
is always set to .EACHI
If FALSE
, by
is never set to .EACHI
lhs <- data.table::data.table(x = rep(c("b", "a", "c"), each = 3),
y = c(1, 3, 6),
v = 1:9)
rhs <- data.table::data.table(x = c("c", "b"),
v = 8:7,
foo = c(4, 2))
rhs %>%
anti_join(lhs, x, v)
#> x v foo
#> 1: b 7 2
lhs %>%
inner_join(rhs, x)
#> x y v i.v foo
#> 1: c 1 7 8 4
#> 2: c 3 8 8 4
#> 3: c 6 9 8 4
#> 4: b 1 1 7 2
#> 5: b 3 2 7 2
#> 6: b 6 3 7 2
# creates new data.table
lhs %>%
left_join(rhs, x)
#> x v foo y i.v
#> 1: b 7 2 1 1
#> 2: b 7 2 3 2
#> 3: b 7 2 6 3
#> 4: a NA NA 1 4
#> 5: a NA NA 3 5
#> 6: a NA NA 6 6
#> 7: c 8 4 1 7
#> 8: c 8 4 3 8
#> 9: c 8 4 6 9
# would modify lhs by reference
lhs %>%
start_expr %>%
mutate_join(rhs, x, .SDcols = c("foo", rhs.v = "v"))
#> .DT_[, `:=`(c("foo", "rhs.v"), rhs[.SD, list(foo = x.foo, rhs.v = x.v),
#> on = list(x)])]
# would modify rhs by reference, summarizing 'y' before adding it.
rhs %>%
start_expr %>%
mutate_join(lhs, x, .SDcols = .(y = mean(y)))
#> .DT_[, `:=`("y", lhs[.SD, list(y = mean(y)), by = .EACHI, on = list(x)][,
#> list(y)])]
# creates new data.table
lhs %>%
right_join(rhs, x)
#> x y v i.v foo
#> 1: c 1 7 8 4
#> 2: c 3 8 8 4
#> 3: c 6 9 8 4
#> 4: b 1 1 7 2
#> 5: b 3 2 7 2
#> 6: b 6 3 7 2
# keep only columns from lhs
lhs %>%
semi_join(rhs, x)
#> x y v
#> 1: c 1 7
#> 2: c 3 8
#> 3: c 6 9
#> 4: b 1 1
#> 5: b 3 2
#> 6: b 6 3