Conditions related to duplicated values

row_duplicated(.data, match_type = NULL, output_class, ...)

# S3 method for matrix
row_duplicated(.data, match_type = NULL, output_class, ...)

# S3 method for data.frame
row_duplicated(.data, match_type = NULL, output_class, ...)

Arguments

.data

A two-dimensional data structure.

match_type

One of (NULL, "any", "none", "which_first", "count"). Possibly abbreviated.

output_class

Passed to op_ctrl(). If missing, it will be inferred.

...

Arguments passed on to op_ctrl

cols

A vector indicating which columns to consider for the operation. If NULL, all columns are used. If its length is 0, no columns are considered. Negative numbers, logical values, character vectors representing column names, and tidyselect::select_helpers are supported.

rows

Like cols but for row indices, and without tidyselect support.

factor_mode

One of ("character", "integer"), possibly abbreviated. If a column is a factor, this determines whether the operation uses its internal integer values, or the character values from its levels.

Details

For each row, different checks can be performed on whether there are duplicated values among the columns. The default (when match_type = NULL) returns a logical matrix/data.frame with the same number of columns as the input, where each column indicates whether the corresponding input column is duplicated for that row. Other match_type values perform aggregations and are self-explanatory.

Care must be taken for input data frames whose columns have different types. Type promotion will follow normal R rules:

logical -> integer -> double -> complex -> character

For each row, the set of seen values starts as logical. If the value in a new column requires promotion, all values in the set are promoted accordingly. This continues iteratively for each column in the row. See the examples.

Examples

# after processing the 2nd column, all values will be promoted to character row_duplicated(data.frame(TRUE, "TRUE", TRUE))
#> V1 V2 V3 #> 1 FALSE TRUE TRUE
# after the 1st column the set of seen values is (TRUE) # after the 2nd column, TRUE is promoted to 1, and the set is (1) # after the 3rd column, 1 is promoted to "1", and the set is ("1", "FALSE") # the final column is thus promoted to "TRUE" row_duplicated(data.frame(TRUE, 1L, "FALSE", TRUE))
#> V1 V2 V3 V4 #> 1 FALSE TRUE FALSE FALSE