Duplicates in unique ID — isUniqueIDDuplicated • HighFrequencyChecks

This function check that all interviews in the dataset have an ID which is unique. There is an option to automatically mark for deletion the surveys which have a duplicated unique ID.

isUniqueIDDuplicated(
  ds = NULL,
  uniqueID = NULL,
  surveyConsent = NULL,
  reportingColumns = c(enumeratorID, uniqueID),
  deleteIsUniqueIDDuplicated = FALSE
)

Arguments

ds	dataset containing the survey (from kobo): data.frame
uniqueID	name of the field where the survey unique ID is stored: string
surveyConsent	name of the field in the dataset where the survey consent is stored: string
reportingColumns	(Optional, by default it is built from the enumeratorID and the UniqueID) name of the columns from the dataset you want in the result: list of string (c('col1','col2',...))
deleteIsUniqueIDDuplicated	(Optional, by default set as FALSE) if TRUE, the survey in error will be marked as 'deletedIsUniqueIDDuplicated': boolean (TRUE/FALSE)
enumeratorID	name of the field where the enumerator ID is stored: string

Value

dst same dataset as the inputed one but with survey marked for deletion if errors are found and delete=TRUE (or NULL)

ret_log list of the errors found (or NULL)

var a list of value (or NULL)

graph graphical representation of the results (or NULL)

Author

Yannick Pascaud

Examples

{
ds <- HighFrequencyChecks::sample_dataset
uniqueID <- "X_uuid"
surveyConsent <- "survey_consent"
enumeratorID <- "enumerator_id"
reportingColumns <- c(enumeratorID, uniqueID)

list[dst,ret_log,var,graph] <- isUniqueIDDuplicated(ds=ds,
                                                    uniqueID=uniqueID,
                                                    surveyConsent=surveyConsent,
                                                    reportingColumns=reportingColumns,
                                                    deleteIsUniqueIDDuplicated=FALSE)
head(ret_log, 10)
}
#>     enumerator_id X_uuid survey_consent
#> 194            46                   yes