ps_checkData
ps_checkData.Rd
Data checks and summaries: duplicate records, negative analytic values, numbers of analytic results, percentiles of results
Usage
ps_checkData(
doc = "ps_checkData",
data,
CheckDupVars,
GroupVar,
Groups = "All",
ByGroup = TRUE,
ID = " ",
AnalyticVars,
folder = " "
)
Arguments
- doc
A character string written to the output list; default is the function name
- data
An R object (data frame) containing analytic data
- CheckDupVars
A vector with names of identifying variables, typically group and lab ID
- GroupVar
The name of variable defining the groups (required)
- Groups
A character vector of groups by which numbers of samples and statistics will be tabulated or "All"
- ByGroup
Logical: default is TRUE. If FALSE, tabulations are for all groups combined
- ID
The name of lab ID, default is " " (no lab ID)
- AnalyticVars
A character vector of names of analytic variables for which tabulations are done
- folder
The path to the folder in which data frames will be saved; default is " ", no path
Value
The function returns a list with the following components
usage: A string with the contents of the argument doc, date run, R version used
dataUsed: The data frame specified by the argument data and GroupVar
params: A character vector with the values of CheckDupVars, GroupVar, and Groups
analyticVars: The vector of names specified by the argument AnalyticVars
Duplicates: A data frame containing the observations with duplicate values
NegativeValues: A data frame containing the observations with at least one negative value for a variable in AnalyticVars
Nvalues: A data frame contain the number of observations with a value for each analytic variable
statistics: A data frame containing the descriptive statistics (by group, if ByGroup = TRUE)
location: The value of the parameter folder
Detail
AnalyticVars must be a vector of length at least 2. If Groups specifies selected groups (is not equal to "All"), it must be a vector of length at least 2. The function returns a list with four data frames: duplicate observations, observations with negative values for one or more analytic variables, numbers of observations for each analytic variable, and descriptive statistics (quantiles and number missing). If the largest values is < 10 (true if use log10 transforms), the descriptive statistics are rounded to 2 digits, otherwise to integers. If ByGroup=TRUE, numbers of observations and statistics statistics are by group.