Declare a null hypothesis about variables selected in specify()
.
Learn more in vignette("infer")
.
Usage
hypothesize(x, null, p = NULL, mu = NULL, med = NULL, sigma = NULL)
hypothesise(x, null, p = NULL, mu = NULL, med = NULL, sigma = NULL)
Arguments
- x
A data frame that can be coerced into a tibble.
- null
The null hypothesis. Options include
"independence"
,"point"
, and"paired independence"
.independence
: Should be used with both aresponse
andexplanatory
variable. Indicates that the values of the specifiedresponse
variable are independent of the associated values inexplanatory
.point
: Should be used with only aresponse
variable. Indicates that a point estimate based on the values inresponse
is associated with a parameter. Sometimes requires supplying one ofp
,mu
,med
, orsigma
.paired independence
: Should be used with only aresponse
variable giving the pre-computed difference between paired observations. Indicates that the order of subtraction between paired values does not affect the resulting distribution.
- p
The true proportion of successes (a number between 0 and 1). To be used with point null hypotheses when the specified response variable is categorical.
- mu
The true mean (any numerical value). To be used with point null hypotheses when the specified response variable is continuous.
- med
The true median (any numerical value). To be used with point null hypotheses when the specified response variable is continuous.
- sigma
The true standard deviation (any numerical value). To be used with point null hypotheses.
Value
A tibble containing the response (and explanatory, if specified) variable data with parameter information stored as well.
Examples
# hypothesize independence of two variables
gss %>%
specify(college ~ partyid, success = "degree") %>%
hypothesize(null = "independence")
#> Dropping unused factor levels DK from the supplied explanatory variable
#> 'partyid'.
#> Response: college (factor)
#> Explanatory: partyid (factor)
#> Null Hypothesis: independence
#> # A tibble: 500 × 2
#> college partyid
#> <fct> <fct>
#> 1 degree ind
#> 2 no degree rep
#> 3 degree ind
#> 4 no degree ind
#> 5 degree rep
#> 6 no degree rep
#> 7 no degree dem
#> 8 degree ind
#> 9 degree rep
#> 10 no degree dem
#> # ℹ 490 more rows
# hypothesize a mean number of hours worked per week of 40
gss %>%
specify(response = hours) %>%
hypothesize(null = "point", mu = 40)
#> Response: hours (numeric)
#> Null Hypothesis: point
#> # A tibble: 500 × 1
#> hours
#> <dbl>
#> 1 50
#> 2 31
#> 3 40
#> 4 40
#> 5 40
#> 6 53
#> 7 32
#> 8 20
#> 9 40
#> 10 40
#> # ℹ 490 more rows
# more in-depth explanation of how to use the infer package
if (FALSE) {
vignette("infer")
}