Changelog
Source:NEWS.md
infer 1.0.7
CRAN release: 20240325
The aliases
p_value()
andconf_int()
, first deprecated 6 years ago, now return an error (#530).Addresses ggplot2 warnings when shading pvalues for test statistics that are outside of the range of the generated distribution (#528).
Fixed bug in
shade_p_value()
andshade_confidence_interval()
wherefill = NULL
was ignored when it was documented as preventing any shading (#525).
infer v1.0.6
CRAN release: 20240131

Updated infrastructure for errors, warnings, and messages (#513). Most of these changes will not be visible to users, though:
 Many longer error messages are now broken up into several lines.
 For references to helpfiles, users can now click on the error message’s text to navigate to the cited documentation.
Various improvements to documentation (#501, #504, #508, #512).
Fixed bug where
get_confidence_interval()
would error uninformatively when the supplied distribution of estimates contained missing values. The function will now warn and return a confidence interval calculated using the nonmissing estimates (#521).Fixed bug where
generate()
could not be used without firstspecify()
ing variables, even in cases where that specification would not affect resampling/simulation (#448).
infer v1.0.5
CRAN release: 20230906
Implemented support for permutation hypothesis tests for paired data via the argument value
null = "paired independence"
inhypothesize()
(#487).The
weight_by
argument torep_slice_sample()
can now be passed either as a vector of numeric weights or an unquoted column name in.data
(#480).Newly accommodates variables with spaces in names in the wrapper functions
t_test()
andprop_test()
(#472).Fixed bug in twosample
prop_test()
where the response and explanatory variable were passed in place of each other toprop.test()
. This enables usingprop_test()
with explanatory variables with greater than 2 levels and, in the process, addresses a bug whereprop_test()
collapsed levels other than thesuccess
when the response variable had more than 2 levels.
infer v1.0.4
CRAN release: 20221201
 Fixed bug in pvalue shading where shaded regions no longer correctly overlaid histogram bars.
 Addressed deprecation warning ahead of upcoming dplyr release.
infer v1.0.2
CRAN release: 20220610
 Fix pvalue shading when the calculated statistic falls exactly on the boundaries of a histogram bin (#424).
 Fix
generate()
errors when columns are namedx
(#431).  Fix error from
visualize
when passedgenerate()
dinfer_dist
objects that had not been passed tohypothesize()
(#432).  Update visual checks for
visualize
output to align with the R 4.1.0+ graphics engine (#438). 
specify()
and wrapper functions now appropriately handle ordered factors (#439).  Clarify error when incompatible statistics and hypotheses are supplied (#441).
 Updated
generate()
unexpectedtype
warnings to be more permissive—the warning will be raised less often whentype = "bootstrap"
(#425).  Allow passing additional arguments to
stats::chisq.test
via...
incalculate()
. Ellipses are now always passed to the applicable base R hypothesis testing function, when applicable (#414)!  The package will now set the levels of logical variables on conversion to factor so that the first level (regarded as
success
by default) isTRUE
. Core verbs have warned without an explicitsuccess
value already, and this change makes behavior consistent with the functions being wrapped by shorthand test wrappers (#440).  Added new statistic
stat = "ratio of means"
(#452).
infer v1.0.1 (GitHub Only)
This release reflects the infer version accepted to the Journal of Open Source Software.
infer 1.0.0
CRAN release: 20210813
v1.0.0 is the first major release of the {infer} package! By and large, the core verbs specify()
, hypothesize()
, generate()
, and calculate()
will interface as they did before. This release makes several improvements to behavioral consistency of the package and introduces support for theorybased inference as well as randomizationbased inference with multiple explanatory variables.
Behavioral consistency
A major change to the package in this release is a set of standards for behavioral consistency of calculate()
(#356). Namely, the package will now
 supply a consistent error when the supplied
stat
argument isn’t welldefined for the variablesspecify()
d
gss %>%
specify(response = hours) %>%
calculate(stat = "diff in means")
#> Error: A difference in means is not welldefined for a
#> numeric response variable (hours) and no explanatory variable.
or
gss %>%
specify(college ~ partyid, success = "degree") %>%
calculate(stat = "diff in props")
#> Error: A difference in proportions is not welldefined for a dichotomous categorical
#> response variable (college) and a multinomial categorical explanatory variable (partyid).
 supply a consistent message when the user supplies unneeded information via
hypothesize()
tocalculate()
an observed statistic
# supply mu = 40 when it's not needed
gss %>%
specify(response = hours) %>%
hypothesize(null = "point", mu = 40) %>%
calculate(stat = "mean")
#> Message: The point null hypothesis `mu = 40` does not inform calculation of
#> the observed statistic (a mean) and will be ignored.
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 41.4
and
 supply a consistent warning and assume a reasonable null value when the user does not supply sufficient information to calculate an observed statistic
# don't hypothesize `p` when it's needed
gss %>%
specify(response = sex, success = "female") %>%
calculate(stat = "z")
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 1.16
#> Warning message:
#> A z statistic requires a null hypothesis to calculate the observed statistic.
#> Output assumes the following null value: `p = .5`.
or
# don't hypothesize `p` when it's needed
gss %>%
specify(response = partyid) %>%
calculate(stat = "Chisq")
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 334.
#> Warning message:
#> A chisquare statistic requires a null hypothesis to calculate the observed statistic.
#> Output assumes the following null values: `p = c(dem = 0.2, ind = 0.2, rep = 0.2, other = 0.2, DK = 0.2)`.
To accommodate this behavior, a number of new calculate
methods were added or improved. Namely:
 Implemented the standardized proportion z statistic for one categorical variable
 Extended
calculate()
withstat = "t"
by passingmu
to thecalculate()
method forstat = "t"
to allow for calculation oft
statistics for one numeric variable with hypothesized mean  Extended
calculate()
to allow lowercase aliases forstat
arguments (#373).  Fixed bugs in
calculate()
for to allow for programmatic calculation of statistics
This behavioral consistency also allowed for the implementation of observe()
, a wrapper function around specify()
, hypothesize()
, and calculate()
, to calculate observed statistics. The function provides a shorthand alternative to calculating observed statistics from data:
# calculating the observed mean number of hours worked per week
gss %>%
observe(hours ~ NULL, stat = "mean")
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 41.4
# equivalently, calculating the same statistic with the core verbs
gss %>%
specify(response = hours) %>%
calculate(stat = "mean")
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 41.4
# calculating a t statistic for hypothesized mu = 40 hours worked/week
gss %>%
observe(hours ~ NULL, stat = "t", null = "point", mu = 40)
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 2.09
# equivalently, calculating the same statistic with the core verbs
gss %>%
specify(response = hours) %>%
hypothesize(null = "point", mu = 40) %>%
calculate(stat = "t")
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 2.09
We don’t anticipate that these changes are “breaking” in the sense that code that previously worked will continue to, though it may now message or warn in a way that it did not used to or error with a different (and hopefully more informative) message.
A framework for theoretical inference
This release also introduces a more complete and principled interface for theoretical inference. While the package previously supplied some methods for visualization of theorybased curves, the interface did not provide any object that was explicitly a “null distribution” that could be supplied to helper functions like get_p_value()
and get_confidence_interval()
. The new interface is based on a new verb, assume()
, that returns a null distribution that can be interfaced in the same way that simulationbased null distributions can be interfaced with.
As an example, we’ll work through a full infer pipeline for inference on a mean using infer’s gss
dataset. Supposed that we believe the true mean number of hours worked by Americans in the past week is 40.
First, calculating the observed t
statistic:
obs_stat < gss %>%
specify(response = hours) %>%
hypothesize(null = "point", mu = 40) %>%
calculate(stat = "t")
obs_stat
#> Response: hours (numeric)
#> Null Hypothesis: point
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 2.09
The code to define the null distribution is very similar to that required to calculate a theorized observed statistic, switching out calculate()
for assume()
and replacing arguments as needed.
null_dist < gss %>%
specify(response = hours) %>%
assume(distribution = "t")
null_dist
#> A T distribution with 499 degrees of freedom.
This null distribution can now be interfaced with in the same way as a simulationbased null distribution elsewhere in the package. For example, calculating a pvalue by juxtaposing the observed statistic and null distribution:
get_p_value(null_dist, obs_stat, direction = "both")
#> # A tibble: 1 x 1
#> p_value
#> <dbl>
#> 1 0.0376
…or visualizing the null distribution alone:
visualize(null_dist)
…or juxtaposing the two visually:
visualize(null_dist) +
shade_p_value(obs_stat, direction = "both")
Confidence intervals lie in data space rather than the standardized scale of the theoretical distributions. Calculating a mean rather than the standardized t
statistic:
The null distribution here just defines the spread for the standard error calculation.
ci <
get_confidence_interval(
null_dist,
level = .95,
point_estimate = obs_mean
)
ci
#> # A tibble: 1 x 2
#> lower_ci upper_ci
#> <dbl> <dbl>
#> 1 40.1 42.7
Visualizing the confidence interval results in the theoretical distribution being recentered and rescaled to align with the scale of the observed data:
visualize(null_dist) +
shade_confidence_interval(ci)
Previous methods for interfacing with theoretical distributions are superseded—they will continue to be supported, though documentation will forefront the assume()
interface.
Support for multiple regression
The 2016 “Guidelines for Assessment and Instruction in Statistics Education” [1] state that, in introductory statistics courses, “[s]tudents should gain experience with how statistical models, including multivariable models, are used.” In line with this recommendation, we introduce support for randomizationbased inference with multiple explanatory variables via a new fit.infer
core verb.
If passed an infer
object, the method will parse a formula out of the formula
or response
and explanatory
arguments, and pass both it and data
to a stats::glm
call.
gss %>%
specify(hours ~ age + college) %>%
fit()
#> # A tibble: 3 x 2
#> term estimate
#> <chr> <dbl>
#> 1 intercept 40.6
#> 2 age 0.00596
#> 3 collegedegree 1.53
Note that the function returns the model coefficients as estimate
rather than their associated t
statistics as stat
.
If passed a generate()
d object, the model will be fitted to each replicate.
gss %>%
specify(hours ~ age + college) %>%
hypothesize(null = "independence") %>%
generate(reps = 100, type = "permute") %>%
fit()
#> # A tibble: 300 x 3
#> # Groups: replicate [100]
#> replicate term estimate
#> <int> <chr> <dbl>
#> 1 1 intercept 44.4
#> 2 1 age 0.0767
#> 3 1 collegedegree 0.121
#> 4 2 intercept 41.8
#> 5 2 age 0.00344
#> 6 2 collegedegree 1.59
#> 7 3 intercept 38.3
#> 8 3 age 0.0761
#> 9 3 collegedegree 0.136
#> 10 4 intercept 43.1
#> # … with 290 more rows
If type = "permute"
, a set of unquoted column names in the data to permute (independently of each other) can be passed via the variables
argument to generate
. It defaults to only the response variable.
gss %>%
specify(hours ~ age + college) %>%
hypothesize(null = "independence") %>%
generate(reps = 100, type = "permute", variables = c(age, college)) %>%
fit()
#> # A tibble: 300 x 3
#> # Groups: replicate [100]
#> replicate term estimate
#> <int> <chr> <dbl>
#> 1 1 intercept 39.4
#> 2 1 age 0.0748
#> 3 1 collegedegree 2.98
#> 4 2 intercept 42.8
#> 5 2 age 0.0190
#> 6 2 collegedegree 1.83
#> 7 3 intercept 40.4
#> 8 3 age 0.0354
#> 9 3 collegedegree 1.31
#> 10 4 intercept 40.9
#> # … with 290 more rows
This feature allows for more detailed exploration of the effect of disrupting the correlation structure among explanatory variables on outputted model coefficients.
Each of the auxillary functions get_p_value()
, get_confidence_interval()
, visualize()
, shade_p_value()
, and shade_confidence_interval()
have methods to handle fit()
output! See their helpfiles for example usage. Note that shade_*
functions now delay evaluation until they are added to an existing ggplot (e.g. that outputted by visualize()
) with +
.
Improvements
 Following extensive discussion, the
generate()
typetype = "simulate"
has been renamed to the more evocativetype = "draw"
. We will continue to supporttype = "simulate"
indefinitely, though supplying that argument will now prompt a message notifying the user of its preferred alias. (#233, #390)  Fixed several bugs related to factors with unused levels.
specify()
will now drop unused factor levels and message that it has done so. (#374, #375, #397, #380)  Added
two.sided
as an acceptable alias fortwo_sided
for thedirection
argument inget_p_value()
andshade_p_value()
. (#355)  Various improvements to documentation, including extending example sections in helpfiles, reorganizing the function reference in the {pkgdown} site, and linking more extensively among helpfiles.
Breaking changes
We don’t anticipate that any changes made in this release are “breaking” in the sense that code that previously worked will continue to, though it may now message or warn in a way that it did not used to or error with a different (and hopefully more informative) message. If you currently teach or research with infer, we recommend rerunning your materials and noting any changes in messaging and warning.
 Move forward with a number of planned deprecations. Namely, the
GENERATION_TYPES
object is now fully deprecated, and arguments that were relocated fromvisualize()
toshade_p_value()
andshade_confidence_interval()
are now fully deprecated invisualize()
. If supplied a deprecated argument,visualize()
will warn the user and ignore the argument.  Added a
prop
argument torep_slice_sample()
as an alternative to then
argument for specifying the proportion of rows in the supplied data to sample per replicate (#361, #362, #363). This changes order of arguments ofrep_slice_sample()
(in order to be more aligned withdplyr::slice_sample()
) which might break code if it didn’t use named arguments (likerep_slice_sample(df, 5, TRUE)
). To fix this, use named arguments (likerep_slice_sample(df, 5, replicate = TRUE)
).
Other
 Added Simon P. Couch as an author. Long deserved for his reliable maintenance and improvements of the package.
[1]: GAISE College Report ASA Revision Committee, “Guidelines for Assessment and Instruction in Statistics Education College Report 2016,” http://www.amstat.org/education/gaise.
infer 0.5.4
CRAN release: 20210113

rep_sample_n()
no longer errors when supplied aprob
argument (#279)  Added
rep_slice_sample()
, a light wrapper aroundrep_sample_n()
, that more closely resemblesdplyr::slice_sample()
(the function that supersedesdplyr::sample_n()
) (#325)  Added a
success
,correct
, andz
argument toprop_test()
(#343, #347, #353)  Implemented observed statistic calculation for the standardized proportion z statistic (#351, #353)
 Various bug fixes and improvements to documentation and errors.
infer 0.5.3
CRAN release: 20200714
Breaking changes

get_confidence_interval()
now uses column names (‘lower_ci’ and ‘upper_ci’) in output that are consistent with other infer functionality (#317).
New functionality

get_confidence_interval()
can now produce biascorrected confidence intervals by settingtype = "biascorrected"
. Thanks to @davidbaniadam for the initial implementation (#237, #318)!
infer 0.5.2
CRAN release: 20200614
 Warn the user when a pvalue of 0 is reported (#257, #273)
 Added new vignettes:
chi_squared
andanova
(#268)  Updates to documentation and existing vignettes (#268)
 Add alias for
hypothesize()
(hypothesise()
) (#271)  Subtraction order no longer required for differencebased tests–a warning will be raised in the case that the user doesn’t supply an
order
argument (#275, #281)  Add new messages for common errors (#277)
 Increase coverage of theoretical methods in documentation (#278, #280)
 Drop missing values and reduce size of
gss
dataset used in examples (#282)  Add
stat = "ratio of props"
andstat = "odds ratio"
tocalculate
(#285)  Add
prop_test()
, a tidy interface toprop.test()
(#284, #287)  Updates to
visualize()
for compatibility withggplot2
v3.3.0 (#289)  Fix error when bootstrapping with small samples and raise warnings/errors when appropriate (#239, #244, #291)
 Fix unit test failures resulting from breaking changes in
dplyr
v1.0.0  Fix error in
generate()
when response variable is namedx
(#299)  Add
twosided
andtwo sided
as aliases fortwo_sided
for thedirection
argument inget_p_value()
andshade_p_value()
(#302)  Fix
t_test()
andt_stat()
ignoring theorder
argument (#310)
infer 0.5.0
CRAN release: 20190927
Breaking changes

shade_confidence_interval()
now plots vertical lines starting from zero (previously  from the bottom of a plot) (#234). 
shade_p_value()
now uses “area under the curve” approach to shading (#229).
Other
 Updated
chisq_test()
to take arguments in a response/explanatory format, perform goodness of fit tests, and default to the approximation approach (#241).  Updated
chisq_stat()
to do goodness of fit (#241).  Make interface to
hypothesize()
clearer by adding the options for the point null parameters to the function signature (#242).  Manage
infer
class more systematically (#219).  Use
vdiffr
for plot testing (#221).
infer 0.4.1
 Added Evgeni Chasnovski as author for his incredible work on refactoring the package and providing excellent support.
infer 0.4.0
CRAN release: 20181115
Breaking changes
 Changed method of computing twosided pvalue to a more conventional one. It also makes
get_pvalue()
andvisualize()
more aligned (#205).
Deprecation changes
 Deprecated
p_value()
(useget_p_value()
instead) (#180).  Deprecated
conf_int()
(useget_confidence_interval()
instead) (#180).  Deprecated (via warnings) plotting pvalue and confidence interval in
visualize()
(use new functionsshade_p_value()
andshade_confidence_interval()
instead) (#178).
New functions

shade_p_value()
 {ggplot2}like layer function to add information about pvalue region tovisualize()
output. Has aliasshade_pvalue()
. 
shade_confidence_interval()
 {ggplot2}like layer function to add information about confidence interval region tovisualize()
output. Has aliasshade_ci()
.
Other
 Account for
NULL
value in left hand side of formula inspecify()
(#156) andtype
ingenerate()
(#157).  Update documentation code to follow tidyverse style guide (#159).
 Remove help page for internal
set_params()
(#165).  Fully use {tibble} (#166).
 Fix
calculate()
to not depend on order ofp
fortype = "simulate"
(#122).  Reduce code duplication (#173).
 Make transparency in
visualize()
to not depend on method and data volume.  Make
visualize()
work for “One sample t” theoretical type withmethod = "both"
.  Add
stat = "sum"
andstat = "count"
options tocalculate()
(#50).
infer 0.3.1
CRAN release: 20180806
 Stop using package {assertive} in favor of custom type checks (#149)
 Fixed
t_stat()
to use...
sovar.equal
works  With the help of @echasnovski, fixed
var.equal = TRUE
forspecify() %>% calculate(stat = "t")
 Use custom functions for error, warning, message, and
paste()
handling (#155)
infer 0.3.0
CRAN release: 20180711
 Added
conf_int
logical argument andconf_level
argument tot_test()
 Switched
shade_color
argument invisualize()
to bepvalue_fill
instead since fill color for confidence intervals is also added now  Shading for Confidence Intervals in
visualize()
 Green is default color for CI and red for pvalues

direction = "between"
to get the green shading  Currently working only for simulationbased methods
 Implemented
conf_int()
function for computing confidence interval provided a simulationbased method with astat
variable
get_ci()
andget_confidence_interval()
are aliases forconf_int()
 Converted longer confidence interval calculation code in vignettes to use
get_ci()
instead

 Implemented
p_value()
function for computing pvalue provided a simulationbased method with astat
variable
get_pvalue()
is an alias forp_value()
 Converted longer pvalue calculation code in vignettes to use
get_pvalue()
instead

 Implemented Chisquare Goodness of Fit observed stat depending on
params
being set inhypothesize
withspecify() %>% calculate()
shortcut  Removed “standardized” slope t since its formula is different than “standardized” correlation and there is no way currently to give one over the other
 Implemented correlation with bootstrap CI and permutation hypothesis test
 Filled the
type
argument automatically ingenerate()
based onspecify()
andhypothesize()
 Added message if
type
is given differently than expected
 Added message if
 Implemented
specify() %>% calculate()
for getting observed statistics.
visualize()
works with either a 1x1 data frame or a vector for itsobs_stat
argument  Got
stat = "t"
working

 Refactored
calculate()
into smaller functions to reduce complexity  Produced error if
mu
is given inhypothesize()
butstat = "median"
is provided incalculate()
and other similar misspecifications  Tweaked
chisq_stat()
andt_stat()
to match withspecify() %>% calculate()
framework Both work in the one sample and two sample cases by providing
formula
 Added
order
argument tot_stat()
 Both work in the one sample and two sample cases by providing
 Added implementation of one sample
t_test()
by passing in themu
argument tot.test
fromhypothesize()
 Tweaked
pkgdown
page to include ToDo’s using {dplyr} example
infer 0.2.0
CRAN release: 20180515
 Switched to
!!
instead ofUQ()
sinceUQ()
is deprecated in {rlang} 0.2.0  Added many new files:
CONDUCT.md
,CONTRIBUTING.md
, andTODO.md
 Updated README file with more development information
 Added wrapper functions
t_test()
andchisq_test()
that use a formula interface and provide an intuitive wrapper tot.test()
andchisq.test()
 Created
stat = "z"
andstat = "t"
options  Added many new arguments to
visualize()
to prescribe colors to shade and use for observed statistics and theoretical density curves  Added check so that a bar graph created with
visualize()
if number of unique values for generated statistics is small  Added shading for
method = "theoretical"
 Implemented shading for simulation methods w/o a traditional distribution
 Use percentiles to determine twotailed shading
 Changed
method = "randomization"
tomethod = "simulation"
 Added warning when theoretical distribution is used that assumptions should be checked
 Added theoretical distributions to
visualize()
alone and as overlay with current implementations being Two sample t
 ANOVA F
 One proportion z
 Two proportion z
 Chisquare test of independence
 Chisquare Goodness of Fit test
 Standardized slope (t)
infer 0.1.1
CRAN release: 20180122
 Added additional tests
 Added
order
argument incalculate()
 Fixed bugs postCRAN release
 Automated travis build of pkgdown to ghpages branch
infer 0.1.0
CRAN release: 20180108
 Altered the way that successes are indicated in an infer pipeline. They now live in
specify()
.  Updated documentation with examples
 Created
pkgdown
site materials Deployed to https://infer.tidymodels.org/