pacs
pacs: A set of tools that make life easier for developers and maintainers of R packages.
renv
lock
files.Hint0: An Internet connection is required to take full advantage of most features.
Hint1: Almost all calls that requiring an Internet
connection are cached (for 30 minutes) by the memoise
package, so the second invocation of the same command (and arguments) is
immediate. Restart the R session if you want to clear cached data.
Hint2: Version
variable is mostly a
minimal required i.e. max(version1, version2 , …).
Hint3: When working with many packages, global
functions are recommended, which retrieve data for many packages at
once. An example will be the usage of
pacs::checked_packages()
over
pacs::pac_checkpage
(or pacs::pac_checkred
).
Another example will be the usage of
utils::available.packages
over pacs::pac_last
.
Finally, the most important one will be pacs::lib_validate
over pacs::pac_validate
and pacs::pac_checkred
and others.
Hint4: Character string “all” is shorthand for the
c("Depends", "Imports", "LinkingTo", "Suggests", "Enhances")
vector, character string “most” for the same vector without “Enhances”,
character string “strong” (default setup) for the first three elements
of that vector.
Hint5: Use parallel::mclapply
(Linux
and Mac) or parallel::parLapply
(Windows, Linux and Mac) to
speed up loop calculations. Nevertheless, under
parallel::mclapply
, computation results are NOT cached with
memoise
package. Warning: Parallel computations might be
unstable.
Hint6: withr
and remotes
packages are a valuable addition.
This procedure will be crucial for R developers, clearly showing the
possible broken packages inside the local library.
We could assess which packages require versions to update.
Default validation of the library with the
pacs::lib_validate
function.
The field
argument is equal to
c("Depends", "Imports", "LinkingTo")
by default as these
are the dependencies installed when the install.packages
function is used.
The full library validation requires activation of two additional
arguments, lifeduration
and checkred
.
Additional arguments are by default turned off as they are
time-consuming, for lifeduration
assessment might take even
a few minutes for bigger libraries.
Assessment of status on CRAN check pages takes only a few additional
seconds, even for all R CRAN packages.
pacs::checked_packages()
is used to gather all package
check statuses for all CRAN servers.
When lifeduration
is triggered then assessment might
take even few minutes.
Not only scope
field inside checkred
list
could be updated, to remind any of
c("ERROR", "FAIL", "WARN", "NOTE")
. We could specify
flavors
field inside the checkred
list
argument and narrow the tested machines. The full list of CRAN servers
(flavors) might be get with
pacs::cran_flavors()$Flavor
.
flavs <- pacs::cran_flavors()$Flavor[1:2]
pacs::lib_validate(checkred = list(scope = c("ERROR", "FAIL"),
flavors = flavs))
Packages are not installed (and should be) or have too low version:
lib <- pacs::lib_validate(checkred = list(scope = c("ERROR", "FAIL")))
# not installed (and should be) or too low version
lib[(lib$version_status == -1), ]
# not installed (and should be)
lib[is.na(lib$Version.have), ]
# too low version
lib[(!is.na(lib$Version.have)) & (lib$version_status == -1), ]
Packages which have at least one CRAN server which ERROR or FAIL:
Packages that are not a dependency (default
c("Depends", "Imports", "LinkingTo")
) of any other
package:
Non-CRAN packages:
Not newest packages:
The core idea behind the function comes from proper processing of the
installed.packages
function result.
# aggregate function is needed as we could have different versions
# installed under different `.libPaths()`.
installed_packages_unique <- stats::aggregate(
installed.packages()[, c("Version", "Depends", "Imports", "LinkingTo")],
list(Package = installed.packages()[, "Package"]),
function(x) x[1]
)
# installed_descriptions function transforms direct dependencies DESCRIPTION file fields
# installed_packages_unique[, c("Depends", "Imports", "LinkingTo")]
# to the two column data.frame with Package name
# and minimum required Version i.e. max(version1, version2 , ...).
installed_descriptions <- pacs:::installed_descriptions(
lib.loc = .libPaths(),
fields = c("Depends", "Imports", "LinkingTo")
)
merge(
installed_descriptions,
installed_packages_unique[, c("Package", "Version")],
by = "Package",
all = TRUE,
suffix = c(".expected.min", ".have")
)
When a project is based on renv
and all needed
dependencies are installed in the renv
directory, we mostly
want to validate only the isolated renv
library. In the new
renv
versions the .libPaths()
contains the
main library path too (renv
library and the main library).
Please remember to limit the library path when using
pacs::lib_validate
, to limit the validation to only
renv
library.
Warning, at least rsconnect
(and its
packrat
connected dependencies) related packages could
still not be in the renv
library.
There is a way to validate the renv
lock file in the
same way the local library or packages are validated.
# a path or url
url <- "https://raw.githubusercontent.com/Polkas/pacs/master/tests/testthat/files/renv_test.lock"
pacs::lock_validate(url)
pacs::lock_validate(
url,
checkred = list(scope = c("ERROR", "FAIL"), flavors = NULL)
)
pacs::lock_validate(
url,
lifeduration = TRUE,
checkred = list(scope = c("ERROR", "FAIL"), flavors = NULL)
)
checked_packages
was built to extend the
.packages
family functions, like
utils::installed.packages()
and
utils::available.packages()
.
pacs::checked_packages
retrieves all current package checks
from CRAN
webpage.
Use pacs::pac_checkpage("dplyr")
to get the check page
per package. However, pacs::checked_packages()
will be more
efficient for many packages. Remember that
pacs::checked_packages()
result is cached after the first
invoke.
We could determine if a specific package version lived more than 14 days (or other x limit days). If not then we might assume something needed to be fixed with it, as had to be quickly updated.
e.g. dplyr
under the “0.8.0” version seems to be a
broken release, we could find out that it was published only for 1
day.
With a 14 day limit we get a proper health status. We are sure about
this state as this is not the newest release. For the newest packages we
are checking if there are any red messages on CRAN check pages too,
specified with a scope
argument.
For the newest package, we will check the CRAN check page too, the scope might be adjusted.
withr
packagewithr
package is recommended for the isolated download
process.
We could use a temporary library path
(withr::with_temp_libpaths
) to check if the process is as
expected.
Checking what packages need to be installed/(optionally updated)
parallel with a specific package, with remotes
package. The
full list even with packages which are already installed could be get
with pacs::pac_deps_user
.
Isolated download of a package and the validation.
Using R CRAN website to get packages version/versions used at a specific Date or a Date interval.
pacs::pac_timemachine("dplyr")
pacs::pac_timemachine("dplyr", version = "0.8.0")
pacs::pac_timemachine("dplyr", at = as.Date("2017-02-02"))
pacs::pac_timemachine("dplyr", from = as.Date("2017-02-02"), to = as.Date("2018-04-02"))
pacs::pac_timemachine("dplyr", at = Sys.Date())
pacs::pac_timemachine("tidyr", from = as.Date("2020-06-01"), to = Sys.Date())
One of the main functionality is to get versions for all package dependencies. Versions might come from installed packages or DESCRIPTION files.
pac_deps
for an extremely fast retrieving of package
dependencies, packages versions might come from installed ones or from
DESCRIPTION files (required minimum). The default setup is to show
dependencies recursively, recursive = TRUE
.
# Providing more than tools::package_dependencies and packrat:::recursivePackageVersion
# pacs::pac_deps is providing the min required version for each package
# Use it to answer what we should have
res <- pacs::pac_deps("shiny", description_v = TRUE)
res
attributes(res)
Packages dependencies with versions from DESCRIPTION files.
Remote (newest CRAN) package dependencies with versions.
Raw dependencies from DESCRIPTION file. The same which is needed by
the the install.packages
function.
Depends/Imports/LinkingTo
DESCRIPTION fields dependencies, recursively.
pacs::pac_deps_user
could be used to check them.
pacs::pac_deps("memoise", description_v = TRUE, recursive = FALSE, local = FALSE)
# or
pacs::pac_deps_user("memoise")
The field
argument is used to change the scope of
exploration. The field
argument is equal to
c("Depends", "Imports", "LinkingTo")
by default as these
are the dependencies installed when the install.packages
function is used. When the field
argument is extended the
number of dependencies will grow. Remember that we are looking for
dependencies recursively by default. At the moment of writing it the
first invoke returns 3 dependencies whereas the second over an one
thousand. It should be clear that when extending the scope (and
recursively) with the "Suggests"
field then the number of
dependencies is exploding.
nrow(pacs::pac_deps("memoise", fields = c("Depends", "Imports", "LinkingTo")))
nrow(pacs::pac_deps("memoise", fields = c("Depends", "Imports", "LinkingTo", "Suggests")))
The developer dependencies are the ones needed when
e.g. R CMD check
is run. These are
Depends/Imports/LinkingTo/Suggests
DESCRIPTION fields dependencies, and for them
Depends/Imports/LinkingTo
recursively. pacs::pac_deps_dev
could be used to check
them. Obviously the list is much longer as the one for
pacs::pac_deps_user
.
For a certain version (archived), might take some time.
Reading raw dcf
DESCRIPTION files scrapped from the
github CRAN mirror or if not worked from the CRAN website.
Comparing DESCRIPTION file dependencies between local and the newest package. We will get duplicated columns if the local version is the newest one.
Comparing DESCRIPTION file dependencies between package versions.
Comparing NAMESPACE between local and the newest package.
Comparing NAMESPACE between package versions.
Take into account that packages sizes are appropriate for your local
system (Sys.info()
). Installation with
install.packages
and some devtools
functions
might result in different packages sizes.
If you do not want to install anything in your current library
(.libPaths()
) and still inspect a package size, then a
usage of the withr
package is recommended.
withr::with_temp_libpaths
is recommended to isolate the
download process.
# restart of R session could be needed
withr::with_temp_libpaths({install.packages("devtools"); cat(pacs::pac_true_size("devtools") / 10**6, "MB", "\n")})
Installation in your main library.
Size of the devtools
package:
True size of the package as taking into account its dependencies. At
the time of writing it, it is 113MB
for
devtools
without base packages
(Mac OS arm64
).
A reasonable assumption might be to count only dependencies which are
not used by any other package. Then we could use
exclude_joint
argument to limit them. However hard to
assume if your local installation is a reasonable proxy for an average
user.
# exclude packages if at least one other package use it too
cat(pacs::pac_true_size("devtools", exclude_joint = 1L) / 10**6, "MB", "\n")
We could check out which of the direct dependencies are heaviest ones:
The shiny app dependencies packages are checked. By default the
c("Depends", "Imports", "LinkingTo")
DESCRIPTION files
fields are check recursively for each package recognized with the
renv::dependencies
function. The required dependencies have
to be installed in the local repository.
pacs::app_deps(system.file("examples/04_mpg", package = "shiny"), description_v = TRUE)
pacs::app_deps(system.file("examples/04_mpg", package = "shiny"), description_v = TRUE, local = FALSE)
When we want to check only direct dependencies,
recursive
argument has to be set to FALSE
.
Then you could use the renv::dependencies
function
directly.
The size of shiny app is a sum of dependencies and the app directory. The app dependencies (packages) are checked recursively.