The dataverse package uses disk and session caches to improve network performance. Use of the cache is described on this page.

cache_dataset(version)

cache_path()

cache_info()

cache_reset()

Arguments

version

A character specifying a version of the dataset. This can be of the form "1.1" or "1" (where in "x.y", x is a major version and y is an optional minor version), or ":latest" (the default, the latest published version). We recommend using the number format so that the function stores a cache of the data (See cache_dataset). If the user specifies a key or DATAVERSE_KEY argument, they can access the draft version by ":draft" (the current draft) or ":latest" (which will prioritize the draft over the latest published version. Finally, set use_cache = "none" to not read from the cache and re-download afresh even when version is provided.

Value

cache_dataset() returns "disk" if the dataset version is to be cached to disk, "none" otherwise.

cache_path() returns the file path to the directory containing the cache.

cache_info() returns a data.frame containing names and sizes of files in the cache.

cache_reset() returns the path to the (now empty) cache, invisibly)

Details

Use of the cache is determined by the value of the use_cache = argument to dataset and other API calls, or by the environment variable DATAVERSE_USE_CACHE. Possible values are

  • "none": do not use the cache. This is the default for datasets that are versioned with ":draft", ":latest", and ":latest-published".

  • "session": cache API requests for the duration of the R session. This is the default for API calls that do not involve file or dataset retrieval.

  • `"disk": use a permanent disk cache. This is the default for files and explicitly versioned datasets.

cache_dataset() determines whether a dataset or file should be cached based on the version specification.

cache_path() finds or creates the location (directory) on the file system containing the cache.

cache_info() queries the cache for information about the name, size, and other attributes of files in the cache. The file name is a 'hash' of the function used to retrieve the file; it is not useful for identifying specific files.

cache_reset() clears all downloaded files from the disk cache.

Examples

cache_dataset(":latest")  # "none"
#> [1] "none"
cache_dataset("1.2")      # "disk"
#> [1] "disk"

if (FALSE) { # \dontrun{
 # specifying the version will by default store a cache. Add `use_cache = "none"` to turn off
 df_tab <-
  get_dataframe_by_name(
   filename = "roster-bulls-1996.tab",
   dataset  = "doi:10.70122/FK2/HXJVJU",
   server   = "demo.dataverse.org",
   version = "3"
 )
} # }

cache_path()
#> [1] "/Users/runner/Library/Caches/org.R-project.R/R/dataverse/api_cache"

cache_info()
#>  [1] size   isdir  mode   mtime  ctime  atime  uid    gid    uname  grname
#> <0 rows> (or 0-length row.names)