
Fast Data Manipulation
fast-data-manipulation.Rd
collapse provides the following functions for fast manipulation of (mostly) data frames.
fselect
is a much faster alternative todplyr::select
to select columns using expressions involving column names.get_vars
is a more versatile and programmer friendly function to efficiently select and replace columns by names, indices, logical vectors, regular expressions, or using functions to identify columns.num_vars
,cat_vars
,char_vars
,fact_vars
,logi_vars
anddate_vars
are convenience functions to efficiently select and replace columns by data type.add_vars
efficiently adds new columns at any position within a data frame (default at the end). This can be done vie replacement (i.e.add_vars(data) <- newdata
) or returning the appended data, e.g.,add_vars(data, newdata1, newdata2, ...)
. It is thus also an efficient alternative tocbind.data.frame
.rowbind
efficiently combines data frames / lists row-wise. The implementation is derived fromdata.table::rbindlist
, it is also a fast alternative torbind.data.frame
.join
provides fast, class-agnostic, and verbose table joins.pivot
efficiently reshapes data, supporting longer, wider and recast pivoting, as well as multi-column-pivots and pivots taking along variable labels.fsubset
is a much faster version ofsubset
to efficiently subset vectors, matrices and data frames. If the non-standard evaluation offered byfsubset
is not needed, the functionss
is a much faster and more secure alternative to[.data.frame
.fslice(v)
is a much faster alternative todplyr::slice_[head|tail|min|max]
for filtering/deduplicating matrix-like objects (by groups).fsummarise
is a much faster version ofdplyr::summarise
, especially when used together with the Fast Statistical Functions andfgroup_by
.fmutate
is a much faster version ofdplyr::mutate
, especially when used together with the Fast Statistical Functions, the fast Data Transformation Functions, andfgroup_by
.ftransform(v)
is a much faster version oftransform
, which also supports list input and nested pipelines.settransform(v)
does all of that by reference, i.e. it assigns to the calling environment.fcompute(v)
is similar toftransform(v)
but only returns modified/computed columns.roworder
is a fast substitute fordplyr::arrange
, but the syntax is inspired bydata.table::setorder
.colorder
efficiently reorders columns in a data frame, see alsodata.table::setcolorder
.frename
is a fast substitute fordplyr::rename
, to efficiently rename various objects.setrename
renames objects by reference.relabel
andsetrelabel
do the same thing for variable labels (see alsovlabels
).
Table of Functions
Function / S3 Generic | Methods | Description | ||
fselect(<-) | No methods, for data frames | Fast select or replace columns (non-standard evaluation) | ||
get_vars(<-) , num_vars(<-) , cat_vars(<-) , char_vars(<-) , fact_vars(<-) , logi_vars(<-) , date_vars(<-) | No methods, for data frames | Fast select or replace columns | ||
add_vars(<-) | No methods, for data frames | Fast add columns | ||
rowbind | No methods, for lists of lists/data frames | Fast row-binding lists | ||
join | No methods, for data frames | Fast table joins | ||
pivot | No methods, for data frames | Fast reshaping | ||
fsubset | default, matrix, data.frame, pseries, pdata.frame | Fast subset data (non-standard evaluation) | ||
ss | No methods, for data frames | Fast subset data frames | ||
fslice(v) | No methods, for matrices and data frames | Fast slicing of rows | ||
fsummarise | No methods, for data frames | Fast data aggregation | ||
fmutate , (f/set)transform(v)(<-) | No methods, for data frames | Compute, modify or delete columns (non-standard evaluation) | ||
fcompute(v) | No methods, for data frames | Compute or modify columns, returned in a new data frame (non-standard evaluation) | ||
roworder(v) | No methods, for data frames incl. pdata.frame | Reorder rows and return data frame (standard and non-standard evaluation) | ||
colorder(v) | No methods, for data frames | Reorder columns and return data frame (standard and non-standard evaluation) | ||
(f/set)rename , (set)relabel | No methods, for all objects with 'names' attribute | Rename and return object / relabel columns in a data frame. |