
Fast Data Manipulation
fast-data-manipulation.Rdcollapse provides the following functions for fast manipulation of (mostly) data frames.
fselectis a much faster alternative todplyr::selectto select columns using expressions involving column names.get_varsis a more versatile and programmer friendly function to efficiently select and replace columns by names, indices, logical vectors, regular expressions, or using functions to identify columns.num_vars,cat_vars,char_vars,fact_vars,logi_varsanddate_varsare convenience functions to efficiently select and replace columns by data type.add_varsefficiently adds new columns at any position within a data frame (default at the end). This can be done vie replacement (i.e.add_vars(data) <- newdata) or returning the appended data, e.g.,add_vars(data, newdata1, newdata2, ...). It is thus also an efficient alternative tocbind.data.frame.rowbindefficiently combines data frames / lists row-wise. The implementation is derived fromdata.table::rbindlist, it is also a fast alternative torbind.data.frame.joinprovides fast, class-agnostic, and verbose table joins.pivotefficiently reshapes data, supporting longer, wider and recast pivoting, as well as multi-column-pivots and pivots taking along variable labels.fsubsetis a much faster version ofsubsetto efficiently subset vectors, matrices and data frames. If the non-standard evaluation offered byfsubsetis not needed, the functionssis a much faster and more secure alternative to[.data.frame.fslice(v)is a much faster alternative todplyr::slice_[head|tail|min|max]for filtering/deduplicating matrix-like objects (by groups).fsummariseis a much faster version ofdplyr::summarise, especially when used together with the Fast Statistical Functions andfgroup_by.fmutateis a much faster version ofdplyr::mutate, especially when used together with the Fast Statistical Functions, the fast Data Transformation Functions, andfgroup_by.ftransform(v)is a much faster version oftransform, which also supports list input and nested pipelines.settransform(v)does all of that by reference, i.e. it assigns to the calling environment.fcompute(v)is similar toftransform(v)but only returns modified/computed columns.roworderis a fast substitute fordplyr::arrange, but the syntax is inspired bydata.table::setorder.colorderefficiently reorders columns in a data frame, see alsodata.table::setcolorder.frenameis a fast substitute fordplyr::rename, to efficiently rename various objects.setrenamerenames objects by reference.relabelandsetrelabeldo the same thing for variable labels (see alsovlabels).
Table of Functions
| Function / S3 Generic | Methods | Description | ||
fselect(<-) | No methods, for data frames | Fast select or replace columns (non-standard evaluation) | ||
get_vars(<-), num_vars(<-), cat_vars(<-), char_vars(<-), fact_vars(<-), logi_vars(<-), date_vars(<-) | No methods, for data frames | Fast select or replace columns | ||
add_vars(<-) | No methods, for data frames | Fast add columns | ||
rowbind | No methods, for lists of lists/data frames | Fast row-binding lists | ||
join | No methods, for data frames | Fast table joins | ||
pivot | No methods, for data frames | Fast reshaping | ||
fsubset | default, matrix, data.frame, pseries, pdata.frame | Fast subset data (non-standard evaluation) | ||
ss | No methods, for data frames | Fast subset data frames | ||
fslice(v) | No methods, for matrices and data frames | Fast slicing of rows | ||
fsummarise | No methods, for data frames | Fast data aggregation | ||
fmutate, (f/set)transform(v)(<-) | No methods, for data frames | Compute, modify or delete columns (non-standard evaluation) | ||
                 fcompute(v) | No methods, for data frames | Compute or modify columns, returned in a new data frame (non-standard evaluation) | ||
roworder(v) | No methods, for data frames incl. pdata.frame | Reorder rows and return data frame (standard and non-standard evaluation) | ||
colorder(v) | No methods, for data frames | Reorder columns and return data frame (standard and non-standard evaluation) | ||
(f/set)rename, (set)relabel | No methods, for all objects with 'names' attribute | Rename and return object / relabel columns in a data frame. |