flag is an S3 generic to compute (sequences of) lags and leads. L and F are wrappers around flag representing the lag- and lead-operators, such that L(x,-1) = F(x,1) = F(x) and L(x,-3:3) = F(x,3:-3). L and F provide more flexibility than flag when applied to data frames (i.e. column subsetting, formula input and id-variable-preservation capabilities...), but are otherwise identical.

(flag is more of a programmers function in style of the Fast Statistical Functions while L and F are more practical to use in regression formulas or for computations on data frames.)

flag(x, n = 1, ...)
L(x, n = 1, ...)
F(x, n = 1, ...)

# S3 method for default
flag(x, n = 1, g = NULL, t = NULL, fill = NA, stubs = TRUE, ...)
# S3 method for default
L(x, n = 1, g = NULL, t = NULL, fill = NA, stubs = TRUE, ...)
# S3 method for default
F(x, n = 1, g = NULL, t = NULL, fill = NA, stubs = TRUE, ...)

# S3 method for matrix
flag(x, n = 1, g = NULL, t = NULL, fill = NA, stubs = length(n) > 1L, ...)
# S3 method for matrix
L(x, n = 1, g = NULL, t = NULL, fill = NA, stubs = TRUE, ...)
# S3 method for matrix
F(x, n = 1, g = NULL, t = NULL, fill = NA, stubs = TRUE, ...)

# S3 method for data.frame
flag(x, n = 1, g = NULL, t = NULL, fill = NA, stubs = length(n) > 1L, ...)
# S3 method for data.frame
L(x, n = 1, by = NULL, t = NULL, cols = is.numeric,
fill = NA, stubs = TRUE, keep.ids = TRUE, ...)
# S3 method for data.frame
F(x, n = 1, by = NULL, t = NULL, cols = is.numeric,
fill = NA, stubs = TRUE, keep.ids = TRUE, ...)

# Methods for compatibility with plm:

# S3 method for pseries
flag(x, n = 1, fill = NA, stubs = TRUE, ...)
# S3 method for pseries
L(x, n = 1, fill = NA, stubs = TRUE, ...)
# S3 method for pseries
F(x, n = 1, fill = NA, stubs = TRUE, ...)

# S3 method for pdata.frame
flag(x, n = 1, fill = NA, stubs = length(n) > 1L, ...)
# S3 method for pdata.frame
L(x, n = 1, cols = is.numeric, fill = NA, stubs = TRUE,
keep.ids = TRUE, ...)
# S3 method for pdata.frame
F(x, n = 1, cols = is.numeric, fill = NA, stubs = TRUE,
keep.ids = TRUE, ...)

# Methods for grouped data frame / compatibility with dplyr:

# S3 method for grouped_df
flag(x, n = 1, t = NULL, fill = NA, stubs = length(n) > 1L, keep.ids = TRUE, ...)
# S3 method for grouped_df
L(x, n = 1, t = NULL, fill = NA, stubs = TRUE, keep.ids = TRUE, ...)
# S3 method for grouped_df
F(x, n = 1, t = NULL, fill = NA, stubs = TRUE, keep.ids = TRUE, ...)

## Arguments

x a vector / time series, (time series) matrix, data frame, panel series (plm::pseries), panel data frame (plm::pdata.frame) or grouped data frame (class 'grouped_df'). Data must not be numeric i.e you can also lag a date variable, character data etc... integer. A vector indicating the lags / leads to compute (passing negative integers to flag or L computes leads, passing negative integers to F computes lags). a factor, GRP object, atomic vector (internally converted to factor) or a list of vectors / factors (internally converted to a GRP object) used to group x. data.frame method: Same as g, but also allows one- or two-sided formulas i.e. ~ group1 or var1 + var2 ~ group1 + group2. See Examples. same input as g/by, to indicate the time-variable(s). For safe computation of differences on unordered time series and panels. Data Frame method also allows one-sided formula i.e. ~time. grouped_df method supports lazy-evaluation i.e. time (no quotes). data.frame method: Select columns to difference using a function, column names, indices or a logical vector. Default: All numeric variables. Note: cols is ignored if a two-sided formula is passed to by. value to insert when vectors are shifted. Default is NA. logical. TRUE will rename all lagged / leaded columns by adding a stub or prefix "Ln." / "Fn.". data.frame / pdata.frame / grouped_df methods: Logical. Drop all panel-identifiers from the output (which includes all variables passed to by or t). Note: For grouped / panel data frames identifiers are dropped, but the 'groups' / 'index' attributes are kept. arguments to be passed to or from other methods.

## Details

If a single integer is passed to n, and g/by and t are left empty, flag/L/F just returns x with all columns lagged / leaded by n. If length(n)>1, and x is an atomic vector (time series), flag/L/F returns a (time series) matrix with lags / leads computed in the same order as passed to n. If instead x is a matrix / data frame, a matrix / data frame with ncol(x)*length(n) columns is returned where columns are sorted first by variable and then by lag (so all lags computed on a variable are grouped together). x can be of any standard data type.

With groups/panel-identifiers supplied to g/by, flag/L/F efficiently computes a panel-lag/lead by shifting the entire vector(s) but inserting fill elements in the right places. If t is left empty, the data needs to be ordered such that all values belonging to a group are consecutive and in the right order. It is not necessary that the groups themselves occur in the right order. If a time-variable is supplied to t (or a list of time-variables uniquely identifying the time-dimension), the panel is fully identified and lags / leads can be securely computed even if the data is unordered.

It is also possible to lag unordered or irregular time series utilizing only the t argument to identify the temporal dimension of the data.

Since v1.5.0 flag/L/F provide full built-in support for irregular time series and unbalanced panels. The suggested workaround using the seqid function is therefore no longer necessary.

Computationally, if both g/by and t are supplied, flag/L/F uses two initial passes to create an ordering through which the data are accessed. First-pass: Calculate minimum and maximum time-value for each individual. Second-pass: Generate the ordering by placing the current element index into the vector slot obtained by adding the cumulative group size and the current time-value subtracted its individual-minimum together. This method of computation is faster than any sort-based method and delivers optimal performance if the panel-id supplied to g/by is already a factor variable, and if t is either an integer or factor variable. If g/by is not factor or t is not factor or integer, qG or GRP will be called to group the respective identifier and this can be expensive, so for optimal performance prepare the data (or use plm classes).

The methods applying to plm objects (panel series and panel data frames) automatically utilize the factor panel-identifiers attached to these objects and thus securely and efficiently compute fully identified panel-lags. If these objects have > 2 panel-identifiers attached to them, the last identifier is assumed to be the time-variable, and the others are taken as grouping-variables and interacted. Note that flag/L/F is significantly faster than plm::lag/plm::lead since the latter is written in R and based on a Split-Apply-Combine logic.

## Value

x lagged / leaded n-times, grouped by g/by, ordered by t. See Details and Examples.

fdiff, fgrowth, Time Series and Panel Series, Collapse Overview

## Examples

## Simple Time Series: AirPassengers
L(AirPassengers)                      # 1 lag
#>      Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
#> 1949  NA 112 118 132 129 121 135 148 148 136 119 104
#> 1950 118 115 126 141 135 125 149 170 170 158 133 114
#> 1951 140 145 150 178 163 172 178 199 199 184 162 146
#> 1952 166 171 180 193 181 183 218 230 242 209 191 172
#> 1953 194 196 196 236 235 229 243 264 272 237 211 180
#> 1954 201 204 188 235 227 234 264 302 293 259 229 203
#> 1955 229 242 233 267 269 270 315 364 347 312 274 237
#> 1956 278 284 277 317 313 318 374 413 405 355 306 271
#> 1957 306 315 301 356 348 355 422 465 467 404 347 305
#> 1958 336 340 318 362 348 363 435 491 505 404 359 310
#> 1959 337 360 342 406 396 420 472 548 559 463 407 362
#> 1960 405 417 391 419 461 472 535 622 606 508 461 390F(AirPassengers)                      # 1 lead
#>      Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
#> 1949 118 132 129 121 135 148 148 136 119 104 118 115
#> 1950 126 141 135 125 149 170 170 158 133 114 140 145
#> 1951 150 178 163 172 178 199 199 184 162 146 166 171
#> 1952 180 193 181 183 218 230 242 209 191 172 194 196
#> 1953 196 236 235 229 243 264 272 237 211 180 201 204
#> 1954 188 235 227 234 264 302 293 259 229 203 229 242
#> 1955 233 267 269 270 315 364 347 312 274 237 278 284
#> 1956 277 317 313 318 374 413 405 355 306 271 306 315
#> 1957 301 356 348 355 422 465 467 404 347 305 336 340
#> 1958 318 362 348 363 435 491 505 404 359 310 337 360
#> 1959 342 406 396 420 472 548 559 463 407 362 405 417
#> 1960 391 419 461 472 535 622 606 508 461 390 432  NA
all_identical(L(AirPassengers),       # 3 identical ways of computing 1 lag
flag(AirPassengers),
F(AirPassengers, -1))
#> [1] TRUE
head(L(AirPassengers, -1:3))          # 1 lead and 3 lags - output as matrix
#>       F1  --  L1  L2  L3
#> [1,] 118 112  NA  NA  NA
#> [2,] 132 118 112  NA  NA
#> [3,] 129 132 118 112  NA
#> [4,] 121 129 132 118 112
#> [5,] 135 121 129 132 118
#> [6,] 148 135 121 129 132
## Time Series Matrix of 4 EU Stock Market Indicators, 1991-1998
tsp(EuStockMarkets)                                     # Data is recorded on 260 days per year
#> [1] 1991.496 1998.646  260.000freq <- frequency(EuStockMarkets)
plot(stl(EuStockMarkets[,"DAX"], freq))                 # There is some obvious seasonality
head(L(EuStockMarkets, -1:3 * freq))                    # 1 annual lead and 3 annual lags
#>      F260.DAX     DAX L260.DAX L520.DAX L780.DAX F260.SMI    SMI L260.SMI
#> [1,]  1755.98 1628.75       NA       NA       NA   1846.6 1678.1       NA
#> [2,]  1754.95 1613.63       NA       NA       NA   1854.8 1688.5       NA
#> [3,]  1759.90 1606.51       NA       NA       NA   1845.3 1678.6       NA
#> [4,]  1759.84 1621.04       NA       NA       NA   1854.5 1684.1       NA
#> [5,]  1776.50 1618.16       NA       NA       NA   1870.5 1686.6       NA
#> [6,]  1769.98 1610.61       NA       NA       NA   1862.6 1671.6       NA
#>      L520.SMI L780.SMI F260.CAC    CAC L260.CAC L520.CAC L780.CAC F260.FTSE
#> [1,]       NA       NA   1907.3 1772.8       NA       NA       NA    2515.8
#> [2,]       NA       NA   1900.6 1750.5       NA       NA       NA    2521.2
#> [3,]       NA       NA   1880.9 1718.0       NA       NA       NA    2493.9
#> [4,]       NA       NA   1873.5 1708.1       NA       NA       NA    2476.1
#> [5,]       NA       NA   1883.6 1723.1       NA       NA       NA    2497.1
#> [6,]       NA       NA   1868.5 1714.3       NA       NA       NA    2469.0
#>        FTSE L260.FTSE L520.FTSE L780.FTSE
#> [1,] 2443.6        NA        NA        NA
#> [2,] 2460.2        NA        NA        NA
#> [3,] 2448.2        NA        NA        NA
#> [4,] 2470.4        NA        NA        NA
#> [5,] 2484.7        NA        NA        NA
#> [6,] 2466.8        NA        NA        NAsummary(lm(DAX ~., data = L(EuStockMarkets,-1:3*freq))) # DAX regressed on it's own annual lead,
#>
#> Call:
#> lm(formula = DAX ~ ., data = L(EuStockMarkets, -1:3 * freq))
#>
#> Residuals:
#>      Min       1Q   Median       3Q      Max
#> -158.092  -30.174    1.355   28.741  211.844
#>
#> Coefficients:
#>               Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -1.030e+03  1.016e+02 -10.141  < 2e-16 ***
#> F260.DAX     1.037e-01  2.621e-02   3.957 8.25e-05 ***
#> L260.DAX    -3.544e-01  4.394e-02  -8.066 2.65e-15 ***
#> L520.DAX    -2.232e-01  4.116e-02  -5.423 7.75e-08 ***
#> L780.DAX     9.451e-02  4.484e-02   2.107 0.035391 *
#> F260.SMI     4.968e-02  1.554e-02   3.198 0.001441 **
#> SMI          2.616e-01  2.301e-02  11.366  < 2e-16 ***
#> L260.SMI     6.138e-02  2.740e-02   2.240 0.025342 *
#> L520.SMI    -2.153e-01  2.707e-02  -7.954 6.15e-15 ***
#> L780.SMI    -2.208e-01  3.091e-02  -7.144 2.04e-12 ***
#> F260.CAC    -1.392e-01  3.583e-02  -3.884 0.000111 ***
#> CAC          7.165e-01  3.189e-02  22.470  < 2e-16 ***
#> L260.CAC    -5.482e-02  3.874e-02  -1.415 0.157455
#> L520.CAC     2.326e-01  4.570e-02   5.090 4.46e-07 ***
#> L780.CAC    -7.636e-02  3.966e-02  -1.925 0.054583 .
#> F260.FTSE   -9.505e-02  2.174e-02  -4.372 1.39e-05 ***
#> FTSE         3.158e-01  3.103e-02  10.176  < 2e-16 ***
#> L260.FTSE    1.745e-01  3.104e-02   5.621 2.63e-08 ***
#> L520.FTSE    2.809e-01  3.381e-02   8.308 4.14e-16 ***
#> L780.FTSE    1.535e-01  3.014e-02   5.092 4.42e-07 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Residual standard error: 49.64 on 800 degrees of freedom
#>   (1040 observations deleted due to missingness)
#> Multiple R-squared:  0.9926,	Adjusted R-squared:  0.9925
#> F-statistic:  5668 on 19 and 800 DF,  p-value: < 2.2e-16
#>                                                         # lags and the lead/lags of the other series

## World Development Panel Data
head(flag(wlddev, 1, wlddev$iso3c, wlddev$year))        # This lags all variables,
#>       country iso3c       date year decade     region     income  OECD PCGDP
#> 1        <NA>  <NA>       <NA>   NA     NA       <NA>       <NA>    NA    NA
#> 2 Afghanistan   AFG 1961-01-01 1960   1960 South Asia Low income FALSE    NA
#> 3 Afghanistan   AFG 1962-01-01 1961   1960 South Asia Low income FALSE    NA
#> 4 Afghanistan   AFG 1963-01-01 1962   1960 South Asia Low income FALSE    NA
#> 5 Afghanistan   AFG 1964-01-01 1963   1960 South Asia Low income FALSE    NA
#> 6 Afghanistan   AFG 1965-01-01 1964   1960 South Asia Low income FALSE    NA
#>   LIFEEX GINI       ODA
#> 1     NA   NA        NA
#> 2 32.292   NA 114440000
#> 3 32.742   NA 233350000
#> 4 33.185   NA 114880000
#> 5 33.624   NA 236450000
#> 6 34.060   NA 302480000head(L(wlddev, 1, ~iso3c, ~year))                       # This lags all numeric variables
#>   iso3c year L1.decade L1.PCGDP L1.LIFEEX L1.GINI    L1.ODA
#> 1   AFG 1960        NA       NA        NA      NA        NA
#> 2   AFG 1961      1960       NA    32.292      NA 114440000
#> 3   AFG 1962      1960       NA    32.742      NA 233350000
#> 4   AFG 1963      1960       NA    33.185      NA 114880000
#> 5   AFG 1964      1960       NA    33.624      NA 236450000
#> 6   AFG 1965      1960       NA    34.060      NA 302480000head(L(wlddev, 1, ~iso3c))                              # Without t: Works because data is ordered
#> Panel-lag computed without timevar: Assuming ordered data#>   iso3c L1.year L1.decade L1.PCGDP L1.LIFEEX L1.GINI    L1.ODA
#> 1   AFG      NA        NA       NA        NA      NA        NA
#> 2   AFG    1960      1960       NA    32.292      NA 114440000
#> 3   AFG    1961      1960       NA    32.742      NA 233350000
#> 4   AFG    1962      1960       NA    33.185      NA 114880000
#> 5   AFG    1963      1960       NA    33.624      NA 236450000
#> 6   AFG    1964      1960       NA    34.060      NA 302480000head(L(wlddev, 1, PCGDP + LIFEEX ~ iso3c, ~year))       # This lags GDP per Capita & Life Expectancy
#>   iso3c year L1.PCGDP L1.LIFEEX
#> 1   AFG 1960       NA        NA
#> 2   AFG 1961       NA    32.292
#> 3   AFG 1962       NA    32.742
#> 4   AFG 1963       NA    33.185
#> 5   AFG 1964       NA    33.624
#> 6   AFG 1965       NA    34.060head(L(wlddev, 0:2, ~ iso3c, ~year, cols = 9:10))       # Same, also retaining original series
#>   iso3c year PCGDP L1.PCGDP L2.PCGDP LIFEEX L1.LIFEEX L2.LIFEEX
#> 1   AFG 1960    NA       NA       NA 32.292        NA        NA
#> 2   AFG 1961    NA       NA       NA 32.742    32.292        NA
#> 3   AFG 1962    NA       NA       NA 33.185    32.742    32.292
#> 4   AFG 1963    NA       NA       NA 33.624    33.185    32.742
#> 5   AFG 1964    NA       NA       NA 34.060    33.624    33.185
#> 6   AFG 1965    NA       NA       NA 34.495    34.060    33.624head(L(wlddev, 1:2, PCGDP + LIFEEX ~ iso3c, ~year,      # Two lags, dropping id columns
keep.ids = FALSE))
#>   L1.PCGDP L2.PCGDP L1.LIFEEX L2.LIFEEX
#> 1       NA       NA        NA        NA
#> 2       NA       NA    32.292        NA
#> 3       NA       NA    32.742    32.292
#> 4       NA       NA    33.185    32.742
#> 5       NA       NA    33.624    33.185
#> 6       NA       NA    34.060    33.624
# Different ways of regressing GDP on its's lags and life-Expectancy and it's lags
summary(lm(PCGDP ~ ., L(wlddev, 0:2, ~iso3c, ~year, 9:10, keep.ids = FALSE)))     # 1 - Precomputing
#>
#> Call:
#> lm(formula = PCGDP ~ ., data = L(wlddev, 0:2, ~iso3c, ~year,
#>     9:10, keep.ids = FALSE))
#>
#> Residuals:
#>      Min       1Q   Median       3Q      Max
#> -16621.0   -100.0    -17.2     86.2  11935.3
#>
#> Coefficients:
#>               Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -321.51378   63.37246  -5.073    4e-07 ***
#> L1.PCGDP       1.31801    0.01061 124.173   <2e-16 ***
#> L2.PCGDP      -0.31550    0.01070 -29.483   <2e-16 ***
#> LIFEEX        -1.93638   38.24878  -0.051    0.960
#> L1.LIFEEX     10.01163   71.20359   0.141    0.888
#> L2.LIFEEX     -1.66669   37.70885  -0.044    0.965
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Residual standard error: 791.3 on 7988 degrees of freedom
#>   (4750 observations deleted due to missingness)
#> Multiple R-squared:  0.9974,	Adjusted R-squared:  0.9974
#> F-statistic: 6.166e+05 on 5 and 7988 DF,  p-value: < 2.2e-16
#> summary(lm(PCGDP ~ L(PCGDP,1:2,iso3c,year) + L(LIFEEX,0:2,iso3c,year), wlddev))   # 2 - Ad-hoc
#>
#> Call:
#> lm(formula = PCGDP ~ L(PCGDP, 1:2, iso3c, year) + L(LIFEEX, 0:2,
#>     iso3c, year), data = wlddev)
#>
#> Residuals:
#>      Min       1Q   Median       3Q      Max
#> -16621.0   -100.0    -17.2     86.2  11935.3
#>
#> Coefficients:
#>                                 Estimate Std. Error t value Pr(>|t|)
#> (Intercept)                   -321.51378   63.37246  -5.073    4e-07 ***
#> L(PCGDP, 1:2, iso3c, year)L1     1.31801    0.01061 124.173   <2e-16 ***
#> L(PCGDP, 1:2, iso3c, year)L2    -0.31550    0.01070 -29.483   <2e-16 ***
#> L(LIFEEX, 0:2, iso3c, year)--   -1.93638   38.24878  -0.051    0.960
#> L(LIFEEX, 0:2, iso3c, year)L1   10.01163   71.20359   0.141    0.888
#> L(LIFEEX, 0:2, iso3c, year)L2   -1.66669   37.70885  -0.044    0.965
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Residual standard error: 791.3 on 7988 degrees of freedom
#>   (4750 observations deleted due to missingness)
#> Multiple R-squared:  0.9974,	Adjusted R-squared:  0.9974
#> F-statistic: 6.166e+05 on 5 and 7988 DF,  p-value: < 2.2e-16
#> summary(lm(PCGDP ~ L(PCGDP,1:2,iso3c) + L(LIFEEX,0:2,iso3c), wlddev))             # 3 - same no year
#> Panel-lag computed without timevar: Assuming ordered data#> Panel-lag computed without timevar: Assuming ordered data#>
#> Call:
#> lm(formula = PCGDP ~ L(PCGDP, 1:2, iso3c) + L(LIFEEX, 0:2, iso3c),
#>     data = wlddev)
#>
#> Residuals:
#>      Min       1Q   Median       3Q      Max
#> -16621.0   -100.0    -17.2     86.2  11935.3
#>
#> Coefficients:
#>                           Estimate Std. Error t value Pr(>|t|)
#> (Intercept)             -321.51378   63.37246  -5.073    4e-07 ***
#> L(PCGDP, 1:2, iso3c)L1     1.31801    0.01061 124.173   <2e-16 ***
#> L(PCGDP, 1:2, iso3c)L2    -0.31550    0.01070 -29.483   <2e-16 ***
#> L(LIFEEX, 0:2, iso3c)--   -1.93638   38.24878  -0.051    0.960
#> L(LIFEEX, 0:2, iso3c)L1   10.01163   71.20359   0.141    0.888
#> L(LIFEEX, 0:2, iso3c)L2   -1.66669   37.70885  -0.044    0.965
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Residual standard error: 791.3 on 7988 degrees of freedom
#>   (4750 observations deleted due to missingness)
#> Multiple R-squared:  0.9974,	Adjusted R-squared:  0.9974
#> F-statistic: 6.166e+05 on 5 and 7988 DF,  p-value: < 2.2e-16
#> g = qF(wlddev$iso3c); t = qF(wlddev$year)                                         # 4- Precomputing
summary(lm(PCGDP ~ L(PCGDP,1:2,g,t) + L(LIFEEX,0:2,g,t), wlddev))                 # panel-id's
#>
#> Call:
#> lm(formula = PCGDP ~ L(PCGDP, 1:2, g, t) + L(LIFEEX, 0:2, g,
#>     t), data = wlddev)
#>
#> Residuals:
#>      Min       1Q   Median       3Q      Max
#> -16621.0   -100.0    -17.2     86.2  11935.3
#>
#> Coefficients:
#>                          Estimate Std. Error t value Pr(>|t|)
#> (Intercept)            -321.51378   63.37246  -5.073    4e-07 ***
#> L(PCGDP, 1:2, g, t)L1     1.31801    0.01061 124.173   <2e-16 ***
#> L(PCGDP, 1:2, g, t)L2    -0.31550    0.01070 -29.483   <2e-16 ***
#> L(LIFEEX, 0:2, g, t)--   -1.93638   38.24878  -0.051    0.960
#> L(LIFEEX, 0:2, g, t)L1   10.01163   71.20359   0.141    0.888
#> L(LIFEEX, 0:2, g, t)L2   -1.66669   37.70885  -0.044    0.965
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Residual standard error: 791.3 on 7988 degrees of freedom
#>   (4750 observations deleted due to missingness)
#> Multiple R-squared:  0.9974,	Adjusted R-squared:  0.9974
#> F-statistic: 6.166e+05 on 5 and 7988 DF,  p-value: < 2.2e-16
#>
## Using plm:
pwlddev <- plm::pdata.frame(wlddev, index = c("iso3c","year"))
head(L(pwlddev, 0:2, 9:10))                                     # Again 2 lags of GDP and LIFEEX
#>          iso3c year PCGDP L1.PCGDP L2.PCGDP LIFEEX L1.LIFEEX L2.LIFEEX
#> ABW-1960   ABW 1960    NA       NA       NA 65.662        NA        NA
#> ABW-1961   ABW 1961    NA       NA       NA 66.074    65.662        NA
#> ABW-1962   ABW 1962    NA       NA       NA 66.444    66.074    65.662
#> ABW-1963   ABW 1963    NA       NA       NA 66.787    66.444    66.074
#> ABW-1964   ABW 1964    NA       NA       NA 67.113    66.787    66.444
#> ABW-1965   ABW 1965    NA       NA       NA 67.435    67.113    66.787PCGDP <- pwlddev$PCGDP # A panel-Series of GDP per Capita head(L(PCGDP)) # Lagging the panel series #> ABW-1960 ABW-1961 ABW-1962 ABW-1963 ABW-1964 ABW-1965 #> NA NA NA NA NA NA summary(lm(PCGDP ~ ., L(pwlddev, 0:2, 9:10, keep.ids = FALSE))) # Running the lm again #> #> Call: #> lm(formula = PCGDP ~ ., data = L(pwlddev, 0:2, 9:10, keep.ids = FALSE)) #> #> Residuals: #> Min 1Q Median 3Q Max #> -16621.0 -100.0 -17.2 86.2 11935.3 #> #> Coefficients: #> Estimate Std. Error t value Pr(>|t|) #> (Intercept) -321.51378 63.37246 -5.073 4e-07 *** #> L1.PCGDP 1.31801 0.01061 124.173 <2e-16 *** #> L2.PCGDP -0.31550 0.01070 -29.483 <2e-16 *** #> LIFEEX -1.93638 38.24878 -0.051 0.960 #> L1.LIFEEX 10.01163 71.20359 0.141 0.888 #> L2.LIFEEX -1.66669 37.70885 -0.044 0.965 #> --- #> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 #> #> Residual standard error: 791.3 on 7988 degrees of freedom #> (4750 observations deleted due to missingness) #> Multiple R-squared: 0.9974, Adjusted R-squared: 0.9974 #> F-statistic: 6.166e+05 on 5 and 7988 DF, p-value: < 2.2e-16 #> # THIS DOES NOT WORK: -> a pseries is only created when subsetting the pdata.frame using$ or [[
summary(lm(PCGDP ~ L(PCGDP,1:2) + L(LIFEEX,0:2), pwlddev))      # ..so L.default is used here..
#>
#> Call:
#> lm(formula = PCGDP ~ L(PCGDP, 1:2) + L(LIFEEX, 0:2), data = pwlddev)
#>
#> Residuals:
#>      Min       1Q   Median       3Q      Max
#> -16621.0   -100.0    -17.2     86.2  11935.3
#>
#> Coefficients:
#>                    Estimate Std. Error t value Pr(>|t|)
#> (Intercept)      -321.51378   63.37246  -5.073    4e-07 ***
#> L(PCGDP, 1:2)L1     1.31801    0.01061 124.173   <2e-16 ***
#> L(PCGDP, 1:2)L2    -0.31550    0.01070 -29.483   <2e-16 ***
#> L(LIFEEX, 0:2)--   -1.93638   38.24878  -0.051    0.960
#> L(LIFEEX, 0:2)L1   10.01163   71.20359   0.141    0.888
#> L(LIFEEX, 0:2)L2   -1.66669   37.70885  -0.044    0.965
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Residual standard error: 791.3 on 7988 degrees of freedom
#>   (4750 observations deleted due to missingness)
#> Multiple R-squared:  0.9974,	Adjusted R-squared:  0.9974
#> F-statistic: 6.166e+05 on 5 and 7988 DF,  p-value: < 2.2e-16
#> LIFEEX <- pwlddev\$LIFEEX                                        # To make it work, create pseries
summary(lm(PCGDP ~ L(PCGDP,1:2) + L(LIFEEX,0:2)))               # THIS WORKS !
#>
#> Call:
#> lm(formula = PCGDP ~ L(PCGDP, 1:2) + L(LIFEEX, 0:2))
#>
#> Residuals:
#>      Min       1Q   Median       3Q      Max
#> -16621.0   -100.0    -17.2     86.2  11935.3
#>
#> Coefficients:
#>                    Estimate Std. Error t value Pr(>|t|)
#> (Intercept)      -321.51378   63.37246  -5.073    4e-07 ***
#> L(PCGDP, 1:2)L1     1.31801    0.01061 124.173   <2e-16 ***
#> L(PCGDP, 1:2)L2    -0.31550    0.01070 -29.483   <2e-16 ***
#> L(LIFEEX, 0:2)--   -1.93638   38.24878  -0.051    0.960
#> L(LIFEEX, 0:2)L1   10.01163   71.20359   0.141    0.888
#> L(LIFEEX, 0:2)L2   -1.66669   37.70885  -0.044    0.965
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Residual standard error: 791.3 on 7988 degrees of freedom
#>   (4750 observations deleted due to missingness)
#> Multiple R-squared:  0.9974,	Adjusted R-squared:  0.9974
#> F-statistic: 6.166e+05 on 5 and 7988 DF,  p-value: < 2.2e-16
#>
## Using dplyr:
library(dplyr)
wlddev %>% group_by(iso3c) %>% select(PCGDP,LIFEEX) %>% L(0:2)
#> Adding missing grouping variables: iso3c#> Panel-lag computed without timevar: Assuming ordered data#> # A tibble: 12,744 x 7
#> # Groups:   iso3c [216]
#>    iso3c PCGDP L1.PCGDP L2.PCGDP LIFEEX L1.LIFEEX L2.LIFEEX
#>  * <fct> <dbl>    <dbl>    <dbl>  <dbl>     <dbl>     <dbl>
#>  1 AFG      NA       NA       NA   32.3      NA        NA
#>  2 AFG      NA       NA       NA   32.7      32.3      NA
#>  3 AFG      NA       NA       NA   33.2      32.7      32.3
#>  4 AFG      NA       NA       NA   33.6      33.2      32.7
#>  5 AFG      NA       NA       NA   34.1      33.6      33.2
#>  6 AFG      NA       NA       NA   34.5      34.1      33.6
#>  7 AFG      NA       NA       NA   34.9      34.5      34.1
#>  8 AFG      NA       NA       NA   35.4      34.9      34.5
#>  9 AFG      NA       NA       NA   35.8      35.4      34.9
#> 10 AFG      NA       NA       NA   36.2      35.8      35.4
#> # ... with 12,734 more rowswlddev %>% group_by(iso3c) %>% select(year,PCGDP,LIFEEX) %>% L(0:2,year) # Also using t (safer)
#> Adding missing grouping variables: iso3c#> # A tibble: 12,744 x 8
#> # Groups:   iso3c [216]
#>    iso3c  year PCGDP L1.PCGDP L2.PCGDP LIFEEX L1.LIFEEX L2.LIFEEX
#>  * <fct> <int> <dbl>    <dbl>    <dbl>  <dbl>     <dbl>     <dbl>
#>  1 AFG    1960    NA       NA       NA   32.3      NA        NA
#>  2 AFG    1961    NA       NA       NA   32.7      32.3      NA
#>  3 AFG    1962    NA       NA       NA   33.2      32.7      32.3
#>  4 AFG    1963    NA       NA       NA   33.6      33.2      32.7
#>  5 AFG    1964    NA       NA       NA   34.1      33.6      33.2
#>  6 AFG    1965    NA       NA       NA   34.5      34.1      33.6
#>  7 AFG    1966    NA       NA       NA   34.9      34.5      34.1
#>  8 AFG    1967    NA       NA       NA   35.4      34.9      34.5
#>  9 AFG    1968    NA       NA       NA   35.8      35.4      34.9
#> 10 AFG    1969    NA       NA       NA   36.2      35.8      35.4
#> # ... with 12,734 more rows