add_level() can be used to create more complicated patterns of nesting. For example, when creating lower level data, it is possible to use a different
N for each of the values of the higher level data:
Here, each city has a different number of citizens. And the value of
N used to create the age variable automatically updates as needed. The result is a dataset with 6 citizens, 2 in the first city and 4 in the second. As long as N is either a number, or a vector of the same length of the current lowest level of the data,
add_level() will know what to do.
It is also possible to provide a function to N, enabling a random number of citizens per city:
Here, each city is given a random number of citizens between 1 and 6. Since the
sample() function returns a vector of length 2, this is like specifying 2 separate
Ns as in the example above.
It is also possible to define
N on the basis of higher level variables themselves. Consider the following example:
Here, the city has a defined population, and the number of citizens in our simulated data reflects a sample of 30% of that population. Although we only display the first 6 rows for brevity’s sake, the first city would have 27 rows in total.
Finally, it is possible to define
N on the basis of higher level
Here, each city has a random number of citizens from 1 to 10, but we need to supply N to the sample function to ensure that one draw is made per city.
Because the functions in fabricatr take data and return data, they are cross-compatible with a
tidyverse workflow. Here is an example of using magrittr’s pipe operator (
%>%) and dplyr’s
mutate verbs to add new data.
It is also possible to use the pipe operator (
%>%) to direct the flow of data between
fabricate() calls. Remember that every
fabricate() call can import existing data frames, and every call returns a single data frame.
my_data <- data_frame(Y = sample(1:10, 2)) %>% fabricate(lower_level = add_level(N = 3, Y2 = Y + rnorm(N))) my_data