R Split Continuous Variable Into Bins

R language has various data types, and the most common data type is Vector. However, merging and splitting is a common operation in any programming language, and today, we will see how to split vector and data frames into various groups in R.

split in R

The split() is a built-in R function that divides the Vector or data frame into the groups defined by the function. It accepts the vector or data frame as an argument and returns the data into groups.

The unsplit() function in R does the reverse of the split() function. The value returned from the split() function is a list of vectors containing the groups' values.

Syntax

                          split(x, f, drop = FALSE, ...) split(x, f, drop = FALSE, sep = ".", lex.order = FALSE, ...)

Parameters

Thexis a vector or data frame to be divided into groups.

Thefis a 'factor' because as.factor(f) defines the grouping or a list of such factors, and their interaction is used for the grouping.

Thedropis a logical argument suggesting if the levels that do not occur should be dropped.

The sepis a separator, a character string, passed to the interaction where f is a list.

Thelex.orderis a logical argument that passed to interaction when f is a list.

Example

Suppose you have a named vector, where the name of each element corresponds to the group to which the element belongs.

Hence, you can split a vector into two vectors where items are of the same group, passing the names of the vector with the names function to argument f.

Let's define a named vector using the c() function.

                          rv <- c(x = 3, y = 5, x = 1, x = 4, y = 3)              rv

Output

To divide into groups, use the split() function. We will divide the data into the x and y groups.

                          rv <- c(x = 3, y = 5, x = 1, x = 4, y = 3)              rv              data <- split(rv, f = names(rv))              data

Output

                          x y x x y              3 5 1 4 3              $x              x x x              3 1 4              $y              y y              5 3

You can see that our vector is divided by its groups defined by the names.

You can also pass a character vector as a parameter to f to indicate the corresponding groups of each element or directly a factor object.

                          rv <- c("Mando1", "Mando2", "Mando1", "Mando1", "Mando2")              rv              data <- split(rv, f = factor(rv))              data

Output

                          [1] "Mando1" "Mando2" "Mando1" "Mando1" "Mando2"              $Mando1              [1] "Mando1" "Mando1" "Mando1"              $Mando2              [1] "Mando2" "Mando2"

Split data in Multiple groups in R

To split the data into multiple groups, use the input of the argumentf as a list.

                          rv <- c(x = 3, y = 5, x = 1, x = 4, y = 3)              rv1 <- c("Mando1", "Mando2", "Mando1", "Mando1", "Mando2")              rv1              rv2 <- c("DarkTrooper1", "DarkTrooper2", "DarkTrooper2", "DarkTrooper1", "DarkTrooper1")              rv2              data <- split(rv, f = list(rv1, rv2))              data

Output

                          [1] "Mando1" "Mando2" "Mando1" "Mando1" "Mando2"              [1] "DarkTrooper1" "DarkTrooper2" "DarkTrooper2" "DarkTrooper1" "DarkTrooper1"              $Mando1.DarkTrooper1              x x              3 4              $Mando2.DarkTrooper1              y              3              $Mando1.DarkTrooper2              x              1              $Mando2.DarkTrooper2              y              5

You can see that by default, the group interactions are separated with a dot and that the output contains all possible groups even when there are no observations in some of them.

However, you can customize that with the sep and drop arguments, respectively. See the following code.

                          rv <- c(x = 3, y = 5, x = 1, x = 4, y = 3)              rv1 <- c("Mando1", "Mando2", "Mando1", "Mando1", "Mando2")              rv2 <- c("DarkTrooper1", "DarkTrooper2", "DarkTrooper2", "DarkTrooper1", "DarkTrooper1")              data <- split(rv, f = list(rv1, rv2), drop = TRUE, sep = ": ")              data

Output

                          $`Mando1: DarkTrooper1`              x x              3 4              $`Mando2: DarkTrooper1`              y              3              $`Mando1: DarkTrooper2`              x              1              $`Mando2: DarkTrooper2`              y              5

Splitting the data frame in R

To split the data frame in R, use the split() function. You can split a data set into subsets based on one or more variables representing groups of the data. R-lang comes with some inbuilt data sets, which we will use in this example.

Let's use the R inbuilt dataset calledToothGrowth.

                          data("ToothGrowth")              head(ToothGrowth)

Output

                          len  supp dose 1  4.2  VC   0.5 2 11.5  VC   0.5 3 7.3   VC   0.5 4 5.8   VC   0.5 5 6.4   VC   0.5 6 10.0  VC   0.5

The head() function returns the first six rows of the dataset.

You can use the split() function to split the data frame into groups based on the len variable.

                          data("ToothGrowth")              df <- head(ToothGrowth)              data <- split(df, f = df$len)              data

Output

                          $`4.2`                              len supp dose              1  4.2  VC  0.5              $`5.8`                              len supp dose              4  5.8  VC  0.5              $`6.4`                              len supp dose              5  6.4 VC 0.5              $`7.3`                              len supp dose              3  7.3  VC 0.5              $`10`                              len supp dose              6  10   VC  0.5              $`11.5`                              len supp dose              2  11.5 VC 0.5

You can see from the output that we have divided the dataset into subsets that meet different combinations of groups simultaneously. As an example, you can create a split of the sample data frame with len columns. This will create four subsets with all possible combinations of the groups.

If you want to divide a data frame based on more columns or groups, then pass the listas a value to the f. For example, see the following code snippet.

                          split(df, f = list(df$len, df$dose))

To recover the original data frame from split() function, use the unsplit() function. The syntax for unsplit() function is the following.

Conclusion

To split the vector or data frame in R, use thesplit()function. To recover the split vector or data frame, use theunsplit()method.

R Split Continuous Variable Into Bins

split in R

Syntax

Parameters

Example

Output

Output

Output

Split data in Multiple groups in R

Output

Output

Splitting the data frame in R

Output

Output

Conclusion

See also

0 Response to "R Split Continuous Variable Into Bins"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel