Discover how you can subset matrices using R.
Join DataCamp today, and start our interactive intro to R programming tutorial for free: https://www.datacamp.com/courses/free-introduction-to-r
Just as for vectors, there are situations in which you want to select single elements or entire parts of a matrix to continue your analysis with. Again, you can use square brackets for this, but the fact that you're dealing with two dimensions now, complicates things a bit.
Have a look at this matrix containing some random numbers.
If you want to select a single element from this matrix, you'll have to specify both the row and the column of the element of interest. Suppose we want to select the number 15, located at the first row and the third column. We type m, open brackets, 1, comma, 3, comma, close brackets.
As you can probably tell, the first index refers to the row, the second one refers to the column. Likewise, to select the number 1, at row 3 and column 2, we write the following line:
Works like a charm! Notice that the results are single values, so vectors of length 1.
Now, what if you want to select an entire row or column from this matrix? You can do this by letting out some of the indices between square brackets. Instead of writing 3, comma, 2 inside square brackets to select the element at row 3 and column 2, you can leave out the 2 and keep the 3, comma part. Now, you select all elements that are in row 3, namely 6, 1, 4 and 2.
Notice here that the result is not a matrix anymore! It's also a vector, but this time one that contains more than 1 element. You selected a single row from the matrix so a vector suffices to store this one-dimensional information. To select columns, you can work similarly, but this time the index that comes before the comma should be removed. To select the entire 3rd column, you should write m, open brackets, comma, 3, close brackets.
Again, a vector results, this time of length 3, corresponding to the third column of `m`.
Now, what happens when you decide not to include a comma to clearly discern between column and row indices? Let's simply try it out and see if we can explain it. Suppose you simply type m and then 4 inside brackets.
The result is 11. How did R get to that? Well, when you pass a single index to subset a matrix, R simply goes through the matrix column by column from left to right. The first index is then 5, the second one 12, the third one 6 and the fourth one is 11, in the next column. This means that if we pass m, we should get 4, in the third row and third column.
Correct! There aren't a lot of cases in which using a single index without commas in a matrix is useful, but I just wanted to point out that the comma is really crucial here.
In vector subsetting, you also learned how to select multiple elements. In matrices, this is of course also possible and the principles are just the same. Say, for example, you want to select the values 14 and 8, in the middle of the matrix. This command will do that for you:
You select elements that are on the second row and on the second and third column. Again, the result is a vector, because 1 dimension suffices. But you can't select elements that don't have one of row or column index in common. If you want to select the 11, on row 1 and column 2, and 8, on row 2 and column 3, this call
will not give the wanted result. Instead, a submatrix gets returned, that spans the elements on row 1 and 2 and column 2 and 3. These submatrices can also be built up from disjoint places in your matrix. Creating a submatrix that contains elements on row 1 and 3 and on columns 1 , 3 and 4, for example, would look like this
Now, remember these other ways of performing subsetting, by using names and with logical vectors? These work just as well for matrices. Let's have a look at subsetting by names first. First, though, we'll have to name the matrix:
In fact subsetting by name works exactly the same as by index, but you just replace the indices with the corresponding names. To select 8, you could use the row index 2 and column 3, or use the row name r2 and column name column c:
You can even use a combination of both:
Just remember to surround the row and column names with quotes Selecting multiple elements and submatrices from a matrix is straightforward as well. To select elements on row r3 and in the last two columns, you can use:
Finally, you can also use logical vectors. Again, the same rules apply: rows and columns corresponding to a TRUE are kept, while those corresponding to FALSE are left out. To select the same elements as in the previous call, you can use:
The rules of vector recycling also apply here. Suppose that you only pass a vector of length 2 to perform a selection on the columns:
The column selection vector gets recycled to FALSE, TRUE, FALSE, TRUE:
Giving the same result.