Data Frames in R
In this tutorial, we will learn about data frames in R. We will cover the basics of creating, accessing, modifying, and performing operations on data frames.
What is a Data Frame
A data frame in R is a table or two-dimensional array-like structure in which each column contains values of one variable and each row contains one set of values from each column. Data frames are used for storing data tables.
Creating Data Frames
Data frames can be created in R using the data.frame()
function:
df <- data.frame(column1 = c(1, 2, 3), column2 = c('A', 'B', 'C'))
The above code creates a data frame with two columns, where column1
contains numeric values and column2
contains character values.
Example 1: Creating a Simple Data Frame
- We start by creating a data frame named
df
using thedata.frame
function. - The data frame has three columns:
id
with integer values,name
with character values, andage
with numeric values. - We print the data frame to the console to see its structure.
R Program
df <- data.frame(id = c(1, 2, 3), name = c('Alice', 'Bob', 'Charlie'), age = c(25, 30, 35))
print(df)
Output
id name age 1 1 Alice 25 2 2 Bob 30 3 3 Charlie 35
Example 2: Accessing Data Frame Elements
- We create a data frame named
df
with columnsid
,name
, andage
. - We access the
name
column using the dollar sign$
operator and print it. - We access the element in the first row and second column using the
[row, column]
notation and print it.
R Program
df <- data.frame(id = c(1, 2, 3), name = c('Alice', 'Bob', 'Charlie'), age = c(25, 30, 35))
print(df$name)
print(df[1, 2])
Output
[1] "Alice" "Bob" "Charlie" [1] "Alice"
Example 3: Modifying Data Frame Elements
- We create a data frame named
df
with columnsid
,name
, andage
. - We modify the
age
of the second row by assigning a new value to it. - We add a new column named
gender
to the data frame. - We print the modified data frame to see the changes.
R Program
df <- data.frame(id = c(1, 2, 3), name = c('Alice', 'Bob', 'Charlie'), age = c(25, 30, 35))
df$age[2] <- 32
df$gender <- c('F', 'M', 'M')
print(df)
Output
id name age gender 1 1 Alice 25 F 2 2 Bob 32 M 3 3 Charlie 35 M
Example 4: Filtering Data Frames
- We create a data frame named
df
with columnsid
,name
, andage
. - We filter the data frame to include only rows where the
age
is greater than 30. - We assign the filtered data frame to a new variable named
df_filtered
. - We print the filtered data frame.
R Program
df <- data.frame(id = c(1, 2, 3), name = c('Alice', 'Bob', 'Charlie'), age = c(25, 30, 35))
df_filtered <- df[df$age > 30, ]
print(df_filtered)
Output
id name age 3 3 Charlie 35