Data Frames in R
In this tutorial, we will learn about data frames in R. We will cover the basics of creating, accessing, modifying, and performing operations on data frames.
What is a Data Frame
A data frame in R is a table or two-dimensional array-like structure in which each column contains values of one variable and each row contains one set of values from each column. Data frames are used for storing data tables.
Creating Data Frames
Data frames can be created in R using the data.frame() function:
df <- data.frame(column1 = c(1, 2, 3), column2 = c('A', 'B', 'C'))The above code creates a data frame with two columns, where column1 contains numeric values and column2 contains character values.
Example 1: Creating a Simple Data Frame
- We start by creating a data frame named dfusing thedata.framefunction.
- The data frame has three columns: idwith integer values,namewith character values, andagewith numeric values.
- We print the data frame to the console to see its structure.
R Program
df <- data.frame(id = c(1, 2, 3), name = c('Alice', 'Bob', 'Charlie'), age = c(25, 30, 35))
print(df)Output
id name age 1 1 Alice 25 2 2 Bob 30 3 3 Charlie 35
Example 2: Accessing Data Frame Elements
- We create a data frame named dfwith columnsid,name, andage.
- We access the namecolumn using the dollar sign$operator and print it.
- We access the element in the first row and second column using the [row, column]notation and print it.
R Program
df <- data.frame(id = c(1, 2, 3), name = c('Alice', 'Bob', 'Charlie'), age = c(25, 30, 35))
print(df$name)
print(df[1, 2])Output
[1] "Alice" "Bob" "Charlie" [1] "Alice"
Example 3: Modifying Data Frame Elements
- We create a data frame named dfwith columnsid,name, andage.
- We modify the ageof the second row by assigning a new value to it.
- We add a new column named genderto the data frame.
- We print the modified data frame to see the changes.
R Program
df <- data.frame(id = c(1, 2, 3), name = c('Alice', 'Bob', 'Charlie'), age = c(25, 30, 35))
df$age[2] <- 32
df$gender <- c('F', 'M', 'M')
print(df)Output
id name age gender 1 1 Alice 25 F 2 2 Bob 32 M 3 3 Charlie 35 M
Example 4: Filtering Data Frames
- We create a data frame named dfwith columnsid,name, andage.
- We filter the data frame to include only rows where the ageis greater than 30.
- We assign the filtered data frame to a new variable named df_filtered.
- We print the filtered data frame.
R Program
df <- data.frame(id = c(1, 2, 3), name = c('Alice', 'Bob', 'Charlie'), age = c(25, 30, 35))
df_filtered <- df[df$age > 30, ]
print(df_filtered)Output
id name age 3 3 Charlie 35
