R commands
Get help
|
|
?<function name> |
getting the help of a function |
help(<function name>) |
getting the help of a function |
help(package = <package name>) |
getting the help of a package |
Read and write files
|
|
read.table(file = <file name>) |
read a file - default separator is space, file without header |
read.delim(file = <file name>) |
read a file - default separator is tabulation, file with header |
write.table(x = <object>, file = <file name>) |
write an object into a file - be careful of the default options |
vroom(file = <file name>) |
read a file (vroom package), automatically detects the separator |
|
|
dim(<object>) |
print the dimension of a matrix/data.frame |
length(<object>) |
print the length of a vector/list |
head(<object>) |
print the six first elements of an object - be careful if the number of columns is important |
tail(<object>) |
print the six last elements of an object - be careful if the number of columns is important |
class(<object>) |
get the class of an object |
Working directory
|
|
getwd() |
print the working directory |
setwd(<path>) |
set the working directory to \<path> |
Data manipulation with the dplyr
package
|
|
filter(df, condition) |
Filter rows based on a condition |
select(df, columns) |
Select specific columns |
mutate(df, new_column = expression) |
Add new variables that are functions of existing variables |
group_by(df, grouping_columns) |
Group data by one or more columns |
summarize(df, new_column = function(column)) |
Summarize data by group |
arrange(df, columns) |
Sort data by columns |
Tidy data with the tidyr
package
|
|
pivot_longer(df, columns, key, value) |
Convert data from wide to long format |
pivot_wider(df, key, value) |
Convert data from long to wide format |
separate(df, column, into = c("new1", "new2"), sep = "separator") |
Split a single character column into multiple columns |
unite(df, new_column, columns, sep = "separator") |
Combine multiple columns into a single character column |
drop_na(df) |
Drop rows that contain a missing value |
replace_na(df, replacement_values) |
Replace missing values with specified values |
A modern reimplementation of the data.frame
with the tibble
package
|
|
tibble() |
Create a tibble (modern data frame) |
as_tibble(df) |
Convert an existing data frame to a tibble |
tribble() |
Create a tibble row by row |
Data visualization with the ggplot2
package
|
|
ggplot(df, aes(x = x_var, y = y_var)) + geom_*() |
Create a basic plot |
+ labs(title = "Title", x = "X-axis label", y = "Y-axis label") |
Add labels |
+ theme_*() |
Customize plot appearance |
+ facet_wrap(vars(facet_var)) |
Create facet plots |
+ scale_*() |
Adjust scales (e.g., color, size) |
The use of pipes
You may use the %>%
operator from the magrittr
package to chain together
operations. Here are some examples:
-
Data manipulation
df %>%
filter(condition) %>%
select(columns) %>%
group_by(grouping_columns) %>%
summarize(new_column = function(column))
-
Data visualization
df %>%
ggplot(aes(x = x_var, y = y_var)) +
geom_point() +
labs(title = "Scatter Plot", x = "X-axis label", y = "Y-axis label") +
theme_minimal() +
facet_wrap(vars(facet_var))