Skip to content

R commands

Get help

?<function name> getting the help of a function
help(<function name>) getting the help of a function
help(package = <package name>) getting the help of a package

Read and write files

read.table(file = <file name>) read a file - default separator is space, file without header
read.delim(file = <file name>) read a file - default separator is tabulation, file with header
write.table(x = <object>, file = <file name>) write an object into a file - be careful of the default options
vroom(file = <file name>) read a file (vroom package), automatically detects the separator

Information about the objects

dim(<object>) print the dimension of a matrix/data.frame
length(<object>) print the length of a vector/list
head(<object>) print the six first elements of an object - be careful if the number of columns is important
tail(<object>) print the six last elements of an object - be careful if the number of columns is important
class(<object>) get the class of an object

Working directory

getwd() print the working directory
setwd(<path>) set the working directory to \<path>

Data manipulation with the dplyr package

filter(df, condition) Filter rows based on a condition
select(df, columns) Select specific columns
mutate(df, new_column = expression) Add new variables that are functions of existing variables
group_by(df, grouping_columns) Group data by one or more columns
summarize(df, new_column = function(column)) Summarize data by group
arrange(df, columns) Sort data by columns

Tidy data with the tidyr package

pivot_longer(df, columns, key, value) Convert data from wide to long format
pivot_wider(df, key, value) Convert data from long to wide format
separate(df, column, into = c("new1", "new2"), sep = "separator") Split a single character column into multiple columns
unite(df, new_column, columns, sep = "separator") Combine multiple columns into a single character column
drop_na(df) Drop rows that contain a missing value
replace_na(df, replacement_values) Replace missing values with specified values

A modern reimplementation of the data.frame with the tibble package

tibble() Create a tibble (modern data frame)
as_tibble(df) Convert an existing data frame to a tibble
tribble() Create a tibble row by row

Data visualization with the ggplot2 package

ggplot(df, aes(x = x_var, y = y_var)) + geom_*() Create a basic plot
+ labs(title = "Title", x = "X-axis label", y = "Y-axis label") Add labels
+ theme_*() Customize plot appearance
+ facet_wrap(vars(facet_var)) Create facet plots
+ scale_*() Adjust scales (e.g., color, size)

The use of pipes

You may use the %>% operator from the magrittr package to chain together operations. Here are some examples:

  • Data manipulation

    df %>%
      filter(condition) %>%
      select(columns) %>%
      group_by(grouping_columns) %>%
      summarize(new_column = function(column))
    

  • Data visualization

    df %>%
      ggplot(aes(x = x_var, y = y_var)) +
      geom_point() +
      labs(title = "Scatter Plot", x = "X-axis label", y = "Y-axis label") +
      theme_minimal() +
      facet_wrap(vars(facet_var))
    

Info

See https://rstudio.com/resources/cheatsheets/ for more information