Data Import/Export

Importing and Exporting Data in R

Author

Raju Rimal

Published

November 30, 2024

Modified

March 19, 2025

One of the most important steps in data analysis is getting your data into and out of your analysis software. R provides powerful tools to work with various data formats, allowing you to seamlessly import and export data for analysis and sharing. This guide introduces how to handle common data formats in R, including CSV, Text, Excel, SPSS, Stata, and more.


1. Importing Data into R

CSV (Comma-Separated Values) is one of the most common file formats for tabular data.

# Import a CSV file
data <- read.csv("data.csv")
print(head(data))
Customizing Import Options
# Specify column types, separator, and missing value indicator
data <- read.csv("data.csv", stringsAsFactors = FALSE, sep = ",", na.strings = "")

The fread function is part of the data.table package and is optimized for fast data reading. It automatically detects the delimiter, column types, and handles common formatting issues.

library(data.table)

# Import a CSV file
data <- fread("data.csv")
print(head(data))
Why Use fread?
  • Speed: Significantly faster than read.csv for large datasets.
  • Ease of Use: Automatically detects column types, missing values, and delimiters.
  • Custom Options: Easily configurable for specific needs, e.g., skipping rows, handling encodings.

The readr package provides faster and more flexible tools for reading data.

library(readr)

data <- read_csv("data.csv")
print(head(data))
Comparison: read.csv vs. fread
Feature read.csv fread readr
Speed Slower Faster Fast
Auto Type Detection Limited Excellent Excellent (Customizable)
Handling Large Files May struggle Highly efficient Efficient

Text files often use custom delimiters, such as tabs or pipes (|).

# Import a tab-delimited text file
data <- read.table("data.txt", header = TRUE, sep = "\t")
print(head(data))

For more control over delimiters:

data <- read_delim("data.txt", delim = "\t")

fread function from data.table also works for text file.


Excel files are commonly used in business and research settings.

library(readxl)

# Read the first sheet of an Excel file
data <- read_excel("data.xlsx")

# Specify a sheet
data <- read_excel("data.xlsx", sheet = "Sheet1")
library(openxlsx)

# Read an Excel file
data <- read.xlsx("data.xlsx", sheet = 1)

library(haven)

# Import SPSS file
data <- read_sav("data.sav")

# Import Stata file
data <- read_dta("data.dta")

  • JSON: Use the jsonlite package (fromJSON() function).
  • SQL Databases: Use the DBI package to connect and query databases.
  • RData/Serialized Files: Use load() to import .RData files.

2. Exporting Data from R

# Export a data frame to CSV
write.csv(data, "output.csv", row.names = FALSE)

The data.table package also offers a fast function for exporting data: fwrite.

library(data.table)

# Export to CSV
fwrite(data, "output.csv", sep = ",")

Why Use fwrite?

  • Speed: Faster than write.csv for large datasets.
  • Custom Delimiters: Easily supports different delimiters like tabs, pipes, etc.
# Export a data frame to a tab-delimited text file
write.table(data, "output.txt", sep = "\t", row.names = FALSE)

Using openxlsx

library(openxlsx)

# Export a data frame to Excel
write.xlsx(data, "output.xlsx")

Using writexl

library(writexl)

# Export a data frame to Excel
write_xlsx(data, "output.xlsx")

library(haven)

# Export to SPSS
write_sav(data, "output.sav")

# Export to Stata
write_dta(data, "output.dta")

3. Tips for Handling Data Formats

  1. Inspect Your Data: After importing data, always inspect it using functions like head(), str(), or summary().
  2. Handle Missing Data: Specify na.strings when importing files to ensure proper handling of missing values.
  3. Export Without Row Names: By default, R includes row numbers when exporting. Use row.names = FALSE to avoid this.
  4. Use Relevant Packages: Many packages enhance R’s ability to import/export data efficiently.
File Type Recommended Packages
CSV/Text readr, Base R (read.csv)
Excel readxl, openxlsx
SPSS/Stata haven
JSON jsonlite

4. Common Issues and Troubleshooting

  1. Encoding Problems:

    If you encounter strange characters, specify the encoding.

    data <- read.csv("data.csv", fileEncoding = "UTF-8")
  2. Column Type Issues:

    If columns are not imported correctly, manually specify column types:

    data <- read_csv("data.csv", col_types = cols(
      Column1 = col_double(),
      Column2 = col_character()
    ))
  3. Missing Packages:

    If a package is not installed, install it using install.packages("package_name").


5. Practice Exercise

  1. Import the following datasets into R:
    • A CSV file with employee data.
    • An Excel file containing sales data (Sheet 2).
    • A Stata file with survey responses.
  2. Inspect the datasets using head(), str(), and summary().
  3. Export the cleaned data to CSV and Excel formats.

Conclusion

Importing and exporting data in R is a crucial skill that ensures smooth transitions between data collection, analysis, and sharing. By mastering these tools and techniques, you’ll be well-equipped to handle real-world datasets in a variety of formats.