CSV Files and its operations in R Programming Language

Rumman Ansari     2023-03-24   5751 Share
☰ Table of Contents

Table of Content:


Getting and Setting the Working Directory

You can check which directory the R workspace is pointing to using the getwd() function. You can also set a new working directory using setwd()function.

 
 # Get and print current working directory.
print(getwd())

# Set current working directory.
setwd("E:/R-Programming-Script-files")

# Get and print current working directory.
print(getwd())
 
 

When we execute the above code, it produces the following result ?

  
[1] "C:/Users/Hello World/Documents" 
[1] "E:/R-Programming-Script-files"
 
 

This result depends on your OS and your current directory where you are working.

Input as CSV File

The csv file is a text file in which the values in the columns are separated by a comma. Let's consider the following data present in the file named inputData.csv.

You can create this file using windows notepad by copying and pasting this data. Save the file as inputData.csv using the save As All files(*.*) option in notepad.

 
id,Name,salary,Job_date,dept
1,Rumman,12623.3,01-01-2012,IT
2,Jaman,32515.2,23-09-2013,Operations
3,Inza,342611,15-11-2014,IT
4,Azam,232729,11-05-2014,HR
5,Sabir,45843.25,27-03-2015,Finance
6,Jakir,322578,21-05-2013,IT
7,Sourav,221632.8,30-07-2013,Operations
8,Ramu,22722.5,17-06-2014,Finance

 
 

Reading a CSV File

Following is a simple example of read.csv() function to read a CSV file available in your current working directory −

 
 data <- read.csv("inputData.csv")
print(data)
 
 

When we execute the above code, it produces the following result ?

 
  id   Name    salary   Job_date       dept
1  1 Rumman  12623.30 01-01-2012         IT
2  2  Jaman  32515.20 23-09-2013 Operations
3  3   Inza 342611.00 15-11-2014         IT
4  4   Azam 232729.00 11-05-2014         HR
5  5  Sabir  45843.25 27-03-2015    Finance
6  6  Jakir 322578.00 21-05-2013         IT
7  7 Sourav 221632.80 30-07-2013 Operations
8  8   Ramu  22722.50 17-06-2014    Finance

 
 

Analyzing the CSV File

By default the read.csv() function gives the output as a data frame. This can be easily checked as follows. Also we can check the number of columns and rows.

 
 data <- read.csv("inputData.csv")

print(is.data.frame(data))
print(ncol(data))
print(nrow(data))
 
 

When we execute the above code, it produces the following result ?

 
 > data <- read.csv("inputData.csv")
> print(is.data.frame(data))
[1] TRUE
> print(ncol(data))
[1] 5
> print(nrow(data))
[1] 8
 
 

Once we read data in a data frame, we can apply all the functions applicable to data frames as explained in subsequent section.

Get the maximum salary

 
 # Create a data frame.
data <- read.csv("inputData.csv")

# Get the max salary from data frame.
sal <- max(data$salary)
print(sal)
 
 

When we execute the above code, it produces the following result ?

 
 [1] 342611
 
 

Get the details of the person with max salary

We can fetch rows meeting specific filter criteria similar to a SQL where clause.

 
 # Create a data frame.
data <- read.csv("inputData.csv")

# Get the max salary from data frame.
sal <- max(data$salary)

# Get the person detail having max salary.
retval <- subset(data, salary == max(salary))
print(retval)
 
 

When we execute the above code, it produces the following result ?

 
   id Name salary   Job_date dept
3  3 Inza 342611 15-11-2014   IT
 
 

Get all the people working in IT department

 
 # Create a data frame.
data <- read.csv("inputData.csv")

retval <- subset( data, dept == "IT")
print(retval)
 
 

When we execute the above code, it produces the following result ?

 
   id   Name   salary   Job_date dept
1  1 Rumman  12623.3 01-01-2012   IT
3  3   Inza 342611.0 15-11-2014   IT
6  6  Jakir 322578.0 21-05-2013   IT
 
 

Get the persons in IT department whose salary is greater than 25000

 
 # Create a data frame.
data <- read.csv("inputData.csv")

info <- subset(data, salary > 25000 & dept == "IT")
print(info)
 
 

When we execute the above code, it produces the following result ?

 
   id  Name salary   Job_date dept
3  3  Inza 342611 15-11-2014   IT
6  6 Jakir 322578 21-05-2013   IT
 
 

Get the people who joined on or after 2013

 
 # Create a data frame.
data <- read.csv("inputData.csv")

retval <- subset(data, as.Date(Job_date) > as.Date("2013-01-01"))
print(retval)
 
 

When we execute the above code, it produces the following result ?

 
 [1] id       Name     salary   Job_date dept    
<0 rows> (or 0-length row.names)
 
 

Writing into a CSV File

R can create csv file form existing data frame. The write.csv() function is used to create the csv file. This file gets created in the working directory.

 
 # Create a data frame.
data <- read.csv("inputData.csv")
retval <- subset(data, as.Date(Job_date) > as.Date("01-01-2013"))

# Write filtered data into a new file.
write.csv(retval,"output.csv")
newdata <- read.csv("output.csv")
print(newdata)
 
 

When we execute the above code, it produces the following result ?

 
   X id   Name    salary   Job_date       dept
1 2  2  Jaman  32515.20 23-09-2013 Operations
2 3  3   Inza 342611.00 15-11-2014         IT
3 4  4   Azam 232729.00 11-05-2014         HR
4 5  5  Sabir  45843.25 27-03-2015    Finance
5 6  6  Jakir 322578.00 21-05-2013         IT
6 7  7 Sourav 221632.80 30-07-2013 Operations
7 8  8   Ramu  22722.50 17-06-2014    Finance
 
 

Here the column X comes from the data set newper. This can be dropped using additional parameters while writing the file.

 
 # Create a data frame.
data <- read.csv("inputData.csv")
retval <- subset(data, as.Date(Job_date) > as.Date("01-01-2014"))

# Write filtered data into a new file.
write.csv(retval,"output.csv", row.names = FALSE)
newdata <- read.csv("output.csv")
print(newdata)
 
 

When we execute the above code, it produces the following result ?

 
   id   Name    salary   Job_date       dept
1  2  Jaman  32515.20 23-09-2013 Operations
2  3   Inza 342611.00 15-11-2014         IT
3  4   Azam 232729.00 11-05-2014         HR
4  5  Sabir  45843.25 27-03-2015    Finance
5  6  Jakir 322578.00 21-05-2013         IT
6  7 Sourav 221632.80 30-07-2013 Operations
7  8   Ramu  22722.50 17-06-2014    Finance
 
 

All code in one place

 
 # Get and print current working directory.
print(getwd())

# Set current working directory.
setwd("E:/R-Programming-Script-files")

# Get and print current working directory.
print(getwd())

data <- read.csv("inputData.csv")
print(data)

data <- read.csv("inputData.csv")

print(is.data.frame(data))
print(ncol(data))
print(nrow(data))

# Create a data frame.
data <- read.csv("inputData.csv")

# Get the max salary from data frame.
sal <- max(data$salary)
print(sal)


# Create a data frame.
data <- read.csv("inputData.csv")

# Get the max salary from data frame.
sal <- max(data$salary)

# Get the person detail having max salary.
retval <- subset(data, salary == max(salary))
print(retval)

# Create a data frame.
data <- read.csv("inputData.csv")

retval <- subset( data, dept == "IT")
print(retval)


# Create a data frame.
data <- read.csv("inputData.csv")

info <- subset(data, salary > 25000 & dept == "IT")
print(info)

# Create a data frame.
data <- read.csv("inputData.csv")

retval <- subset(data, as.Date(Job_date) > as.Date("2012-01-01"))
print(retval)


# Create a data frame.
data <- read.csv("inputData.csv")
retval <- subset(data, as.Date(Job_date) > as.Date("01-01-2013"))

# Write filtered data into a new file.
write.csv(retval,"output.csv")
newdata <- read.csv("output.csv")
print(newdata)

# Create a data frame.
data <- read.csv("inputData.csv")
retval <- subset(data, as.Date(Job_date) > as.Date("01-01-2014"))

# Write filtered data into a new file.
write.csv(retval,"output.csv", row.names = FALSE)
newdata <- read.csv("output.csv")
print(newdata)