Week2: Working directory and accessor

vector

working directory

Author

Tien-Cheng

Welcome to the second course! You will learn working directory, subset elements from vector, list and dataframe.

Note

data type logical and operator
accessor []
What is working directory (wd)?
How to access elements from vector, list and dataframe

1 Conditions: logical operators and vectors

Logical vector Logical operators ¹

¹ accessors

L==R: direction doesn’t matters.
L%in%R: one sided, check if object L is presence in R.
!: negate the logical vector.

# check if pattern exist in vector
3%in%c(1,3)
# what is the difference?
c(1,3)%in%3
2%in%c(1,3) 
1==2 
1==c(2,1) 
# elementwise check whether L equals
c(2,1)==1
# check identity pairwise
c(1,2)==c(2,4)
# is '!' reverse the logical vector?
!1==2 
1!=2 
c(1,3)==2
# what does which() returns?
which(c(1,3)==3) 
# what will be the data type? check with str()
c(1,2,NA) %>% is.na() 
c(1,2,NA) %>% is.na() %>% which() 
c(1,2,NA) %>% is.na() %>% !.
c(1,2,NA) %>% !is.na() 
!is.na(c(1,2,NA))

Preconditions examples inside a function

# check if data type match
arg <- ""
is.character(arg)
if(is.character(arg)){
  print("character")
}
if(is.character(arg)){
  print("character")
}else{
  error("type other than character")
}
if(is.character(arg)){
  warning("wrong")
}
if(is.character(arg)){
  stop("wrong")
}

challenge

Inside your plusone function, please check first whether input x is numeric, then proceed the process. if not, return with message “wrong input type” using stop()

2 Working directory

2.1 preparation

Open the folder that contain Wheat_BSC_project.Rproj
download the data from HU-box, save it in data.
create Week2.R and save it in folder src.

What is working directory (wd)?

2.2 abbreviation path: “.” for wd and “..” for parent of wd

"." means the working directory (wd) where this R script exists.

".." means the parent (one level higher) directory of ".".

click for example

library(dplyr)

# working directory, abbreviated as "."
getwd()
# parent directory, abbreviated as ".."
dirname(getwd())
# assign current path to variable
current_path <- getwd()
# check the type 
current_path %>% str()


# check files in the directory

# are they different?
"." %>% list.files(path=.)
getwd() %>% list.files(path=.)

# are they different?
".." %>% list.files(path=.)
getwd() %>% dirname() %>% list.files(path=.)

challenge

Although the meaning of . is the same as getwd(), the content is depending on the environment you are working with.

Right click R studio logo, open a new R studio window, compare the result of getwd() in R project and R

2.3 accessing files and folder inside a R project

Which one do you prefer? Why do we prefer relative path?

# absolute path, did you get error?
"C:/Users/marse/seadrive_root/Tien-Che/My Libraries/PhD_Tien/Project/Postdoc_teaching/BSC_project_IPFS2023/data" %>% list.files(path=.)
# relative path in R base
parent_path <- getwd() 
paste0(parent_path,"/data") %>% list.files(path=.)

# Does this works? 
".\data" %>% list.files(path=.)
"data" %>% list.files(path=.)

click for example

symbol	Absolute path	relative path	color
A	`C/users/Wheat_BSC_project`	`..`	black
B	`C/users/Wheat_BSC_project/data`	?	blue
C	`C/users/Wheat_BSC_project/src`	`.`	red
D	`C/users/Wheat_BSC_project/src/data`	?	blue

Below are four relative paths. Please rewrite them in absolute (full) path form. Which two are the same? Based on the figure illustated below, path 1-4 should be A,B,C or D?

"ear_summarized.csv"
"data/ear_summarized.csv"
"./data/ear_summarized.csv"
"../data/ear_summarized.csv"

3 get element from a vetor with `accessors []`

vector indexing start from 1 to the length of the vector.

empty_vec <- c()
length(empty_vec)
# what is the type of the empty vec?
empty_vec %>% str()

# NULL: empty 
empty_vec[1]
empty_vec[0]


vec <- c(1,3,5)
vec[1]
#reorder the vector 
vec[c(2,1,3)]

# removing the indexed elements
vec[-1]
vec[-2]

# indexing start from 1, not 0
# therefore you get, numeric(0)
vec[0]
# when access exceeding the range of a vector, what datatype do you get? 
vec[4]
vec %>% .[length(.)+1]
vec[1:4]
vec[4:1]

# find specific element or position
vec[c(F,T,F)]
vec[vec==5]
# when codition not match at all, it will return? 
vec[vec==2]
vec[c(F,F,F)]
vec %>% .[c(F)]
vec[vec=="a"]

# default str vector
letters
LETTERS
# when the query does not match, guess what will be the datatype? 
letters %>% .[.==2]
letters %>% .[c(F)]
# vector over write
vec
vec <- c(2,1,3)
vec

challenge

vec <- c(1, 2, 3, 4, 5)
logical_vec <- c(TRUE, FALSE)
subset_vec <- vec[logical_vec]
subset_vec

[1] 1 3 5

what did you observe? Is there any vector recycling?

What happen when you enter vec[TRUE]?

Supplementary information of special datatypes:²

² Data type emplty: NULLat zero position: numeric(0)

4 list: keep the diversity of data type

Make a list is like put a cookie(content of list element) in the cookie jar(list element).

# list without element name
list_a <- list(c(1,2))

list without name There are 3 common accessors for list:

access the list element (cookie jar)

[] access the list position

access the content of list element (all cookies in jar)

[[]] access the content of a list element by position or name

access the specific content of list element (selected cookie(s))

[[]][], position of logical vector

click for example

vec_obj <- c(1,2,4,5)
# position vector
pos_vec <- c(2,3,1)
list_obj <- list(vec_obj)
# element id
ele_ind <- 1

action	vector	list
extract from content	`vec_obj[pos_vec]`	`list_obj[[ele_ind]][pos_vec]`
refer to content (data type:list)	`vec_obj`	`list_obj[[ele_ind]]`
refer to position (data type:list)		`list_obj[ele_ind]`

pos_vec or ele_ind could also be either numeric or logical

4.1 access content by name

$ access the content of a list element by name list_object$element_name or list_object[[element_name]]

# list without element name
list_b <- list(nam=c(1,2))

More about the accessors. ³

³ accessors

# create a simple list
list(1)
# create a simple list with name "x" for first element
list(x=1)
list(x=1)["x"]
# extract content
list(x=1)$"x"
list(x=1)[[1]]
list(x=1)[["x"]]

# extract with pipe
list(x=1) %>% .[[1]]
list(x=1) %>% .$"x"

# long list
long_list_example <- list(1,c(1,2),
                          T,c(T,T),
                          "str",c("a","b"),
                          list(1),
                          mean,data.frame())
# check the content
long_list_example
# check structure of this list 
# list_complex_example %>% str()
# list_complex_example %>% glimpse()
# list_complex_example
# first list 
long_list_example[1]
# content of first list
long_list_example[[1]]
# first element of content of first list
long_list_example[[1]][1]

challenge

can you guess what data type are these?

# non-sense
long_list_example[[1]][2]
long_list_example[1][1]
long_list_example[1][2]
long_list_example[2][2]
# meaningful
long_list_example[[2]][2]

4.2 lapply: apply functions and return `list`

lapply(vector, function) ?lapply

# input is vector
c(1,4) %>% 
  lapply(.,FUN=function(x){x+3})
# input is list
list(2,4,c(1,4)) %>% 
  lapply(.,FUN=function(x){x+3})
# input has differnt type
list(2,4,c(1,4),"8") %>% 
  lapply(.,FUN=function(x){x+3})

challenge

Why you get error in the last line?

5 dataframe is a special type of list

each column has one data type

# create a dataframe 
df <- data.frame(time=as.Date("2023-04-16",format="%Y-%m-%d")+seq(1,3,1),
                 temp=c(20,15,13),
                 thermal_time=cumsum(c(20,15,13)))
# another way
df <- data.frame(time=as.Date("2023-04-16",format="%Y-%m-%d")+seq(1,3,1)) 
df$temp=c(20,15,13)
df$thermal_time=cumsum(df$temp)

# third method
library(dplyr)
df <- data.frame(time=as.Date("2023-04-16",format="%Y-%m-%d")+seq(1,3,1)) %>% 
  mutate(temp=c(20,15,13), 
         thermal_time=cumsum(temp))
df

challenge

Is it possible to create data frame with vectors of different length?

data.frame(time=as.Date("2023-04-16",format="%Y-%m-%d")+seq(1,3,1), 
           temp=c(20,13))

5.1 extract columns from data frame

You can subset dataframe by indexing [row,column]

dataframe[,column] select the whole role for selected columnn

dataframe[row,] select the whole column of selected rows

Select multiple row or column by puting logical or numeric vector in the square bracket.

challenge

use df,

Access column thermal_time as vector
Extract temp when time is 2023-04-17
Extract first row and first column with [1,]and [,1]

if you want to turn a data frame (df) by 90 degree (“transpose”), which function can you use? Could you find the answer on google or chatGPT?

1 Conditions: logical operators and vectors

2 Working directory

2.1 preparation

2.2 abbreviation path: “.” for wd and “..” for parent of wd

2.3 accessing files and folder inside a R project

3 get element from a vetor with accessors []

4 list: keep the diversity of data type

4.1 access content by name

4.2 lapply: apply functions and return list

5 dataframe is a special type of list

5.1 extract columns from data frame

3 get element from a vetor with `accessors []`

4.2 lapply: apply functions and return `list`