- misc
- case sensitive
- You can use alphanumeric characters, underscore (_) and period (.).
- ``#'' is used to add comments, so everything after ``#'' is ignored by R.
- Each command can be terminated by semi-colon (;) OR new-line.
- Number of ``spaces'' doesn't matter.

- Assignment of
**variables (objects)**and expression

Basic commands can be*assignment*or*expression*.> d <- 6.2 # Assignment > d # Expression: display the value assigned to varidable d > 6.2 -> d # another way of assignment

Once you assign values to variables, R will remember the values in the memory. But I bet that your memory isn't as good as R's. When you forget what variables you have defined, type:> ls() # list

If you want R to forget the definition of variables, type:

> rm(d) # remove d

- Getting help

R comes with great online documentations. Since R is extremely feature rich, nobody can memorize all details, so it's essential to know how to find the documentation.

> help("rm") > help.search("bootstrap") # keyword search > help.start()

- Vectors

``c'' means concatenate.> d <- c(2.5, 2, 4, 5) # a vector of 4 elements > d > e <- c(1,1,1) # a vector of 3 elements > e > f <- c(d, e) # f formed by concatenating d and e, 7 elements > f > f[4] # extract the 4-th element > g <- f[3:6] # create a new vector with 3rd to 6-th elements

- () is used to indicate
*functions*or arithmetic groupings. `[]`

is used to indicate the*indices*of a vector.

Other ways to create vectors

> a <- rep(2, 10) > a > a <- seq(3, 9) > a > 1:10 > seq(3, 4, 0.1)

- () is used to indicate
- Arithmetic operations
> 5.1 / 3 + 2 * ((3 + 4.1)^2 - 5) > d2 - 3

Operations on vectors -- vectorized operations is one of the quirks/strengths of R.Operations are performed element by element.

> g <- c(1,2,3) > g + e > e / g > g + 2 > g * (e + 1) > 1:9 / g

Note that the two vectors can have different lengths. The shorter vector is*recycled*as often as needed. - Math functions

These functions are applied to*each*element.> a <- c(1, 2, 3, 4, 5, 6) > sqrt(a) > exp(a) > log(a) > a^2 > sum(a) > prod(a) > mean(a)

- Types of vectors
**Numeric vectors/variables:**- Examples: c(1.0, 2.1, -0.3), 2:10
**Character vectors/variables:**- Each element is a character string

> pets <- c(2,11,4) > names(pets) <- c("cat","fish","shrimp") > num.fur.balls <- pets["cat"]

**Logical vectors/variables:**- Contains TRUE or FALSE
- R allows manipulation of logical quantities: TRUE or FALSE.
- Additionally, logical vector can take NA (not available) for missing data.
- Generally, logical vectors are created by
*conditions* - Logical operators:
`<, <=, >, >=, ==, !=`

> a <- 1:5 > t.or.f <- c(T, F, F, F, T) > gt3 <- a > 3 > even <- a %% 2 == 0 > a[t.of.f] > b <- c(0.5, 0.2, NA, 0.1) > b <- b[! is.na(b)] # eliminate the missing data

- Data frames

- Data frames contain rows and columns, similar to spread-sheets.
> x <- c(1, 2, 3, 4) > y <- c(5, 6, 7, 8) > z <- c(9, 10, 11, 12) > dat <- data.frame(x, y, z) # creates 4 rows, 3 columns data frame > dat > names(dat) # each column has a name > named(dat) <- c("c1", "c2", "z") # changing the column names

- Extraction of elements
> dat[2,3] # element of row 2, column 3 > dat[2, "z"] # same thing, but using the column name > dat[1,] # first row > dat[,2] # 2nd column > dat[2:4,c(1,3)] # subset, 2-4 rows and 1 & 3 columns > dat$c1 # extracting the c1 column by name

- Attach/detach

Frequently, you need to access the columns of dataframes.`attach()`

will make the column names visible temporarily.> attach(dat) > newVect <- c1 + c2 # exactly same as newVect <- dat$c1 + dat$c2 > z <- c1 * c2 # Note dat$z is not changed > dat$z <- c1 * c2 # This changes dat$z. > dat$modded <- z + c2 # This will add a new column with name "modded" > detach(dat) # Stop the attach > c1

- Manipulating data frames
`cbind()`

: column bind, combine data frames or vectors by columns.`rbind()`

: row bind> dim(dat) # shows the number of columns and rows > a <- c(13,14,15,16) > dat <- cbind(dat, a) # add 4-th column > dat > rbind(dat, dat[3:4,]) # extract 3-4-th rows and attach it at the end > dat[dat[,1] > 2,] # select the rows, whose 1st column > 2

For the comparison, you can use

`>, <, >=, <=, ==, !=`

.

Also you can use & (and),`|`

(or), ! (not) to make logical conditions. - Getting information about variables/objects
> dim(dat) # dimenion of the object > ncol(dat) # number of columns > nrow(dat) # number of rows > length(a) # length of a vector

- Data frames contain rows and columns, similar to spread-sheets.
- Importing data

If you have data stored in some spread sheet format, export the data in tab delimited text format (or any other decent text formats, such as comma separated text, works).If you want to try this, download a example data here

> dat.in <- read.table("data.txt", header=T, sep="\t")

- Use ``header=T'' if the 1st row of the data in text is column names.
- Use ``header=F'' if not. This will create a data.frame.
- You may need to specify the full path:
> dat.in <- read.table("/Users/naoki/doc/analysis/data.txt", header=T)

Or use`setwd()`

to set the current working directory or the R process.> getwd() # print out the current working directory > setwd("/Users/naoki/doc/analysis/") > getwd() > dat.in <- read.table("data.txt", header=T)

- Other types of objects

**matrices**or more generally**arrays**: multi-dimensional generalizations of vectors.**factors**: handles categorical data (e.g. sex: female, male, or hermaphrodite).**lists**: a general form of vector in which the various elements need not be of the same type.

- More advanced, but useful functions for data manipulations
`apply(), lapply(), sapply(), tapply()`

`is.na(), any(), all()`