# Objectives

Introduce the following concepts:

- Object-orientedness
- Vectors
- Functions

# Working through the script

## Object-oriented

\({\bf\textsf{R}}\) is “object-oriented”, which means that character strings can be used to represent values. We have two options when writing script to define objects:

`<-`

assigns the operation on the right to the named object on the left.`=`

will do the same, and is shorter.

For programming purposes I like the idea of moving one side to the other, especially when the right side has many entries or is large.
I tend to reserve `=`

for defining variables or long file paths in my script, and use `<-`

when creating data objects or storing statistical results.

Note that \({\bf\textsf{R}}\) creates objects on the fly and does not need them to be defined at the beginning of the script or session as in `C++`

.

`answer <- 2+(2*20)`

Note that you should now see `answer`

in the global environment pane of R studio.

Calling the object will print its content in the console:

`answer`

`## [1] 42`

This object can now be used for additional operations…

`answer*2`

`## [1] 84`

…and the creation of new objects:

```
new.answer <- answer*2
new.answer
```

`## [1] 84`

## Functions

Functions are a special type of \({\bf\textsf{R}}\) object that instead of containing data, contain a series of operations. Functions are essentially shortcuts for common sets of operations.

For example, researchers often want to find the mean of data. Say we have the following five observations:

`24`

, `13`

, `12`

, `22`

, and `15`

The arithmetic mean is defined as the sum of the observations divided by the number of observations, which in \({\bf\textsf{R}}\) looks like:

`(24 + 13 + 12 + 22 + 15) / 5 `

`## [1] 17.2`

Alternatively, we can assign the data to an object using the `c`

function, which stands for *concatenate*.
It joins everything between the parentheses, separated by commas, into a *vector* that we’ll call `data`

:

```
data <- c(24, 13, 12, 22, 15)
data
```

`## [1] 24 13 12 22 15`

To find the mean of `data`

, one might first think we can simply divide the object by 5…

`data / 5`

`## [1] 4.8 2.6 2.4 4.4 3.0`

…but this is obviously incorrect.
Here, \({\bf\textsf{R}}\) has applied the “divide by five” operation to each value in the vector.
This is an example of how \({\bf\textsf{R}}\) is *vectorized*: it is designed to perform its operations along vectors.
Although it will be awhile before you feed \({\bf\textsf{R}}\) large enough datasets to notice the difference, vectorization optimizes performance and makes \({\bf\textsf{R}}\) computations quick.

Calculating the mean is a two-step process, and we need to define both.
Thus, we must first find the sum of the data, for which we can use the shortcut function `sum`

:

`sum(data)`

`## [1] 86`

Then we divide the sum by 5 to calculate the mean:

`sum(data) / 5 `

`## [1] 17.2`

This is an example of *hard-coding*: we’ve specified the divisor in this operation as a fixed value (5).
But what if the value varies – say your technician (definitely not you!) inadvertently lost or failed to enter some data, and a given set of replicates do not have the number of observations you expect?
Hard-coding your count creates problems:

```
data2 <- c(24, 13, 12, 22)
sum(data2) / 5
```

`## [1] 14.2`

The calculated mean is too low, because our hard-coded operation divided the sum of four observations by five.

It is preferable to have \({\bf\textsf{R}}\) determine the count for each operation, so if counts differ, \({\bf\textsf{R}}\) can automatically account for it.

We can use the `length`

function to determine how many observations are in the set:

`length(data)`

`## [1] 5`

If *length* sounds odd, remember `data`

is a vector comprised of individual values.
The number of entries determines how long the vector is, and so length is a convenient way to count the number of observations.
This is a core concept in \({\bf\textsf{R}}\) that we will return to frequently.

Let’s see how this combination of functions performs:

`sum(data) / length(data)`

`## [1] 17.2`

`length(data2)`

`## [1] 4`

`sum(data2) / length(data2)`

`## [1] 17.75`

Of course, calculating the mean of a vector is a very common operation, and \({\bf\textsf{R}}\) has a built-in function that combines the `sum`

, `length`

, and `/`

operations into one shortcut:

`mean(data)`

`## [1] 17.2`

### Custom functions

\({\bf\textsf{R}}\) has a lot of functions built in, and thousands of packages supply additional functions. But one often still encounters a situation where one’s life–or at least one’s script–is made more simple with a custom function.

Writing your own functions is easy.
They are a special type of object in \({\bf\textsf{R}}\) that can be defined and added to the global environment.
The `function()`

function helps create them: one simply assigns arguments between the `( )`

and specifies the operation between curly brackets `{ }`

.

Even though \({\bf\textsf{R}}\) already has `mean()`

, let’s make our own alternative, called `Meaner()`

:

`Meaner <- function(x) { sum(x) / length(x) }`

We can call it without any arguments to see what is stored in the object:

`Meaner `

`## function(x) { sum(x) / length(x) }`

Then we can call it on our data:

`Meaner(data)`

`## [1] 17.2`

Our custom `Meaner()`

function performs the same as the base `mean()`

.

Let’s make it truly custom, and add a little excitement to the operation:

```
Meaner <- function(x) {
m = sum(x) / length(x)
m1 = paste(m, "!", sep="")
return(m1)
}
Meaner(data)
```

`## [1] "17.2!"`

Notice how the function created two objects, `m`

and `m1`

, that were not added to the global environment but instead only existed while the operation was running on your computer’s processor.
These two objects existed only temporarily during the calculation; `return()`

specified what should be returned back to \({\bf\textsf{R}}\) when the operation was complete.

## Comments

Note that anything preceded by

`#`

is ignored by \({\bf\textsf{R}}\). We call it a comment operator, and it is useful for adding explanation to the script.At a very basic level \({\bf\textsf{R}}\) is a fancy calculator. It will chug arithmetic operations:

\({\bf\textsf{R}}\) follows proper order of operations, including parentheses: