IntroRangeR
/
Recent content on IntroRangeRHugo -- gohugo.ioen-usTue, 20 Oct 2020 00:00:00 +0000Intermediate R markdown
/post/intermediate-r-markdown/
Tue, 20 Oct 2020 00:00:00 +0000/post/intermediate-r-markdown/Objective Materials How-to video Template & example files Add a bibliography The yaml header The .csl file The .bib file Compile new Create from zotero Add script appendix Knit to .html Publishing Alternative formats Objective Step up your \({\bf\textsf{R}}\) markdown game by creating polished documents:
Adding in-text citations and a bibliography that are compiled and formatted while knitting Hide code chunks from main body of text but include at end as an appendix Knit .Introduction to confidence intervals
/post/intro-conf-ints/
Tue, 06 Oct 2020 00:00:00 +0000/post/intro-conf-ints/Hopefully soon this space features a blog post on confidence intervals.About
/about/
Mon, 31 Aug 2020 00:00:00 +0000/about/Course objectives Required software Introductory video What do I know about \({\bf\textsf{R}}\)?? This blog supports my graduate-level Introduction to R seminar at North Dakota State University. Fork or bookmark the GitHub project, IntroRangeR, to keep up-to-date with course materials. If you’re officially enrolled, homework assignments are submitted on Blackboard.
The syllabus lives on GoogleDrive.
Course objectives After working through this material, one should have a better-than-beginner proficiency in \({\bf\textsf{R}}\) for the purposes of data management, analysis, and presentation.Introducing multivariate data
/post/intro-mv/
Sun, 23 Aug 2020 00:00:00 +0000/post/intro-mv/Objective Multivariate data Multivariate analyses Finding patterns The distance matrix Ordination The latent variable post-hoc statistical tests Objective Multivariate data are different than what we’ve used so far, and so multivariate analyses are different, as well. This post introduces these differences ahead of the lessons in multivariate analyses.
Multivariate data The primary difference between multivariate and univariate data is the number of response variables.Lesson 10 | GLMs & GLMMs
/post/glm-glmm/
Sun, 23 Aug 2020 00:00:00 +0000/post/glm-glmm/Materials Data Script Video lecture Walking through the script Packages & data Analyzing count data Fitting a GLM The car package Analyzing non-independent data Fitting a GLMM summary on glmer object Analysis of Deviance on glmer object Homework Materials Data This lesson uses those same data on arrests around athletics venues in Cleveland, Ohio:1
AllClevelandCrimeData.csv
Script GLM_GLMM.R
Video lecture Video on YouTubeLesson 11.1 | Cluster analysis
/post/cluster-analysis/
Sun, 23 Aug 2020 00:00:00 +0000/post/cluster-analysis/Objectives Materials Script Video lecture Walking through the script The vegan package Calculating a distance matrix Fitting cluster diagrams k-means clustering vegan::cascadeKM Clustering on scaled data Homework assignment Objectives Introduce the vegan package, which includes the primary tools most ecologists use for multivariate analysis in \({\bf\textsf{R}}\) Introduce clustering as a form of multivariate data analysis Fit cluster diagrams Identify known groups in cluster diagrams Find unknown groups via k-means clustering Materials Script clustering.Lesson 11.2.1 | Fitting & plotting ordinations
/post/intro-ordinations/
Sun, 23 Aug 2020 00:00:00 +0000/post/intro-ordinations/Objectives Materials Script Video lecture Walking through the script Packages and data set-up Fitting an ordination Biplot Interpreting scores Assessment Ordination vs. clustering Scaled data Scale & fit Eigenvalues Great power = great responsibility Homework assignment Objectives This lesson introduces ordination as a form of multivariate analysis, and covers several relevant vegan functions.
Fitting an ordination object with capscale Graphing and interpreting the biplot Assessing the solution via eigenvalues and scree plots Extracting and interpreting site and species scores Understanding the impact of scaling on ordination results Materials Remember, If you aren’t familiar with multivariate data analysis, be sure to go over this introductory blog post.Lesson 11.2.2 | Ordination groups & gradients
/post/ord-groups-gradients/
Sun, 23 Aug 2020 00:00:00 +0000/post/ord-groups-gradients/Objectives Materials Script Video lecture Walking through the script Packages and data Plot by known groups ordihull ordiellipse ordispider Testing groups Gradients Vectors Testing Smoothed surfaces Homework assignment Objectives This second part of the introduction to ordinations focuses on showing a priori groups and environmental gradients in ordination graphics and testing them with vegan functions.
Categorical variables: Fit and plot groups with ordihull, ordiellipse, and ordispider Test groups with envfit and RVAideMemoire Continuous variables (environmental gradients) Fit, plot, and test vectors with envfit Fit, plot, and test non-linear response surfaces with ordisurf Materials This lesson follows Lesson 11.Lesson 11.3 | Pretty ordination plots
/post/pretty-ords/
Sun, 23 Aug 2020 00:00:00 +0000/post/pretty-ords/Materials Script Video lecture Walking through the script GGplotting cluster analysis GGplotting ordinations Getting set up Load extension package for ordinations in ggplot Homework assignment Materials Script IntroMultivariateGGplotting.R
Bonus script: IntroMultivariateDissectOrdObjects.R
Video lecture Video on YouTube
Walking through the script Remember, only run install.packages once per package, and do not include it in a .Rmd file.
install.packages("gridExtra") install.Lesson 12.1.1 | Introduction to mapping
/post/intro-mapping/
Sun, 23 Aug 2020 00:00:00 +0000/post/intro-mapping/Materials Script Video lecture Walking through the script Packages Basic ggplot maps maps data Adding features Coordinate Reference Systems The sf package Geometry geom_sf CRS Natural Earth data Homework assignment This lesson is broken into two parts, with a single homework assignment. This post covers Part 1: Basic mapping with ggplot2
Materials Script IntroMapping.R
Video lecture Video on YouTube
Walking through the script Packages Handling spatial dataLesson 12.1.2 | Adding map features
/post/map-features/
Sun, 23 Aug 2020 00:00:00 +0000/post/map-features/Materials Script Video lecture Walking through the script Packages and data Modifying layers Dissolve boundaries Add feature information Create and plot new data column Add features from other data sources Download and open files from the Web Add features with different CRS Cropping via intersect Associating spatial and non-spatial data Homework assignment This second part of the introduction to mapping in \({\bf\textsf{R}}\) covers adding features to maps with ggplot2 and sf.Lesson 12.2 | Spatial interpolation
/post/interpolation-lesson/
Sun, 23 Aug 2020 00:00:00 +0000/post/interpolation-lesson/Overview Materials Script Video lecture Walking through the script Preparations Setup Load Data Geometries Point Data Cities Preprocess Data Clip Point Data to Buffered Germany Make Regular Grid Prepare input data Run KNN interpolation procedure Visualize Summarize data Homework assignment Overview This lesson adapts a great blog post from Timo Grossenbacher on spatial interpolation into a lecture for IntroRangeR.
This excercise creates a map like this (Timo’s original)…Lesson 9 | Time series & count data
/post/time-series-count/
Sat, 22 Aug 2020 00:00:00 +0000/post/time-series-count/Materials Data Script Video lecture Walking through the script Bar plots for counts Game day vs. non-game day Stop the shouting! Time series Create a timestamp Homework assignment Materials Data This lesson uses data on arrests around athletics venues in Cleveland, Ohio to address questions of the effects of sporting events on the frequency, timing, and type of crimes.
AllClevelandCrimeData.csv
Menaker, BE, DA McGranahan & RD Sheptak Jr.Concerns with F-tests on multiple regression models in R
/post/f-test-error-types/
Thu, 20 Aug 2020 00:00:00 +0000/post/f-test-error-types/This post is pending
anova(mr_lm) car::Anova(mr_lm, type = 2) We can see from the \(P\) values that hp and wt are significant terms, whereas drat is not.
But remember that \(P\) values are derived from \(F\) statistics, and the \(F\) values in this ANOVA table are interesting. Note that the \(F\) value for hp is an order of magnitude greater than that of wt. Does mean that it is an order of magnitude more important?Multiple alternative hypothesis testing
/post/multiple-alternative-hypothesis-testing/
Thu, 20 Aug 2020 00:00:00 +0000/post/multiple-alternative-hypothesis-testing/A blog post on comparing models based on multiple competing hypotheses will someday appear here. In the meantime, the Analysis of Ecosystems project folder has script for AIC-based model selection.Lesson 8.2 | Linear regression
/post/linear-regression/
Fri, 14 Aug 2020 00:00:00 +0000/post/linear-regression/Overview Objectives Materials Lecture Script Walking through the script Linear regression Fitting multiple regression models Assessing model results Significance testing Determining effect sizes Confidence intervals Visualization Continuous + categorical variables Interaction terms Overview To conduct a linear regression is to test for a linear relationship between two continuous variables: as one variable changes, does the other respond in a predictable manner? The linear model sorts how much of the change in the response variable can be attributed to change in the predictor variable relative to random noise.How ANOVA is linear regression
/post/anova-is-regression/
Thu, 13 Aug 2020 00:00:00 +0000/post/anova-is-regression/The statistics of a line Linear regression on continuous data Fitting lines to categorical data Comparing two groups t-statistics vs. F statistics Comparing three (or more) groups Post-hoc Tukey tests This post illustrates how Analysis of Variance – ANOVA, used for testing for differences among groups – is a special case of linear regression. Along the way, we parse the various components of results from statistical tests in \({\bf\textsf{R}}\) and illustrate post-hoc pairwise tests using TukeyHSD().Lesson 8.1 | Basic statistical analysis
/post/basic-stats/
Wed, 12 Aug 2020 00:00:00 +0000/post/basic-stats/Overview Objectives Materials Lecture Script Walking through the script Getting started Research question & hypotheses Checking distribution t-tests Analysis of variance Multiple comparisons Overview \({\bf\textsf{R}}\) shines at performing statistical analyses. Lessons 8.1 & 8.2 cover basic functions for fitting common statistical models and retrieving their results. These lessons do not cover questions about when a given statistical model is appropriate, or how to interpret the results.Reflections on distributions
/post/reflections-on-distributions/
Wed, 12 Aug 2020 00:00:00 +0000/post/reflections-on-distributions/Always consider the biological or ecological implications of your statistical assumptions.
In the introductory lesson on data distributions, we used data from this paper on raptor responses to prescribed fire to demonstrate how using the gamma distribution produces evidence that Swainson’s Hawks are attracted to prescribed fires, but using the normal distribution does not indicate a non-zero effect.
Although I did the statistics for that paper, I have mixed thoughts on whether I took the proper approach.Lesson 7.1 | Intro data distributions
/post/data-distributions1/
Tue, 11 Aug 2020 00:00:00 +0000/post/data-distributions1/Overview Objectives Materials Script Lecture Walking through the script Some basics of statistics Plotting distributions Histogram Kernel density estimate Moments Assessing fit Improving fit Transformations Overview An essential first step before conducting statistical analysis of data – even before deciding which statistical analysis to conduct – is understanding the distribution of those data.
Many statistical models have one or more assumptions about the nature of the data being tested, and several assumptions refer to the distribution.Lesson 7.2 | The Gamma distribution
/post/gamma-dist/
Tue, 11 Aug 2020 00:00:00 +0000/post/gamma-dist/Overview Objectives Materials Script Lecture Walking through the script Alternative distributions Using r*dist functions Confidence intervals Overview An essential first step before conducting statistical analysis of data – even before deciding which statistical analysis to conduct – is understanding the distribution of those data. It is critical to first understand the type of data, and then the distribution of those data, before selecting a statistical model to ensure that an appropriate test is performed.Lesson 6 | More tidyverse
/post/more-tidyverse/
Thu, 06 Aug 2020 00:00:00 +0000/post/more-tidyverse/Objectives Materials Lecture Script Data Walking through the script Load data Modifying data with tidyverse Creating columns Wide vs long formats Manipulating string variables Consolidate into one object Homework Objectives Access data directly from Excel workbooks via readxl Combine familiar tidyverse verbs like mutate() and full_join() with new functions to manipulate data: Combine (and split) identifying columns with unite() and separate() Switch between long and wide data formats with pivot_ verbs Split character strings into individual observations with str_split() Materials Lecture Video on YouTubeLesson 5 | More ggplot
/post/more-ggplot/
Tue, 04 Aug 2020 00:00:00 +0000/post/more-ggplot/Objectives Materials Lecture Script Walking through the script Boxplots geom_jitter geom_violin Plotting error bars geom_errorbar Additional variables position_dodge Show data while emphasizing trends Homework Objectives Practice ggplot basics Learn new geoms to explore patterns within data geom_boxplot geom_violin geom_jitter Clearly plot group means and variance with geom_errorbar and position_dodge Connect repeated measures data through time with geom_line and aes(group=) De-emphasize plot elements with alpha= Materials Lecture Video on YouTubeLesson 4 | Introduction to ggplot2
/post/intro-ggplot2/
Sun, 02 Aug 2020 00:00:00 +0000/post/intro-ggplot2/Objectives Online resources ggplot2 how-to pages General graphing resources Materials Lecture Script Walking through the script Package loading Data loading Introducing ggplot Learning geom Component placement Adding information Adding variables to aes() Using facets Customizing ggplot appearance scale_ settings Adjusting themes Homework Objectives The script and exercises for this week introduce new users to the ggplot2 package:
Core components of the basic ggplot() call Adding information one variable at a time via aes() and facet_* Formatting and themes Online resources Here are some helpful resources for learning how to use ggplot and just thinking about graphing, in general.Script writing style
/post/script-writing-style/
Fri, 31 Jul 2020 00:00:00 +0000/post/script-writing-style/Style is important when writing script. I generally follow the tidyverse style guide and cannot recommend strongly enough that you do, too. I’ll admit that I’m definitely not perfect and have adopted some shortcuts the style guide says to avoid. However I do my best to not let those habits creep into script for class.
Two main points for now:
Firstly, realize that \({\bf\textsf{R}}\) ignores line breaks as long as they are preceded by a comma dividing arguments, as in vars(.Introducing R Markdown
/post/intro-r-markdown/
Thu, 30 Jul 2020 00:00:00 +0000/post/intro-r-markdown/Resources R Markdown: What and why Getting started Video on YouTube Opening a new file Producing output Components of an R markdown file yaml header Code chunks You are expected to prepare all homework assignments in R Markdown. These are files with extension .Rmd that are easily created, edited, and processed in R studio; the basics for doing so are described below. Basic proficiency in using R markdown is a specific objective of this course, to enhance your ability to fully integrate \({\bf\textsf{R}}\) into a workflow that includes interpreting, presenting, and sharing your results.Lesson 1 | A very basic introduction to R
/post/lesson-1/
Thu, 30 Jul 2020 00:00:00 +0000/post/lesson-1/Objectives Materials Working through the script Comments Object-oriented Functions Custom functions Objectives Introduce the following concepts:
Object-orientedness Vectors Functions Materials Script: VeryBasicIntro.R
Video on YouTube
Working through the script Comments Note that anything preceded by # is ignored by \({\bf\textsf{R}}\). We call it a comment operator, and it is useful for adding explanation to the script.
At a very basic level \({\bf\textsf{R}}\) is a fancy calculator.Lesson 2 | Objects, classes, and data structures
/post/lesson-2/
Thu, 30 Jul 2020 00:00:00 +0000/post/lesson-2/Materials Lecture Script Walking through the script Working directories Saving objects Packages Diagnostic data functions Classes Working with specific columns Evaluating specific rows and cells A more complex dataset Basic plotting Homework Materials Lecture Video on YouTube
Script ObjectClassesStructure.R
Walking through the script Working directories Working directories help pull data from, and save to, specific files. One can always see what your current working directory is:Lesson 3 | Intro to tidyverse
/post/lesson-3-intro-to-tidyverse/
Thu, 30 Jul 2020 00:00:00 +0000/post/lesson-3-intro-to-tidyverse/Objectives Materials Walking through the script Data loading read.csv() vs. read_csv() Data manipulation Adding variables Joining datasets Summary statistics Wrap up Homework assignment Objectives Introduce data wrangling using tidyverse verbs to:
Add columns Change and rename columns Combine data objects Assign data to groups and calculate summary statistics Basically we want to start being able to perform in \({\bf\textsf{R}}\) the intermediary steps between raw data and graphs/analysis, rather than going back to Excel.Lesson 0 | Getting started
/post/getting-started/
Wed, 29 Jul 2020 00:00:00 +0000/post/getting-started/Set-up Get-to-know-you survey Getting started Script Loading packages Loading data Graphing Basic bar graphs Stacked bar graphs Mapping Word clouds Video on YouTube
Set-up Here are some things you should do to make working through this course as easy and productive as possible:
Install the necessary software Required: The R Statistical Environment R studio is optional but highly encouraged because all the course materials assume one is using it and honestly, I don’t know how I’d do the homeworks without it.