Home > Learning > BI/DS Tutorials and Workshops > 2026 Summer Workshops > 2026 Biostatistics Week 1
Home > Learning > BI/DS Tutorials and Workshops > 2026 Summer Workshops > 2026 Biostatistics Week 1
Three one-week courses were offered. Four hours/day – 10 am to 12 pm (lecture and discussion) and 1 to 3 pm (R lab and application). Morning sessions and course lab instructions were recorded.
Course Presenter: Dr. Alexander McLain, University of South Carolina Arnold School of Public Health, Professor of Epidemiology and Biostatistics
Week 1 (June 1-5): Foundations of Data Science in R
Week 2 (June 8-12): Statistical Modeling
Week 3 (June 15-19): Bioinformatics & High-Dimensional Data
10 am to 12 pm: Lecture and discussion
1 to 3 pm: R Lab and application
Target: Assumes no prior R experience. All participants working with biological data.
Monday, June 1, Day 1
MORNING: Course Overview & Orientation to Data Science
Why data science in biology? Course logistics, expectations, reproducibility principles.
AFTERNOON: Getting Started with R and RStudio
Why data science in biology? Course logistics, expectations, reproducibility principles.
Tuesday, June 2, Day 2
MORNING: R Programming Fundamentals
Data types, vectors, matrices, lists, data frames; indexing; control flow (if/else, for loops); writing functions.
Tuesday, June 2, Day 2
AFTERNOON: Working with Data in R
Importing CSV/Excel files; inspecting data (str, summary, head); basic data manipulation with base R.
Wednesday, June 3, Day 3
MORNING: Data Wrangling with the tidyverse
Tidy data principles; dplyr verbs (filter, select, mutate, group_by, summarize); pipes (%>%); tidyr (pivot_longer, pivot_wider).
AFTERNOON: tidyverse Lab
Hands-on cleaning and reshaping a messy biological dataset; joining tables; handling missing data. (Instructions for afternoon were given in the morning session.)
Thursday, June 4, Day 4
MORNING: Data Visualization with ggplot2
Grammar of graphics; geom types (point, bar, box, line, histogram, density); faceting; themes; color palettes for biological data.
AFTERNOON: ggplot2 Lab
Recreating publication-quality figures from genomics/ecology datasets; customizing axes, legends, and themes.
Friday, June 5, Day 5
MORNING: Probability, Distributions, and Statistical Inference
Random variables; common distributions (Normal, Binomial, Poisson); Central Limit Theorem; p-values, confidence intervals, and their correct interpretation.
AFTERNOON: Simulation & Inference Lab
Simulating data in R; visualizing distributions; one- and two-sample t-tests; chi-square tests; Wilcoxon rank-sum test; interpreting output.