Registration is closed
Timetable
Please check back before the event for updates
Install instuctions for each of the workshop sessions are HERE
Venue
The event will be at the University of Otago, with each day starting in Castle 1 lecture theatre.
Talks will be held in the Castle 1 lecture theatre and workshops will be at the Otago Business School
Venue (workshops: 11-13 February)
The workshops will be held at the Otago Business School (aka Commerce Building), University of Otago.
Exact rooms for the workshops will be detailed at the event following the first talk of each day
Workshop sessions
Unlike previous Research Bazaars, this year workshop spaces will be on a first-come-first-served basis on the day.
These are the intended workshop sessions but are subject to change. Make sure to check back before the event
Unless specified otherwise, a laptop is required
We have an exciting lineup of workshops, all of which will be of an introductory nature and provide a foundation for further learning.
Workshop timetable
The timetable is now set but please check back before the event for updates. All workshops within a session are run concurrently in separate rooms. Nearly all workshops will require you to bring a laptop.Skill level description:
In order to convey the skill level of a particular workshop the following terms are used to describe the assumed levels of prior knowledge or experience.
Beginner: Someone new to the topic with minimal prior knowledge of the workshop topic beyond the stated pre-requisites
Post-beginner: Greater expectation of prior knowledge or experience beyond the direct topic being covered in the workshop
Install instuctions for each of the workshop sessions are HERE
Monday 11th February
Session 1:Good data organization is the foundation of any research project. Spreadsheets are tools that are commonly used to store data and we organize data in spreadsheets in the ways that we as humans want to work with the data. But computers require that data be organized in particular ways so in order to use tools that make computation more efficient, such as programming languages like R or Python, we need to structure our data the way that computers need the data.
This session will cover the best practices for using spreadsheets with data. This will include:
- Learning about the "Tidy data" principles
- How to organise data according to "Tidy data" principles
- Dealing wth dates
- Exporting data for use with other tools
- people who collect data
- people who want to begin analysing data
Targeted skill level: beginner
Software required: spreadsheet program installed (e.g. MS Excel or Libre Office Calc)
- Creating up a github account
- Creating a repository to store the notebook
- Introduction to markdown syntax for formatting
- Adding entries to the notebook
- Modifying existing entries
- Collaborating with multiple authors
- create a reprodicible and collaborative document
- people who want to create a digital notebook
- people who want to collaborate with notes
- creating documentation
Targeted skill level: beginner
Software required: web-browser
- define terms, phrases, and concepts in software development and data science
- understand what tasks are best performed by a computer
- identify and use best practices in data structures
No prior knowledge required
Targeted skill level: beginner
Software required: web-browser
- basic R syntax
- the components of the RStudio interface
- how to compute basic statistics
- where to find further help for R
- Data analysis
- Day 2: functions in R and data manipulation in R
- Day 3: data visualisation in R
Targeted skill level: beginner
Software required: R and RStudio
- open the commandline
- understand how to navigate and create files and directories
- run commandline programs
- task automation
- day 2 lessons for Make and Docker
Targeted skill level: beginner
Software required: MacOS/Linux - Terminal (comes pre-installed). Windows - GitBash
This session will cover the syntax for creating patterns (regular expressions) for use in searching text.
- match text using simple patterns
- understand basic regular expression syntax
- People who want to improve their understanding of how search works
- People who use find/replace
Targeted skill level: beginner
Software required: web-browser
Finding and replacing is a common task for text editing. This session will cover using regular expressions for the purposes of manipulating text by creating patterns. By attending this session you can expect to:
- extract text from files that match patterns
- find and replace text using patterns
- rearrange columns in files
pre-requisites: a working knowledge of navigating the filesystem on the commandline and running commandline programs.
Targeted skill level: post-beginner
Software required: MacOS/Linux - Terminal (comes pre-installed). Windows - GitBash
By attending this session you can expect to understand how to:
- subset data based on columns
- filter rows by conditions
- create new columns based on other columns
- create data summaries
- create columns or summaries by data groupings
Targeted skill level: beginner
Software required: R and RStudio (please also install the tidyverse package)
Targeted skill level: beginner
By attending this session you can expect to:
- Understand the purpose of data.govt.nz
- Understand what open data is
- Understand how to browse for open data sets
- Be familiar with some of the main sources for open data
Software required: web-browser
Targeted skill level: beginner
Tuesday 12th February
Session 4:This session will cover using version control with Git to automate tracking and dealing with changes when writing scripts within RStudio.
The session will cover:
- setting up Git
- adding files to be tracked
- making and tracking changes
- reviewing changes
pre-requisites: Introduction to R or prior experience with R
Targeted skill level: beginner
Software required: R and RStudio (please also install the tidyverse package)
- understand targets and dependencies
- create a basic scripted workflow
- want to repeat a workflow with changing data
Targeted skill level: post-beginner
Software required: MacOS/Linux - Terminal (comes pre-installed). Windows - GitBash
- Load data into OpenRefine
- perform basic data cleaning operations
- Export data from OpenRefine
- clean and organise data
- want to apply the same data cleaning operation to multiple datasets
Targeted skill level: beginner
Software required: OpenRefine
- Understand the how to create a function
- Understand how to specify arguments to a function
- Understand how to return data from a function
- People who want to specify their own methods for dealing with data
Targeted skill level: beginner
Software required: R and RStudio
- obtain a pre-built docker image
- create a dockerfile to create a custom contatiner
- access the docker container and run a command
- share data between the host and the container
Targeted skill level: post-beginner
Software required: Docker
Targeted skill level: all
Software required: none
pre-requisites: Introduction to unix shell or equivalent
Targeted skill level: beginner
Software required: MacOS/Linux - Terminal (comes pre-installed). Windows - GitBash
suggested prior knowledge: Introduction to R or prior experience with R
Targeted skill level: beginner
Software required: R and RStudio
pre-requisites: Reproducible computational environments using containers
Good data organization is the foundation of any research project. Spreadsheets are tools that are commonly used to store data and we organize data in spreadsheets in the ways that we as humans want to work with the data. But computers require that data be organized in particular ways so in order to use tools that make computation more efficient, such as programming languages like R or Python, we need to structure our data the way that computers need the data.
This session is extremely similar to that of "Best practices for data organisation in spreadsheets" on Monday.
This session will cover the best practices for using spreadsheets with data. This will include:
- Learning about the "Tidy data" principles
- How to organise data according to "Tidy data" principles
- Dealing wth dates
- Exporting data for use with other tools
- people who collect data
- people who want to begin analysing data
Targeted skill level: beginner
Software required: spreadsheet program installed (e.g. MS Excel or Libre Office Calc)
Wednesday 13th February
Session 7:- specify the data to be visualised
- be able to create scatter plots, line graphs, bar plots, box and whisker plots, and histograms
- understand how to make customisations to default themes
pre-requisites: Introduction to R or prior experience with R
Targeted skill level: beginner
Software required: R and RStudio (please also install the tidyverse package)
- understand what a relational database is
- create basic queries for choosing columns
- create basic queries for filtering data
- create queries for summarising data based on groups
Targeted skill level: beginner
Software required: SQLite and http://sqlitebrowser.org
This session will cover getting started with Gitub and the basic syntax of markdown for creating formatted documents which can be converted into a website.
- Creating up a github account
- Creating a repository
- Introduction to markdown syntax for formatting
- Creating a simple webpage with markdown
- Modifying pages and tracking changes
- How to use Github for collaboration
Targeted skill level: beginner
Software required: web-browser
suggested prior knowledge: Introduction to R or prior experience with R
Targeted skill level: post-beginner
Software required: R and RStudio
pre-requisites: Introduction to getting data from databases
This session directly continues from Introduction to Github
Pre-requisites: Introduction to Github