Registration is closed

Timetable

Please check back before the event for updates

Install instuctions for each of the workshop sessions are HERE

Venue

The event will be at the University of Otago, with each day starting in Castle 1 lecture theatre.

Talks will be held in the Castle 1 lecture theatre and workshops will be at the Otago Business School

Venue (workshops: 11-13 February)

The workshops will be held at the Otago Business School (aka Commerce Building), University of Otago.

Exact rooms for the workshops will be detailed at the event following the first talk of each day

Workshop sessions

Unlike previous Research Bazaars, this year workshop spaces will be on a first-come-first-served basis on the day.

These are the intended workshop sessions but are subject to change. Make sure to check back before the event

Unless specified otherwise, a laptop is required

We have an exciting lineup of workshops, all of which will be of an introductory nature and provide a foundation for further learning.

Workshop timetable

The timetable is now set but please check back before the event for updates. All workshops within a session are run concurrently in separate rooms. Nearly all workshops will require you to bring a laptop.

Skill level description:
In order to convey the skill level of a particular workshop the following terms are used to describe the assumed levels of prior knowledge or experience.

Beginner: Someone new to the topic with minimal prior knowledge of the workshop topic beyond the stated pre-requisites

Post-beginner: Greater expectation of prior knowledge or experience beyond the direct topic being covered in the workshop

Install instuctions for each of the workshop sessions are HERE

Monday 11th February

Session 1:

Good data organization is the foundation of any research project. Spreadsheets are tools that are commonly used to store data and we organize data in spreadsheets in the ways that we as humans want to work with the data. But computers require that data be organized in particular ways so in order to use tools that make computation more efficient, such as programming languages like R or Python, we need to structure our data the way that computers need the data.

This session will cover the best practices for using spreadsheets with data. This will include:

Learning about the "Tidy data" principles
How to organise data according to "Tidy data" principles
Dealing wth dates
Exporting data for use with other tools

This session would be useful for:

people who collect data
people who want to begin analysing data

Targeted skill level: beginner

Software required: spreadsheet program installed (e.g. MS Excel or Libre Office Calc)

Note taking is a key part of research. Digital notebooks enable you to not only capture ideas but also enrich them with embedded images, links to reference material, and also let you modify and improve them while keeping track of the previous states. They can also be used as self-promotion of your work. This session will cover the creation of a simple web-based notebook that could serve as a lab notebook or blog. By the end of this workshop, participants will have covered:

Creating up a github account
Creating a repository to store the notebook
Introduction to markdown syntax for formatting
Adding entries to the notebook
Modifying existing entries
Collaborating with multiple authors

This session would be useful for:

create a reprodicible and collaborative document
people who want to create a digital notebook
people who want to collaborate with notes
creating documentation

Targeted skill level: beginner

Software required: web-browser

What is data? What data might I have? Is there an easier way to do what I'm doing? These are some of the questions that you may ask during your research and this session is designed to help you start answering these questions. Computational research is full of jargon making it difficult to know if a particular program is going to solve the problems you have. But being able to put names to your problems is powerful is being able to start solving them. This lesson (based on Library Carpentry) introduces librarians and others to working with data. At the conclusion of the lesson you will be able to:

define terms, phrases, and concepts in software development and data science
understand what tasks are best performed by a computer
identify and use best practices in data structures

No prior knowledge required

Targeted skill level: beginner

Software required: web-browser

This session will provide a space for a yet to be determined session of a more skilled nature

Session 2:

R is a programming language that is useful for data analysis, and by learning R you can improve your efficiency and reproducibility of your analysis. This is an introductory session to the R programming language. Participants attending this session can expect by the end of the session to understand:

basic R syntax
the components of the RStudio interface
how to compute basic statistics
where to find further help for R

Useful for:

Data analysis
Day 2: functions in R and data manipulation in R
Day 3: data visualisation in R

Targeted skill level: beginner

Software required: R and RStudio

The power of the unix shell comes from it's reproduciblity and ability to automate and scale tasks. This is an introductory session to the unix commandline. By the end of this session participants can expect to:

open the commandline
understand how to navigate and create files and directories
run commandline programs

Useful for:

task automation
day 2 lessons for Make and Docker

Targeted skill level: beginner

Software required: MacOS/Linux - Terminal (comes pre-installed). Windows - GitBash

Have you found yourself repeating the same search but only making small changes each time? Maybe you want to know where all the occurances of a list of words are in a document. Instead of having to perform each search individually there are more efficient ways, such as creating code that embodies the commonalities in your search terms.
This session will cover the syntax for creating patterns (regular expressions) for use in searching text.

match text using simple patterns
understand basic regular expression syntax

Useful for:

People who want to improve their understanding of how search works
People who use find/replace

Targeted skill level: beginner

Software required: web-browser

Finding and replacing is a common task for text editing. This session will cover using regular expressions for the purposes of manipulating text by creating patterns. By attending this session you can expect to:

extract text from files that match patterns
find and replace text using patterns
rearrange columns in files

pre-requisites: a working knowledge of navigating the filesystem on the commandline and running commandline programs.

Targeted skill level: post-beginner

Software required: MacOS/Linux - Terminal (comes pre-installed). Windows - GitBash

Session 3:

Often the data we have is not in the format, or subsetted in the way we need to analyse it. This session is all about manipulating the data you have into the formats and groupings you need for analysis in R

By attending this session you can expect to understand how to:

subset data based on columns
filter rows by conditions
create new columns based on other columns
create data summaries
create columns or summaries by data groupings

pre-requisites: Introduction to R or prior experience with R

Targeted skill level: beginner

Software required: R and RStudio (please also install the tidyverse package)

This session directly continues on from introduction to unix shell pre-requisites: Introduction to unix shell

Targeted skill level: beginner

There are many publicly available datasets that can be used for research and to supplement data you may already have. Sites like Data.govt.nz helps people discover, learn and use open data easily; empowering, enabling informed decision making, and problem-solving for citizens and business alike.

By attending this session you can expect to:

Understand the purpose of data.govt.nz
Understand what open data is
Understand how to browse for open data sets
Be familiar with some of the main sources for open data

Software required: web-browser

Targeted skill level: beginner

This session is an unstructured breakout session to provide space to work or discuss together.

Tuesday 12th February

Session 4:

Being able to know what you did in the past and how it differs from the present is a key part of research. Using software, this can be automated so that you can focus on your research without having to be concerned about manually keeping track of all the different versions of documents or scripts you have.
This session will cover using version control with Git to automate tracking and dealing with changes when writing scripts within RStudio.

The session will cover:

setting up Git
adding files to be tracked
making and tracking changes
reviewing changes

pre-requisites: Introduction to R or prior experience with R

Targeted skill level: beginner

Software required: R and RStudio (please also install the tidyverse package)

This session will cover creating a workflow script to manage dependencies and outputs. After this session you should be able to:

understand targets and dependencies
create a basic scripted workflow

This session is useful for people who:

want to repeat a workflow with changing data

pre-requisites: Introduction to unix shell or equivalent

Targeted skill level: post-beginner

Software required: MacOS/Linux - Terminal (comes pre-installed). Windows - GitBash

This session will cover using OpenRefine for cleaning and tidying data. By the end of the session attendees should expect to:

Load data into OpenRefine
perform basic data cleaning operations
Export data from OpenRefine

This session would be useful for people who:

clean and organise data
want to apply the same data cleaning operation to multiple datasets

Targeted skill level: beginner

Software required: OpenRefine

This session will provide a space for a yet to be determined session of a more skilled nature

Session 5:

This session will be and introduction to how to create your own functions (methods) in R. By the end of the session you should:

Understand the how to create a function
Understand how to specify arguments to a function
Understand how to return data from a function

Uesful for:

People who want to specify their own methods for dealing with data

pre-requisites: Introduction to R or prior experience with R

Targeted skill level: beginner

Software required: R and RStudio

This introductory session will cover creating a reproducible workflow environment using docker images.

obtain a pre-built docker image
create a dockerfile to create a custom contatiner
access the docker container and run a command
share data between the host and the container

pre-requisites: Introduction to unix shell or equivalent

Targeted skill level: post-beginner

Software required: Docker

This presentation will cover general best-practice principles of management, storage and sharing of research data. It will include practical tips for improving data management practices that can be implemented immediately regardless of the type of data. By attending students will feel better prepared to respond to university, employer, funder and/or publisher data requirements.

Targeted skill level: all

Software required: none

This session will cover saving commands used in the unix shell and saving them into scripts for reuse.

pre-requisites: Introduction to unix shell or equivalent

Targeted skill level: beginner

Software required: MacOS/Linux - Terminal (comes pre-installed). Windows - GitBash

Session 6:

This session will cover using the markdown syntax and R code to create reproducible documents.

suggested prior knowledge: Introduction to R or prior experience with R

Targeted skill level: beginner

Software required: R and RStudio

This session will directly continues from Reproducible computational environments using containers.

pre-requisites: Reproducible computational environments using containers

This session is extremely similar to that of "Best practices for data organisation in spreadsheets" on Monday.

This session will cover the best practices for using spreadsheets with data. This will include:

Learning about the "Tidy data" principles
How to organise data according to "Tidy data" principles
Dealing wth dates
Exporting data for use with other tools

This session would be useful for:

people who collect data
people who want to begin analysing data

Targeted skill level: beginner

Software required: spreadsheet program installed (e.g. MS Excel or Libre Office Calc)

This session will provide a space for a yet to be determined session of a more skilled nature

Wednesday 13th February

Session 7:

This session will cover how to make plots from data in R. This will be done using the ggplot2 package for R. Participants can expect to learn how to:

specify the data to be visualised
be able to create scatter plots, line graphs, bar plots, box and whisker plots, and histograms
understand how to make customisations to default themes

pre-requisites: Introduction to R or prior experience with R

Targeted skill level: beginner

Software required: R and RStudio (please also install the tidyverse package)

This session will be an introduction to querying databases. After this session participants can expect to:

understand what a relational database is
create basic queries for choosing columns
create basic queries for filtering data
create queries for summarising data based on groups

Targeted skill level: beginner

Software required: SQLite and http://sqlitebrowser.org

Github provides an online way to collaborate and track changes on plain text files, such as markdown. Markdown is a simple text language that encodes text formatting of a single document that can then be converted into multiple different formats such as html, doc, or pdf.
This session will cover getting started with Gitub and the basic syntax of markdown for creating formatted documents which can be converted into a website.

Creating up a github account
Creating a repository
Introduction to markdown syntax for formatting
Creating a simple webpage with markdown
Modifying pages and tracking changes
How to use Github for collaboration

Targeted skill level: beginner

Software required: web-browser

This session will be an unstructured session during ResBaz based on interest of participants.

Session 8:

This session will cover how to take your own functions in R and turn them into R packages to improve maintainability.

suggested prior knowledge: Introduction to R or prior experience with R

Targeted skill level: post-beginner

Software required: R and RStudio

This session will continue directly from Introduction to getting data from databases.

pre-requisites: Introduction to getting data from databases

This session directly continues from Introduction to Github

Pre-requisites: Introduction to Github

This session will be an unstructured session during ResBaz based on interest of participants.

Dunedin

11-13 February, 2019

Registration is closed

Timetable

Venue

Venue (workshops: 11-13 February)

Workshop sessions

Workshop timetable

Monday 11th February

Tuesday 12th February

Wednesday 13th February

Contact