

Overview and Table of Contents
The curriculum for the high school Datathon4Justice aims to teach students how to use R Studio to analyze the demographic variables (genders, ethnicities, geographic origins, and birth decades) of artists in major U.S. art museums. On this page, you can find more information on the data, helpful resources to prepare for and successfully work through this curriculum, and guidance on how to set up your own Datathon4Justice event.
On this page, you’ll find the following information:
- The Curriculum: step by step instructions for downloading and analyzing museum data in RStudio.
- The Data: A link to the original, peer-reviewed paper, as well as the source data you’ll need for the curriculum.
- R-Studio: step-by-step instructions for downloading and installing the statistical software package R-Studio, which we’ll need to use for the curriculum.
- Statistical Concepts: a link to primers/tutorials on the main statistical concepts you’ll use in the curriculum.
- Datathon4Justice Event Planning: guidance on how to set up a Datathon4Justice event at your school.
The Curriculum
If you are already familiar with RStudio, as well as with the basic statistical concepts needed, feel free to simply download the curriculum and get going!
Note: These materials are governed under the Creative Commons “Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)” license. (What’s this?)

The Data
The data used in the Data4Justice curriculum was gathered in a large-scale study led by QSIDE Co-Founder and President, Chad Topaz, and the findings were published in an academic article. The PDF version of the academic article is provided below, and here is the link to the online version of this article. The raw data for this study, which is used in the curriculum, is available online through GitHub and as a zip file with this link.
R Studio
R Studio is a program that simplifies working with the R programming language for statistical computing and graphics. It is available in two formats: RStudio Desktop is a regular desktop application while RStudio Server runs on a remote server and allows accessing RStudio using a web browser. Here is more information on downloading and getting started with R Studio
Downloading and installing R Studio:
Introduction to using R Studio:
Statistical Concepts
Although the Data4Justice curriculum does not require any previous experience with R Studio, it does require some foundational knowledge of certain Statistical concepts. Below is a list of the concepts that the curriculum covers along with links to lessons on Khan Academy that cover these topics.
- Representing data
- Categorical data example
- Analyzing trends in categorical data
- Reading a bar graph
- Creating a histogram
- Random sampling
- Confidence intervals and margin of error
- Reading box plots
Datathon4Justice Event Planning
Getting students involved in a Datathon4Justice event can be a great opportunity to develop data science skills and connect STEM concepts to social justice work. Below is advice on how to properly plan for one of these events at your school:
- Timeline: we recommend picking a date for this event that is at least 4 to 6 weeks in advance so you can have enough time to advertise and prepare for the Datathon. We estimate that this event will take at most 5 to 6 hours. If you want to hold this event during the school day, make sure to coordinate with other teachers and school staff in order to properly facilitate everyone’s schedules. Alternatively, you can hold this event on the weekend either in-person or virtually.
- Recruitment: because the Datathon4Justice uses a data science curriculum, we recommend advertising this event to students interested in STEM subjects. However, students with a diverse range of interests outside of STEM can also benefit from participating in the Datathon. The curriculum is for everyone who cares about social justice and/or data science. Here is the link to a sample Datathon4Justice poster that you are welcome to copy and use to adverse your event.
- Computers and R Studio: ideally, each student should have their own computer in order to fully participate in this event. We recommend having R Studio downloaded in advance on the computers that will be used for the event. If students will be using school computers, check to make sure that this program is available on those devices (see the information on downloading and using R Studio above). Chromebooks may not work as well with this particular program. It may be helpful to provide an introductory lesson on using R Studio before the actual Datathon.
- Event Day Operations: to begin the Datathon4Justice, it’s best to discuss the data and how to download it in R Studio as a whole group. Then, we recommend dividing participating students up into smaller groups that can be led by teachers or staff through each lesson. After every group completes a lesson, everyone can come together to discuss their findings and thoughts. At the end of the event, we recommend having students reflect upon what they learned and how their perception of social justice and data science has changed.
If you are thinking about planning a Datathon4Justice, let us know! Reach out to us through email at qside@qsideinstitute.org. |
Sign up for the National Math Festival newsletter here
Sign up to get involved with QSIDE as an Affiliate or Consortium member
More about the Creative Commons License:
This is a human-readable summary of (and not a substitute for) the license.
Under this license, you are free to:
- Share the materials, including making a copy and redistributing the material in any medium or format, and
- Adapt the materials, including remixing, transforming, and/or building upon these materials.
If you choose to share and/or adapt the resource, you must:
- Attribute the original source, the National Math Festival and the QSIDE Institute;
- Link to the Creative Commons license;
- Use in only non-commercial ways — that is, you may not charge, solicit donations for, or sell the original or any adaptations made from the original source materials;
- License any adaptations made from the original using the same CC license — that is, you *must* use the Attribution-NonCommercial-ShareAlike 4.0 International CC license for any adaptations made from the originals; and
- Limit restrictions to the original CC license — that is, you may not apply legal terms or technological measures that legally restrict others from doing anything the license permits, either to the original or to any adaptations you make.