One of the challenges to understanding the spread of COVID19 data is the availability of data. Each day, members of the Data User Group have been updating a CSV file on our Github and a shared Google Sheet with information from the Ontario Ministry of Health’s COVID 19 daily updates. As the capacity for testing has expanded and the number of identified cases has increased, the once detailed tables of data available on the Ministry website has become daily notices: “Information for all cases today are pending.”
Unfortunately, since the Ministry of Health does not provide updates on the reported cases, every record coded as “pending” effectively becomes missing data. Fortunately, each of the regional Health Units are providing updates on the numbers of cases on their websites. These regional updates have been useful to validate and cross-reference the tables that Wikipedia has been maintaining (mentioned in the previous post). This compiled data set is available in the shared Googles Sheet under the tab “Wikipedia”.
Following are a few of the questions we have explored with the data we have compiled (click on the visual to see it in a larger format):
What are the origins of internationally acquired COVID19 cases in Ontario?
Data source: “Wikipedia” tab in the DUG Google Sheet
Where are the confirmed COVID19 cases in Ontario?
Data source: Github csv
Tool: R – Shiny Dashboard, Interactive Map (Greg Rousell)
What does the growth of COVID19 confirmed cases look like in each region?
Data source: Github csv
Tool: R – Shiny Dashboard, Interactive Plots (Greg Rousell)
Data source: “Wikipedia” tab in the DUG Google Sheet
Data Collection Updates
- Records in the Wikipedia tab of the Google Sheet have been cross-referenced with local media reporting to fill in gaps on the formal Wikipedia site. Links to the media reports are included in the record as an additional reference and validation.
- CDUID codes have been added to the records in the Wikipedia tab of the Google Sheet to make it easier to produce maps in GIS platforms. These CDUID codes correspond to the 2016 Census Divisions managed by Statistics Canada. The 2016 Statistics Canada Census Division .shp file (polygons) is available here.
- Some of the data in the Wikipedia tab are reported for aggregated regions. An .shp file with corresponding merged regions is in development and will be posted shortly.
We are a group of researchers and analysts who are interested in data science and would like to use our expertise to contribute to the understanding of COVID-19 in our communities.
Looking for data…
One of the challenges we encountered trying to understand the spread of COVID-19 was finding a data source in a format that is easily accessible for analysis. When we were unable to locate such a file (and finding that the process to scrape data through R was too messy given the formats that the information has been released) we decided to take a manual approach. Using a few different sources, we have compiled data tables which are easily accessible in R (our favorite) and Python.
…Compiling our own
March 19, 2020
A COVID19 Googls Sheet for Ontario cases has been created, and is being maintained, with data from an Ontario government website and resources available on two Wikipedia pages. We will continue to update these tables until a more authoritative source of case records is made available, ideally by Public Health Ontario.
Resources: An invitation to explore and dive deeper
As we explore this data we will be sharing visualizations and insights on the Data User Group website. Our hope is that others will find our summaries useful. We extend an open invitation to others interested in data science to engage in additional analysis and use this data set for your own exploration. Resources include:
March 19, 2020
- A Github has also been created which will include the R code of our members as well as a .csv file that will be updated regularly. The Google Sheet will serve as our authoritative data source and the github will serve as our central repository for code. We invite any who are interested to contribute to the github.
- A Shiny app of the github code has been created to provide interactive explorations of the COVID19 data by regional spread.
Data Background and Sources
The “Provincial Reporting” tab in the Google Sheet is a compilation of data from this Ontario government website. This webpage provides a table on new cases of COVID-19 diagnosed in the province. Following are notes about the data:
- Using the Wayback Machine, the earliest records that could be obtained began at case 32.
- The first 31 cases were then compiled by parsing the press releases available at the bottom of the page.
- Currently, case numbers 6, 16, 17 and 18 have not been found in the available press releases.
- Coding with respect to regional health unit appears to have changed over time. A new column has been added with recoded health unit labels for consistency.
- On March 18th the website stopped posting the hospitals the cases are related to.
- The data in the Google Sheet is updated daily from this website, around 10:30 am and 5:30 pm when the data is released.
The “Wikipedia” tab in the Google Sheet is a compilation of data from Wikipedia’s Thematic Google Map. Included in this map are interactive summaries by region which includes the number of cases, number of patients in local hospitals and buildings that have been impacted. Following are notes about the data that is published on this page:
- The data in the Google Sheet is updated daily from this website.
- Wikipedia also provides a table with time-series data for the spread of COVID-19 for each province. The data from this website is available in “Wikipedia National” tab in the Google Sheet and is updated less frequently than the other two tabs.
A posting is now open until February 8, 2019 for a Research Analyst contract position at the Durham District School Board. The start date for this position is to be determined. This posting can be viewed on and applied to through applytoeducation.com.
Click here for more information
Summary of the Research Analyst: This position will take a lead role in a project … as part of the Ontario Education Equity Action Plan.
Reports To: Administrative Officer, Accountability and Assessment.
A posting is now open until July 18, 2018 for a Research Officer position at the Peel District School Board. The position begins September 4, 2018. This application is posted on applytoeducation.com (link at bottom of this pdf with more information) so to apply you will need to create an account first.
Click here for more information (link to apply at the bottom of the pdf)
The Peel District School Board (PDSB) is one of the largest school boards in Canada, with more than 150,000 students in over 250 schools. At PDSB, everything we do is designed to help all students achieve to the best of their ability. We have the incredible opportunity to inspire a smile in each student. Our collective, daily efforts make a positive difference in the lives of our students, their families and the world. Guided by our mission, vision and values, we build positive places for learning and working … together at http://www.peelschools.org We are currently accepting applications for a Research Officer.
Are you an experienced professional highly skilled in qualitative and quantitative research? Do you welcome the opportunity to draw on this expertise to support services and programs across the Peel District School Board? If so, take the next step in your successful career by joining our team.
Job Duties/Responsibilities and Details
Reporting to the Chief Research Officer, Research and Accountability, you will work both independently and as part of a team of education researchers in the design, implementation and interpretation of research and evaluation projects to support the board’s system-wide strategic goals, equity and diversity initiatives, and curriculum and instruction programs.
Being a Research Officer at the Peel District School Board means acting as a research and evaluation resource to support the use of data for planning and decision-making. This will include being responsible for consultation and development of assessments (curriculum, alternative programs, special education) as well as the evaluation of educational programs (equity, diversity, instruction, special education). The research
team and Board staff will also rely on your assessment of current educational trends, and on the literature reviews and environmental scans you can provide on topics of interest as they carry out their functions.
Over the past couple of years the Data User Group and Barrie Region MISA PNC have hosted a “Researcher Coffee Break” where school board employees with research, evaluation and data related roles can connect on a teleconference to discuss current issues, challenges or new approaches to common data sets.
Given the popularity of these periodic teleconferences, the Data User Group is collaborating with the Association of Educational Researchers of Ontario for a day of networking and sharing:
Click here for the flyer which includes a link to the online registration.