---------------------------------------------------------------------------------------------
Here is a link to the slides for the presentation about finding and evaluating data sets at the January 2023 Graduate Student Data Bootcamp
What is data?
Data consists of discrete values or units of information that can take many forms: numbers, words, characters, images, sound recordings, videos, among others. Data is anything that can be collected, stored, organized, and analyzed.
A data set is information that is collected, assembled, and organized–by someone–for analysis of an issue, phenomenon, or subject. It may contain many kinds of information: textual, numeric, images, sound, video, code, geospatial data in a variety of formats (CSV, XML, TIFF, PDF, etc.).
When you create a data set, you want to make sure that it is "good" data in that it is accurate, complete, organized, and ultimately, reusable.