What is data?
Data consists of discrete values or units of information that can take many forms: numbers, words, characters, images, sound recordings, videos, among others. Data is anything that can be collected, stored, organized, and analyzed.
A data set is information that is collected, assembled, and organized–by someone–for analysis of an issue, phenomenon, or subject. It may contain many kinds of information: textual, numeric, images, sound, video, code, geospatial data in a variety of formats (CSV, XML, TIFF, PDF, etc.).
"Free" datasets and software may have limits on how you can use them.
Here are some of the basic types of licenses for datasets:
"Open Source" is talking about the software involved: open source software (OSS) is freely available online for download and use; this term does not refer to a license of any kind. Some basic types of licenses for OSS include:
When you create a data set, you want to make sure that it is "good" data in that it is accurate, complete, organized, and ultimately, reusable.
Here are some tools that can help: