Data management refers to how researchers organize the data they create, collect, describe, store, and work with. Data management does not include data analysis (examining and interpreting the data).
When you start a new research project it can be valuable to think about what research data you’ll be using and how you’ll be using it as part of your project.
Be aware of your files
- Before you start collecting data, consider how many total files you think you’ll end up with and how large all of them will be individually and combined. There’s a big difference between having 10 small files and thousands of large files. This will affect how and where you store your data.
- Think about what file formats you plan on using and what software is needed to open those formats. Are you using a common file format that anyone will be able to use or will people need to have specific software to open your files?
Use consistent file and directory names
- The most important thing to do when naming files is to be consistent. Try to use the same format of file names for all of your data. This will make it easier for you to find a specific file.
- Use names that accurately represent the contents of the files (e.g. “DataSet001-2024-[your initials]”).
- If you use any abbreviations or acronyms in your file names make sure you have these written down somewhere so people can easily know what they mean.
- Use directories or folders to help people understand the different steps of your research process (e.g. have different directories for “unprocessed/raw data” and “processed/cleaned data”).
Make backups of your data
- A backup is a copy of data that you create in case something happens that means you cannot continue to work with your original files, such as technology failure (spilling soda on your laptop), natural disaster (fires or floods), theft, or accidentally deleting everything.
- A best practice for backups is to have one copy stored somewhere other than your primary computer. This could be an external hard drive or a cloud storage system (e.g. Dropbox or Microsoft OneDrive).
- USB flash drives can be used for data transfer (between multiple computers), but should not relied upon for backups or long-term data storage.
Document your data
- Documentation is the process of writing down what was done to the data throughout the research process. This includes:
- What was done
- Who did it
- When it happened
- Where it happened
- How it was done
- Why it was done
- Create a “data dictionary” that defines acronyms, abbreviations, and other terms used in your data that may not be immediately obvious.