Tidy data for librarians
- We will discuss good practices for data entry and formatting
- We will not discuss analysis or visualisation
- Don’t use multiple tables in one sheet
- Don’t use multiple tabs in a file
- Fill in zero when you mean zero
- Use an appropriate null value to record missing data
- Don’t use formatting to convey information or make the spreadsheet
look pretty
- Don’t put units or comments in cells
- Don’t combine several values in one cell
- Take care over column names
- Avoid including special characters in your data file
- Put metadata (units, legends etc.) in a separate file
- Excel is notoriously bad at handling dates.
- Treating dates as multiple pieces of data rather than one makes them
easier to handle and exchange between programs.
- Use data validation tools to minimise the possibility of input
errors.
- Use sorting and conditional formatting to identify possible
errors.
- Use .csv file format for data storage and processing