Instructor Notes
General notes on OpenRefine
Common problems
If learners are using a browser other than Firefox, or OpenRefine does not automatically open for them when they click the .exe file, have them point their browser at http://127.0.0.1:3333/ or http://localhost:3333 to launch the program.
Mac users with the newest operating system will have to allow this to run by “allowing everything” to run. They can change the setting back after the exercise.
-
Some students will run into issues with
- unzipping
- finding the .exe file once the software has been unzipped
- finding the data file on their computers after downloading
-
If OpenRefine crashes when launched from a network share drive, do the following:
- Copy the OpenRefine folder to a local drive not mapped to a network share, e.g. “C:\Users\JaneDoe”
- Open a Windows Command prompt
- Change the working directory to the OpenRefine folder at “C:\Users\JaneDoe”
- Run openrefine.exe
If “https” doesn’t work to fetch CrossRef during Advanced OpenRefine Functions, they can try “http”
If they need to diagnose failure to fetch the content from the URL they can check the “Store error” option in the “Add column by fetching URLs” dialogue and try looking at the common problems listed in the documentation
The data for this lesson was pulled from DOAJ in 2015 and may not reflect the same data currently available from DOAJ on the day of your workshop.
Introduction to OpenRefine
Importing data into OpenRefine
Instructor Note
This is a good moment to review the points from What Should I Know When Working with OpenRefine?
Instructor Note
Carefully guide learners on how to revisit OpenRefine’s homepage to explore import options when creating new or re-opening existing projects, select the large blue diamond in the upper left corner of the browser window.
Layout of OpenRefine, Rows vs Records
Faceting and filtering
Clustering
Working with columns and sorting
Sorting and Reorder Rows Permanently
Do not rush these last two sentences. Repeat them slowly after a pause and allow learners to explore how sorting works for a moment.
Although the “Undo/Redo” tab is not introduced until episode 9, it may be worth noting that applying a sort does not count as a change to the data because removing the sort will restore the data to its original order. However, once you select “Reorder Rows Permanently” this does count as a data change and adds an entry to the Undo/Redo history.
Introduction to Transformations
Writing Transformations
Transformations - Undo and Redo
Transforming Strings, Numbers, Dates and Booleans
Transformations - Handling Arrays
Different meanings of ‘transformation’
Ask the students what transformation means to them currently. Many may only know it from Excel to convert columns into rows or vice versa. Discuss how in OpenRefine, transformation is specifically the working window–these values are neither stored nor displayed in the cells or output.
Recap on best practice for separators
Recall previous discussion of dangers of changing separators and ensuring you avoid using a separator character that is already used in the text. A possible question to pose to learners could be: Which subject would be broken if a hyphen were used as a separator?