Getting Started
- You can launch JupyterLab from the command line or from Anaconda Navigator.
- You can use a JupyterLab notebook to edit and run Python.
- Notebooks can include both code and markdown (text) cells.
Variables and Types
- Use variables to store values.
- Use
print
to display values. - Format output with f-strings.
- Variables persist between cells.
- Variables must be created before they are used.
- Variables can be used in calculations.
- Use an index to get a single character from a string.
- Use a slice to get a portion of a string.
- Use the built-in function
len
to find the length of a string. - Python is case-sensitive.
- Every object has a type.
- Use the built-in function
type
to find the type of an object. - Types control what operations can be done on objects.
- Variables only change value when something is assigned to them.
Lists
- A list stores many values in a single structure.
- Use an item’s index to fetch it from a list.
- Lists’ values can be replaced by assigning to them.
- Appending items to a list lengthens it.
- Use
del
to remove items from a list entirely. - Lists may contain values of different types.
- Character strings can be indexed like lists.
- Character strings are immutable.
- Indexing beyond the end of the collection is an error.
Built-in Functions and Help
- Use comments to add documentation to programs.
- A function may take zero or more arguments.
- Commonly-used built-in functions include
max
,min
, andround
. - Functions may only work for certain (combinations of) arguments.
- Functions may have default values for some arguments.
- Use the built-in function
help
to get help for a function. - Every function returns something.
Libraries & Pandas
- Most of the power of a programming language is in its libraries.
- A program must import a library module in order to use it.
- Use
help
to learn about the contents of a library module. - Import specific items from a library to shorten programs.
- Create an alias for a library when importing it to shorten programs.
For Loops
- A for loop executes commands once for each value in a collection.
- The first line of the
for
loop must end with a colon, and the body must be indented. - Indentation is always meaningful in Python.
- A
for
loop is made up of a collection, a loop variable, and a body. - Loop variables can be called anything (but it is strongly advised to have a meaningful name to the looping variable).
- The body of a loop can contain many statements.
- Use
range
to iterate over a sequence of numbers. - The Accumulator pattern turns many values into one.
Looping Over Data Sets
- Use a
for
loop to process files given a list of their names. - Use
glob.glob
to find sets of files whose names match a pattern. - Use
glob
andfor
to process batches of files. - Use a list “accumulator” to append a DataFrame to an empty list
[]
. - The
.merge()
,.join()
, and.concat()
methods can combine pandas DataFrames.
Using Pandas
- Use builtin methods
.sum()
,.mean()
,unique()
, andnunique()
to explore summary statistics on the rows and colums in your DataFrame. - Use
.groupby()
to work with subsets of your dataset. - Sort pandas series with
.sort_values()
. - Use
.loc()
and.iloc()
to pinpoint specific locations in Pandas DataFrames. - Save DataFrames to CSV and pickle files using
.to_csv()
and.to_pickle()
.
Conditionals
- Use
if
statements to control whether or not a block of code is executed. - Conditionals are often used inside loops.
- Use
else
to execute a block of code when anif
condition is not true. - Use
elif
to specify additional tests. - Conditions are tested once, in order.
- Use
and
andor
to check against multiple value statements.
Writing Functions
- Break programs down into functions to make them easier to understand.
- Define a function using
def
with a name, parameters, and a block of code. - Defining a function does not run it.
- Arguments in call are matched to parameters in definition.
- Functions may return a result to their caller using
return
.
Tidy Data with Pandas
- In tidy data each variable forms a column, each observation forms a row, and each type of observational unit forms a table.
- Using pandas for data manipulation to reshape data is fundamental for preparing data for analysis.
Data Visualisation
- Explored the use of pandas for basic data manipulation, ensuring correct indexing with DatetimeIndex to enable time-series operations like resampling.
- Used pandas’ built-in plot() for initial visualizations and faced issues with overplotting, leading to adjustments like data filtering and resampling to simplify plots.
- Introduced Plotly for advanced interactive visualizations, enhancing user engagement through dynamic plots such as line graphs, area charts, and bar plots with capabilities like dropdown selections.
Wrap-Up
- Python supports a large community within and outwith research.
- Follow standard Python style (using PEP8) in your code.