Plotting and Programming in Python: Reference

Key Points

Running and Quitting
  • Python scripts are plain text files.

  • Use the Jupyter Notebook for editing and running Python.

  • The Notebook has Command and Edit modes.

  • Use the keyboard and mouse to select and edit cells.

  • The Notebook will turn Markdown into pretty-printed documentation.

  • Markdown does most of what HTML does.

Variables and Assignment
  • Use variables to store values.

  • Use print to display values.

  • Variables persist between cells.

  • Variables must be created before they are used.

  • Variables can be used in calculations.

  • Use an index to get a single character from a string.

  • Use a slice to get a substring.

  • Use the built-in function len to find the length of a string.

  • Python is case-sensitive.

  • Use meaningful variable names.

Data Types and Type Conversion
  • Every value has a type.

  • Use the built-in function type to find the type of a value.

  • Types control what operations can be done on values.

  • Strings can be added and multiplied.

  • Strings have a length (but numbers don’t).

  • Must convert numbers to strings or vice versa when operating on them.

  • Can mix integers and floats freely in operations.

  • Variables only change value when something is assigned to them.

Built-in Functions and Help
  • Use comments to add documentation to programs.

  • A function may take zero or more arguments.

  • Commonly-used built-in functions include max, min, and round.

  • Functions may only work for certain (combinations of) arguments.

  • Functions may have default values for some arguments.

  • Use the built-in function help to get help for a function.

  • The Jupyter Notebook has two ways to get help.

  • Every function returns something.

  • Python reports a syntax error when it can’t understand the source of a program.

  • Python reports a runtime error when something goes wrong while a program is executing.

  • Fix syntax errors by reading the source code, and runtime errors by tracing the program’s execution.

Libraries
  • Most of the power of a programming language is in its libraries.

  • A program must import a library module in order to use it.

  • Use help to learn about the contents of a library module.

  • Import specific items from a library to shorten programs.

  • Create an alias for a library when importing it to shorten programs.

Reading Tabular Data into DataFrames
  • Use the Pandas library to get basic statistics out of tabular data.

  • Use index_col to specify that a column’s values should be used as row headings.

  • Use DataFrame.info to find out more about a dataframe.

  • The DataFrame.columns variable stores information about the dataframe’s columns.

  • Use DataFrame.T to transpose a dataframe.

  • Use DataFrame.describe to get summary statistics about data.

Pandas DataFrames
  • Use DataFrame.iloc[..., ...] to select values by integer location.

  • Use : on its own to mean all columns or all rows.

  • Select multiple columns or rows using DataFrame.loc and a named slice.

  • Result of slicing can be used in further operations.

  • Use comparisons to select data based on value.

  • Select values or NaN using a Boolean mask.

Plotting
  • matplotlib is the most widely used scientific plotting library in Python.

  • Plot data directly from a Pandas dataframe.

  • Select and transform data, then plot it.

  • Many styles of plot are available: see the Python Graph Gallery for more options.

  • Can plot many sets of data together.

Lists
  • A list stores many values in a single structure.

  • Use an item’s index to fetch it from a list.

  • Lists’ values can be replaced by assigning to them.

  • Appending items to a list lengthens it.

  • Use del to remove items from a list entirely.

  • The empty list contains no values.

  • Lists may contain values of different types.

  • Character strings can be indexed like lists.

  • Character strings are immutable.

  • Indexing beyond the end of the collection is an error.

For Loops
  • A for loop executes commands once for each value in a collection.

  • A for loop is made up of a collection, a loop variable, and a body.

  • The first line of the for loop must end with a colon, and the body must be indented.

  • Indentation is always meaningful in Python.

  • Loop variables can be called anything (but it is strongly advised to have a meaningful name to the looping variable).

  • The body of a loop can contain many statements.

  • Use range to iterate over a sequence of numbers.

  • The Accumulator pattern turns many values into one.

Looping Over Data Sets
  • Use a for loop to process files given a list of their names.

  • Use glob.glob to find sets of files whose names match a pattern.

  • Use glob and for to process batches of files.

Writing Functions
  • Break programs down into functions to make them easier to understand.

  • Define a function using def with a name, parameters, and a block of code.

  • Defining a function does not run it.

  • Arguments in call are matched to parameters in definition.

  • Functions may return a result to their caller using return.

Variable Scope
  • The scope of a variable is the part of a program that can ‘see’ that variable.

Conditionals
  • Use if statements to control whether or not a block of code is executed.

  • Conditionals are often used inside loops.

  • Use else to execute a block of code when an if condition is not true.

  • Use elif to specify additional tests.

  • Conditions are tested once, in order.

  • Create a table showing variables’ values to trace a program’s execution.

Programming Style
  • Follow standard Python style in your code.

  • Use docstrings to provide online help.

Errors and Exceptions
  • Tracebacks can look intimidating, but they give us a lot of useful information about what went wrong in our program, including where the error occurred and what type of error it was.

  • An error having to do with the ‘grammar’ or syntax of the program is called a SyntaxError. If the issue has to do with how the code is indented, then it will be called an IndentationError.

  • A NameError will occur when trying to use a variable that does not exist. Possible causes are that a variable definition is missing, a variable reference differs from its definition in spelling or capitalization, or the code contains a string that is missing quotes around it.

  • Containers like lists and strings will generate errors if you try to access items in them that do not exist. This type of error is called an IndexError.

  • Trying to read a file that does not exist will give you an FileNotFoundError. Trying to read a file that is open for writing, or writing to a file that is open for reading, will give you an IOError.

Defensive Programming
  • Program defensively, i.e., assume that errors are going to arise, and write code to detect them when they do.

  • Put assertions in programs to check their state as they run, and to help readers understand how those programs are supposed to work.

  • Use preconditions to check that the inputs to a function are safe to use.

  • Use postconditions to check that the output from a function is safe to use.

  • Write tests before writing code in order to help determine exactly what that code is supposed to do.

Debugging
  • Know what code is supposed to do before trying to debug it.

  • Make it fail every time.

  • Make it fail fast.

  • Change one thing at a time, and for a reason.

  • Keep track of what you’ve done.

  • Be humble.

Command-Line Programs
  • The sys library connects a Python program to the system it is running on.

  • The list sys.argv contains the command-line arguments that a program was run with.

  • Avoid silent failures.

  • The pseudo-file sys.stdin connects to a program’s standard input.

Wrap-Up
  • Python supports a large and diverse community across academia and industry.

Feedback
  • We are constantly seeking to improve this course.

Reference

Running and Quitting

Variables and Assignment

Data Types and Type Conversion

Built-in Functions and Help

Libraries

Reading Tabular Data into DataFrames

Pandas DataFrames

Plotting

import matplotlib.puplot as plot
plt.plot(time,position,label='label')
plt.xlabel('x axis label')
plt.ylabel('y axis label')
plt.legend()

Lists

For Loops

for number in range(0,5):
  print(number)

Looping Over Data Sets

for filename in glob.glob(*.txt):
  data = pd.read_csv(filename)

Writing Functions

def add_numbers(a, b):
  result = a + b
  return result

add_numbers(1, 4)

Variable Scope

Conditionals

for m in [3, 6, 7, 2, 8]:
  if m > 5:
    print(m, 'is large')
  elif m == 5:
    print(m, 'is 5')
  else:
    print(m, 'is small')

Programming Style

Glossary

Additive color model
A way to represent colors as the sum of contributions from primary colors such as red, green, and blue.
Argument
A value given to a function or program when it runs. The term is often used interchangeably (and inconsistently) with parameter.
Array
A container holding elements of the same type.
Assertion
An expression which is supposed to be true at a particular point in a program. Programmers typically put assertions in their code to check for errors; if the assertion fails (i.e., if the expression evaluates as false), the program halts and produces an error message. See also: invariant, precondition, postcondition.
Assign
To give a value a name by associating a variable with it.
Body
(of a function): the statements that are executed when a function runs.
Boolean
An object composed of true and false.
Call stack
A data structure inside a running program that keeps track of active function calls.
Case-insensitive
Treating text as if upper and lower case characters of the same letter were the same. See also: case-sensitive.
Case-sensitive
Treating text as if upper and lower case characters of the same letter are different. See also: case-insensitive.
Comment
A remark in a program that is intended to help human readers understand what is going on, but is ignored by the computer. Comments in Python, R, and the Unix shell start with a # character and run to the end of the line; comments in SQL start with --, and other languages have other conventions.
Compose
To apply one function to the result of another, such as f(g(x)).
Conditional statement
A statement in a program that might or might not be executed depending on whether a test is true or false.
Comma-separated values
(CSV) A common textual representation for tables in which the values in each row are separated by commas.
DataFrame
The way Pandas represents a table; a collection of series.
Default value
A value to use for a parameter if nothing is specified explicitly.
Defensive programming
The practice of writing programs that check their own operation to catch errors as early as possible.
Delimiter
A character or characters used to separate individual values, such as the commas between columns in a CSV file.
Docstring
Short for “documentation string”, this refers to textual documentation embedded in Python programs. Unlike comments, docstrings are preserved in the running program and can be examined in interactive sessions.
Documentation
Human-language text written to explain what software does, how it works, or how to use it.
Dotted notation
A two-part notation used in many programming languages in which thing.component refers to the component belonging to thing.
Element
An item in a list or an array. For a string, these are the individual characters.
Empty string
A character string containing no characters, often thought of as the “zero” of text.
Encapsulation
The practice of hiding something’s implementation details so that the rest of a program can worry about what it does rather than how it does it.
Floating-point number
A number containing a fractional part and an exponent. See also: integer.
For loop
A loop that is executed once for each value in some kind of set, list, or range. See also: while loop.
Function
A block of code that can be called and re-used elsewhere. Occurrence of a function name in the code is a function call. Functions may process input arguments and return the result back. Functions may also be used for logically grouping together pieces of code. In such cases, they don’t need to return any meaningful value and can be written without the return statement completely. Such functions return a special value None, which is a way of saying “nothing” in Python.
Function call
A use of a function in another piece of software.
Global variable
A variable defined outside of a function that can be used anywhere.
Immutable
Unchangeable. The value of immutable data cannot be altered after it has been created. See also: mutable.
Import
To load a library into a program.
In-place operators
An operator such as += that provides a shorthand notation for the common case in which the variable being assigned to is also an operand on the right hand side of the assignment. For example, the statement x += 3 means the same thing as x = x + 3.
Index
A subscript that specifies the location of a single value in a collection, such as a single pixel in an image.
Inner loop
A loop that is inside another loop. See also: outer loop.
Integer
A whole number, such as -12343. See also: floating-point number.
Invariant
An expression whose value doesn’t change during the execution of a program, typically used in an assertion. See also: precondition, postcondition.
Jupyter Notebook
Interactive coding environment allowing a combination of code and markdown.
Library
A collection of files containing functions used by other programs.
Local Variable
A variable defined inside of a function that can only be used inside of that function.
Loop Variable
The variable that keeps track of the progress of the loop.
Mask
A boolean object used for selecting data from another object.
Member
A variable contained within an object.
Method
A function which is tied to a particular object. Each of an object’s methods typically implements one of the things it can do, or one of the questions it can answer.
Modules
The files within a library containing functions used by other programs.
Mutable
Changeable. The value of mutable data can be altered after it has been created. See immutable.”
Object
A collection of conceptually related variables (members) and functions using those variables (methods).
Outer loop
A loop that contains another loop. See also: inner loop.
Parameter
A variable named in the function’s declaration that is used to hold a value passed into the call. The term is often used interchangeably (and inconsistently) with argument.
Pipe
A connection from the output of one program to the input of another. When two or more programs are connected in this way, they are called a “pipeline”.
Postcondition
A condition that a function (or other block of code) guarantees is true once it has finished running. Postconditions are often represented using assertions.
Precondition
A condition that must be true in order for a function (or other block of code) to run correctly.
Regression
To re-introduce a bug that was once fixed.
Return statement
A statement that causes a function to stop executing and return a value to its caller immediately.
RGB
An additive model that represents colors as combinations of red, green, and blue. Each color’s value is typically in the range 0..255 (i.e., a one-byte integer).
Sequence
A collection of information that is presented in a specific order. For example, in Python, a string is a sequence of characters, while a list is a sequence of any variable.
Series
A Pandas data structure to represent a column.
Shape
An array’s dimensions, represented as a vector. For example, a 5×3 array’s shape is (5,3).
Silent failure
Failing without producing any warning messages. Silent failures are hard to detect and debug.
Slice
A regular subsequence of a larger sequence, such as the first five elements or every second element.
Stack frame
A data structure that provides storage for a function’s local variables. Each time a function is called, a new stack frame is created and put on the top of the call stack. When the function returns, the stack frame is discarded.
Standard input
A process’s default input stream. In interactive command-line applications, it is typically connected to the keyboard; in a pipe, it receives data from the standard output of the preceding process.
Standard output
A process’s default output stream. In interactive command-line applications, data sent to standard output is displayed on the screen; in a pipe, it is passed to the standard input of the next process.
String
Short for “character string”, a sequence of zero or more characters.
Substring
A part of a string.
Syntax
The rules that define how code must be written for a computer to understand.
Syntax error
A programming error that occurs when statements are in an order or contain characters not expected by the programming language.
Test oracle
A program, device, data set, or human being against which the results of a test can be compared.
Test-driven development
The practice of writing unit tests before writing the code they test.
Traceback
The sequence of function calls that led to an error.
Tuple
An immutable sequence of values.
Type
The classification of something in a program (for example, the contents of a variable) as a kind of number (e.g. floating-point, integer), string, or something else.
Type of error
Indicates the nature of an error in a program. For example, in Python, an IOError to problems with file input/output. See also: syntax error.
Variable
A value that has a name associated with it.
While loop
A loop that keeps executing as long as some condition is true. See also: for loop.