Programming with Python

Data types

Learning Objectives

  • Identify built-in data types in Python
  • Differentiate between scalar and structured objects
  • Recognize mutable and immutable objects
  • Convert between data types in Python

Before we go much further into numerical modeling, we should stop and discuss some of the inner workings of Python. Recognizing the way values can be handled by Python will give you flexibility in programming and help you avoid common errors.

Early in the previous lesson, we saw that we could assign a value to a variable using the symbol =:

elevation_ft = 5430    # elevation of Boulder, CO in feet 


The variable name elevation_ft is not itself the value 5430. It is simply a label that points to a place in the memory where the object with the value 5430 is stored.

This is different from the way the symbol = is used in algebra. An equation like this one represents different things in Python and in algebra:

x = 4 + 1

In both cases, the letter ‘x’ corresponds to the value 5. In algebra, ‘x’ is equivalent to 5; the symbol is simply taking the place of the number. In Python, ‘x’ is not itself 5; it is a name that points to an object with a value of 5. The variable name ‘x’ is short-hand for the address where the object is stored in the memory.

Objects are classified into different classes or data types that define the kinds of things that a program can do with those objects. An integer (like 5430 above) is one type of object, the string “Hello, World!” is also an object, and the numpy array of elevation values in the previous lesson was another type of object.

Scalar objects

Objects are either scalar or non-scalar. Scalar objects are the building blocks of data. They hold a single value and cannot be divided. Non-scalar objects hold sets of elements within some internal structure. Computers operate directly on scalar objects but have to iterate through the elements of a non-scalar object in order to process it.

The term scalar comes from linear algebra, where it is used to differentiate a single number from a vector or matrix.

- Integers

We can use the built-in function type to see what type a particular object is:

type(5430)
int

The number 5430 is an object of type int, or integer. We can also use type see the type of object that the variable is assigned to:

type(elevation_ft)
int

The variable name elevation_ft is assigned to an object with the value 5430, which is of type int. Integer is one of several built-in data types in Python. Because they are built in, we don’t need to load a library to use them.

- Floats

Real numbers (potentially with decimals) are floating point numbers or floats:

elevation_m = 1655.064 # elevation of Boulder, CO in meters

type(elevation_m)
float

A number doesn’t need to have meaningful fractional part to be a float. Just adding a decimal point to a whole number makes it a float:

print '7 is', type(7)
print '-' * 20
print '7. is', type(7.)
print '7.0 is', type(7.0)
7 is <type 'int'>
--------------------
7. is <type 'float'>
7.0 is <type 'float'>

- Booleans

Other types of objects in Python are a bit more unusual. Boolean objects can take one of two values: True or False. We will see in a later lesson that boolean objects are produced by operations that compare values against one another and by conditional statements.

You’ll notice that the words True and False change color when you type them into a Jupyter Notebook. They look different because they are recognized as special keywords. This only works when True and False are capitalized, though! Python does not treat lower case true and false as boolean objects.

i_like_chocolate = True
type(i_like_chocolate)
bool


When used in an arithmetic operation, a boolean object acts like an integer. True takes a value of 1 and False a value of 0:

print '3 * True:', 3 * True
print '3.0 * True:', 3.0 * True
print '3.0 * False:', 3.0 * False
3 * True: 3
3.0 * True: 3.0
3.0 * False: 0.0


We can cast objects of any type into a boolean using the function bool():

print bool(127.3)
True

- NoneType

The most abstract of data types in Python is the NoneType. NoneType objects can only contain the special constant None. None is the value that an object takes when no value is set or in the absence of a value. None is a null or NoData value. It is not the same as False, it is not 0 and it is not an empty string. None is nothing.

If you compare None to anything other than None, None will always be less than the other value (In Python 3, comparing None to another object will instead produce an error):

nothing = None
print type(nothing)

print nothing > -4
print nothing == nothing   # double == compares for equivalency
<type 'NoneType'>
False
True

Why would you ever want to create an object that contains nothing at all? As you build more complex programs, you’ll find many situations where you might want to set a variable but don’t want to assign a value to it quite yet. For example, you might want your code to perform one action if the user sets a certain variable but perform a different action if the user does nothing:

input_from_user = None

# The user might or might not provide input here.
# If the user provides input, the value would be
# assigned to the variable input_from_user

if input_from_user is None:
    print "The user hasn't said anything!"
    
if input_from_user is not None:
    print "The user said:", input_from_user
The user hasn't said anything!


Try assigning an object of a different type to input_from_user to see how the script behaves.

Numeric data types

What type of object are these values?

  • 5.6
  • 1932
  • 7.0000

Solution


  • float
  • int
  • float
  • float

Casting and integer division

Think about the operations that occur when running the following statements. Why are their outputs different?

print 'a:', 100/3
print 'b:', float(100)/3
print 'c:', 100/float(3)
print 'd:', float(100/3)

Solution


a: 33
b: 33.3333333333
c: 33.3333333333
d: 33.0
  1. Dividing two integers results in an integer (b),(c) Casting either the dividend or divisor as a float will mean that it is no longer integer division
  2. The function float() is acting on the output of integer division. The remainder has already been discarded.

Lemonade sales

You get hired to work for a highly successful lemonade stand. Their database is managed by a 7-year-old, though, so their data is a mess. These are their sales reports for FY2017:

sales_1q = ["50.3"] # thousand dollars
sales_2q = 108.52
sales_3q = 79
sales_4q = "82"

- Calculate the total sales for FY2017

Solution


total_sales = float(sales_1q[0]) + sales_2q + sales_3q + float(sales_4q)
print 'Total lemonade sales:', total_sales, 'thousand dollars'
Total lemonade sales: 319.82 thousand dollars

Casting bool

Any type of object can be cast to a boolean with the function bool(). Which of these objects converts to True and which to False?

  • a negative float
  • None
  • the boolean object True
  • the integer 0
  • the float 0
  • the string ‘string’
  • an empty string
  • a string that contains only a space
  • 3e-324
  • 2e-324
  • a list with one item
  • an empty list

Solution


  • a negative float: True
  • None: False
  • the boolean object True: True
  • the integer 0: False
  • the float 0.0: False
  • the string ‘string’: True
  • an empty string: False
  • a string that contains only a space: True
  • 3e-324: True
  • 2e-324: False
  • a list with one item: True
  • an empty list: False

Non-scalar (or structured) objects

Non-scalar objects contain multiple elements that can be separated into parts. Since they have an internal structure, we can use indexing to access the individual parts of a non-scalar object.

There are several built-in types of non-scalar objects in Python. They can be grouped according to their internal structure:

  • Sequences are structured objects where elements are kept in a known order. We use integer indexing and slicing to access elements based on their position.

  • Mapping objects map keys to values. Because the elements of a mapping object are not stored in order, we cannot select them based on their position. Instead, the keys serve as indices.

The differences between the two groups will make more sense after looking at some examples.

Sequences

- Strings

Objects of type string are simply sequences of characters with a defined order. Strings have to be enclosed in sigle quotes (‘’), double quotes (" “), triple single or double quotes (‘’’ ‘’’,”“” “”“), or single quotes within double quotes (”‘’“):

print type("The judge said 'Nobody expects the Spanish Inquisition!'")
<type 'str'>

We can cast objects of any type into strings with the function str():

str(bool(6 < 2))
'False'

We can test if a sub string exists within a string or not using the keyword in:

print 'a' in 'program'
print 'at' not in 'battle'
True
False

There are many methods available for objects of type string:

string = "if it's in caps i'm trying to YELL!"

print string.lower()
print string.upper()
print string.capitalize()
print string.split()
print string.replace('YELL', 'fix my keyboard')
if it's in caps i'm trying to yell!
IF IT'S IN CAPS I'M TRYING TO YELL!
If it's in caps i'm trying to yell!
['if', "it's", 'in', 'caps', "i'm", 'trying', 'to', 'YELL!']
if it's in caps i'm trying to fix my keyboard!

- Lists

A list is exactly what it sounds like – a sequence of things. The objects contained in a list don’t have to be of the same type: one list can simultaneously contain numbers, strings, other lists, numpy arrays, and even commands to run. Like other sequences, lists are ordered. We can access the individual items in a list through an integer index.

Lists are created by putting values, separated by commas, inside square brackets:

shopping_list = ['funions', 'ice cream', 'guacamole']

We can change the individual values in a list using indexing:

shopping_list[0] = 'funyuns' # oops
print shopping_list
['funyuns', 'ice cream', 'guacamole']


There are many ways to change the contents of lists besides assigning new values to individual elements:

shopping_list.append('tortilla chips')   # add one item
print shopping_list
['funyuns', 'ice cream', 'guacamole', 'tortilla chips']


del shopping_list[0]   # delete the first item
print shopping_list
['ice cream', 'guacamole', 'tortilla chips']


shopping_list.reverse()   # reverse the order of the list (in place)
print shopping_list
['tortilla chips', 'guacamole', 'ice cream']


We can use operators to concatenate lists or build lists with repeated elements:

shopping_list = shopping_list + ['coffee', 'cheese']
print shopping_list
['tortilla chips', 'guacamole', 'ice cream', 'coffee', 'cheese']


print 3 * shopping_list[-1:]
['cheese', 'cheese', 'cheese']


There is one very important difference between lists and strings: lists can be modified in place while strings cannot.

- Tuples

Like lists, tuples are simply sequences of objects. Tuples, however, are immutable objects. We can only change the values in a tuple by assigning the variable name to a new object.

Tuples are created by putting values in a sequence, separated by commas. For easier reading, they are usually inside parentheses:

things = ('toy cars', 42, 'dinosaur')
print type(things)
<type 'tuple'>

Because they are sequences, we can use indexing to access individual values in a tuple:

print things[0]
toy cars

However, because they are immutable objects, we cannot use indexing to change the values of a tuple:

things[0] = 'toy airplanes'
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-151-546e5c83c872> in <module>()
----> 1 things[0] = 'toy airplanes'

TypeError: 'tuple' object does not support item assignment

Mapping types

- Dictionaries

Because values in sequences are stored a known order, individual values in sequence-type objects can be accessed by their position through integer indices. Dictionaries are a type of object where values are not stored in any particular order. Dictionaries are unordered collections of key:value pairs. They map (or match) keys, which can be any immutable type (strings, numbers, tuples), to values, which can be of any type (heterogeneous). Individual values in a dictionary are accessed by their keys.

We create dictionaries with curly brackets and pairs of keys and values. An empty dictionary would simply have no key:value pairs inside the curly brackets:

person = {'name':'Jack', 'age': 32}
print person
{'age': 32, 'name': 'Jack'}

Notice that the order of the key:value pairs is different in the dictionary definition than in the output! Because values in a dictionary are not stored in a particular order, they take an arbitrary order when the dictionary is displayed.

We can access and modify individual values in a dictionary with their keys:

person['age'] = 33
print person
{'age': 33, 'name': 'Jack'}

We can also use keys to add values to a previously defined dictionary:

person['address'] = 'Downtown Boulder'  
print person
{'age': 33, 'name': 'Jack', 'address': 'Downtown Boulder'}


String methods

Can you explain what this script does?

string = "if it's in caps i'm trying to YELL!"

print string.find('caps')
  • Modify the command so that it finds the substring (‘caps’) even if capitalization is different in the string (ex. ‘CAPS’).

  • What is the output of find if the substring is not in the string?

  • What happens if the substring appears more than once in the string? (ex. ‘in’)

Solution


loc = string.find('caps')
print string[loc:]
caps i'm trying to YELL!

The method find returns the start index of the substring.

# change capitalization for testing
string = "if it's in cAPs i'm trying to YELL!"

print 'If substring not in string:', string.find('caps')

# force string to lowercase
print 'If find substring:', string.lower().find('caps')

print "Returns only first occurrence of substring 'in':", string.lower().find('in')
If substring not in string: -1
If find substring: 11
Returns only first occurrence of substring 'in': 8

Numbers in strings

You decided to quit research and open a bar. You are using Python to create a sign.

age = 21   # <--- don't change this line!
sign = 'You must be ' + age + '-years-old to enter this bar'
print sign
---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

<ipython-input-28-253b33d26e7e> in <module>()
     1 age = 21   # <--- don't change this line!
----> 2 sign = 'You must be ' + age + '-years-old to enter this bar'
     3 print sign


TypeError: cannot concatenate 'str' and 'int' objects

Fix your code so it prints the text in sign correctly. Don’t change the first line!

Solution


age = 21
sign = 'You must be ' + str(age) + '-years-old to enter this bar'
print sign
You must be 21-years-old to enter this bar

Cheeeeeeeeese

What is the difference between these two statements? Why are their outputs different?

s1 = 3 * shopping_list[-1:]
s2 = 3 * shopping_list[-1]

Solution


print s1, type(s1)
print s2, type(s2)
['cheese', 'cheese', 'cheese'] <type 'list'>
cheesecheesecheese <type 'str'>

shopping_list[-1] is the last value in the list, which is a string. The second statement is therefore repeating a string three times.

shopping_list[-1:] is a slice of a list, so it is also a list (even if it only has one value). The first statement is therefore repeating a list three times.

Human readable numbers

When we write down a large integer, it’s customary to use commas (or periods, depending on the country) to separate the number into groups of three digits. It’s easier for humans to read a large number with separators but Python sees them as something else. What type of object is this? Why does Python read it as this object type?

my_account_balance = 15,752,000,000

Solution


my_account_balance = 15,752,000,000
type(my_account_balance)
tuple

You don’t actually need the parentheses to create a tuple. Python reads any sequence of objects separated by commas as a tuple.

Tiny tuples

Create a tuple that contains only one value. Confirm that it’s really a tuple. You might have to experiment!

Hint: Start with a tuple with two values and simplify it.

Solution


lil_tuple = 1,
type(lil_tuple)
tuple

Travel guide

  • Create an empty dictionary called “states”
  • Add 3 items to the dictionary. Map state names (the keys) to their abbreviations (the values) (ex. ‘Wyoming’:‘WY’). Pick easy ones! You can also use states from another country or look here for help.

Solution


states = {}

states['Colorado'] = 'CO'
states['California'] = 'CA'
states['Florida'] = 'FL'
  • Use a variable in place of a key to access values in your states dictionary. For example, if I set the variable to “Wyoming”, the value should be “WY”.

Solution


selected_state = 'California'

print states[selected_state]
CA
  • Create a dictionary called “cities” that contains 3 key:value pairs. The keys should be the state abbreviation in your states dictionary and the values should be the names of one city in each of those states state (ex. ‘WY’:‘Laramie’). Don’t start with an empty dictionary and add values to it – initialize the dictionary with the all of the key:value pairs already in it.

Solution


cities = {'CO':'Denver', 'FL':'Miami', 'CA':'San Francisco'}

Travel guide, part II (Advanced)

  • Write a short script to fill in the blanks in this string for any state in your states dictionary.

    __________ is abbreviated ____ and has cities like ________

  • Refactor (rewrite, improve) your code so you only have to change one word in your script to change states.

Hint: The values in one of your dictionaries are the keys for the other dictionary

Solution


selected_state = 'Colorado'

print selected_state + ' is abbreviated ' + states[selected_state] + ' and has cities like ' + cities[states[selected_state]]
Colorado is abbreviated CO and has cities like Denver