Abstract Data Types

Data abstraction is a powerful concept in computer science that allows programmers to treat code as objects --- for example, car objects, chair objects, people objects, etc. That way, programmers don't have to worry about how code is implemented --- they just have to know what it does.

Data abstraction mimics how we think about the world. For example, when you want to drive a car, you don't need to know how the engine was built or what kind of material the tires are made of. You just have to know how to turn the wheel and press the gas pedal.

An abstract data type consists of two types of functions:

  • Constructors: functions that build the abstract data type.
  • Selectors: functions that retrieve information from the data type.

For example, say we have an abstract data type called city. This city object will hold the city's name, and its latitude and longitude. To create a city object, you'd use a constructor like

city = make_city(name, lat, lon)

To extract the information of a city object, you would use the selectors like

get_name(city)
get_lat(city)
get_lon(city)

For example, here is how we would use the make_city constructor to create a city object to represent Berkeley and the selectors to access its information.

>>> berkeley = make_city('Berkeley', 122, 37)
>>> get_name(berkeley)
'Berkeley'
>>> get_lat(berkeley)
122
>>> get_lon(berkeley)
37

Notice that we don't need to know how these functions were implemented. We are assuming that someone else has defined them for us.

It's okay if the end user doesn't know how functions were implemented. However, the functions still have to be defined by someone. We'll look into defining the constructors and selectors later in this discussion.

Question 1: Distance

We will now use those selectors to write distance, which computes the distance between two city objects. Recall that the distance between two coordinate pairs, (x1, y1) and (x2, y2) can be found by calculating the sqrt of (x1 - x2)**2 + (y1 - y2)**2. We have already imported sqrt for your convenience, so use that and the appropriate selectors to fill in the function.

from math import sqrt
def distance(city_1, city_2):
    """
    >>> city1 = make_city('city1', 0, 1)
    >>> city2 = make_city('city2', 0, 2)
    >>> distance(city1, city2)
    1.0
    """

"*** YOUR CODE HERE ***"
lat_1, lon_1 = get_lat(city_1), get_lon(city_1) lat_2, lon_2 = get_lat(city_2), get_lon(city_2) return sqrt((lat_1 - lat_2)**2 + (lon_1 - lon_2)**2)

Use OK to test your code:

python3 ok -q distance

Question 2: Closer city

Implement closer_city, a function that takes a latitude, longitude, and two cities, and returns the name of the city that is relatively closer to the provided latitude and longitude.

You may only use selectors and constructors (introduced above) for this question. You may also use the distance function defined above. Remember, the point of data abstraction, as we've said, is that we do not need to know how an abstract data type is implemented, but rather just how we can interact with and use the data type.

def closer_city(lat, lon, city1, city2):
    """ Returns the name of either city1 or city2, whichever is closest
        to coordinate (lat, lon).

        >>> berkeley = make_city('Berkeley', 37.87, 112.26)
        >>> stanford = make_city('Stanford', 34.05, 118.25)
        >>> closer_city(38.33, 121.44, berkeley, stanford)
        'Stanford'
        >>> bucharest = make_city('Bucharest', 44.43, 26.10)
        >>> vienna = make_city('Vienna', 48.20, 16.37)
        >>> closer_city(41.29, 174.78, bucharest, vienna)
        'Bucharest'
    """
"*** YOUR CODE HERE ***" return <REPLACE THIS>
new_city = make_city('arb', lat, lon) dist1 = distance(city1, new_city) dist2 = distance(city2, new_city) if dist1 < dist2: return get_name(city1) return get_name(city2)

Use OK to test your code:

python3 ok -q closer_city

Question 3: Closer City Abstraction

Run the following ok test to make sure that you are using abstraction barriers correctly! You should not need to change your code from the previous question to pass this test.

Use OK to test your code:

python3 ok -q check_abstraction

Dictionaries

You've actually seen several abstract data types! List, tuples, ranges, and even strings are examples of abstract data types. Dictionary is another example of abstract data types.

Dictionaries are unordered sets of key-value pairs. To create a dictionary, use the following syntax:

>>> singers = { 'Iggy Azalea': 'Fancy', 'Beyonce': 'Flawless', 'Adam Levine': 'Maps'}

The curly braces denote the key-value pairs in your dictionary. Each key-value pair is separated by a comma. For each pair, the key appears to the left of the colon and the value appears to the right of the colon. (This is a dictionary's constructor!) You can retrieve values from your dictionary by "indexing" using the key:

>>> singers['Beyonce']
'Flawless'
>>> singers['Iggy Azalea']
'Fancy'

You can update an entry for an existing key in the dictionary using the following syntax. What this means is that each key is unique. Be careful, adding a new key follows identical syntax!

>>> singers['Beyonce'] = 'Survivor'
>>> singers['Beyonce']
'Survivor'
>>> singers['Nicki Minaj'] = 'Anaconda' # new entry!
>>> singers['Nicki Minaj']
'Anaconda'

You can also check for membership of keys!

>>> 'Adam Levine' in singers
True

Recall how we can iterate through a list using for-loops. For example, you can do something like this:

>>> a = [1,2,3]
>>> for each in a:
...     print(each)
1
2
3

What happens if you iterate through a dictionary? Can you even iterate through a dictionary?? Notice what happens:

>>> shopping_cart = {"apple":3, "bananas":4, "orange":6}
>>> for each in shopping_cart:
...     print(each)
apple
bananas
orange

Notice that when you iterate through a dictionary, the set of keys is what you iterate through. How would you print out values instead? You can simply do:

>>> shopping_cart = {"apple":3, "bananas":4, "orange":6}
>>> for each in shopping_cart:
...     print(shopping_cart[each])
3
4
5

Question 4: Counter

Implement the function counter which takes in a string of words, and returns a dictionary where each key is a word in the message, and each value is the number of times that word is present in the original string.

def counter(message):
    """ Returns a dictionary of each word in message mapped
    to the number of times it appears in the input string.

    >>> x = counter('to be or not to be')
    >>> x['to']
    2
    >>> x['be']
    2
    >>> x['not']
    1
    >>> y = counter('run forrest run')
    >>> y['run']
    2
    >>> y['forrest']
    1
    """
    word_list = message.split()
"*** YOUR CODE HERE ***"
result_dict = {} for word in word_list: if word in result_dict: result_dict[word] += 1 else: result_dict[word] = 1 return result_dict

Use OK to test your code:

python3 ok -q counter

Question 5: Politician ADT

Now let's combine ADT and dictionaries! Let's try to make an ADT that represents a politician. The politician ADT has a constructor make_politician. There are three selectors, get_name, get_party, and get_age. When implementing an ADT, you have the freedom to choose how you want to represent it. Some common data structures used to represent ADTs are lists and dictionaries. The city ADT was implemented using a list. Now, let's use dictionaries to implement our politician ADT. You must use a dictionary for this question, or else you will not pass the tests.

Use OK to test your code:

python3 ok -q make_politician
python3 ok -q get_pol_name
python3 ok -q get_party
python3 ok -q get_age

Recursion and Tree Recursion

Question 6: Remove from sequence

Complete the recursive function remove which removes the first element in a sequence that is equal to a specified value.

def first(s):
    """Return the first element in a sequence."""
    return s[0]

def rest(s):
    """Return all elements in a sequence after the first"""
    return s[1:]

def remove(x, s):
    """Remove first element equal to x from sequence s.

    >>> remove(1,[])
    []
    >>> remove(1,[1])
    []
    >>> remove(1,[1,1])
    [1]
    >>> remove(1,[2,1])
    [2]
    >>> remove(1,[3,1,2])
    [3, 2]
    >>> remove(1,[3,1,2,1])
    [3, 2, 1]
    >>> remove(5, [3, 5, 2, 5, 11])
    [3, 2, 5, 11]
    """
"*** YOUR CODE HERE ***"
if not s: return [] elif first(s) == x: return rest(s) else: return [first(s)] + remove(x, rest(s))

Illustrated here is a more complete doctest that shows good testing methodology. It is a little cumbersome as documentation, but you'll want to think about it for your projects. Test every condition that might come up. Then you won't be surprised when it does.

Use OK to test your code:

python3 ok -q remove

Question 7: Fibonacci

Write the fibonacci function using tree recursion. The function should take in n and return the nth fibonacci number.

As a reminder, Fibonacci is defined as a function where:

Fibonacci Tree

def fibonacci(n):
    """Return the nth fibonacci number.

    >>> fibonacci(11)
    89
    >>> fibonacci(5)
    5
    >>> fibonacci(0)
    0
    >>> fibonacci(1)
    1
    """
"*** YOUR CODE HERE ***"
if n == 0: return 0 elif n == 1: return 1 else: return fibonacci(n - 1) + fibonacci(n - 2)

Use OK to test your code:

python3 ok -q fibonacci

Optional Questions

Question 8: MLB All Star

It's October, which means its baseball playoffs season! In this exercise, let's utilize dictionaries to see if we can model and learn more about some of our favorite players. In this problem, you will be implementing multiple functions.

As you can see within your hw06.py file, the dictionaries mapping players to their team and statistics respectively have been created already. However, instead of accessing these values directly, we'll be implementing two functions to retrieve the appropriate values as a layer of abstraction.

Implement the get_team and get_stats functions to retrieve the team or statistics given a player's name.

full_roster = {
    "Manny Machado" : "Dodgers",
    "Yasiel Puig" : "Dodgers",
    "Aaron Judge" : "Yankees",
    "Clayton Kershaw" : "Dodgers",
    "Giancarlo Stanton" : "Yankees"
}

full_stats = {
    "Manny Machado": ["SO", "1B", "3B", "SO", "HR"],
    "Yasiel Puig": ["3B", "3B", "1B", "1B", "SO"],
    "Aaron Judge": ["SO", "HR", "HR", "1B", "SO"],
    "Clayton Kershaw": ["1B", "SO", "SO", "1B", "SO"],
    "Giancarlo Stanton": ["HR", "SO", "3B", "SO", "2B"],
}

def get_team(player):
    """Returns team that the provided player is a member of.

    >>> get_team("Manny Machado")
    'Dodgers'
    >>> get_team("Aaron Judge")
    'Yankees'
    """
"*** YOUR CODE HERE ***"
return full_roster[player]
def get_stats(player): """Returns the statistics associated with the provided player. >>> get_stats("Manny Machado") ['SO', '1B', '3B', 'SO', 'HR'] >>> get_stats('Aaron Judge') ['SO', 'HR', 'HR', '1B', 'SO'] """
"*** YOUR CODE HERE ***"
return full_stats[player]

Use OK to test your code:

python3 ok -q get_team
python3 ok -q get_stats

Question 9: Baseball Statistics

If you're a Moneyball movie fan, you know that statistics helps teams find value in players that nobody else sees. In this problem, you'll be implementing two functions to calculate the team batting average for one team, and the mean slugging percentage for all teams.

We highly encourage you to use the functions that you've defined before to solve these problems, specifically the common_players and get_players functions from the previous two problems.

Two functions have already been defined for you to calculate batting average and slugging percentage given a player's statistics. Use these functions when you calculate the averages for each team.

# Following Functions have been given to you, do NOT modify

def calculate_batting_average(stats):
    hits = 0
    total_bats = 0
    for at_bat in stats:
        if at_bat != "SO":
            hits += 1
        total_bats += 1
    return float(round(hits/total_bats, 1))

def calculate_slugging_percent(stats):
    bases = 0
    total_bats = 0
    for at_bat in stats:
        if at_bat == "1B":
            bases += 1
        elif at_bat == "2B":
            bases += 2
        elif at_bat == "3B":
            bases += 3
        elif at_bat == "HR":
            bases += 4
        total_bats += 1
    return float(round(bases/total_bats, 1))

# Modify Functions below

def calculate_team_BA(team):
    """Given a single team name, returns the mean batting average of all players on that team. You are encouraged to use previous functions that you've defined already
    >>> calculate_team_BA('Dodgers')
    0.6
    >>> calculate_team_BA('Yankees')
    0.6
    """
"*** YOUR CODE HERE ***"
roster = get_players(team) total_BA = 0 total_players = 0 for player in roster: player_stats = get_stats(player) total_BA += calculate_batting_average(player_stats) total_players += 1 return total_BA / total_players
def calculate_all_team_SP(): """Returns a dictionary mapping every team to the average slugging percentage of all players on that team. You are encouraged to use previous functions that you've defined already. >>> calculate_all_team_SP() {'Dodgers': 1.2, 'Yankees': 1.8} """
"*** YOUR CODE HERE ***"
team_SP = {} rosters = common_players(full_roster) for team in rosters: total_SP = 0 total_players = 0 for player in rosters[team]: player_stats = get_stats(player) total_SP += calculate_slugging_percent(player_stats) total_players += 1 team_SP[team] = total_SP / total_players return team_SP

Use OK to test your code:

python3 ok -q calculate_team_BA
python3 ok -q calculate_all_team_SP