Convert list of objects to a list of integers and a lookup table

Tags: python lookup

To illustrate what I mean by this, here is an example

messages = [
  ('Ricky',  'Steve',  'SMS'),
  ('Steve',  'Karl',   'SMS'),
  ('Karl',   'Nora',   'Email')

I want to convert this list and a definition of groups to a list of integers and a lookup dictionary so that each element in the group gets a unique id. That id should map to the element in the lookup table like this

messages_int, lookup_table = create_lookup_list(
              messages, ('person', 'person', 'medium'))

print messages_int
[ (0, 1, 0),
  (1, 2, 0),
  (2, 3, 1) ]

print lookup_table
{ 'person': ['Ricky', 'Steve', 'Karl', 'Nora'],
  'medium': ['SMS', 'Email']

I wonder if there is an elegant and pythonic solution to this problem.

I am also open to better terminology than create_lookup_list etc


Here is my solution, it's not better - it's just different :)

def create_lookup_list(data, keys):
  encoded = []
  table = dict([(key, []) for key in keys])

  for record in data:
      msg_int = []
      for key, value in zip(keys, record):
          if value not in table[key]:

  return encoded, table

Here is mine, the inner function lets me write the index-tuple as a generator.

def create_lookup_list( data, format):
    table = {}
    indices = []
    def get_index( item, form ):
        row = table.setdefault( form, [] )
            return row.index( item )
        except ValueError:
            n = len( row )
            row.append( item )
            return n
    for row in data:
        indices.append( tuple( get_index( item, form ) for item, form in zip( row, format ) ))

    return table, indices

Here is my own solution - I doubt it's the best

def create_lookup_list(input_list, groups):
    # use a dictionary for the indices so that the index lookup 
    # is fast (not necessarily a requirement)
    indices = dict((group, {}) for group in groups) 
    output = []

    # assign indices by iterating through the list
    for row in input_list:
        newrow = []
        for group, element in zip(groups, row):
            if element in indices[group]:
                index = indices[group][element]
                index = indices[group][element] = len(indices[group])

    # create the lookup table
    lookup_dict = {}
    for group in indices:
        lookup_dict[group] = sorted(indices[group].keys(),
                lambda e1, e2: indices[group][e1]-indices[group][e2])

    return output, lookup_dict

This video can help you solving your question :)
By: admin