Best strategy for dealing with incomplete lines of data from a file

By : EmlynC
Source: Stackoverflow.com
Question!

I use the following block of code to read lines out of a file 'f' into a nested list:

for data in f:
     clean_data = data.rstrip()
     data = clean_data.split('\t') 
     t += [data[0]]
     strmat += [data[1:]]

Sometimes, however, the data is incomplete and a row may look like this:

['955.159', '62.8168', '', '', '', '', '', '', '', '', '', '', '', '', '', '29', '30', '0', '0']

It puts a spanner in the works because I would like Python to implicitly cast my list as floats but the empty fields '' cause it to be cast as an array of strings (dtype: s12).

I could start a second 'if' statement and convert all empty fields into NULL (since 0 is wrong in this instance) but I was unsure whether this was best.

  1. Is this the best strategy of dealing with incomplete data?
  2. Should I edit the stream or do it post-hoc?
By : EmlynC


Answers

The way how you should deal with incomplete values depends on the context of your application (which you haven't mentioned yet).

For example, you can simply ignore missing values

By : tux21b


This video can help you solving your question :)
By: admin