gzip file with splitting the record into columns when the one of the column value in double quote

By : Bhaskar
Source: Stackoverflow.com

I have gzip file which contains columns separated by comma, but when the column value is within double quotes the commas should be kept as it is. I wrote the following code:

                           input = gzip.open(file, "rb")
                            reader = codecs.getreader("utf-8")
                            input_file = reader(input)
                                count = 0
                                for line in input_file:

                                        # print 'count='
                                        # print count
                                        if len(line) != 0:

                                            col = line.split(',')

My data in the file looks like:

4798151,1137351,nam_p0,2762913,nam_r000,"NAM_Rack, Power & Cooling",3
4798151,1135623,nam_s0,2762914,nam_a0,"NAM_Advise, Transform & Manage",3

When I was splitting data with comman, the comma with in double quotes should ignore and come into a column. I am not sure how to add the condition treating the text enclosed in double quote as one. A quick response would be a great help. Thanks.

Use csv.


>>> import StringIO
>>> import csv
>>> line = '4798151,1137351,nam_p0,2762913,nam_r000,"NAM_Rack, Power & Cooling",3'
>>> handler = StringIO.StringIO(line)
>>> [row for row in csv.reader(handler, delimiter=',')]
[['4798151', '1137351', 'nam_p0', '2762913', 'nam_r000', 'NAM_Rack, Power & Cooling', '3']]

In this case you can use this direct approach:

with gzip.open(file, 'rb') as handler:
    for row in csv.reader(handler, delimiter=","):
        # row processing HERE
By : klashxx

By : Boneist

