Pandas Dataframe Data Type Conversion or Isomap Transformation

Question!

I load images with scipy's misc.imread, which returns in my case 2304x3 ndarray. Later, I append this array to the list and convert it to a DataFrame. The purpose of doing so is to later apply Isomap transform on the DataFrame. My data frame is 84 rows/samples (images in the folder) and 2304 features each feature is array/list of 3 elements. When I try using Isomap transform I get error:

ValueError: setting an array element with a sequence.

I think error is there because elements of my data frame are of the object type. First I tried using a conversion to_numeric on each column, but got an error, then I wrote a loop to convert each element to numeric. The results I get are still of the object type. Here is my code:

import pandas as pd
from scipy import misc
from mpl_toolkits.mplot3d import Axes3D
import matplotlib
import matplotlib.pyplot as plt
import glob
from sklearn import manifold

samples = []
path = 'Datasets/ALOI/32/*.png'
files = glob.glob(path)
for name in files:
    img = misc.imread(name)
    img = img[::2, ::2]
    x = (img/255.0).reshape(-1,3)
    samples.append(x)

df = pd.DataFrame.from_records(samples, coerce_float = True)

for i in range(0,2304):
    for j in range(0,84):
        df[i][j] = pd.to_numeric(df[i][j], errors = 'coerce')
    df[i] = pd.to_numeric(df[i], errors = 'coerce')


print df[2303][83]
print df[2303].dtype
print df[2303][83].dtype

#iso = manifold.Isomap(n_neighbors=6, n_components=3)
#iso.fit(df) 
#manifold = iso.transform(df)
#print manifold.shape

Last four lines commented out because they give an error. The output I get is:

[ 0.05098039  0.05098039  0.05098039]
object
float64

As you can see each element of DataFrame is of the type float64 but whole column is an object.

Does anyone know how to convert whole data frame to numeric?

Is there another way of applying Isomap?

By : semenoff


Answers

Do you want to reshape your image to a new shape instead of the original one?

If that is not the case then you should change the following line in your code

x = (img/255.0).reshape(-1,3)

with

x = (img/255.0).reshape(-1)

Hope this will resolve your issue



This video can help you solving your question :)
By: admin