Only Identically Labelled Objects Can Be Compared By – Python Pandas

Python Pandas, This was only applicable to DataFrames, not Series, until pandas 0.19 where it applies to both.

In[1]: df1 = pd.DataFrame([

   [1, 2],

   [3, 4]

])

In[2]: df2 = pd.DataFrame([

   [3, 4],

   [1, 2]

], index = [1, 0])

In[3]: df1 == df2

Exception: Can only compare identically – labeled DataFrame objects

One solution would be to sort the index first (Note: some functions require sorted indexes):

In[4]: df2.sort_index(inplace = True)

In[5]: df1 == df2

Out[5]:

   0 1

0 True True

1 True True

Note: == is also sensitive to the order of columns, so you may have to use sort_index(axis=1):

In[11]: df1.sort_index().sort_index(axis = 1) == df2.sort_index().sort_index(axis = 1)

Out[11]:

   0 1

0 True True

1 True True

You can also try dropping the index column if it is not needed to compare

print(df1.reset_index(drop = True) == df2.reset_index(drop = True))

I have used this same technique in a unit test like so:

from Python pandas.util.testing

import assert_frame_equal

assert_frame_equal(actual.reset_index(drop = True), expected.reset_index(drop = True))

You use it like this:

df1.equals(df2)

this should work

import python pandas as pd

import numpy as np

firstProductSet = {

   ‘Product1’: [‘Computer’, ‘Phone’, ‘Printer’, ‘Desk’],

   ‘Price1’: [1200, 800, 200, 350]

}

df1 = pd.DataFrame(firstProductSet, columns = [‘Product1’, ‘Price1’])

secondProductSet = {

   ‘Product2’: [‘Computer’, ‘Phone’, ‘Printer’, ‘Desk’],

   ‘Price2’: [900, 800, 300, 350]

}

df2 = pd.DataFrame(secondProductSet, columns = [‘Product2’, ‘Price2’])

df1[‘Price2’] = df2[‘Price2’] #add the Price2 column from df2 to df1

df1[‘pricesMatch?’] = np.where(df1[‘Price1’] == df2[‘Price2’], ‘True’, ‘False’) #create new column in df1 to check

if prices match

df1[‘priceDiff?’] = np.where(df1[‘Price1’] == df2[‘Price2’], 0, df1[‘Price1’] – df2[‘Price2’]) #create new column in df1

for price diff

print(df1)

No explicit instruction given, as to the alignment: == aka DataFrame.__eq__,

In[1]: import pandas as pd

In[2]: df1 = pd.DataFrame(index = [0, 1, 2], data = {

   ‘col1’: list(‘abc’)

})

In[3]: df2 = pd.DataFrame(index = [2, 0, 1], data = {

   ‘col1’: list(‘cab’)

})

In[4]: df1 == df2

   — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — –

   …

   ValueError: Can only compare identically – labeled DataFrame objects

Alignment is explicitly broken: DataFrame.equals, DataFrame.values, DataFrame.reset_index(),

 In[5]: df1.equals(df2)

    Out[5]: False

    In[9]: df1.values == df2.values

    Out[9]:

       array([

          [False],

          [False],

          [False]

       ])

    In[10]: (df1.values == df2.values).all().all()

    Out[10]: False

Alignment is explicitly enforced: DataFrame.eq, DataFrame.sort_index(),

In[6]: df1.eq(df2)

Out[6]:

   col1

0 True

1 True

2 True

In[8]: df1.eq(df2).all().all()

Out[8]: True

This is an example of how to deal with this error. I have added rows with zeros Dataframes can be from csv or any other source.

import pandas as pd

import numpy as np

# df1 with 9 rows

df1 = pd.DataFrame({

   ‘Name’: [‘John’, ‘Mike’, ‘Smith’, ‘Wale’, ‘Marry’, ‘Tom’, ‘Menda’, ‘Bolt’, ‘Yuswa’, ],

   ‘Age’: [23, 45, 12, 34, 27, 44, 28, 39, 40]

})

# df2 with 8 rows

df2 = pd.DataFrame({

   ‘Name’: [‘John’, ‘Mike’, ‘Wale’, ‘Marry’, ‘Tom’, ‘Menda’, ‘Bolt’, ‘Yuswa’, ],

   ‘Age’: [25, 45, 14, 34, 26, 44, 29, 42]

})

# get lengths of df1 and df2

df1_len = len(df1)

df2_len = len(df2)

diff = df1_len – df2_len

rows_to_be_added1 = rows_to_be_added2 = 0

# rows_to_be_added1 = np.zeros(diff)

if diff < 0:

   rows_to_be_added1 = abs(diff)

else:

   rows_to_be_added2 = diff

# add empty rows to df1

if rows_to_be_added1 > 0:

   df1 = df1.append(pd.DataFrame(np.zeros((rows_to_be_added1, len(df1.columns))), columns = df1.columns))

# add empty rows to df2

if rows_to_be_added2 > 0:

   df2 = df2.append(pd.DataFrame(np.zeros((rows_to_be_added2, len(df2.columns))), columns = df2.columns))

# at this point we have two dataframes with the same number of rows, and maybe different indexes

# drop the indexes of both, so we can compare the dataframes and other operations like update etc.

df2.reset_index(drop = True, inplace = True)

df1.reset_index(drop = True, inplace = True)

# add a new column to df1

df1[‘New_age’] = None

# compare the Age column of df1 and df2, and update the New_age column of df1 with the Age column of df2

if they match,

Else None

df1[‘New_age’] = np.where(df1[‘Age’] == df2[‘Age’], df2[‘Age’], None)

# drop rows where Name is 0.0

df2 = df2.drop(df2[df2[‘Name’] == 0.0].index)

# now we don ‘t get the error ValueError: Can only compare identically-labeled Series objects

The ValueError will be raised if you try to compare DataFrames with other indexes. DataFrame objects can only be compared with identically labeled ones.

It’s possible to solve this error by using equal parts. There is an error called the ValueError. When trying to compare two DataFrames with different indexes.

Can only be compared with identically labeled DataFrame objects. You can either use the equals function, which ignores the indexes, or use the reset_ index function.

The DataFrame.equals function can be used to solve this error by comparing the two DataFrames columns.

The equals function makes it possible to compare two Series or DataFrames to see if they have the same shape or elements.

The revised code states that a value is a piece of information stored within a particular object. When using a built-in operation or function that receives an argument that is the right type but an inappropriate value.

We will encounter a ValueError in Python. The data that we want to compare is the correct type, DataFrame, but the DataFrames have inappropriate indexes for comparison.

import pandas as pd

df1 = pd.DataFrame({

      ‘Bodyweight (kg)’: [76, 84, 93, 106, 120, 56],

      ‘Bench press (kg)’: [135, 150, 170, 140, 180, 155]

   },

   index = [‘lifter_1’, ‘lifter_2’, ‘lifter_3’, ‘lifter_4’, ‘lifter_5’, ‘lifter_6’])

df2 = pd.DataFrame({

      ‘Bodyweight (kg)’: [76, 84, 93, 106, 120, 56],

      ‘Bench press (kg)’: [145, 120, 180, 220, 175, 110]

   },

   index = [‘lifter_A’, ‘lifter_B’, ‘lifter_C’, ‘lifter_D’, ‘lifter_E’, ‘lifter_F’])

print(df1)

print(df2)

Let’s run this part of the program to see the DataFrames:

Bodyweight(kg) Bench press(kg)

    lifter_1 76 135

    lifter_2 84 150

    lifter_3 93 170

    lifter_4 106 140

    lifter_5 120 180

    lifter_6 56 155

    Bodyweight(kg) Bench press(kg)

    lifter_A 76 145

    lifter_B 84 120

    lifter_C 93 180

    lifter_D 106 220

    lifter_E 120 175

    lifter_F 56 110 e

print(df1 == df2)

To solve this error, we can use the DataFrame.equals function. The equals function allows us compare two Series or DataFrames to see if they have the same shape or elements. Let’s look at the revised code:

print(df1.equals(df2))

False

The exception ValueError: Can only compare identically-labeled DataFrame objects, was found when I ran this code.

I am trying to create a python app with two dataframes and I want to compare them using an operator.

You can see the demo here,AttributeError: ‘DataFrame’ object has no attribute ‘price’ in Python,TypeError: ‘dict_values’ object is not subscriptable in Python.

import pandas as pd

 dfa = pd.DataFrame({

       ‘Bodyweight (kg)’: [760, 840, 930, 1060, 1200, 560],

       ‘Bench press (kg)’: [1350, 1500, 1700, 1400, 1080, 1505]

    },

    index = [‘index_1’, ‘index_2’, ‘index_3’, ‘index_4’, ‘index_5’, ‘index_6’])

 dfb = pd.DataFrame({

       ‘Bodyweight (kg)’: [756, 840, 903, 1006, 1200, 560],

       ‘Bench press (kg)’: [1405, 1020, 1080, 2200, 1075, 1010]

    },

    index = [‘index_A’, ‘index_B’, ‘index_C’, ‘index_D’, ‘index_E’, ‘index_F’])

 print(dfa == dfb)

 Traceback (most recent call last):

   File “main.py”, line 7, in <module>

        print(dfa == dfb)

      File “/usr/lib/python3.8/site-packages/pandas/core/ops/__init__.py”, line 701, in f

        self, other = _align_method_FRAME(self, other, axis, level=None, flex=False)

      File “/usr/lib/python3.8/site-packages/pandas/core/ops/__init__.py”, line 510, in _align_method_FRAME

        raise ValueError(

    ValueError: Can only compare identically-labeled DataFrame objects

    ** Process exited – Return Code: 1 **

    Press Enter to exit terminal

 import pandas as pd

 dfa = pd.DataFrame({

       ‘Bodyweight (kg)’: [760, 840, 930, 1060, 1200, 560],

       ‘Bench press (kg)’: [1350, 1500, 1700, 1400, 1080, 1505]

    },

    index = [‘index_1’, ‘index_2’, ‘index_3’, ‘index_4’, ‘index_5’, ‘index_6’])

 dfb = pd.DataFrame({

       ‘Bodyweight (kg)’: [756, 840, 903, 1006, 1200, 560],

       ‘Bench press (kg)’: [1405, 1020, 1080, 2200, 1075, 1010]

    },

    index = [‘index_A’, ‘index_B’, ‘index_C’, ‘index_D’, ‘index_E’, ‘index_F’])

 dfa = dfa.reset_index(drop = True)

 dfb = dfb.reset_index(drop = True)

 print(dfa == dfb)

 Bodyweight(kg) Bench press(kg)

 0 False False

 1 True False

 2 False False

 3 False False

 4 True False

 5 True False

    **

    Process exited – Return Code: 0 **

Press Enter to exit terminal

When we compare 2 different DataFrames, it is Value Error, which happens when we compare identically labeled series objects.

This error can be thrown if we compare DataFrames which have different labels or indexes. The data in the two DataFrames are the same, but the indexes of these are different.

ValueError: Can only compare identically-labeled DataFrame objects This error occurs when you attempt to compare two pandas DataFrames and either the index labels or the column labels do not perfectly match.

1 week ago Web Python Pandas can only compare identically-labelled series objects 0 Comparing two df’s of different lengths, receiving “Can only compare identically-labeled.

1 week ago Web PYTHON : Pandas “Can only compare identically-labeled DataFrame objects” error [ Gift : Animated Search Engine.

et df1[“name”] = df2[“name] if df1[“

   id “] == df2[“

   id].

    psb = pd.merge(dtl, dtlLookUp, how = ‘left’, on = [‘id’])

df = df1.loc[df1[‘CUST_ACCT_KEY’] != df2[‘CUST_ACCT_KEY’]]

df = df1.loc[df1[‘CUST_ACCT_KEY’] != df2[‘CUST_ACCT_KEY’]].values

df1 = pd.DataFrame(np.arange(1, 10), index = np.arange(1, 10), columns = [‘A’]) df2 = pd.DataFrame(np.arange(11, 20), index = np.arange(11, 20), columns = [‘B’]) df1[‘A’] != df2[‘B’]

ValueError: Can only compare identically – labeled Series objects

Abdullah
Abdullah
Articles: 33

Leave a Reply

Your email address will not be published. Required fields are marked *