OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

How to predict the resulting type after indexing a Pandas DataFrame

  • Thread starter Thread starter Pro Q
  • Start date Start date
P

Pro Q

Guest
I have a Pandas DataFrame, as defined here:

Code:
df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Aritra'],
                   'Age': [25, 30, 35],
                   'Location': ['Seattle', 'New York', 'Kona']},
                  index=([10, 20, 30]))

However, when I index into this DataFrame, I can't accurately predict what type of object is going to result from the indexing:

Code:
# (1) str
df.iloc[0, df.columns.get_loc('Name')]
# (2) Series
df.iloc[0:1, df.columns.get_loc('Name')]

# (3) Series
df.iloc[0:2, df.columns.get_loc('Name')]
# (4) DataFrame
df.iloc[0:2, df.columns.get_loc('Name'):df.columns.get_loc('Age')]

# (5) Series
df.iloc[0, df.columns.get_loc('Name'):df.columns.get_loc('Location')]
# (6) DataFrame
df.iloc[0:1, df.columns.get_loc('Name'):df.columns.get_loc('Location')]

Note that each of the pairs above contain the same data. (e.g. (2) is a Series that contains a single string, (4) is a DataFrame that contains a single column, etc.)

Why do they output different types of objects? How can I predict what type of object will be output?

Given the data, it looks like the rule is based on how many slices (colons) you have in the index:

  • 0 slices ((1)): scalar value
  • 1 slice ((2), (3), (5)): Series
  • 2 slices ((4), (6)): DataFrame

However, I'm not sure if this is always true, and even if it is always true, I want to know the underlying mechanism as to why it is like that.

I've spent a while looking at the indexing documentation, but it doesn't seem to describe this behavior clearly. The documentation for the iloc function also doesn't describe the return types.

I'm also interested in the same question for loc instead of iloc, but, since loc is inclusive, the results aren't quite as bewildering. (That is, you can't get pairs of indexes with different types where the indexes should pull out the exact same data.)
<p>I have a Pandas <code>DataFrame</code>, as defined <a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.index.html" rel="nofollow noreferrer">here</a>:</p>
<pre class="lang-py prettyprint-override"><code>df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Aritra'],
'Age': [25, 30, 35],
'Location': ['Seattle', 'New York', 'Kona']},
index=([10, 20, 30]))
</code></pre>
<p><strong>However, when I index into this <code>DataFrame</code>, I can't accurately predict what type of object is going to result from the indexing:</strong></p>
<pre class="lang-py prettyprint-override"><code># (1) str
df.iloc[0, df.columns.get_loc('Name')]
# (2) Series
df.iloc[0:1, df.columns.get_loc('Name')]

# (3) Series
df.iloc[0:2, df.columns.get_loc('Name')]
# (4) DataFrame
df.iloc[0:2, df.columns.get_loc('Name'):df.columns.get_loc('Age')]

# (5) Series
df.iloc[0, df.columns.get_loc('Name'):df.columns.get_loc('Location')]
# (6) DataFrame
df.iloc[0:1, df.columns.get_loc('Name'):df.columns.get_loc('Location')]
</code></pre>
<p>Note that each of the pairs above <em>contain the same data</em>. (e.g. <code>(2)</code> is a Series that contains a single string, <code>(4)</code> is a DataFrame that contains a single column, etc.)</p>
<p><strong>Why do they output different types of objects? How can I predict what type of object will be output?</strong></p>
<p>Given the data, it looks like the rule is based on how many slices (colons) you have in the index:</p>
<ul>
<li>0 slices (<code>(1)</code>): scalar value</li>
<li>1 slice (<code>(2)</code>, <code>(3)</code>, <code>(5)</code>): <code>Series</code></li>
<li>2 slices (<code>(4)</code>, <code>(6)</code>): <code>DataFrame</code></li>
</ul>
<p>However, I'm not sure if this is always true, and even if it is always true, <strong>I want to know the underlying mechanism as to why it is like that.</strong></p>
<p>I've spent a while looking at the <a href="https://pandas.pydata.org/docs/user_guide/indexing.html" rel="nofollow noreferrer">indexing documentation</a>, but it doesn't seem to describe this behavior clearly. The <a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.iloc.html" rel="nofollow noreferrer">documentation for the <code>iloc</code> function</a> also doesn't describe the return types.</p>
<p><strong>I'm also interested in the same question for <code>loc</code></strong> instead of <code>iloc</code>, but, since <a href="https://stackoverflow.com/questions/49962417/why-does-loc-have-inclusive-behavior-for-slices"><code>loc</code> is inclusive</a>, the results aren't quite as bewildering. (That is, you can't get pairs of indexes with different types where the indexes should pull out the exact same data.)</p>
 

Latest posts

M
Replies
0
Views
1
Mohit Pant
M
Top