OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

How to get the values of a dictionary type from a parquet file using pyarrow?

  • Thread starter Thread starter In78
  • Start date Start date
I

In78

Guest
I have a parquet file which I am reading with pyarrow.

Code:
In [83]: pq.read_schema('dummy_file.parquet').field('dummy_column').type
Out[83]: DictionaryType(dictionary<values=string, indices=int32, ordered=0>)

It says it is a column of dictionary type which is similar to a sql enum or pandas category type. Now I want to find the values present in the dictionary type, how do I do that?

It says in here that:

The dictionary values are found in an instance of DictionaryArray.

But how do I get this DictionaryArray?


<p>I have a parquet file which I am reading with pyarrow.</p>
<pre><code>In [83]: pq.read_schema('dummy_file.parquet').field('dummy_column').type
Out[83]: DictionaryType(dictionary<values=string, indices=int32, ordered=0>)
</code></pre>
<p>It says it is a column of dictionary type which is similar to a sql enum or pandas category type. Now I want to find the values present in the dictionary type, how do I do that?</p>
<p>It says in <a href="https://arrow.apache.org/docs/pytho...ryType.html#pyarrow.DictionaryType.value_type" rel="nofollow noreferrer">here</a> that:</p>
<blockquote>
<p>The dictionary values are found in an instance of DictionaryArray.</p>
</blockquote>
<p>But how do I get this <code>DictionaryArray</code>?</p>
<hr />
 
Top