OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

Delete a column from TFRecord Dataset (for feature selection)

  • Thread starter Thread starter Eagle
  • Start date Start date
E

Eagle

Guest
I am trying to implement a Feature Selection component with the following plan in mind:

The implementation​

  • Component takes and InputArtifact[Example] as input
  • Since the data is stored in the form of TFRecords in the URI of the input artifact, I convert it into compatible numpy dictionaries and use sklearn to come up with the list of features selected
  • I delete the required features from the input example directly to produce put it in OutputArtifact[Example] (which has the same structure but fewer columns)

I am done with the first and second point, but am not able to figure out how to delete the selected columns directly in the TFRecord Dataset itself (which I am getting using tf.data.TFRecordDataset(train_uri, compression_type='GZIP'))
<p>I am trying to implement a Feature Selection component with the following plan in mind:</p>
<h3>The implementation</h3>
<ul>
<li>Component takes and <code>InputArtifact[Example]</code> as input</li>
<li>Since the data is stored in the form of TFRecords in the URI of the input artifact, I convert it into compatible numpy dictionaries and use sklearn to come up with the list of features selected</li>
<li>I delete the required features from the input example <strong>directly</strong> to produce put it in <code>OutputArtifact[Example]</code> (which has the same structure but fewer columns)</li>
</ul>
<p>I am done with the first and second point, but am not able to figure out how to delete the selected columns directly in the TFRecord Dataset itself (which I am getting using <code>tf.data.TFRecordDataset(train_uri, compression_type='GZIP')</code>)</p>
 

Latest posts

Top