OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

Tools that can combine a CSV with meta data and feed them to LLM to query them

  • Thread starter Thread starter quant
  • Start date Start date
Q

quant

Guest
I have a CSV file that looks like this:

Code:
pd.DataFrame({'HRHHID': [1,2,3,4,5], 'HEHOUSUT': [2,3,1,4,2], 'HETELHHD': [1,2,1,1,1]})

I also have a txt file with some "meta data" for this file that looks like this:

Code:
ATTACHMENT 6

CPS RECORD LAYOUT FOR BASIC LABOR FORCE ITEMS

STANDARD PUBLIC USE FILES

A1. HOUSEHOLD INFORMATION

NAME            SIZE        DESCRIPTION                                 LOCATION

HRHHID          15      HOUSEHOLD IDENTIFIER    (Part 1)                            1-15

                    EDITED UNIVERSE:    ALL HHLD's IN SAMPLE

                    Part 1. See Characters 71-75 for Part 2 of the Household Identifier.
                    
HEHOUSUT        2       TYPE OF HOUSING UNIT                                    31 - 32

                    EDITED UNIVERSE:    ALL HHLDs IN SAMPLE

                    VALID ENTRIES

                    0   OTHER UNIT
                    1   HOUSE, APARTMENT, FLAT
                    2   HU IN NONTRANSIENT HOTEL, MOTEL, ETC.
                    3   HU PERMANENT IN TRANSIENT HOTEL, MOTEL
                    4   HU IN ROOMING HOUSE
                    5   MOBILE HOME OR TRAILER W/NO PERM. ROOM ADDED
                    6   MOBILE HOME OR TRAILER W/1 OR MORE PERM. ROOMS ADDED
                    7   HU NOT SPECIFIED ABOVE
                    8   QUARTERS NOT HU IN ROOMING OR BRDING HS
                    9   UNIT NOT PERM. IN TRANSIENT HOTL, MOTL
                    10  UNOCCUPIED TENT SITE OR TRLR SITE
                    11  STUDENT QUARTERS IN COLLEGE DORM
                    12  OTHER UNIT NOT SPECIFIED ABOVE

HETELHHD        2       IS THERE A TELEPHONE IN THIS                                33 - 34
                    HOUSE/APARTMENT?

                    EDITED UNIVERSE:    HRINTSTA = 1

                    VALID ENTRIES

                    1   YES
                    2   NO

I know that there are tools that you can pass to them a table and ask them to query it like pandas-ai and tools that can "process and transform" different types of data (pdf, csv, png etc) like unstructured but I am looking for a tool that can combine the 2 files mentioned above and leverage an LLM to answer questions like: "How many households have a telephone ?"
<p>I have a CSV file that looks like this:</p>
<pre><code>pd.DataFrame({'HRHHID': [1,2,3,4,5], 'HEHOUSUT': [2,3,1,4,2], 'HETELHHD': [1,2,1,1,1]})
</code></pre>
<p>I also have a <code>txt</code> file with some "meta data" for this file that looks like this:</p>
<pre><code>ATTACHMENT 6

CPS RECORD LAYOUT FOR BASIC LABOR FORCE ITEMS

STANDARD PUBLIC USE FILES

A1. HOUSEHOLD INFORMATION

NAME SIZE DESCRIPTION LOCATION

HRHHID 15 HOUSEHOLD IDENTIFIER (Part 1) 1-15

EDITED UNIVERSE: ALL HHLD's IN SAMPLE

Part 1. See Characters 71-75 for Part 2 of the Household Identifier.

HEHOUSUT 2 TYPE OF HOUSING UNIT 31 - 32

EDITED UNIVERSE: ALL HHLDs IN SAMPLE

VALID ENTRIES

0 OTHER UNIT
1 HOUSE, APARTMENT, FLAT
2 HU IN NONTRANSIENT HOTEL, MOTEL, ETC.
3 HU PERMANENT IN TRANSIENT HOTEL, MOTEL
4 HU IN ROOMING HOUSE
5 MOBILE HOME OR TRAILER W/NO PERM. ROOM ADDED
6 MOBILE HOME OR TRAILER W/1 OR MORE PERM. ROOMS ADDED
7 HU NOT SPECIFIED ABOVE
8 QUARTERS NOT HU IN ROOMING OR BRDING HS
9 UNIT NOT PERM. IN TRANSIENT HOTL, MOTL
10 UNOCCUPIED TENT SITE OR TRLR SITE
11 STUDENT QUARTERS IN COLLEGE DORM
12 OTHER UNIT NOT SPECIFIED ABOVE

HETELHHD 2 IS THERE A TELEPHONE IN THIS 33 - 34
HOUSE/APARTMENT?

EDITED UNIVERSE: HRINTSTA = 1

VALID ENTRIES

1 YES
2 NO

</code></pre>
<p>I know that there are tools that you can pass to them a table and ask them to query it like <a href="https://github.com/Sinaptik-AI/pandas-ai" rel="nofollow noreferrer">pandas-ai</a> and tools that can "process and transform" different types of data (pdf, csv, png etc) like <a href="https://aws.amazon.com/marketplace/seller-profile?id=seller-swoqdseq3bxt2" rel="nofollow noreferrer">unstructured</a> but I am looking for a tool that can <strong>combine the 2 files mentioned above and leverage an LLM to answer questions like: "How many households have a telephone ?"</strong></p>
 

Latest posts

ن
Replies
0
Views
1
نعمان منذر محمود الجميلي
ن
Top