OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

How to write a function to read csv files with different separators in pandas in Python?

  • Thread starter Thread starter hbstha123
  • Start date Start date
H

hbstha123

Guest
I have a bunch of CSV files for different years named my_file_2019, my_file_2020, my_file_2023 and so on. Some files have tab separator while others have semi-colon.

I want to write a common function to extract data from all files.

This was my initial function:

Code:
def get_data(year):
    
    file = f"my_file_{year}.csv"
    
    df = pd.read_csv(file,
                    sep = "\t")
    
    #filter for germany
    df = df[df["CountryCode"] == "DE"]
    
    return df

I called the functions like below to get data from file for each year.

Code:
df_2019 = get_data(2019)
df_2020 = get_data(2020)
df_2021 = get_data(2021)
df_2022 = get_data(2022)
df_2023 = get_data(2023)

I got KeyError: 'CountryCode' when the separator was different.

I used the try except method as shown

Code:
def get_data(year):
    
    file = f"my_file_{year}.csv"
    
    try:
        df = pd.read_csv(file,
                    sep = "\t")
    
    except KeyError:
        df = pd.read_csv(file,
                    sep = ";")
    
    #filter for germany
    df = df[df["CountryCode"] == "DE"]
    
    return df

Then I can still read the file when the separator is tab, but not semi-colon.

How can I fix this?
<p>I have a bunch of CSV files for different years named my_file_2019, my_file_2020, my_file_2023 and so on. Some files have tab separator while others have semi-colon.</p>
<p>I want to write a common function to extract data from all files.</p>
<p>This was my initial function:</p>
<pre><code>def get_data(year):

file = f"my_file_{year}.csv"

df = pd.read_csv(file,
sep = "\t")

#filter for germany
df = df[df["CountryCode"] == "DE"]

return df


</code></pre>
<p>I called the functions like below to get data from file for each year.</p>
<pre><code>df_2019 = get_data(2019)
df_2020 = get_data(2020)
df_2021 = get_data(2021)
df_2022 = get_data(2022)
df_2023 = get_data(2023)
</code></pre>
<p>I got KeyError: 'CountryCode' when the separator was different.</p>
<p>I used the try except method as shown</p>
<pre><code>def get_data(year):

file = f"my_file_{year}.csv"

try:
df = pd.read_csv(file,
sep = "\t")

except KeyError:
df = pd.read_csv(file,
sep = ";")

#filter for germany
df = df[df["CountryCode"] == "DE"]

return df

</code></pre>
<p>Then I can still read the file when the separator is tab, but not semi-colon.</p>
<p>How can I fix this?</p>
 

Latest posts

Top