OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

Read Excel merged cells strike and non strike data and write to separate files

  • Thread starter Thread starter Roshan Nuvvula
  • Start date Start date
R

Roshan Nuvvula

Guest
I have an Excel file where some of the cells got merged and some cells have warp Text with Strike and Non Strike in it.

Sampledata_Image

I want to write the data that non strike records to one Excel and other Strike records to another Excel along with the strike in the cell.

My output1:

Code:
col1        |   col2          |  col3    | col4    |col5  |
Sampletext1 | Combines TextC2 |Sample3   | text4   |text5 |
Sampletext2 | Combines TextC2 |Sample3_1 | text4_1 |text5 |
Sampletext2 | Combines TextC2 |Sample3_1 | text4_2 |text5 |

My output2:

Code:
col1        |   col2          |  col3    | col4    |col5  |
Sampletext2 | Combines TextC2 |Sample3_1 | text4_3 |text5 |

My Expected output2

'text4_3` should be striked out in the output file as well.

I have tried using openpyxl in python

Code:
from openpyxl import load_workbook
from openpyxl import Workbook
import pandas as pd

input_file = 'myexecel.xslx'

Workbook = load_workbook(input_file)
for i in Workbook.worksheets:
    if i.sheet_state == "visible":
         sheetname = str(i).replace('<Worksheet "', '').replace('">', '').strip()
         ws = Workbook[sheetname]

         #unmerging the cells
        for merged_cell in ws.merged_cells:
            min_row, min_col, max_row, max_col = merged_cell.min_row, merged_cell.min_col, merged_cell.max_row, merged_cell.max_col
            data = ws.cell(row=min_row, column=min_col).value
            ws.unmerge_cells(start_row=min_row, start_column=min_col, end_row=max_row, end_column=max_col)

            for row in ws.iter_rows(min_row=min_row, min_col=min_col, max_row=max_row, max_col=max_col):
                for cell in row:
                    cell.value = data


        data_all = [[cell for cell in row] for row in ws.iter_rows(values_only=True)]
        df_raw = pd.DataFrame(data_all[1:], columns=headercols)
        df_raw["strike_flag"] = [any(cell.font.strikethrough for cell in row) for row in ws.iter_rows(min_row=2)]

with the above code I was able to find whether the cell have strike through or not. But not sure how to separate strike through and non strike through records.
<p>I have an Excel file where some of the cells got merged and some cells have warp Text with Strike and Non Strike in it.</p>
<p><a href="https://i.sstatic.net/Ap7Anz8J.png" rel="nofollow noreferrer"><img src="https://i.sstatic.net/Ap7Anz8J.png" alt="Sampledata_Image" /></a></p>
<p>I want to write the data that non strike records to one Excel and other Strike records to another Excel along with the strike in the cell.</p>
<p>My output1:</p>
<pre><code>col1 | col2 | col3 | col4 |col5 |
Sampletext1 | Combines TextC2 |Sample3 | text4 |text5 |
Sampletext2 | Combines TextC2 |Sample3_1 | text4_1 |text5 |
Sampletext2 | Combines TextC2 |Sample3_1 | text4_2 |text5 |
</code></pre>
<p>My output2:</p>
<pre><code>col1 | col2 | col3 | col4 |col5 |
Sampletext2 | Combines TextC2 |Sample3_1 | text4_3 |text5 |
</code></pre>
<p><a href="https://i.sstatic.net/AJLgp2Y8.png" rel="nofollow noreferrer"><img src="https://i.sstatic.net/AJLgp2Y8.png" alt="My Expected output2" /></a></p>
<blockquote>
<p>'text4_3` should be striked out in the output file as well.</p>
</blockquote>
<p>I have tried using <code>openpyxl</code> in python</p>
<pre class="lang-py prettyprint-override"><code>
from openpyxl import load_workbook
from openpyxl import Workbook
import pandas as pd

input_file = 'myexecel.xslx'

Workbook = load_workbook(input_file)
for i in Workbook.worksheets:
if i.sheet_state == "visible":
sheetname = str(i).replace('<Worksheet "', '').replace('">', '').strip()
ws = Workbook[sheetname]

#unmerging the cells
for merged_cell in ws.merged_cells:
min_row, min_col, max_row, max_col = merged_cell.min_row, merged_cell.min_col, merged_cell.max_row, merged_cell.max_col
data = ws.cell(row=min_row, column=min_col).value
ws.unmerge_cells(start_row=min_row, start_column=min_col, end_row=max_row, end_column=max_col)

for row in ws.iter_rows(min_row=min_row, min_col=min_col, max_row=max_row, max_col=max_col):
for cell in row:
cell.value = data


data_all = [[cell for cell in row] for row in ws.iter_rows(values_only=True)]
df_raw = pd.DataFrame(data_all[1:], columns=headercols)
df_raw["strike_flag"] = [any(cell.font.strikethrough for cell in row) for row in ws.iter_rows(min_row=2)]

</code></pre>
<p>with the above code I was able to find whether the cell have strike through or not. But not sure how to separate strike through and non strike through records.</p>
 
Top