OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

python regex, split string with multiple delimeters

  • Thread starter Thread starter Drewdin
  • Start date Start date
D

Drewdin

Guest
I know this question has been answered but my use case is slightly different. I am trying to setup a regex pattern to split a few strings into a list.

Input Strings:

Code:
1. "ABC-QWERT01"
2. "ABC-QWERT01DV"
3. "ABCQWER01"

Criteria of the string ABC - QWERT 01 DV 1 2 3 4 5

  1. The string will always start with three chars
  2. The dash is optional
  3. there will then be 3-10 chars
  4. Left padded 0-99 digits
  5. the suffix is 2 chars and is optional

Expected Output

Code:
1. ['ABC','-','QWERT','01']
1. ['ABC','-','QWERT','01', 'DV']
1. ['ABC','QWER','01','DV']

I have tried the following patterns a bunch of different ways but I am missing something. My thought was start at the beginning of the string, split after the first three chars or the dash, then split on the occurrence of two decimals.

Pattern 1: r"([ -?, \d{2}])+" This works but doesn't break up the string by the first three chars if the dash is missing

Pattern 2: r"([^[a-z]{3}, -?, \d{2}])+" This fails as a non-pattern match, nothing gets split

Pattern 3: r"([^[a-z]{3}|-?, \d{2}])+" This fails as a non-pattern match, nothing gets split

Any tips or suggestions?
<p>I know this question has been answered but my use case is slightly different. I am trying to setup a regex pattern to split a few strings into a list.</p>
<p>Input Strings:</p>
<pre><code>1. "ABC-QWERT01"
2. "ABC-QWERT01DV"
3. "ABCQWER01"
</code></pre>
<p>Criteria of the string
ABC - QWERT 01 DV
1 2 3 4 5</p>
<ol>
<li>The string will always start with three chars</li>
<li>The dash is optional</li>
<li>there will then be 3-10 chars</li>
<li>Left padded 0-99 digits</li>
<li>the suffix is 2 chars and is optional</li>
</ol>
<p>Expected Output</p>
<pre><code>1. ['ABC','-','QWERT','01']
1. ['ABC','-','QWERT','01', 'DV']
1. ['ABC','QWER','01','DV']
</code></pre>
<p>I have tried the following patterns a bunch of different ways but I am missing something. My thought was start at the beginning of the string, split after the first three chars or the dash, then split on the occurrence of two decimals.</p>
<p>Pattern 1: <code>r"([ -?, \d{2}])+"</code>
This works but doesn't break up the string by the first three chars if the dash is missing</p>
<p>Pattern 2: <code>r"([^[a-z]{3}, -?, \d{2}])+"</code>
This fails as a non-pattern match, nothing gets split</p>
<p>Pattern 3: <code>r"([^[a-z]{3}|-?, \d{2}])+"</code>
This fails as a non-pattern match, nothing gets split</p>
<p>Any tips or suggestions?</p>
 

Latest posts

A
Replies
0
Views
1
Alfredo Augusto Petri
A
Top