OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

Why is \b NOT matching beginning of word after a space? [duplicate]

  • Thread starter Thread starter kaay
  • Start date Start date
K

kaay

Guest
According to the docs for \b: "r'\bat\b' matches 'at', 'at.', '(at)', and 'as at ay'"

... except this is not my experience:

Code:
import re
pat = re.compile(r'\bat\b')
print(pat.match('as at ay'))
print(pat.match('at ay'))
print(pat.match('as at'))
print(pat.match(' at'))
print(pat.match('at'))

returns:

Code:
None
<re.Match object; span=(0, 2), match='at'>
None
None
<re.Match object; span=(0, 2), match='at'>

Tested on Python 3.6.2, 3.6.8, 3.12.0
Windows 10 Enterprise 21H2
Poland, language and regional format: English (US)
Python gives:(('en_US', 'cp1252'), 1033, en_US)
for (locale.getdefaultlocale(), (ctypes.windll.kernel32.GetUserDefaultUILanguage(), locale.windows_locale[ctypes.windll.kernel32.GetUserDefaultUILanguage()])

.findall works fine. Everywhere else (e.g. Notepad++ and this site, it works.

To pre-empt: this is NOT the same thing as here. I AM using raw string, and it works fine at the END of a word.
<p>According to <a href="https://docs.python.org/3/library/re.html" rel="nofollow noreferrer">the docs for \b</a>:
"r'\bat\b' matches 'at', 'at.', '(at)', and 'as at ay'"</p>
<p>... except this is not my experience:</p>
<pre><code>import re
pat = re.compile(r'\bat\b')
print(pat.match('as at ay'))
print(pat.match('at ay'))
print(pat.match('as at'))
print(pat.match(' at'))
print(pat.match('at'))
</code></pre>
<p>returns:</p>
<pre><code>None
<re.Match object; span=(0, 2), match='at'>
None
None
<re.Match object; span=(0, 2), match='at'>
</code></pre>
<p>Tested on Python 3.6.2, 3.6.8, 3.12.0<br>
Windows 10 Enterprise 21H2<br>
Poland, language and regional format: English (US)<br>
Python gives:(('en_US', 'cp1252'), 1033, en_US)<br>
for (locale.getdefaultlocale(), (ctypes.windll.kernel32.GetUserDefaultUILanguage(), locale.windows_locale[ctypes.windll.kernel32.GetUserDefaultUILanguage()])</p>
<p>.findall works fine. Everywhere else (e.g. Notepad++ and <a href="https://regex101.com/" rel="nofollow noreferrer">this site</a>, it works.</p>
<p>To pre-empt: this is NOT the same thing as <a href="https://stackoverflow.com/questions/21664290/why-doesnt-b-w-b-match-a-word">here</a>. I AM using raw string, and it works fine at the END of a word.</p>
 

Latest posts

Top