OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

How does regex filteration work in Python re while logging sensitive info?

  • Thread starter Thread starter John Bosman
  • Start date Start date
J

John Bosman

Guest
I am trying to write a python script which would redact/hide certain data present in a string before logging it out to the console. Below is my code snippet so far.

Code:
import re
from logging import DEBUG, Logger, basicConfig, getLogger, Filter, LogRecord

SENSITIVE_PATTERNS = [
    (
        "email_address",
        r"([A-Za-z0-9]+[.-_])*[A-Za-z0-9]+@[A-Za-z0-9-]+(\.[A-Z|a-z]{2,})+",
    ),
]


def create_logger(sensitive_patterns: list = None) -> Logger:
    basicConfig(level=DEBUG)
    logger = getLogger()
    sensitive_data_filter = SensitiveDataFilter(sensitive_patterns)
    logger.addFilter(sensitive_data_filter)
    return logger


class SensitiveDataFilter(Filter):
    def __init__(self, patterns=None):
        super().__init__()
        self.patterns = patterns or []

    def filter(self, record: LogRecord) -> bool:
        for pattern in self.patterns:
            should_redact = re.search(pattern[1], record.msg)

            if should_redact:
                record.msg = re.sub(pattern[1], f"<HIDDEN {pattern[0]}>", record.msg)

        return True


logger = create_logger(
    sensitive_patterns=SENSITIVE_PATTERNS,
)

test1 = "[email protected]"
test2 = "A"*55
test3 = test2.lower()
logger.info(f"this is test1 : {test1}")
logger.info(f"this is a test3 : {test3}")
logger.info(f"this is a test2 : {test2}")

My objective is to hide certain string in log. For example: I want to hide emails whenever they are logged. This piece works fine INFO:root:this is test1 : <HIDDEN email_address>. However, I also want to keep other logs as is when there is no redaction match. This is leading to an interesting problem. Whenever I have a large string in all upper case the script keeps on executing and never ends (I am guessing going to some infinite loop?). However, the same piece when executed with the string case all lowered, it seems to work. INFO:root:this is a test3 : aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

What am I missing here?

I have tried running it in debug mode, the code seems to be stuck in one of the internal calls of the re module but I am not able to figure out why
<p>I am trying to write a python script which would redact/hide certain data present in a string before logging it out to the console. Below is my code snippet so far.</p>
<pre><code>import re
from logging import DEBUG, Logger, basicConfig, getLogger, Filter, LogRecord

SENSITIVE_PATTERNS = [
(
"email_address",
r"([A-Za-z0-9]+[.-_])*[A-Za-z0-9]+@[A-Za-z0-9-]+(\.[A-Z|a-z]{2,})+",
),
]


def create_logger(sensitive_patterns: list = None) -> Logger:
basicConfig(level=DEBUG)
logger = getLogger()
sensitive_data_filter = SensitiveDataFilter(sensitive_patterns)
logger.addFilter(sensitive_data_filter)
return logger


class SensitiveDataFilter(Filter):
def __init__(self, patterns=None):
super().__init__()
self.patterns = patterns or []

def filter(self, record: LogRecord) -> bool:
for pattern in self.patterns:
should_redact = re.search(pattern[1], record.msg)

if should_redact:
record.msg = re.sub(pattern[1], f"<HIDDEN {pattern[0]}>", record.msg)

return True


logger = create_logger(
sensitive_patterns=SENSITIVE_PATTERNS,
)

test1 = "[email protected]"
test2 = "A"*55
test3 = test2.lower()
logger.info(f"this is test1 : {test1}")
logger.info(f"this is a test3 : {test3}")
logger.info(f"this is a test2 : {test2}")

</code></pre>
<p>My objective is to hide certain string in log. For example: I want to hide emails whenever they are logged. This piece works fine <code>INFO:root:this is test1 : <HIDDEN email_address></code>.
However, I also want to keep other logs as is when there is no redaction match. This is leading to an interesting problem. Whenever I have a large string in all upper case the script keeps on executing and never ends (I am guessing going to some infinite loop?).
However, the same piece when executed with the string case all lowered, it seems to work. <code>INFO:root:this is a test3 : aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa</code></p>
<p>What am I missing here?</p>
<p>I have tried running it in debug mode, the code seems to be stuck in one of the internal calls of the re module but I am not able to figure out why</p>
 

Latest posts

A
Replies
0
Views
1
Alfredo Augusto Petri
A
Top