OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

Filtering file names and sizes from line based text file?

  • Thread starter Thread starter Omar Ahmed
  • Start date Start date
O

Omar Ahmed

Guest
I have a text file with (read/write rules, author, file size, date, hours, minutes, file names with extensions) And aiming to use regex to extract file names and size as well as date and hours minutes to a tuple with just these elements. File elements are:-

Code:
-rw-r--r-- 1 jttoivon hyad-all  164519 Dec 28 17:59 basics.ipynb
-rw-r--r-- 1 jttoivon hyad-all  164477 Nov  5 19:21 basics.ipynb.orig
-rw-r--r-- 1 jttoivon hyad-all  115587 Dec 11 11:50 bayes.ipynb
drwxr-xr-x 4 jttoivon hyad-all    4096 Nov 29 13:07 _build
-rw-r--r-- 1 jttoivon hyad-all  198820 Dec 11 11:50 clustering.ipynb
-rw-r--r-- 1 jttoivon hyad-all    6647 Dec 11 12:20 conf.py
-rw-r--r-- 1 jttoivon hyad-all   41828 Nov 28 13:26 example_figure2.png
-rw-r--r-- 1 jttoivon hyad-all  125079 Nov 28 13:26 example_figure2.xcf
-rw-r--r-- 1 jttoivon hyad-all   24139 Nov 28 12:03 example_figure.png
-rwxr-xr-x 1 jttoivon hyad-all     650 Nov 28 12:03 example_figure.py
-rw-r--r-- 1 jttoivon hyad-all   25399 Nov  2 21:25 exception_hierarchy.pdf
-rw-r--r-- 1 jttoivon hyad-all   43632 Nov  2 22:05 exception_hierarchy.png
-rw-r--r-- 1 jttoivon hyad-all   24366 Nov  2 21:26 exception_hierarchy.svg
-rw------- 1 jttoivon hyad-all   72095 Oct  3 17:36 extra.ipynb
-rw------- 1 jttoivon hyad-all 1207075 Nov 28 16:02 face.png

I was aiming to find the right pattern to extract the file names yet since there are many formats which starts with a "." and an "_" and others who has two extentions. It's inconsistent and i couldn't find one pattern to extract them all

Here is my progress so far:

https://regex101.com/r/AoGD12/1

Extracted file size yet missing file names causing mismatches when tupling.

It was required to use just regex and not splitting.

Question:- Write function file_listing that loads the file src/listing.txt. It should return a list of tuples (size, month, day, hour, minute, filename). Use regular expressions to do this (either match, search, findall, or finditer method).
<p>I have a text file with (read/write rules, author, file size, date, hours, minutes, file names with extensions)
And aiming to use regex to extract file names and size as well as date and hours minutes to a tuple with just these elements.
File elements are:-</p>

<pre><code>-rw-r--r-- 1 jttoivon hyad-all 164519 Dec 28 17:59 basics.ipynb
-rw-r--r-- 1 jttoivon hyad-all 164477 Nov 5 19:21 basics.ipynb.orig
-rw-r--r-- 1 jttoivon hyad-all 115587 Dec 11 11:50 bayes.ipynb
drwxr-xr-x 4 jttoivon hyad-all 4096 Nov 29 13:07 _build
-rw-r--r-- 1 jttoivon hyad-all 198820 Dec 11 11:50 clustering.ipynb
-rw-r--r-- 1 jttoivon hyad-all 6647 Dec 11 12:20 conf.py
-rw-r--r-- 1 jttoivon hyad-all 41828 Nov 28 13:26 example_figure2.png
-rw-r--r-- 1 jttoivon hyad-all 125079 Nov 28 13:26 example_figure2.xcf
-rw-r--r-- 1 jttoivon hyad-all 24139 Nov 28 12:03 example_figure.png
-rwxr-xr-x 1 jttoivon hyad-all 650 Nov 28 12:03 example_figure.py
-rw-r--r-- 1 jttoivon hyad-all 25399 Nov 2 21:25 exception_hierarchy.pdf
-rw-r--r-- 1 jttoivon hyad-all 43632 Nov 2 22:05 exception_hierarchy.png
-rw-r--r-- 1 jttoivon hyad-all 24366 Nov 2 21:26 exception_hierarchy.svg
-rw------- 1 jttoivon hyad-all 72095 Oct 3 17:36 extra.ipynb
-rw------- 1 jttoivon hyad-all 1207075 Nov 28 16:02 face.png
</code></pre>

<p>I was aiming to find the right pattern to extract the file names yet since there are many formats which starts with a "." and an "_" and others who has two extentions. It's inconsistent and i couldn't find one pattern to extract them all</p>

<p>Here is my progress so far:</p>

<p><a href="https://regex101.com/r/AoGD12/1" rel="nofollow noreferrer">https://regex101.com/r/AoGD12/1</a></p>

<p>Extracted file size yet missing file names causing mismatches when tupling. </p>

<p>It was required to use just regex and not splitting. </p>

<p>Question:-
Write function file_listing that loads the file src/listing.txt. It should return a list of tuples (size, month, day, hour, minute, filename). Use regular expressions to do this (either match, search, findall, or finditer method).</p>
 
Top