OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

Can I convert spectrograms generated with librosa back to audio?

  • Thread starter Thread starter Ramon Griffo
  • Start date Start date
R

Ramon Griffo

Guest
I converted some audio files to spectrograms and saved them to files using the following code:

Code:
import os
from matplotlib import pyplot as plt
import librosa
import librosa.display
import IPython.display as ipd

audio_fpath = "./audios/"
spectrograms_path = "./spectrograms/"
audio_clips = os.listdir(audio_fpath)

def generate_spectrogram(x, sr, save_name):
    X = librosa.stft(x)
    Xdb = librosa.amplitude_to_db(abs(X))
    fig = plt.figure(figsize=(20, 20), dpi=1000, frameon=False)
    ax = fig.add_axes([0, 0, 1, 1], frameon=False)
    ax.axis('off')
    librosa.display.specshow(Xdb, sr=sr, cmap='gray', x_axis='time', y_axis='hz')
    plt.savefig(save_name, quality=100, bbox_inches=0, pad_inches=0)
    librosa.cache.clear()

for i in audio_clips:
    audio_fpath = "./audios/"
    spectrograms_path = "./spectrograms/"
    audio_length = librosa.get_duration(filename=audio_fpath + i)
    j=60
    while j < audio_length:
        x, sr = librosa.load(audio_fpath + i, offset=j-60, duration=60)
        save_name = spectrograms_path + i + str(j) + ".jpg"
        generate_spectrogram(x, sr, save_name)
        j += 60
        if j >= audio_length:
            j = audio_length
            x, sr = librosa.load(audio_fpath + i, offset=j-60, duration=60)
            save_name = spectrograms_path + i + str(j) + ".jpg"
            generate_spectrogram(x, sr, save_name)

I wanted to keep the most detail and quality from the audios, so that i could turn them back to audio without too much loss (They are 80MB each).

Is it possible to turn them back to audio files? How can I do it?

Example spectrograms

I tried using librosa.feature.inverse.mel_to_audio, but it didn't work, and I don't think it applies.

I now have 1300 spectrogram files and want to train a Generative Adversarial Network with them, so that I can generate new audios, but I don't want to do it if i wont be able to listen to the results later.
<p>I converted some audio files to spectrograms and saved them to files using the following code:</p>

<pre><code>import os
from matplotlib import pyplot as plt
import librosa
import librosa.display
import IPython.display as ipd

audio_fpath = "./audios/"
spectrograms_path = "./spectrograms/"
audio_clips = os.listdir(audio_fpath)

def generate_spectrogram(x, sr, save_name):
X = librosa.stft(x)
Xdb = librosa.amplitude_to_db(abs(X))
fig = plt.figure(figsize=(20, 20), dpi=1000, frameon=False)
ax = fig.add_axes([0, 0, 1, 1], frameon=False)
ax.axis('off')
librosa.display.specshow(Xdb, sr=sr, cmap='gray', x_axis='time', y_axis='hz')
plt.savefig(save_name, quality=100, bbox_inches=0, pad_inches=0)
librosa.cache.clear()

for i in audio_clips:
audio_fpath = "./audios/"
spectrograms_path = "./spectrograms/"
audio_length = librosa.get_duration(filename=audio_fpath + i)
j=60
while j < audio_length:
x, sr = librosa.load(audio_fpath + i, offset=j-60, duration=60)
save_name = spectrograms_path + i + str(j) + ".jpg"
generate_spectrogram(x, sr, save_name)
j += 60
if j >= audio_length:
j = audio_length
x, sr = librosa.load(audio_fpath + i, offset=j-60, duration=60)
save_name = spectrograms_path + i + str(j) + ".jpg"
generate_spectrogram(x, sr, save_name)
</code></pre>

<p>I wanted to keep the most detail and quality from the audios, so that i could turn them back to audio without too much loss (They are 80MB each).</p>

<p>Is it possible to turn them back to audio files? How can I do it?</p>

<p><a href="https://i.sstatic.net/hmvBJ.png" rel="noreferrer"><img src="https://i.sstatic.net/hmvBJ.png" alt="Example spectrograms"></a></p>

<p>I tried using librosa.feature.inverse.mel_to_audio, but it didn't work, and I don't think it applies.</p>

<p>I now have 1300 spectrogram files and want to train a Generative Adversarial Network with them, so that I can generate new audios, but I don't want to do it if i wont be able to listen to the results later.</p>
 

Latest posts

Top