OiO.lk Blog C++ Increasing CPU usage over time in FFmpeg with Widevine DRM stream
C++

Increasing CPU usage over time in FFmpeg with Widevine DRM stream


Increasing CPU usage over time in FFmpeg with __memset_avx2_unaligned_erms while processing DRM .mpd streams
I’m facing a problem where CPU usage gradually increases when using FFmpeg to process a DRM-protected DASH stream (.mpd with Widevine encryption). Initially, the CPU usage is low, but after around 2 hours, the usage doubles, and it keeps increasing over time.
I used perf to analyze what’s happening, and this is what I see after a few hours:

Overhead  Shared Object     Symbol
  25.57%  ffmpeg            [.] __memset_avx2_unaligned_erms
   9.28%  ffmpeg            [.] aes_encrypt
   8.25%  [kernel]          [k] clear_page_rep
   3.17%  [kernel]          [k] asm_exc_page_fault
   1.68%  [kernel]          [k] __handle_mm_fault
   1.38%  ffmpeg            [.] __memmove_avx_unaligned_erms
   1.14%  ffmpeg            [.] _aesni_ctr32_ghash_6x
   1.01%  ffmpeg            [.] malloc_consolidate

Context
I am processing a Widevine DRM-protected DASH stream (.mpd) using FFmpeg to decrypt it and convert it into HLS segments. -cenc_decryption_key
The issue arises after a few hours, and it seems to center around the function __memset_avx2_unaligned_erms, which ends up using about 25% of the CPU after several hours.
The workload is running with FFmpeg built statically with OpenSSL, AVX2 support, and other necessary dependencies.

Questions:
Why is __memset_avx2_unaligned_erms being called more frequently as time progresses, and why does the CPU usage keep rising?
Could this be related to misaligned memory or an issue with the AVX2 SIMD implementation in FFmpeg or glibc?
What strategies can I use to prevent this CPU usage increase in long-running DRM-protected stream processing?
Should I consider disabling SIMD optimizations in FFmpeg for certain parts of the process? If so, how can I do this without losing performance benefits?
Any guidance or suggestions would be greatly appreciated, as I’m running out of ideas to resolve this issue.

What I’ve tried:
Perf analysis: I’ve confirmed the problem seems related to __memset_avx2_unaligned_erms being called increasingly over time.
Memory buffer optimization: I’ve attempted to reuse memory buffers to avoid excessive memory allocation and initialization, but the issue persists.
Reducing memset calls: I’ve reduced the size of memory buffers, trying to limit the impact of memset, but CPU usage still increases after extended processing.



You need to sign in to view this answers

Exit mobile version