OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

Can anybody help me out with maximizing GPU utilization when using pytorch?

  • Thread starter Thread starter 차원규
  • Start date Start date

차원규

Guest
I'm currently using an RTX 4090 with a 7950X CPU. My goal is to maximize GPU utilization when running a relatively simple model on tabular data. The model has fewer than 1 million parameters, and the data shape is (5,000,000, 120). When I train the model, only 18% of the GPU is utilized, and it takes around 3 hours to complete the training.

The main issue is that if I could somehow utilize 90% of the GPU, the training time would be significantly reduced, potentially to one-fifth of the current time, which would save me a lot of time.

I've tried various solutions, such as adjusting the batch size, increasing the complexity of the model, and changing the DataLoader's num_workers, but none of these have worked well. Regardless of my adjustments, the GPU load remains around 10-15%. This has been frustrating.

From this, I've come up with the idea of using multiprocessing. Since I'm using a single GPU and it only uses 18% for a single model, I still have room to run four more models. I thought that running five different models simultaneously could increase GPU utilization to around 100%, thus saving a lot of time. However, things didn't work out well when I tried using PyTorch's multiprocessing.

Can anyone help me with this, or is my idea not feasible with PyTorch?

What I've tried

  1. Increasing batch size from 64 to 4096, 40962, 40964
  2. Using num_workers (2,4,8)
  3. Adding more layers to the model
  4. Using pytorch.multiprocessing
<p>I'm currently using an RTX 4090 with a 7950X CPU. My goal is to maximize GPU utilization when running a relatively simple model on tabular data. The model has fewer than 1 million parameters, and the data shape is (5,000,000, 120). When I train the model, only 18% of the GPU is utilized, and it takes around 3 hours to complete the training.</p>
<p>The main issue is that if I could somehow utilize 90% of the GPU, the training time would be significantly reduced, potentially to one-fifth of the current time, which would save me a lot of time.</p>
<p>I've tried various solutions, such as adjusting the batch size, increasing the complexity of the model, and changing the DataLoader's num_workers, but none of these have worked well. Regardless of my adjustments, the GPU load remains around 10-15%. This has been frustrating.</p>
<p>From this, I've come up with the idea of using multiprocessing. Since I'm using a single GPU and it only uses 18% for a single model, I still have room to run four more models. I thought that running five different models simultaneously could increase GPU utilization to around 100%, thus saving a lot of time. However, things didn't work out well when I tried using PyTorch's multiprocessing.</p>
<p>Can anyone help me with this, or is my idea not feasible with PyTorch?</p>
<p>What I've tried</p>
<ol>
<li>Increasing batch size from 64 to 4096, 4096<em>2, 4096</em>4</li>
<li>Using num_workers (2,4,8)</li>
<li>Adding more layers to the model</li>
<li>Using pytorch.multiprocessing</li>
</ol>
 
Top