OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

Issue with training model on Kaggle GPU - only one GPU working

  • Thread starter Thread starter AYADI Nouamane
  • Start date Start date
A

AYADI Nouamane

Guest
I'm currently trying to train a model on Kaggle using GPU resources, but it seems that only one GPU is being utilized instead of multiple. I'm using the following training code:

Code:
# Step 1: Install the required packages
#!pip install ultralytics xmltodict albumentations torch torchvision torchaudio

# Step 5: Train the YOLO model
import os
import torch
from ultralytics import YOLO

# Set WANDB_MODE to 'dryrun' to disable WanDB logging
os.environ['WANDB_MODE'] = 'dryrun'

# Set up device for multiple GPUs
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = YOLO('yolov8x.pt')  # load a pretrained YOLOv8 model

# Check if multiple GPUs are available
if torch.cuda.device_count() > 1:
    print(f"Using {torch.cuda.device_count()} GPUs")
    model = torch.nn.DataParallel(model, device_ids=list(range(torch.cuda.device_count()))).to(device)
else:
    model = model.to(device)

# Define the training configuration
data_yaml = """
train: /../images/train_combined_data
val:   /../images/val
test:  /../images/test
nc: 1
names: ['Hotspot']
"""

with open('data.yaml', 'w') as f:
    f.write(data_yaml)

# Train the model
model.train(
    data='data.yaml',
    epochs=50,  # Total number of training epochs
    batch=16,  
    imgsz=640,  # Target image size for training
    device='cuda'
)

I've checked Kaggle's documentation and it should support multiple GPUs for training. Is there something specific I need to add to my code to enable multi-GPU training, or is there a setting on Kaggle that I might have missed?

Any help or guidance on this issue would be greatly appreciated. Thank you!

How can I use both the gpus?
<p>I'm currently trying to train a model on Kaggle using GPU resources, but it seems that only one GPU is being utilized instead of multiple. I'm using the following training code:</p>
<pre><code># Step 1: Install the required packages
#!pip install ultralytics xmltodict albumentations torch torchvision torchaudio

# Step 5: Train the YOLO model
import os
import torch
from ultralytics import YOLO

# Set WANDB_MODE to 'dryrun' to disable WanDB logging
os.environ['WANDB_MODE'] = 'dryrun'

# Set up device for multiple GPUs
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = YOLO('yolov8x.pt') # load a pretrained YOLOv8 model

# Check if multiple GPUs are available
if torch.cuda.device_count() > 1:
print(f"Using {torch.cuda.device_count()} GPUs")
model = torch.nn.DataParallel(model, device_ids=list(range(torch.cuda.device_count()))).to(device)
else:
model = model.to(device)

# Define the training configuration
data_yaml = """
train: /../images/train_combined_data
val: /../images/val
test: /../images/test
nc: 1
names: ['Hotspot']
"""

with open('data.yaml', 'w') as f:
f.write(data_yaml)

# Train the model
model.train(
data='data.yaml',
epochs=50, # Total number of training epochs
batch=16,
imgsz=640, # Target image size for training
device='cuda'
)

</code></pre>
<p>I've checked Kaggle's documentation and it should support multiple GPUs for training. Is there something specific I need to add to my code to enable multi-GPU training, or is there a setting on Kaggle that I might have missed?</p>
<p>Any help or guidance on this issue would be greatly appreciated. Thank you!</p>
<p>How can I use both the gpus?</p>
 

Latest posts

Top