OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

Scoring in Hyperparameter tuning fails because of variables with inconsistent number of samples

  • Thread starter Thread starter Parlu10
  • Start date Start date
P

Parlu10

Guest
I'm doing a hyperparameter tuning using sklearn's GridSearchCV.

Code:
    if cfg.tuning is True:
        print("extractingY...\n")
        y = extract_Y(test_dataloader)
        print("finished extraction\n")
        ht = hyp_tuning(model=model, optimizer=optimizer)
        param_grid = {
            'lr': [1e-2, 1.5e-2]
        }
        search = GridSearchCV(ht, param_grid, cv=5, scoring='precision')
        #x is a "dummy" matrix because i use dataloader in fit and predict
        x = [0 for _ in range(len(y))]
        print(len(y))
        result = search.fit(x, y)
        print('Best: %f using %s' % (result.best_score_, result.best_params_))

However, the scoring always fails because of this error:

Code:
C:\Users\Parlu\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\model_selection\_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
  File "C:\Users\Parlu\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\model_selection\_validation.py", line 971, in _score
    scores = scorer(estimator, X_test, y_test, **score_params)
  File "C:\Users\Parlu\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\metrics\_scorer.py", line 279, in __call__
    return self._score(partial(_cached_call, None), estimator, X, y_true, **_kwargs)
  File "C:\Users\Parlu\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\metrics\_scorer.py", line 376, in _score
    return self._sign * self._score_func(y_true, y_pred, **scoring_kwargs)
  File "C:\Users\Parlu\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\utils\_param_validation.py", line 213, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\Parlu\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\metrics\_classification.py", line 2190, in precision_score
    p, _, _, _ = precision_recall_fscore_support(
  File "C:\Users\Parlu\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\utils\_param_validation.py", line 186, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\Parlu\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\metrics\_classification.py", line 1775, in precision_recall_fscore_support
    labels = _check_set_wise_labels(y_true, y_pred, average, labels, pos_label)
  File "C:\Users\Parlu\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\metrics\_classification.py", line 1547, in _check_set_wise_labels
    y_type, y_true, y_pred = _check_targets(y_true, y_pred)
  File "C:\Users\Parlu\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\metrics\_classification.py", line 99, in _check_targets
    check_consistent_length(y_true, y_pred)
  File "C:\Users\Parlu\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\utils\validation.py", line 460, in check_consistent_length
    raise ValueError(
ValueError: Found input variables with inconsistent numbers of samples: [2403, 12014]

I tried to print the length of y, but the terminal says that len(y) is 12014, not 2403 as the error says. I really have no idea of what the problem is.

I'm also attaching the function extract_Y

Code:
def extract_Y(dataloader):
    all_labels = []
    for data, lengths, targets in tqdm(dataloader):
        targets = targets.view(-1)#.cuda()
        mask = (targets != -1)
        all_labels.extend(targets[mask].cpu().numpy())
    return all_labels
<p>I'm doing a hyperparameter tuning using sklearn's GridSearchCV.</p>
<pre><code> if cfg.tuning is True:
print("extractingY...\n")
y = extract_Y(test_dataloader)
print("finished extraction\n")
ht = hyp_tuning(model=model, optimizer=optimizer)
param_grid = {
'lr': [1e-2, 1.5e-2]
}
search = GridSearchCV(ht, param_grid, cv=5, scoring='precision')
#x is a "dummy" matrix because i use dataloader in fit and predict
x = [0 for _ in range(len(y))]
print(len(y))
result = search.fit(x, y)
print('Best: %f using %s' % (result.best_score_, result.best_params_))
</code></pre>
<p>However, the scoring always fails because of this error:</p>
<pre><code>C:\Users\Parlu\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\model_selection\_validation.py:982: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "C:\Users\Parlu\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\model_selection\_validation.py", line 971, in _score
scores = scorer(estimator, X_test, y_test, **score_params)
File "C:\Users\Parlu\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\metrics\_scorer.py", line 279, in __call__
return self._score(partial(_cached_call, None), estimator, X, y_true, **_kwargs)
File "C:\Users\Parlu\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\metrics\_scorer.py", line 376, in _score
return self._sign * self._score_func(y_true, y_pred, **scoring_kwargs)
File "C:\Users\Parlu\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\utils\_param_validation.py", line 213, in wrapper
return func(*args, **kwargs)
File "C:\Users\Parlu\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\metrics\_classification.py", line 2190, in precision_score
p, _, _, _ = precision_recall_fscore_support(
File "C:\Users\Parlu\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\utils\_param_validation.py", line 186, in wrapper
return func(*args, **kwargs)
File "C:\Users\Parlu\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\metrics\_classification.py", line 1775, in precision_recall_fscore_support
labels = _check_set_wise_labels(y_true, y_pred, average, labels, pos_label)
File "C:\Users\Parlu\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\metrics\_classification.py", line 1547, in _check_set_wise_labels
y_type, y_true, y_pred = _check_targets(y_true, y_pred)
File "C:\Users\Parlu\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\metrics\_classification.py", line 99, in _check_targets
check_consistent_length(y_true, y_pred)
File "C:\Users\Parlu\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\utils\validation.py", line 460, in check_consistent_length
raise ValueError(
ValueError: Found input variables with inconsistent numbers of samples: [2403, 12014]
</code></pre>
<p>I tried to print the length of y, but the terminal says that len(y) is 12014, not 2403 as the error says. I really have no idea of what the problem is.</p>
<p>I'm also attaching the function extract_Y</p>
<pre><code>def extract_Y(dataloader):
all_labels = []
for data, lengths, targets in tqdm(dataloader):
targets = targets.view(-1)#.cuda()
mask = (targets != -1)
all_labels.extend(targets[mask].cpu().numpy())
return all_labels
</code></pre>
 

Latest posts

L
Replies
0
Views
1
lagnaoui jihane
L
E
Replies
0
Views
1
Eduard Dubilyer
E
Top