Not considering the nearest neighbor when computing X_aug targets.

In the get_label function, the imbricated loop starts at 1 and ends at num_nbrs included

The loop's only goal is to compute x_new's (or src in the code) target value which is based on the target values of its parent points.

As described in #1, the points used in the loop already have no guarantee of being x_new's parent points. 

On top of that, the computation completely omits **THE** nearest neighbor. Indeed, since x_new is generally not in X_min, when computing the distances between x_new and X_min, there are very few chances that the sorted list of distances has value of 0 as the first element. In fact, not only would the distance not be 0, it probably IS a parent point in most cases and being the NEAREST parent point, it should contribute greatly to the value of x_new target value... ?

When I say that there are very few chances of x_new being in X_min, with k=3, it would require `random.uniform(0, alpha)` to return 0 or 1, 3 times in a row. 

```
  src = X_aug[i]
            distances = np.linalg.norm(X_min - src, axis=1)
            dist_indices_sorted = np.argsort(distances)
            # On recalcule les distances avec les k voisins les plus proches. du pt original
            numerator = 0
            denom = 0
            # Pour tous les knn.
            for nbr_indx in range(1, num_nbrs + 1): # 
                # Check le label du pt_dst et sa distance
                y_nbr = y_min[dist_indices_sorted[nbr_indx]]
                dist_nbr = distances[dist_indices_sorted[nbr_indx]]
                # band-aid code
                if dist_nbr == 0:
                    dist_nbr = alpha # What? si tes collé pk prendre en considération les autres valeurs... ?
                # Code de labels pondérés.

                numerator += (1 / dist_nbr) * y_nbr
                denom += (1 / dist_nbr)
```

I'm sure this is a simple overlook because a similar loop is implemented in oversample() and the first point needs to be discarded in this case.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not considering the nearest neighbor when computing X_aug targets. #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Not considering the nearest neighbor when computing X_aug targets. #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions