Skip to content

Not considering the nearest neighbor when computing X_aug targets. #2

@TheLapino

Description

@TheLapino

In the get_label function, the imbricated loop starts at 1 and ends at num_nbrs included

The loop's only goal is to compute x_new's (or src in the code) target value which is based on the target values of its parent points.

As described in #1, the points used in the loop already have no guarantee of being x_new's parent points.

On top of that, the computation completely omits THE nearest neighbor. Indeed, since x_new is generally not in X_min, when computing the distances between x_new and X_min, there are very few chances that the sorted list of distances has value of 0 as the first element. In fact, not only would the distance not be 0, it probably IS a parent point in most cases and being the NEAREST parent point, it should contribute greatly to the value of x_new target value... ?

When I say that there are very few chances of x_new being in X_min, with k=3, it would require random.uniform(0, alpha) to return 0 or 1, 3 times in a row.

  src = X_aug[i]
            distances = np.linalg.norm(X_min - src, axis=1)
            dist_indices_sorted = np.argsort(distances)
            # On recalcule les distances avec les k voisins les plus proches. du pt original
            numerator = 0
            denom = 0
            # Pour tous les knn.
            for nbr_indx in range(1, num_nbrs + 1): # 
                # Check le label du pt_dst et sa distance
                y_nbr = y_min[dist_indices_sorted[nbr_indx]]
                dist_nbr = distances[dist_indices_sorted[nbr_indx]]
                # band-aid code
                if dist_nbr == 0:
                    dist_nbr = alpha # What? si tes collé pk prendre en considération les autres valeurs... ?
                # Code de labels pondérés.

                numerator += (1 / dist_nbr) * y_nbr
                denom += (1 / dist_nbr)

I'm sure this is a simple overlook because a similar loop is implemented in oversample() and the first point needs to be discarded in this case.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions