c++ - Find similar distances between all values in vector and subset them -
given vector double values. want know distances between elements of vector have similar distance each other. in best case, result vector of subsets of original values subsets should have @ least n members.
//given vector<double> values = {1,2,3,4,8,10,12}; //with simple values example //some algorithm //desired result as: vector<vector<double> > subset; //in case of above example expect result like: //subset[0] = {1,2,3,4}; //distance 1 //subset[1] = {8,10,12}; //distance 2 //subset[2] = {4,8,12}; // distance 4 //subset[3] = {2,4}; //also distance 2 not connected subset[1] //subset[4] = {1,3}; //also distance 2 not connected subset[1] or subset[3] //many others if n 2. if n 3 (normally minimum) these small subsets should excluded.
this example simplified distances of integer numbers iterated , tested vector not case double or float.
my idea far
i thought of calculating distances , storing them in vector. creating difference distance matrix , thresholding matrix tolerance similar distances.
//calculate distances: result vector vector<double> distances; (int = 0; < values.size(); i++) (int j = 0; j < values.size(); j++) { if (i >= j) continue; distances.push_back(abs(values[i] - values[j])); } //calculate difference of these distances: result matrix mat diffdistances = mat::zero(size(distances.size(), distances.size()), cv_32fc1); (int = 0; < distances.size(); i++) (int j = 0; j < distances.size(); j++) { if (i >= j) continue; diffdistances.at<float>(i,j) = abs(distances[i], distances[j]); } //threshold matrix tolerance in difference distances threshold(diffdistances, diffdistances, maxdisttol, 255, cv_thresh_binary_inv); //get points similar distances vector<points> diffdistancepoints; findnonzero(diffdistances, diffdistancepoints);
at point stuck finding original values corresponding similar distances. should possible find them, seems complicated trace indices , wonder if there isn't easier way solve problem.
here solution works, long there no branches meaning, there no values closer 2*threshold
. valid neighbor region because neighboring bonds should differ less threshold, if understood @phann correctly.
the solution definitively neither fastest nor nicest possible solution. might use starting point:
#include <iostream> #include <vector> #include <algorithm> int main(){ std::vector< double > values = {1,2,3,4,8,10,12}; const unsigned int nvalues = values.size(); std::vector< std::vector< double > > distancematrix(nvalues - 1); // distancematrix has triangular shape // first vector contains distances value 0 // second row distances value 1 larger values // nth row distances value n-1 except covered std::vector< std::vector< double > > similardistancesubsets; double threshold = 0.05; std::sort(values.begin(), values.end()); (unsigned int = 0; < nvalues-1; ++i) { distancematrix.at(i).resize(nvalues-i-1); (unsigned j = i+1; j < nvalues; ++j){ distancematrix.at(i).at(j-i-1) = values.at(j) - values.at(i); } } (unsigned int = 0; < nvalues-1; ++i) { (unsigned int j = i+1; j < nvalues; ++j) { std::vector< double > thissubset; double thisdist = distancematrix.at(i).at(j-i-1); // distance belongs cluster if (thisdist < 0) continue; double mindist = thisdist - threshold; double maxdist = thisdist + threshold; thissubset.push_back(values.at(i)); thissubset.push_back(values.at(j)); //indicate clustered distancematrix.at(i).at(j-i-1) = -1; unsigned int lastindex = j; (unsigned int k = j+1; k < nvalues; ++k) { thisdist = distancematrix.at(lastindex).at(k-lastindex-1); // distance belongs cluster if (thisdist < 0) continue; // check if found new valid pair if ((thisdist > mindist) && (thisdist < maxdist)){ // update valid distance interval mindist = thisdist - threshold; mindist = thisdist - threshold; // add newly found point thissubset.push_back(values.at(k)); // indicate clustered distancematrix.at(lastindex).at(k-lastindex-1) = -1; // continue search here lastindex = k; } } if (thissubset.size() > 2) { similardistancesubsets.push_back(thissubset); } } } (unsigned int = 0; < similardistancesubsets.size(); ++i) { (unsigned int j = 0; j < similardistancesubsets.at(i).size(); ++j) { std::cout << similardistancesubsets.at(i).at(j); if (j != similardistancesubsets.at(i).size()-1) { std::cout << " "; } else { std::cout << std::endl; } } } }
the idea precompute distances , every pair of particles, starting smallest , larger neighbors, if there valid pair above it. if these collected in subset , added subset vector. every new value valid neighbor region has updated ensure neighboring distances differ less threshold. afterwards, program continues next smallest value , larger neighbors , on.
Comments
Post a Comment