How to Count the Distinct Pairs using HashMap?

时间：2020-09-20 13:49:13
分类：网络文摘
阅读：124 次

Given an array of integers and a target value, find the number of distinct pairs that sum to the target. The pairs (x,y) and (x’, y’) where x smaller or equal to y and x’ smaller or equal to y’ are considered distinct if (x,y) <= (x’, y’). As an illustration, sort the two arrays sorted(3, 2) = (2, 3). The two arrays are not distinct. The two arrays (2, 3) and (2, 4) are distinct.

For example, arr = [5, 7, 9, 13, 11, 6, 6, 3, 3], and a target value of k = 12, the 4 pairs (5, 7), (6, 6), (9, 3), (9, 3) – two instances of 3, all sum to 12 and there are only 3 distinct pairs, (5, 7), (3, 9) and (6, 6).

Sample Input:
arr = [1, 3, 46, 1, 3, 9], k = 47. There are 4 pairs where a[i] + a[j] = 4.
(1, 46), (46, 1), (46, 1) AND (1, 46) AND there is only 1 distinct pairs.

Count Distinct Pairs in C++ using unordered_map or multiset

We can sort the original numbers, and easily skip the duplicates for the first number (assuming we are always counting with a less number). However, this takes O(N.LogN) time.

We can then use a hashmap e.g. unordered_map in C++ or the std::multiset to record the number of frequencies for each number. Then for special cases such as K=A+A, we need to make sure the appearance of A is at least twice.

We can also push all the pairs into a set (unorderd_set), and simply return the number of the pairs in the set. This improves the complexity to O(N) at the cost of using additional O(N) space.

// O(nlogN) time and O(N) space
int countPairs(vector<int> arr, long k) {
    // O(nlogN) sorting
    // if we are not sorting, we may store the results using set
    // which will help us remove the duplicates
    sort(begin(arr), end(arr)); 
    // number frequencies
    unordered_map<int, int> data;
    for (const auto &n: arr) {
        data[n] ++;
    }
    int r = 0;
    for (int i = 0; i < arr.size(); i ++) {
        if ((i == 0) || (arr[i] != arr[i - 1])) { // skip duplicates
            if (data.find(k - arr[i]) != data.end()) { // find it in hash table
                if (k - arr[i] >= arr[i]) { // make sure the smaller comes first
                    if (k == arr[i] * 2) { // special case
                        if (data[arr[i]] < 2) { 
                            // duplicate number at least should appear twice
                            continue; 
                        }
                    }
                    r ++;
                }
            }
        }
    }
    return r;
}

// O(nlogN) time and O(N) space
int countPairs(vector<int> arr, long k) {
    // O(nlogN) sorting
    // if we are not sorting, we may store the results using set
    // which will help us remove the duplicates
    sort(begin(arr), end(arr)); 
    // number frequencies
    unordered_map<int, int> data;
    for (const auto &n: arr) {
        data[n] ++;
    }
    int r = 0;
    for (int i = 0; i < arr.size(); i ++) {
        if ((i == 0) || (arr[i] != arr[i - 1])) { // skip duplicates
            if (data.find(k - arr[i]) != data.end()) { // find it in hash table
                if (k - arr[i] >= arr[i]) { // make sure the smaller comes first
                    if (k == arr[i] * 2) { // special case
                        if (data[arr[i]] < 2) { 
                            // duplicate number at least should appear twice
                            continue; 
                        }
                    }
                    r ++;
                }
            }
        }
    }
    return r;
}

As a coding style, the above may have deep code indention. You may want to rewrite it with early exits – e.g. if something then continue/break.

–EOF (The Ultimate Computing & Technology Blog) —

评论列表

添加评论: 姓名：　验证码：