How to Remove Duplicates From Vector in C++
-
Remove Duplicates From a Vector in C++ Using
std::sort
andstd::unique
-
Remove Duplicates From a Vector in C++ Using
std::sort
andstd::unique
Withresize
-
Remove Duplicates From a Vector in C++ Using
std::set
-
Remove Duplicates From a Vector in C++ Using
std::unordered_set
for Improved Performance - Remove Duplicates From a Vector in C++ Using a Loop
- Conclusion
Efficiently managing data is a fundamental aspect of software development, and one common challenge is handling duplicate elements within a collection like a vector. In C++, where versatility and performance are crucial, knowing effective methods to remove duplicates from a vector is essential.
This article explores various techniques to achieve this goal, ranging from standard library algorithms like std::sort
and std::unique
to leveraging containers like std::set
and std::unordered_set
. Additionally, we’ll delve into a loop-based approach.
This article will guide you through the diverse strategies available for deduplicating vectors in C++.
Remove Duplicates From a Vector in C++ Using std::sort
and std::unique
Removal of duplicate elements from a vector in C++ can be efficiently achieved using the combination of std::sort
and std::unique
, two powerful functions provided by the C++ Standard Template Library (STL).
The std::sort
function is used to sort the elements in a specified range. In the context of removing duplicates, sorting is essential as it brings identical elements together, making it easier for std::unique
to identify and remove duplicates efficiently.
#include <algorithm>
#include <vector>
std::sort(myVector.begin(), myVector.end());
On the other hand, the std::unique
function is designed to eliminate consecutive duplicate elements within a sorted range. It shifts the unique elements towards the beginning of the range and returns an iterator pointing to the end of the new unique range.
auto last = std::unique(myVector.begin(), myVector.end());
With the sorted and unique elements identified, the duplicates can be erased from the vector using the erase
member function:
myVector.erase(last, myVector.end());
Now, let’s put these concepts into practice with a complete working example:
#include <algorithm>
#include <iostream>
#include <vector>
int main() {
std::vector<int> myVector = {10, 23, 10, 324, 10, 10, 424,
649, 110, 110, 129, 40, 424};
std::sort(myVector.begin(), myVector.end());
auto last = std::unique(myVector.begin(), myVector.end());
myVector.erase(last, myVector.end());
std::cout << "Unique elements: ";
for (const auto& element : myVector) {
std::cout << element << "; ";
}
return 0;
}
In the provided C++ code example, we begin by including the necessary headers for the Standard Template Library (STL) components we use:
#include <algorithm>
#include <iostream>
#include <vector>
In this example, our vector myVector
is initialized with a set of integers, some of which are duplicates. The initial vector looks like this:
std::vector<int> myVector = {10, 23, 10, 324, 10, 10, 424,
649, 110, 110, 129, 40, 424};
Now, the first critical step is to sort the vector using std::sort
. Sorting is necessary because std::unique
operates on sorted ranges.
The line of code below achieves this:
std::sort(myVector.begin(), myVector.end());
After sorting, our vector is transformed into a sorted sequence of elements:
10; 10; 10; 23; 40; 110; 110; 129; 324; 424; 424; 649;
Next, we utilize the std::unique
function to identify consecutive duplicate elements within the sorted range. It returns an iterator pointing to the end of the newly formed unique range:
auto last = std::unique(myVector.begin(), myVector.end());
Following the application of std::unique
, the vector now contains only the unique elements:
10; 23; 40; 110; 129; 324; 424; 649;
Finally, to reflect these modifications in the original vector, we use the erase
member function. It removes elements from the vector starting from the last
iterator up to the end:
myVector.erase(last, myVector.end());
Now, our vector is updated, containing unique elements only. To visualize the result, we loop through the modified vector and output the unique elements.
Code Output:
This output reflects the vector after successfully removing duplicates using the std::sort
and std::unique
combination.
Remove Duplicates From a Vector in C++ Using std::sort
and std::unique
With resize
When tasked with removing duplicate elements from a vector, we have seen that the combination of std::sort
and std::unique
is a convenient choice. However, an alternative approach is to use the resize
function instead of erase
to modify the vector’s size directly.
The initial steps remain the same, where the vector is sorted using std::sort
to bring duplicate elements together, and std::unique
is applied to identify and shift the unique elements:
#include <algorithm>
#include <iostream>
#include <iterator>
#include <vector>
std::sort(myVector.begin(), myVector.end());
auto last = std::unique(myVector.begin(), myVector.end());
Instead of erasing the duplicates with erase
, we use the resize
function, which directly modifies the size of the vector. The argument to resize
is the distance between the beginning of the vector and the iterator returned by std::unique
:
myVector.resize(std::distance(myVector.begin(), last));
Let’s explore this technique with a complete working example:
#include <algorithm>
#include <iostream>
#include <iterator>
#include <vector>
int main() {
std::vector<int> myVector = {10, 23, 10, 324, 10, 10, 424,
649, 110, 110, 129, 40, 424};
std::sort(myVector.begin(), myVector.end());
auto last = std::unique(myVector.begin(), myVector.end());
myVector.resize(std::distance(myVector.begin(), last));
std::cout << "Unique elements using resize: ";
for (const auto& element : myVector) {
std::cout << element << "; ";
}
return 0;
}
The code begins by including the necessary headers and initializing a vector with duplicate elements. After sorting and applying std::unique
, instead of erasing duplicates, we employ the resize
function to adjust the size of the vector directly.
This results in a vector containing only the unique elements.
myVector.resize(std::distance(myVector.begin(), last));
The loop at the end iterates through the modified vector to display the unique elements.
Code Output:
This output reflects the vector after successfully removing duplicates using std::sort
and std::unique
, with the resize
function providing an efficient alternative to the erase
operation.
Remove Duplicates From a Vector in C++ Using std::set
In C++, another effective approach to removing duplicate elements from a vector is by leveraging the std::set
container. Unlike vectors, sets automatically store unique elements, making them an excellent choice for deduplication tasks. In this article, we’ll explore the syntax and functionality of utilizing std::set
to achieve this goal.
The std::set
container in C++ automatically maintains a sorted, unique collection of elements. To remove duplicates from a vector, we can initialize a set with the vector elements, and the set’s uniqueness property takes care of discarding duplicate values:
#include <algorithm>
#include <iostream>
#include <iterator>
#include <set>
#include <vector>
std::set<int> uniqueSet(myVector.begin(), myVector.end());
Once the set is populated with unique elements, the assign
function can be employed to overwrite the original vector with these unique values:
myVector.assign(uniqueSet.begin(), uniqueSet.end());
Now, let’s put these concepts into practice with a complete working example:
#include <algorithm>
#include <iostream>
#include <iterator>
#include <set>
#include <vector>
int main() {
std::vector<int> myVector = {10, 23, 10, 324, 10, 10, 424,
649, 110, 110, 129, 40, 424};
std::set<int> uniqueSet(myVector.begin(), myVector.end());
myVector.assign(uniqueSet.begin(), uniqueSet.end());
std::cout << "Unique elements using std::set: ";
for (const auto& element : myVector) {
std::cout << element << "; ";
}
return 0;
}
In this example, we begin by including the necessary headers and initializing a vector with duplicate elements.
The critical step is creating a set, uniqueSet
, and populating it with the elements of the original vector. The set’s unique property automatically ensures that only distinct elements are stored.
std::set<int> uniqueSet(myVector.begin(), myVector.end());
Following the creation of the set, we utilize the assign
function to overwrite the original vector with the unique elements contained in the set:
myVector.assign(uniqueSet.begin(), uniqueSet.end());
Now, the vector, myVector
, is updated to contain unique elements only. The loop at the end iterates through the modified vector to display the unique elements.
Code Output:
This output displays the vector after successfully removing duplicates using the std::set
container. The set’s inherent uniqueness property simplifies the deduplication process, providing a clean and efficient solution.
Remove Duplicates From a Vector in C++ Using std::unordered_set
for Improved Performance
If performance is a priority, using std::unordered_set
to remove duplicates from a vector can be a highly efficient choice. Unlike std::set
, std::unordered_set
doesn’t maintain a sorted order, making it faster for insertion and lookup operations.
The std::unordered_set
container in C++ is an unordered associative container that stores unique elements. Due to its hash-based implementation, it provides constant-time average complexity for insertion and lookup operations.
In the context of removing duplicates, we can use it as follows:
#include <algorithm>
#include <iostream>
#include <iterator>
#include <unordered_set>
#include <vector>
std::unordered_set<int> uniqueSet(myVector.begin(), myVector.end());
After initializing the unordered set with vector elements, we can then use a loop to iterate through the original vector and erase elements that are not unique in the set:
myVector.erase(std::remove_if(myVector.begin(), myVector.end(),
[&uniqueSet](const int& val) {
return !uniqueSet.insert(val).second;
}),
myVector.end());
Let’s examine this approach with a complete working example:
#include <algorithm>
#include <iostream>
#include <iterator>
#include <unordered_set>
#include <vector>
int main() {
std::vector<int> myVector = {10, 23, 10, 324, 10, 10, 424,
649, 110, 110, 129, 40, 424};
std::unordered_set<int> uniqueSet(myVector.begin(), myVector.end());
myVector.erase(std::remove_if(myVector.begin(), myVector.end(),
[&uniqueSet](const int& val) {
return !uniqueSet.insert(val).second;
}),
myVector.end());
std::cout << "Unique elements using unordered_set: ";
for (const auto& element : uniqueSet) {
std::cout << element << "; ";
}
return 0;
}
The code begins by including the necessary headers and initializing a vector with duplicate elements. The unique elements are identified by inserting them into an std::unordered_set
.
The loop-based erasure then removes duplicates from the original vector based on the unique set.
myVector.erase(std::remove_if(myVector.begin(), myVector.end(),
[&uniqueSet](const int& val) {
return !uniqueSet.insert(val).second;
}),
myVector.end());
The loop condition checks if the insertion into the unordered set is successful. If an element already exists, indicating a duplicate, it is removed from the vector.
Code Output:
This output reflects the vector after successfully removing duplicates using std::unordered_set
. The unordered set’s hash-based implementation contributes to faster insertion and lookup times, making it a performant choice for deduplication tasks.
Remove Duplicates From a Vector in C++ Using a Loop
In certain scenarios where simplicity is prioritized, or performance considerations lead us away from using STL algorithms, a straightforward approach is to use a loop to remove duplicates from a vector in C++. This method involves iterating through the vector and selectively erasing duplicate elements based on their occurrence.
The core idea is to iterate through the vector and selectively erase elements that are duplicates. A loop condition checks if an element has already been encountered and, if so, removes it from the vector:
#include <iostream>
#include <vector>
for (auto it = myVector.begin(); it != myVector.end(); ++it) {
if (std::find(myVector.begin(), it, *it) != it) {
it = myVector.erase(it) - 1;
}
}
Let’s explore this loop-based technique with a complete working example:
#include <algorithm>
#include <iostream>
#include <vector>
int main() {
std::vector<int> myVector = {10, 23, 10, 324, 10, 10, 424,
649, 110, 110, 129, 40, 424};
for (auto it = myVector.begin(); it != myVector.end(); ++it) {
if (std::find(myVector.begin(), it, *it) != it) {
it = myVector.erase(it) - 1;
}
}
std::cout << "Unique elements using a loop: ";
for (const auto& element : myVector) {
std::cout << element << "; ";
}
return 0;
}
The code begins by including the necessary headers and initializing a vector with duplicate elements. The loop iterates through the vector using an iterator, and for each element, it checks if the element has already been encountered in the vector before the current position.
If a duplicate is found, the element is erased, and the iterator is adjusted to point to the last valid position.
for (auto it = myVector.begin(); it != myVector.end(); ++it) {
if (std::find(myVector.begin(), it, *it) != it) {
it = myVector.erase(it) - 1;
}
}
This loop continues until the end of the vector is reached, effectively removing duplicates.
Code Output:
This output reflects the vector after successfully removing duplicates using a loop. While this method may be less performant than some of the STL algorithm-based approaches, it provides a clear and straightforward solution for deduplicating a vector in C++.
Conclusion
Removing duplicates from a vector in C++ is a common task with various approaches, each offering its advantages and considerations. We explored several techniques in this article, including the use of std::sort
and std::unique
, std::set
, std::unordered_set
for improved performance and a loop-based approach.
The std::sort
and std::unique
combination provides a simple and effective solution, while std::set
offers an ordered alternative. For enhanced performance, especially with larger datasets, std::unordered_set
presents a hash-based approach.
Additionally, a loop-based method provides a straightforward alternative. The choice of method depends on the specific requirements, emphasizing the importance of considering factors such as performance, simplicity, and the need for a sorted output. With these techniques, you can confidently tackle the task of deduplicating vectors in C++ based on your unique project constraints and goals.
Founder of DelftStack.com. Jinku has worked in the robotics and automotive industries for over 8 years. He sharpened his coding skills when he needed to do the automatic testing, data collection from remote servers and report creation from the endurance test. He is from an electrical/electronics engineering background but has expanded his interest to embedded electronics, embedded programming and front-/back-end programming.
LinkedIn Facebook