C# Tip: Use a SortedSet to avoid duplicates and sort items

Read Time:2 Minute, 55 Second

As you probably know, you can create collections of items without duplicates by using a HashSet<T> object.

It is quite useful to remove duplicates from a list of items of the same type.

How can we ensure that we always have sorted items? The answer is simple: SortedSet<T>!

HashSet: a collection without duplicates

A simple HashSet creates a collection of unordered items without duplicates.

This example

var hashSet = new HashSet<string>();
hashSet.Add("Turin");
hashSet.Add("Naples");
hashSet.Add("Rome");
hashSet.Add("Bari");
hashSet.Add("Rome");
hashSet.Add("Turin"); var resultHashSet = string.Join(',', hashSet);
Console.WriteLine(resultHashSet);

prints this string: Turin,Naples,Rome,Bari. The order of the inserted items is maintained.

SortedSet: a sorted collection without duplicates

To sort those items, we have two approaches.

You can simply sort the collection once you’ve finished adding items:

var hashSet = new HashSet<string>();
hashSet.Add("Turin");
hashSet.Add("Naples");
hashSet.Add("Rome");
hashSet.Add("Bari");
hashSet.Add("Rome");
hashSet.Add("Turin"); var items = hashSet.ToList<string>().OrderBy(s => s); var resultHashSet = string.Join(',', items);
Console.WriteLine(resultHashSet);

Or, even better, use the right data structure: a SortedSet<T>

var sortedSet = new SortedSet<string>(); sortedSet.Add("Turin");
sortedSet.Add("Naples");
sortedSet.Add("Rome");
sortedSet.Add("Bari");
sortedSet.Add("Rome");
sortedSet.Add("Turin"); var resultSortedSet = string.Join(',', sortedSet);
Console.WriteLine(resultSortedSet);

Both results print Bari,Naples,Rome,Turin. But the second approach does not require you to sort a whole list: it is more efficient, both talking about time and memory.

Use custom sorting rules

What if we wanted to use a SortedSet with a custom object, like User?

public class User { public string FirstName { get; set; } public string LastName { get; set; } public User(string firstName, string lastName) { FirstName = firstName; LastName = lastName; }
}

Of course, we can do that:

var set = new SortedSet<User>(); set.Add(new User("Davide", "Bellone"));
set.Add(new User("Scott", "Hanselman"));
set.Add(new User("Safia", "Abdalla"));
set.Add(new User("David", "Fowler"));
set.Add(new User("Maria", "Naggaga"));
set.Add(new User("Davide", "Bellone")); foreach (var user in set)
{ Console.WriteLine($"{user.LastName} {user.FirstName}");
}

But, we will get an error: our class doesn’t know how to compare things!

That’s why we must update our User class so that it implements the IComparable interface:

public class User : IComparable
{ public string FirstName { get; set; } public string LastName { get; set; } public User(string firstName, string lastName) { FirstName = firstName; LastName = lastName; } public int CompareTo(object obj) { var other = (User)obj; var lastNameComparison = LastName.CompareTo(other.LastName); return (lastNameComparison != 0) ? lastNameComparison : (FirstName.CompareTo(other.FirstName)); }
}

In this way, everything works as expected:

Abdalla Safia
Bellone Davide
Fowler David
Hanselman Scott
Naggaga Maria

Notice that the second Davide Bellone has disappeared since it was a duplicate.

Wrapping up

Choosing the right data type is crucial for building robust and performant applications.

In this article, we’ve used a SortedSet to insert items in a collection and expect them to be sorted and without duplicates.

I’ve never used it in a project. So, how did I know that? I just explored the libraries I was using!

From time to time, spend some minutes reading the documentation, have a glimpse of the most common libraries, and so on: you’ll find lots of stuff that you’ve never thought existed!

Toy with your code! Explore it. Be curious.

And have fun!

🐧

Source: https://www.code4it.dev/csharptips/sorted-set


You might also like this video