C++, Implement filter function for refactoring to keep code dry, clean and flexible

Sub-title: Refactor current function by filter function to make it more general and reduce repeated code

·

4 min read

You may be asked to add a function to get a list of nested items fulfilling certain condition and there is already another function doing similar thing, but with different condition. You can certainly create a new function to achieve this and ignore the existing one, but this may produce a lot of repeated code. So here, we suggest to use filter function and explain the usage with the following example.

Scenario

Example project: github.com/ivanyu199012/13-Refactor-FilterF..

Current code is as below (filter1.cpp):

PersonVector getPeopleOldThan18(const CityVector &cityVector);

int main()
{
  const CityVector cities = initData();
  std::cout << "----------------------------" << std::endl;
  std::cout << "People larger than 18:" << std::endl;
  PersonVector personVector = getPeopleOldThan18(cities);
  for (auto person : personVector)
  {
    std::cout << person->getName() << ", Age: " << person->getAge() << std::endl;
  }
}

PersonVector getPeopleOldThan18(const CityVector &cityVector)
{
  PersonVector outputPersonVector;
  for (auto city : cityVector)
  {
    for (auto pPerson : city->getCitizens())
    {
      if (pPerson->getAge() > 18)
      {
        outputPersonVector.push_back(pPerson);
      }
    }
  }
  return outputPersonVector;
}

There is an existing function getPeopleOldThan18() to get a vector of person older than 18 years old and each class City instance contains a vector of people.

City data structure(Model.cpp):

new Person("John", 34, 10000), // Name, Age, Income
const CityVector initData()
{
  std::vector<Person *> londonCitizens{new Person("John", 34, 10000),
                                       new Person("Peter", 54, 20000),
                                       new Person("Mary", 17, 3000)};

  std::vector<Person *> beijingCitizens{new Person("Wong Yin", 32, 10000),
                                        new Person("Xiu Ming", 52, 20000),
                                        new Person("Tai Man", 15, 3000)};

  std::vector<Person *> seoulCitizens{new Person("Mr. Pak", 32, 10000),
                                      new Person("Gilbin", 52, 20000),
                                      new Person("YoungSoo", 15, 3000)};

  return CityVector {new City("London", londonCitizens),
                    new City("Beijing", beijingCitizens),
                    new City("Seoul", seoulCitizens)};
}

Requirement: Add a function to retrieve people with income larger than 15000

You can add a new function getPeopleEarnMoreThan15000() and ignore the existing function as below(filter2.cpp):

// New function
PersonVector getPeopleEarnMoreThan15000(const CityVector &cityVector)
{
  PersonVector outputPersonVector;
  for (auto city : cityVector)
  {
    for (auto pPerson : city->getCitizens())
    {
      if (pPerson->getIncome() > 15000)
      {
        outputPersonVector.push_back(pPerson);
      }
    }
  }
  return outputPersonVector;
}

PersonVector getPeopleOldThan18(const CityVector &cityVector)
{
  PersonVector outputPersonVector;
  for (auto city : cityVector)
  {
    for (auto pPerson : city->getCitizens())
    {
      if (pPerson->getAge() > 18)
      {
        outputPersonVector.push_back(pPerson);
      }
    }
  }
  return outputPersonVector;
}

But except the if statements if (pPerson->getAge() > 18) and pPerson->getIncome() > 15000, remaining code of 2 functions are the same and a lot of repeated code is produced.

Better solution: Filter function

The concept is to add a new function getPeopleWithFilter() with an extra function parameter personFilter, which function's input is Person and output is boolean, and call the function in the if statement if(personFilter(pPerson)).(filter3.cpp)

PersonVector getPeopleWithFilter(const CityVector &cityVector,
 // Function Parameter
 const std::function<bool(Person *)> &personFilter )
{
  PersonVector outputPersonVector;
  for (auto city : cityVector)
  {
    for (auto pPerson : city->getCitizens())
    {
      // Call the filter function
      if (personFilter(pPerson))
      {
        outputPersonVector.push_back(pPerson);
      }
    }
  }
  return outputPersonVector;
}

Call above function with the existing function getPeopleOldThan18() and new function getPeopleIncomeMoreThan15000().

PersonVector getPeopleIncomeMoreThan15000(const CityVector &cityVector)
{
  std::function<bool(Person *)>
      personFilter = [](const Person *pPerson) -> bool
  {
    return pPerson->getIncome() > 15000;
  };
  // Call the above function
  return getPeopleWithFilter(cityVector, personFilter);
}

PersonVector getPeopleOldThan18(const CityVector &cityVector)
{
  std::function<bool(Person *)>
      personFilter = [](const Person *pPerson) -> bool
  {
    return pPerson->getAge() > 18;
  };
  // Call the above function
  return getPeopleWithFilter(cityVector, personFilter);
}

This code has 2 advantages:

  • Keep the code dry (reduce repeated code).
  • Only need to define the filter function and call getPeopleWithFilter() if you need to add another function to retrieve a list of nested item in the future.

Reason I wrote this blog

I tried to share the coding technique I have learnt and used in my work and hope that others can gain advantage from it. Also, I can refer to this blog if I forget the skill in the future.

Maybe some of you notice, I have not wrote any blog for about a year since I have moved to another company from last year and I spend a lot of time to adapt the new environment.

In this company, rather than developing new software, mostly I fix bug, improve and enhance current software. So refactoring code becomes my main task.

I have concluded some important points for refactoring the code.

  • Understand what the code is about.
  • Find out why the code is wrote like this.
  • For fixing bug, discovering root cause is crucial
    • If time is limited, you may fix problem with a fast temporary solution first and later and DON'T FORGET, pinpoint root cause
    • If root cause identified, this CAN probably lead to more serious problem later
  • MINIMIZE CODE CHANGE since more change may produce more bugs or errors. In this blog, we only modify the content of the function getPeopleOldThan18() without changing name, input and output.

Did you find this article valuable?

Support Ivan Yu by becoming a sponsor. Any amount is appreciated!