How to Convert String to Lower Case in C++
In this article, we will introduce how to convert string to the lower case in C++.
The first thing to ask yourself before you do the string conversion in C++ is what kind of encoding my input string has? Because if you will use std::lower
with multi-byte encoding characters, then you would definitely get buggy code.
Even if the following function seems neat implementation of the std::string
lowercase conversion, but it doesn’t convert all the characters to the lower case because its encoding is UTF-8
.
#include <algorithm>
#include <iostream>
std::string toLower(std::string s) {
std::transform(s.begin(), s.end(), s.begin(),
[](unsigned char c) { return std::tolower(c); });
return s;
}
int main() {
std::string string1 = u8"ÅSH to LoWer WÅN";
std::cout << "input string: " << string1 << std::endl
<< "output string: " << toLower(string1) << std::endl;
return 0;
}
The above code works fine for the ASCII strings and some other non-ASCII strings as well, but once you give it a bit unusual input, say some Latin symbols in it, the output would not be satisfactory.
Output:
input string: ÅSH to LoWer WÅN
output string: Åsh to lower wÅn
It is incorrect since it should have lowered Å
symbol to å
. So, how can we solve this issue to get the correct output?
The best portable way of doing this is using the ICU
(International Components for Unicode) library, which is mature enough to offer stability, widely accessible, and will keep your code cross-platform.
We only need to include the following headers in our source file. There’s a good chance that these libraries are already included and available on your platform, so the code samples should work fine. But if you get IDE/compile-time errors, please see the instructions to download the library in ICU documentation website.
#include <unicode/locid.h>
#include <unicode/unistr.h>
#include <unicode/ustream.h>
Now that we have included headers, so we can write std::string
to lowercase conversion code as follows:
#include <unicode/locid.h>
#include <unicode/unistr.h>
#include <unicode/ustream.h>
#include <iostream>
int main() {
std::string string1 = u8"ÅSH to LoWer WÅN";
icu::UnicodeString unicodeString(string1.c_str());
std::cout << "input string: " << string1 << std::endl
<< "output string: " << unicodeString.toLower() << std::endl;
return 0;
}
Note that we should compile this code with the following compiler flags to include ICU library dependencies:
g++ sample_code.cpp -licuio -licuuc -o sample_code
Run the code, and we get the correct output as expected:
input string: ÅSH to LoWer WÅN
output string: åsh to lower wån
The very same function can process some different language that we don’t usually expect as user input, and we can also explicitly specify locale as a parameter to the toLower
function:
#include <iostream>
#include <unicode/unistr.h>
#include <unicode/ustream.h>
#include <unicode/locid.h>
int main() {
std::string string2 = "Κάδμῳ ἀπιϰόμενοι.. Ελληνας ϰαὶ δὴ ϰαὶ γράμματα, οὐϰ ἐόντα πρὶν Ελλησι";
icu::UnicodeString unicodeString2(string2.c_str());
std::cout << unicodeString2.toLower("el_GR") << std::endl;
return 0;
}
Founder of DelftStack.com. Jinku has worked in the robotics and automotive industries for over 8 years. He sharpened his coding skills when he needed to do the automatic testing, data collection from remote servers and report creation from the endurance test. He is from an electrical/electronics engineering background but has expanded his interest to embedded electronics, embedded programming and front-/back-end programming.
LinkedIn FacebookRelated Article - C++ String
- How to Capitalize First Letter of a String in C++
- How to Find the Longest Common Substring in C++
- How to Find the First Repeating Character in a String in C++
- How to Compare String and Character in C++
- How to Get the Last Character From a String in C++
- How to Remove Last Character From a String in C++