Regex Whitespace in Java
-
Use the
matches()
Method to Find Whitespace Using Regular Expressions in Java -
Use the
Pattern
andMatcher
Classes to Find Whitespace Using Regular Expressions in Java -
Use the
String.replaceAll
Method to Find Whitespace Using Regular Expressions in Java - Conclusion
Handling and manipulating strings is a common task in Java programming. Sometimes, you might need to identify and work with whitespaces within a string.
Regular expressions provide a powerful and flexible way to achieve this. In this article, we will explore various methods to find whitespaces in a string using regular expressions in Java, covering the matches()
method, Pattern
and Matcher
classes, and the String.replaceAll
method.
Use the matches()
Method to Find Whitespace Using Regular Expressions in Java
The matches()
method is a static method of the Pattern
class in Java. It takes two parameters: the first being the regular expression pattern to match and the second being the string to be tested against the pattern.
Its syntax is as follows:
boolean matches(String regex, CharSequence input)
The method returns a boolean value, indicating whether the entire string matches the specified regular expression.
Let’s explore the code example below, which utilizes the matches()
method to identify whitespaces using different whitespace regex characters:
import java.util.regex.Pattern;
public class RegWhiteSpace {
public static void main(String[] args) {
boolean whitespaceMatcher1 = Pattern.matches("\\s+", " ");
boolean whitespaceMatcher2 = Pattern.matches("\\s", " ");
boolean whitespaceMatcher3 = Pattern.matches("[\\t\\p{Zs}]", " ");
boolean whitespaceMatcher4 = Pattern.matches("\\u0020", " ");
boolean whitespaceMatcher5 = Pattern.matches("\\p{Zs}", " ");
System.out.println("\\s+ ---------- " + whitespaceMatcher1);
System.out.println("\\s ----------- " + whitespaceMatcher2);
System.out.println("[\\t\\p{Zs}] --- " + whitespaceMatcher3);
System.out.println("\\u0020 ------- " + whitespaceMatcher4);
System.out.println("\\p{Zs} ------- " + whitespaceMatcher5);
}
}
In the presented example, the first scenario utilizes the regex \s+
to match one or more whitespace characters. The input string, " "
(three spaces), successfully triggers the pattern, and as a result, the boolean variable whitespaceMatcher1
is set to true
.
boolean whitespaceMatcher1 = Pattern.matches("\\s+", " ");
Moving on to the second case, the regex \s
is employed to match a single whitespace character. When applied to the input string " "
(a single space), the pattern successfully identifies the whitespace, leading to the assignment of true
to the boolean variable whitespaceMatcher2
.
boolean whitespaceMatcher2 = Pattern.matches("\\s", " ");
The third situation introduces the regex [\\t\\p{Zs}]
, designed to match a single whitespace character. Functionally equivalent to \s
, this pattern is employed to demonstrate an alternative approach.
The input string remains " "
(a single space), and the result is consistent with the previous cases—whitespaceMatcher3
is set to true
.
boolean whitespaceMatcher3 = Pattern.matches("[\\t\\p{Zs}]", " ");
In the fourth instance, the Unicode character \u0020
, representing a space, is utilized as a regex to match a single whitespace character. The input string, once again " "
(a single space), successfully triggers the pattern, resulting in true
being assigned to whitespaceMatcher4
.
boolean whitespaceMatcher4 = Pattern.matches("\\u0020", " ");
Finally, the fifth case employs the regex \p{Zs}
, specifically designed to match a whitespace separator character. When applied to the input string " "
(a single space), the pattern successfully identifies the whitespace separator, setting whitespaceMatcher5
to true
.
boolean whitespaceMatcher5 = Pattern.matches("\\p{Zs}", " ");
Output:
\s+ ---------- true
\s ----------- true
[\t\p{Zs}] --- true
\u0020 ------- true
\p{Zs} ------- true
Use the Pattern
and Matcher
Classes to Find Whitespace Using Regular Expressions in Java
In addition to the matches()
method, Java provides the Pattern
and Matcher
classes, offering more fine-grained control over regular expression matching. The Pattern
class is responsible for compiling regular expressions into patterns, while the Matcher
class performs matching operations on a given input string based on a compiled pattern.
The find()
method of the Matcher
class is particularly useful for identifying whether a substring in the input string matches the pattern.
boolean find()
Below is a Java code example that demonstrates the use of the Pattern
and Matcher
classes to identify whitespaces using different regex characters:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegWhiteSpaceMatcher {
public static void main(String[] args) {
Pattern pattern1 = Pattern.compile("\\s+");
Matcher matcher1 = pattern1.matcher(" ");
boolean whitespaceMatcher1 = matcher1.find();
Pattern pattern2 = Pattern.compile("\\s");
Matcher matcher2 = pattern2.matcher(" ");
boolean whitespaceMatcher2 = matcher2.find();
Pattern pattern3 = Pattern.compile("[\\t\\p{Zs}]");
Matcher matcher3 = pattern3.matcher(" ");
boolean whitespaceMatcher3 = matcher3.find();
Pattern pattern4 = Pattern.compile("\\u0020");
Matcher matcher4 = pattern4.matcher(" ");
boolean whitespaceMatcher4 = matcher4.find();
Pattern pattern5 = Pattern.compile("\\p{Zs}");
Matcher matcher5 = pattern5.matcher(" ");
boolean whitespaceMatcher5 = matcher5.find();
System.out.println("\\s+ ---------- " + whitespaceMatcher1);
System.out.println("\\s ----------- " + whitespaceMatcher2);
System.out.println("[\\t\\p{Zs}] --- " + whitespaceMatcher3);
System.out.println("\\u0020 ------- " + whitespaceMatcher4);
System.out.println("\\p{Zs} ------- " + whitespaceMatcher5);
}
}
Here in the first case, the pattern \s+
is compiled using Pattern.compile()
, and the resulting pattern is applied to the string " "
using a Matcher
. By invoking the find()
method, we successfully identify the presence of multiple whitespaces, and the corresponding boolean variable whitespaceMatcher1
is set to true
.
Pattern pattern1 = Pattern.compile("\\s+");
Matcher matcher1 = pattern1.matcher(" ");
boolean whitespaceMatcher1 = matcher1.find();
Moving on to the second scenario, the pattern \s
is compiled and applied using a Matcher
to the string " "
. The find()
method effectively detects a single whitespace, leading to the assignment of true
to whitespaceMatcher2
.
The third case introduces the pattern [\\t\\p{Zs}]
, designed to match a single whitespace and equivalent to \s
. After compiling and applying the pattern, the find()
method confirms the existence of whitespace, resulting in whitespaceMatcher3
being set to true
.
In the fourth instance, the Unicode character \u0020
is compiled as a pattern to match a single whitespace. Using a Matcher
on the input string " "
, the find()
method successfully identifies a single whitespace, and whitespaceMatcher4
is set to true
.
Finally, the fifth scenario employs the pattern \p{Zs}
, compiled to match a whitespace separator. The Matcher
identifies a whitespace separator in the input string " "
, setting whitespaceMatcher5
to true
.
Output:
\s+ ---------- true
\s ----------- true
[\t\p{Zs}] --- true
\u0020 ------- true
\p{Zs} ------- true
Use the String.replaceAll
Method to Find Whitespace Using Regular Expressions in Java
The replaceAll
method in Java is employed to replace substrings in a string that matches a specified regular expression with a given replacement. When used for whitespace detection, a regex pattern representing whitespaces is employed, and the method returns a new string with the specified replacements.
Its syntax is as follows:
String replaceAll(String regex, String replacement)
Here’s a code example that demonstrates the use of the String.replaceAll
method to identify whitespaces using different regex characters:
public class RegWhiteSpaceReplace {
public static void main(String[] args) {
String input1 = " ";
String whitespaceReplaced1 = input1.replaceAll("\\s+", "");
boolean hasWhitespace1 = input1.length() != whitespaceReplaced1.length();
String input2 = " ";
String whitespaceReplaced2 = input2.replaceAll("\\s", "");
boolean hasWhitespace2 = input2.length() != whitespaceReplaced2.length();
String input3 = " ";
String whitespaceReplaced3 = input3.replaceAll("[\\t\\p{Zs}]", "");
boolean hasWhitespace3 = input3.length() != whitespaceReplaced3.length();
System.out.println("\\s+ ---------- " + hasWhitespace1);
System.out.println("\\s ----------- " + hasWhitespace2);
System.out.println("[\\t\\p{Zs}] --- " + hasWhitespace3);
}
}
In the first case, the method is applied to the string " "
with the regex \\s+
to match multiple whitespaces. The resulting modified string, whitespaceReplaced1
, has all whitespaces removed, and the boolean variable hasWhitespace1
is set to true
if the length of the original and modified strings differs, indicating the presence of multiple whitespaces.
String input1 = " ";
String whitespaceReplaced1 = input1.replaceAll("\\s+", "");
boolean hasWhitespace1 = input1.length() != whitespaceReplaced1.length();
On to the second scenario, the method is used on the string " "
with the regex \\s
to detect a single whitespace. Similar to the first case, whitespaceReplaced2
is created, and hasWhitespace2
is set to true
if a single whitespace is detected.
The third case employs the method on the string " "
with the regex [\\t\\p{Zs}]
, functionally equivalent to \\s
. The resulting modified string, whitespaceReplaced3
, has the single whitespace removed, and hasWhitespace3
is set to true
if the length of the original and modified strings differs.
Output:
\s+ ---------- true
\s ----------- true
[\t\p{Zs}] --- true
Conclusion
Identifying whitespaces within Java strings using regular expressions provides us with a versatile set of tools. The matches()
method, Pattern
and Matcher
classes, and the String.replaceAll
method each offers distinct advantages, catering to different use cases and coding preferences.
The matches()
method is concise and suitable for basic checks, while the Pattern
and Matcher
classes provide more control for intricate matching requirements. On the other hand, the String.replaceAll
method excels in whitespace removal scenarios.
Understanding these methods equips us with the flexibility to tackle whitespace-related challenges effectively. Regular expressions in Java continue to be a powerful ally for string manipulation, offering possibilities for crafting efficient and tailored solutions.
Rupam Saini is an android developer, who also works sometimes as a web developer., He likes to read books and write about various things.
LinkedInRelated Article - Java Regex
- How to Use Regex in the String.contains() Method in Java
- Regular Expression \\s in Java
- String Matches Regex in Java
- How to Handle Regex Special Characters in Java