Character Classes
If we browse through java regular expressions, we immediately find a table summarizing regular expression constructs. Yes, that’s what we are going to see here.
In below table, the left-hand column specifies the regular expression constructs, while the right-hand column describes the conditions under which each construct will match.
Construct | Description |
[abc] | a, b, or c (simple class) |
[^abc] | Any character except a, b, or c (negation) |
[a-zA-Z] | a through z, or A through Z, inclusive (range) |
[a-d[m-p]] | a through d, or m through p: [a-dm-p] (union) |
[a-z&&[def] | d, e, or f (intersection) |
[a-z&&[^bc] | a through z, except for b and c: [ad-z] (subtraction) |
[a-z&&[^m-p]] | a through z, and not m through p: [a-lq-z] (subtraction) |
Note: For the sake of simplicity I am not showing the code but only the console screen. You can refer the previous part for sample code.
Simple classes – [abc]
This expression will match for each character mentioned in square bracket. Here we used [abcde], so it replaces a, b,c,d,e into *.
Input String: Tech.Bruiser
Regular Expression: [abcde]
Replacement String: *
Output String: T**h.Bruis*r
Negation – [^abc]
To match all characters except those listed, insert the “^” metacharacter at the beginning of the character class.
Input String: Tech.Bruiser
Regular Expression: [^abcde]
Replacement String: *
Output String: *ec*******e*
Ranges – [a-zA-Z]
To match a range of characters we can use [a-zA-Z] for alphabets and [1-9] for numeric.
Input String: Tech.Bruiser
Regular Expression: [a-e]
Replacement String: *
Output String: T**h.Bruis*r
Input String: Tech.Bruiser
Regular Expression: [^a-e]
Replacement String: *
Output String: *ec*******e*
Input String: Tech.Bruiser673
Regular Expression: [5-7]
Replacement String: *
Output String: Tech.Bruiser**3
Input String: Tech.Bruiser673
Regular Expression: [^5-7]
Replacement String: *
Output String: ************67*
Unions – [a-d[m-p]]
What if I want to match 2 or more ranges to be matched ? Then union the right option for that.
Input String: Tech.Bruiser673
Regular Expression: [a-c[S-U]]
Replacement String: *
Output String: *e*h.Bruiser673
Input String: Tech.Bruiser6738
Regular Expression: [1-3[5-7]]
Replacement String: *
Output String: Tech.Bruiser***8
Regular Expression: [1-3[A-C]]
Replacement String: *
Output String: Tech.*ruiser67*8
Intersections – [a-z&&[def]
To match characters common between 2 ranges. For [0-5&&[3-9]] , the common values are 3,4 and 5.
Input String: Tech.Bruiser6738
Regular Expression: [0-9&&[345]]
Replacement String: *
Output String: Tech.Bruiser67*8
Input String: Tech.Bruiser6738
Regular Expression: [a-z&&[c-e]]
Replacement String: *
Output String: T**h.Bruis*r6738
Input String: Tech.Bruiser6738
Regular Expression: [0-9&&[3-5]]
Replacement String: *
Output String: Tech.Bruiser67*8
Subtraction – [a-z&&[^m-p]]
To match characters which are not common in given characters set. For [0-9&&[^345]], the matching characters are 0,1,2,6,7,8 and 9.
Input String: Tech.Bruiser6738
Regular Expression: [0-9&&[^345]]
Replacement String: *
Output String: Tech.Bruiser**3*