Escaping a String from getting regex parsed in Java

Question!

In Java, suppose I have a String variable S, and I want to search for it inside of another String T, like so:

   if (T.matches(S)) ...

(note: the above line was T.contains() until a few posts pointed out that that method does not use regexes. My bad.)

But now suppose S may have unsavory characters in it. For instance, let S = "[hi". The left square bracket is going to cause the regex to fail. Is there a function I can call to escape S so that this doesn't happen? In this particular case, I would like it to be transformed to "\[hi".

By : doodaddy


Answers

As Tom Hawtin said, you need to quote the pattern. You can do this in two ways (edit: actually three ways, as pointed out by @diastrophism):

  1. Surround the string with "\Q" and "\E", like:

    if (T.matches("\\Q" + S + "\\E"))
    
  2. Use Pattern instead. The code would be something like this:

    Pattern sPattern = Pattern.compile(S, Pattern.LITERAL);
    if (sPattern.matcher(T).matches()) { /* do something */ }
    

    This way, you can cache the compiled Pattern and reuse it. If you are using the same regex more than once, you almost certainly want to do it this way.

Note that if you are using regular expressions to test whether a string is inside a larger string, you should put .* at the start and end of the expression. But this will not work if you are quoting the pattern, since it will then be looking for actual dots. So, are you absolutely certain you want to be using regular expressions?



Try Pattern.quote(String). It will fix up anything that has special meaning in the string.



T.contains() (according to javadoc : http://java.sun.com/javase/6/docs/api/java/lang/String.html) does not use regexes. contains() delegates to indexOf() only.

So, there are NO regexes used here. Were you thinking of some other String method ?

By : anjanb


This video can help you solving your question :)
By: admin