Java String.split () special character processing

introduction

  • JDK 1.8

    split function

    notice that the split function takes a regular expression as an argument. The split function is defined as:

    /**
     * Splits this string around matches of the given <a
     * href="../util/regex/Pattern.html#sum">regular expression</a>.
     *
     * <p> This method works as if by invoking the two-argument {@link
     * #split(String, int) split} method with the given expression and a limit
     * argument of zero.  Trailing empty strings are therefore not included in
     * the resulting array.
     *
     * <p> The string {@code "boo:and:foo"}, for example, yields the following
     * results with these expressions:
     *
     * <blockquote><table cellpadding=1 cellspacing=0 summary="Split examples showing regex and result">
     * <tr>
     *  <th>Regex</th>
     *  <th>Result</th>
     * </tr>
     * <tr><td align=center>:</td>
     *     <td>{@code { "boo", "and", "foo" }}</td></tr>
     * <tr><td align=center>o</td>
     *     <td>{@code { "b", "", ":and:f" }}</td></tr>
     * </table></blockquote>
     *
     *
     * @param  regex
     *         the delimiting regular expression
     *
     * @return  the array of strings computed by splitting this string
     *          around matches of the given regular expression
     *
     * @throws  PatternSyntaxException
     *          if the regular expression's syntax is invalid
     *
     * @see java.util.regex.Pattern
     *
     * @since 1.4
     * @spec JSR-51
     */
    public String[] split(String regex) { ... }
    

    special symbol processing

    The

    split function takes a regular expression as an argument, so special processing is required when the special symbol of the regular expression is used as a delimiter.

    For example, . is a wildcard in regular expressions, and matches any single character except for line breaks (\n, \r).

    can be handled in two ways for special symbols:

    • escaped. For example, \.
    • put it in brackets. For example, the [.]
      Example

      String[] s1 = "a.b.c".split("\\.");
      System.out.println(Arrays.asList(s1)); //[a, b, c]
      
      String[] s2 = "a.b.c".split("[.]");
      System.out.println(Arrays.asList(s2)); //[a, b, c]
      

      Reference

      https://www.runoob.com/regexp/regexp-metachar.html

Read More: