Stepan Yakovenko Stepan Yakovenko - 5 months ago 27
Java Question

How to remove bad characters that are not suitable for utf8 encoding in MySQL?

I have dirty data. Sometimes it contains characters like this. I use this data to make queries like

WHERE a.address IN ('mydatahere')

For this character I get

org.hibernate.exception.GenericJDBCException: Illegal mix of collations (utf8_bin,IMPLICIT), (utf8mb4_general_ci,COERCIBLE), (utf8mb4_general_ci,COERCIBLE) for operation ' IN '

How can I filter out characters like this? I use Java.



May be this will help someone as it helped me.

public static String removeBadChars(String s) {
  if (s == null) return null;
  StringBuilder sb = new StringBuilder();
  for(int i=0;i<s.length();i++){ 
    if (Character.isHighSurrogate(s.charAt(i))) continue;
  return sb.toString();