Stepan Yakovenko Stepan Yakovenko - 1 month ago 7
Java Question

How to remove bad characters that are not suitable for utf8 encoding in MySQL?

I have dirty data. Sometimes it contains characters like this. I use this data to make queries like

WHERE a.address IN ('mydatahere')


For this character I get


org.hibernate.exception.GenericJDBCException: Illegal mix of collations (utf8_bin,IMPLICIT), (utf8mb4_general_ci,COERCIBLE), (utf8mb4_general_ci,COERCIBLE) for operation ' IN '


How can I filter out characters like this? I use Java.

Thanks.

Answer

May be this will help someone as it helped me.

public static String removeBadChars(String s) {
  if (s == null) return null;
  StringBuilder sb = new StringBuilder();
  for(int i=0;i<s.length();i++){ 
    if (Character.isHighSurrogate(s.charAt(i))) continue;
    sb.append(s.charAt(i));
  }
  return sb.toString();
}
Comments