Spark infer schema from Mysql DB on RDS(AWS environment),
data type in one of columns-"active" is tinyint(1) wait possible value:
It's not spark that converts the
TINYINT type into a
boolean but the j-connector used under the hood.
So, actually you don't need to specify a schema for that issue. Because what's actually causing this is the jdbc driver that treats the datatype
TINYINT(1) as the
BIT type (because the server silently converts
TINYINT(1) when creating tables).
You can check all the tips and gotchas of the jdbc connector in the MySQL official Connector/J Configuration Properties guide.
You just need to pass the right parameters for your jdbc connector by adding the following to your url connection :
val newUrl = s"$oldUrl&tinyInt1isBit=false" val data = spark.read.format("jdbc") .option("url", newUrl) // your other jdbc options .load