I am creating a database table with over 1.3 billion rows.
The stucture: ID, info1, info2, info3, city
In the 1.3 billion rows there are a lot of the same city's.
I was wondering, what would be faster (when looking up information with PDO PHP)
You have the right idea. Don't repeat the same city multiple times. It will slow down your queries in part because the city names will take up much more space in memory than integers. Larger tables take more time to process, and with almost a billion and a half rows, it should make a significant difference. See this posting.
Repeating the same name thousands of times also makes it a lot more likely that you'll introduce variations (typos) that make your query results unreliable. Finally, this design will make it much tougher to make changes to a city name.
Replace the plain city with a unique
city_id for each city. In another
city table, you'd have that same
city_id and a plain
name that holds the full city's name.
To profit from the change, you must let MySQL know of this relationship. When you create the tables, be sure to make
city_id the primary key of the city table, and in your big table make
city_id a foreign key.
This change will also allow you to expand your database more easily if you want to start tracking more info on a city. To learn more, look into database normalization.