Awais Mahmood Awais Mahmood - 2 months ago 7x
SQL Question

Database design considerations for many unused columns with every table has same schema

I am a web developer so I don't know a lot about databases. The company I joined recently has a very mature Desktop ERP built in .NET and SQL Server, they are providing services to huge corporate clients and there design is working fine. But they didn't develop any web based system. But there database design is quite unusual. Let me explain it and then I will post my questions.

So, Now I joined them to develop them a web based ERP (a replica of there desktop system in web). Since I am building the application from scratch, they have given me liberty to revamp any thing which I think would effect positively.

Now the design is,

  • They have around 150 tables in database.

  • Every table has the same schema definition.

  • They have divided the fields in three categories.

    • Strings (so they assign 50 varchar(250) fields in database).

    • DateTime (so they assign 15 smalldatetime fields in database).

    • Numeric (so they assign 30 Numeric() fields in database).

  • All columns have names as (these names don't terrify the developers, in a week or two they are accustomed and even remembered many of the fields associations):

    • Strings (S1, S2, S3, S4 and so on).

    • DateTime (D1, D2, D3, D4 and so on).

    • Numerics (N1, N2, N3, N4 and so on).

  • As I have told you the schema. Every table consist of 95 columns. And only 15-20 columns are actually been used. The remaining 75-80 columns are NULL.

  • The tables are well normalized and indexes are maintained.

  • Number of rows in most of the tables are less than 1000. Only the transaction table records touches several hundred thousands.

  • The precision for numeric columns are by default (1, 0). When any field is selected to be used then the precision is adjusted as per the requirement.

  • An empty database is of ~4MB.

  • This design makes there development quite easy. Since they have a number of columns and whenever they need a field. They just select the data type, i.e. String or Numeric or DateTime and the next available column is assigned.

  • Only 9-10 tables have image fields.

I think this information is quite enough. Now I want to ask

  • Since, I don't know a lot about SQL. Is this design viable for web environment (web API which will be called from web client as well as from mobile)?

  • Since, every table has 75-80 NULL columns. Does they cost us a lot of memory in future, when transaction records will touch millions? (considering that application is multi tenant)

  • What are your suggestions to improve this design?



You have two choices:

  1. Use it, and live with it.

  2. Completely redesign it.

I recommend #1 because #2 will be hard and will be viewed skeptically by your colleagues and boss. Any problems with you progress will be put down to your crazy database design.

The database you describe embodies the classic entity-value-attribute design error. Instead of defining tables modeled on real-world entities in the universe of discourse, and using the DBMS to enforce and infer logical relationships among them, the designers opted to remove all meaning from the database to the application. Entities that should be in the database are constructed in memory using application logic that supplies meaning to S1 and such. From a database perspective, it's an absolute nightmare.

It's also understandable. EVA designs usually arise where there's little database expertise, and where the problem domain is poorly understood. That adds up to "anything could go in the database", and an EVA design will indeed hold "anything". To the extent your customers determine the actual design -- that is, the supply independent meanings for each database column -- the application acts as a kind of DBMS proxy. The fact that every table has a vast number of unused columns suggests their use may be customer-determined: the customer can "add a column", and the application plucks one from the unused pile. No schema change necessary. It's dynamic!

There's a whole industry based on that idea. For example, the so-called "master data management" tools boil down to an EVA design where the customer designs a database within the application, and application uses the DBMS in much the way you describe.