We're developing a monitoring system. In our system values are reported by agents running on different servers. This observations reported can be values like:
IntMeasure -> INTEGER
FloatMeasure -> FLOAT
Status -> varchar(255)
I haven't done a formal study, but from my own experience I would guess that more than 80% of database design flaws are generated from designing with performance as the most important (if not only) consideration.
If a good design calls for multiple tables, create multiple tables. Don't automatically assume that joins are something to be avoided. They are rarely the true cause of performance problems.
The primary consideration, first and foremost in all stages of database design, is data integrity. "The answer may not always be correct, but we can get it to you very quickly" is not a goal any shop should be working toward. Once data integrity has been locked down, if performance ever becomes an issue, it can be addressed. Don't sacrifice data integrity, especially to solve problems that may not exist.
With that in mind, look at what you need. You have observations you need to store. These observations can vary in the number and types of attributes and can be things like the value of a measurement, the notification of an event and the change of a status, among others and with the possibility of future observations being added.
This would appear to fit into a standard "type/subtype" pattern, with the "Observation" entry being the type and each type or kind of observation being the subtype, and suggests some form of type indicator field such as:
create table Observations( ..., ObservationKind char( 1 ) check( ObservationKind in( 'M', 'E', 'S' )), ... );
But hardcoding a list like this in a check constraint has a very low maintainability level. It becomes part of the schema and can be altered only with DDL statements. Not something your DBA is going to look forward to.
So have the kinds of observations in their own lookup table:
ID Name Meaning == =========== ======= M Measurement The value of some system metric (CPU_Usage). E Event An event has been detected. S Status A change in a status has been detected.
(The char field could just as well be int or smallint. I use char here for illustration.)
Then fill out the Observations table with a PK and the attributes that would be common to all observations.
create table Observations( ID int identity primary key, ObservationKind char( 1 ) not null, DateEntered date not null, ..., constraint FK_ObservationKind foreign key( ObservationKind ) references ObservationKinds( ID ), constraint UQ_ObservationIDKind( ID, ObservationKind ) );
It may seem strange to create a unique index on the combination of Kind field and the PK, which is unique all by itself, but bear with me a moment.
Now each kind or subtype gets its own table. Note that each kind of observation gets a table, not the data type.
create table Measurements( ID int not null, ObservationKind char( 1 ) check( ObservationKind = 'M' ), Name varchar( 32 ) not null, -- Such as "CPU Usage" Value double not null, -- such as 55.00 ..., -- other attributes of Measurement observations constraint PK_Measurements primary key( ID, ObservationKind ), constraint FK_Measurements_Observations foreign key( ID, ObservationKind ) references Observations( ID, ObservationKind ) );
The first two fields will be the same for the other kinds of observations except the check constraint will force the value to the appropriate kind. The other fields may differ in number, name and data type.
Let's examine an example tuple that may exist in the Measurements table:
ID ObservationKind Name Value ... ==== =============== ========= ===== 1001 M CPU Usage 55.0 ...
In order for this tuple to exist in this table, a matching entry must first exist in the Observations table with an ID value of 1001 and an observation kind value of 'M'. No other entry with an ID value of 1001 can exist in either the Observations table or the Measurements table and cannot exist at all in any other of the "kind" tables (Events, Status). This works the same way for all the kind tables.
I would further recommend creating a view for each kind of observation which will provide a join of each kind with the main observation table:
create view MeasurementObservations as select ... from Observations o join Measurements m on m.ID = o.ID;
Any code that works solely with measurements would need to only hit this view instead of the underlying tables. Using views to create a wall of abstraction between the application code and the raw data greatly enhances the maintainability of the database.
Now the creation of another kind of observation, such as "Error", involves a simple Insert statement to the ObservationKinds table:
F Fault A fault or error has been detected.
Of course, you need to create a new table and view for these error observations, but doing so will have no impact on existing tables, views or application code (except, of course, to write the new code to work with the new observations).