Peter Wooster Peter Wooster - 1 year ago 27
MySQL Question

Should I use a denormaized field for amount outstanding in Ruby on Rails

I have a simple normalized model for invoices and payments.

The relevant fields are

invoices.id
invoices.amount_with_tax
invoice_payments.id

invoice_payments.invoice_id
invoice_payments.amount


The relations are

Invoice
has_many :invoice_payments
has_many :payments, through: :invoice_payments

InvoicePayment
belongs_to :invoice
belongs_to :payment

Payment
has_many :invoice_payments
has_many :invoices, through: :invoice_payments


The Payment model is not relevant to this question but exhibits the same problem.

The variables
invoices
and
@invoices
are ActiveRecord relations.

If I add an
amount_outstanding
field to the invoice table the code to get unpaid invoices is simple:

invoices.where('amount_outstanding > 0')


but the maintenance of the denormalized field is a nuisance.

My current code to get this without that field is complex:

invoices = invoices.left_outer_joins(:invoice_payments)
invoices = invoices.distinct.select('invoices.*, sum(invoice_payments.amount) as total_payments')
invoices = invoices.group('invoices.id')
invoices = invoices.having('sum(invoice_payments.amount) < invoices.amount_with_tax or sum(invoice_payments.amount) is null')


I have not done any performance testing on this, but it looks like it won't be very good when there are a lot of invoices in the system. The code generated by ActiveRecord is similar to the code I'd write in SQL, so it wouldn't be horrible.

Similar code will be needed on the Payment side to find payments that are not completely accounted to invoices.

Finding unpaid invoices is a frequent operation, as is reporting the amount outstanding. For a well run company the amount of uncollected accounts receivable is small compared to that which has been paid. Over time it will become a much smaller percentage of the total invoices.

Applying a payment to an invoice is usually done once and they are rarely reverted or edited. Most payments relate to one invoice and cover it completely.

Also attempting to determine if there are unpaid invoices or getting their count in erb e.g. returns a SQL error.

I personally do not like denormalized data, since the work to maintain it is usually about triple that of leaving it normalized. In this case the field must be updated if
amount_with_tax
changes, or if the
InvoicePayment
is edited, deleted or moved to another invoice. The code to calculate the value of the amount_outstanding however is very simple for a single invoice:

paid = invoice_payments.sum(:amount)
self.amount_outstanding = self.amount_with_tax - paid
save


I'm currently using callbacks on the Invoice and InvoicePayments models to maintain the
amount_outstanding
.

So the question is, is a denormalized field worth it in this case or is there a better way to achieve this?

Answer Source

It depends a lot on use cases that display or rely on amount outstanding. How often do you use this value and will the users accept little delay (considering that it would be slow to compute this value) just to display this value while you can still display all other information on the page promptly?

You mentioned you have not done performance testing for this scenario and also there is no information on the scope of system in terms of number of invoices processed. As with any financial system, all the transactions tend to be associated with a particular financial period, so its more appropriate to consider that period instead of all the invoices stored in the system to access any performance needs.

In general, the processing that is required without denormalized field should be able to handle quite a large number of invoices and of course it will depend on your server setup too. Also there is a possibility that you can still optimize the code and reduce the number of queries to get the same functionality.

If you haven't already seen the issues with performance, you might be attempting premature optimization which is not always a best practice. So I would consider avoiding denormalized field until I see the need for it and then do the necessary optimization as and when needed. Hope that helps.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download