Joel Sinofsky Joel Sinofsky - 2 months ago 12
SQL Question

Manipulating data WITHIN excel using SQL

I have an excel report I am working on which uses VBA to manipulate data.

Although there isn't a reason I can't use VBA for my purposes, I would rather do it in SQL as I am much more proficient in that language. In order to do so, I am facing a couple of challenges and have simplified my problem to one easier to solve.

Say I have a Excel spreadsheet with the number 1 in cell A1, 2 in cell A2, and 3 in cell A3. In cell A4 I am trying to sum these, resulting in the number 6.

The SUM() function would get me the answer I would like, but not the way I want to go about it.

Instead here is what I am trying to do...

1) read cells A1->A3 into a table object (call this x) using VBA
2) declare a string variable and set it to 'SELECT SUM(column1) FROM x'
3) execute that sql string
4) store the results in cell A4


Is this a feasible way of going about things? If so, could an example be posted?

Answer Source

Yes, it can be done. No, it shouldn't be done. If you need the sum of 3 cells, compute the sum of 3 cells - Excel has built-in functions specifically made for this.

=SUM(A1:A3)

Type that in [A4] and you'll get your sum without writing any code, in the most efficient way possible, and without surprising the Hell out of anyone looking at what you've done.


Still not convinced? Okay. Have a seat, grab some pop-corn, and enjoy the ride.

read cells A1->A3 into a table object (call this x) using VBA

Excel isn't a database, it doesn't have tables - not in the way you mean the word "table". But that's no showstopper.

Say your workbook has 3 sheets, code-named Sheet1, Sheet2 and Sheet3 (that's the default anyway). So you have Sheet1!A1:A3 populated with some numbers that you want to SUM up using , because... doesn't matter why, just because.

Since we want the sum to be written into Sheet1!A4, we won't be using Sheet1 as our "table" - rather we'll treat it as our output.

So we'll copy Sheet1!A1:A3 to Sheet2:

Sheet2.Range("A1").Value = "Values" 'our column header
Sheet1.Range("A1:A3").Copy Sheet2.Range("A2") 'our values

Next, we need something that can treat Sheet2 as a "table", and execute SQL queries against it. So we'll set up an ADODB/OLEDB connection to Sheet2, execute the SQL query, get a Recordset object with the results, and then dump the value into Sheet1!A4.

Sloppy late-bound code doing this would look like this:

Public Sub OverkillSum()

    Dim connection As Object
    Set connection = CreateObject("ADODB.Connection")
    connection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
                    "Data Source=" & ThisWorkbook.FullName & ";" & _
                    "Extended Properties=""Excel 8.0;HDR=Yes;"";"

    Dim recordset As Object
    Set recordset = connection.Execute("SELECT SUM(Values) As Total FROM [Sheet2$]")

    Sheet1.Range("A4").Value = recordset.Fields("Total").Value
    recordset.Close
    connection.Close

End Sub

Note that the connection string requires ThisWorkbook.FullName, so that won't work in a throw-away workbook that you haven't saved yet.

Between what's above, and =SUM(A1:A3) in cell A4, the design decision should be a no-brainer.

  • Late-bound calls get resolved at run-time. This is overhead that can be avoided by referencing the ADODB type library and using ADODB.Connection and ADODB.Recordset types instead of working with Object interfaces.
  • Connection to the workbook is also extraneous run-time overhead.
  • Querying the worksheet through OLEDB is absolutely not justified for computing the sum of 3 values.
  • You need to remember to clean up your connections and recordsets!
  • Don't do that.
  • Just don't.
  • Sheet1.Range("A4").Value = Application.WorksheetFunction.Sum(Sheet1.Range("A1:A3")) is the one-liner equivalent of the native Excel worksheet function solution - that's still overkill, but at least it remains in the realm of Excel and doesn't involve flying to the Moon and back.
  • Did I say don't do that?

This type of solution is useful for other purposes, e.g. when you have a workbook laid out as a table, that contains information that should live in a database but somehow lives in an Excel worksheet, with so much data that opening it through Workbooks.Open and computing a complex aggregate (perhaps involving WHERE and GROUP BY clauses) would be inefficient with SUMIFS or other non-SQL means.