The Quantum Physicist The Quantum Physicist - 1 month ago 22
C Question

Avoid matrix half-vectorization in LAPACK

The answer to my question is most likely "No", but maybe someone has a smart solution to this problem?

Here's the problem. For example, the lapack function

calculates the eigenvalues of a complex Hermitian matrix. The problem is that all C++ implementations of matrices store either row-major or column-major matrices, while
zheev()
takes a dense upper or lower triangular matrix.

So my question is: Is there any way to avoid copying my matrix to a new array that stores only the lower or upper triangular part and use my current full matrix in lapack?

Answer

The example you linked on zheev() already makes use of an unpacked LDA*N=N*N matrix. Indeed, the hermitian matrix does not need to be packed: you may not have to make a copy of your matrix. Watch out: zheev() modifies the matrix A!

LAPACK handles other storage mode for matrices: see the naming scheme of LAPACK. For instance:

  • zheev(): the memory footprint N*N and the storage are similar to the one of general unpacked N*N matrices. Depending on the value of the argument UPLO, the upper triangular part is used or ignored. Anyway, the matrix can be populated as if it were a general unpacked matrix of size N*N. In this case, the value of the argument UPLO should not change the obtained eigenvalues.
  • zhpev(): packed format. Either the upper diagonal items or the lower diagonal items are stored, depending on the value of the argument UPLO. The memory footprint for matrix storage is (N*(N+1))/2.
  • zhbev(): dedicated to band storage.

As you work with C or C++, here is a sample code using zheev() via the interface LAPACKE. It can be compiled by gcc main.c -o main -llapacke -llapack -lblas -lm -Wall. Moreover, this code ensures that the function zheev() returns the right eigenvectors, not the left ones. The left eigenvectors are the the complex conjugates of the right ones, as explained here.

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <complex.h>
#include <time.h>
#include <lapacke.h>


int main(void){

    int n=200;

    srand(time(NULL));

    // allocate the matrix, unpacked storage
    double complex** A=malloc(n*sizeof(double complex*));
    if(A==NULL){fprintf(stderr,"malloc failed\n");exit(1);}
    A[0]=malloc(n*n*sizeof(double complex));
    if(A[0]==NULL){fprintf(stderr,"malloc failed\n");exit(1);}
    int i;
    for(i=1;i<n;i++){
        A[i]=&A[0][n*i];
    }

    //populte the matrix, only the lower diagonal part
    int j;
    for(i=0;i<n;i++){
        for(j=0;j<i;j++){
            A[i][j]=rand()/((double)RAND_MAX)+rand()/((double)RAND_MAX)*I;

        }
        A[i][i]=rand()/((double)RAND_MAX)+42.;
    }



    //saving the matrix, because zheev() changes it.
    double complex** As=malloc(n*sizeof(double complex*));
    if(As==NULL){fprintf(stderr,"malloc failed\n");exit(1);}
    As[0]=malloc(n*n*sizeof(double complex));
    if(As[0]==NULL){fprintf(stderr,"malloc failed\n");exit(1);}
    for(i=1;i<n;i++){
        As[i]=&As[0][n*i];
    }
    for(i=0;i<n;i++){
        for(j=0;j<i;j++){
            As[i][j]=A[i][j];
        }
        As[i][i]=A[i][i];
    }
    //transpose part, conjugate
    for(i=0;i<n;i++){
        for(j=i+1;j<n;j++){
            As[i][j]=conj(A[j][i]);
        }
    }
    /*
for(i=0;i<n;i++){
for(j=0;j<n;j++){
printf("%g+I%g ",creal(As[i][j]),cimag(As[i][j]));
}
printf("\n");
}
     */

    //let's get the eigenvalues and the eigenvectors!
    double* w=malloc(n*sizeof(double));
    if(w==NULL){fprintf(stderr,"malloc failed\n");exit(1);}
    int lda = n;
    int info = LAPACKE_zheev(LAPACK_ROW_MAJOR, 'V', 'L', n, A[0], lda,  w);
    if(info<0){
        fprintf(stderr,"argument %d has illegal value\n",-info);
    }
    if(info>0){
        fprintf(stderr,"algorithm failed to converge... bad condition number ?\n");
    }

    //printing the eigenvalues...
    for(i=0;i<n;i++){
        printf("%d %g\n",i,w[i]);
    }

    //let's check that the column i of A is now a right eigenvectors corresponding to the eigenvalue w[i]...
    int l=42;

    double complex *res=malloc(n*sizeof(double complex));
    if(res==NULL){fprintf(stderr,"malloc failed\n");exit(1);}
    for(i=0;i<n;i++){
        res[i]=0;
        for(j=0;j<n;j++){
            res[i]+=As[i][j]*A[j][l];
        }
        res[i]-=w[l]*A[i][l];
    }
    double norm2res=0;
    double norm2vec=0;
    for(i=0;i<n;i++){
        norm2res+=creal(res[i])*creal(res[i])+cimag(res[i])*cimag(res[i]);
        norm2vec+=creal(A[i][l])*creal(A[i][l])+cimag(A[i][l])*cimag(A[i][l]);
    }
    printf("the norm of the eigenvector is %g\n",sqrt(norm2vec));
    printf("||Ax-\\lambda*x||_2/||x||_2 is about %g\n",sqrt(norm2res/norm2vec));
    //free the matrix
    free(A[0]);
    free(A);

    free(As[0]);
    free(As);

    free(w);
    free(res);
    return 0;
}

In the code above, a copy of the matrix is performed, but that is not required by LAPACKE_zheev(). Dealing with a matrix of 2000*2000, the memory footprint of the code above is about 167MB. That's more than twice of the size of the matrix (64MB) because a copy is performed. But it would be less than twice if the copy were not performed. Hence, LAPACKE_zheev() does not perform any copy of the matrix. Notice also that LAPACKE_zheev() allocates some space for the working array.