Jeff Jeff - 1 year ago 114
Python Question

Most efficient way of compressing sparse graph in ASCII?

I am working on software that deals with sparse matrices. They're not huge (anywhere from 15x15 to ~300x300). I'd like to be able to store a representation of the matrix in a short string so that I can store it as a value in a CSV file (along with many other things).

What I've tried so far is to treat the matrix as a binary string, and convert to base62:

import numpy as np
import networkx as nx

def graphToHash(a,numnodes):
def baseN(num,b,numerals="0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"):
return ((num == 0) and numerals[0]) or (baseN(num // b, b, numerals).lstrip(numerals[0]) + numerals[num % b])
return str(numnodes) + '!' + baseN(int(''.join([str(i) for i in flatten_list(a)]),2), 62)

def flatten_list(l):
l1=[item for sublist in l if isinstance(sublist,list) or isinstance(sublist,np.ndarray) for item in sublist]
l=l1+[item for item in l if not isinstance(item,list) and not isinstance(item,np.ndarray)]
return l

# example
import sys
len(hash) # ~4300 characters

This works fine for small graphs (30 nodes is ~150 characters). However larger graphs are bit unwieldy (160 nodes is ~4300 and requires me to increase the recursion limit).

Because the graph is binary and sparse, I know I can do better. Ideally I'd like to continue using strings that are {0-9,a-z,A-Z} because I know these will not pose any issues in my CSV file.

What is the most efficient way to compress a binary sparse matrix?

Answer Source

How about using the sparse6 format. It uses printable ASCII characters.

import networkx as nx
G = nx.connected_watts_strogatz_graph(160,4,.3,1000)
s = nx.generate_sparse6(G)

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download