Edward Edward - 1 year ago 68
Python Question

How can I manipulate an array declared as `self.tab[('_',0)]` without explicitly knowing what it contains?

I am writing a code in python that will be reading each character from a file and save its number of occurrences. As it is a homework assignement, I am not allowed to change the way the array was declared.

The array was declared in this way:

def __init__(self):
self.tab = [('_', 0)] * 100
self.size = 0

Now, every time I read a character, I check wheter I already had noticed it or not:

def add(self, c): # c is the character that was read

for i in range(0,self.size):
if self.tab[i] == (c, ): # this is where my problem occurs.
#How should should I check if the
#character given as an argument is
#present in the array I declared

self.tab[i] = ? #Here I want to add 1 to the number
#of occurrences of the character.
#How should I do it?

As I said in the question, I don't know what the character equals to and what is the number does the second column equals to. I want to be able to add 1 to the number of occurrences without knowing how many occurrences there was.

I don't expect an answer that will give me the exact solution to my particular situation. All I need is a set of rules and exemples on how to work in such cases.

Answer Source

As I mentioned in the comment, this is not a great data structure to use for this problem.

Firstly, tuples are immutable, i.e., they can't be updated. To change a string or integer in one of those self.tab tuples you basically need to create a new tuple and replace the original one. So there's really not much point in initialising the list with 100 tuples that are going to be discarded. Secondly, it's not efficient to do a linear scan over a list to look for matching characters.

The sensible way to do this task in Python would be to use the Counter class defined in the collections module. However, it's also quite easy to implement this using a plain dictionary, or a defaultdict.

But anyway, here's one way to do it using the data structure given in the question.

class CharCounter(object):
    def __init__(self):
       self.tab = [('_', 0)] * 100
       self.size = 0 

    def add(self, c): # c is the character that was read
        for i in range(1 + self.size):
            ch, count = self.tab[i]
            if ch == c:
                self.tab[i] = (c, count + 1)
            self.tab[self.size] = (c, 1)
            self.size += 1

# test
counter = CharCounter()
for c in 'this is a test':

for i in range(counter.size):
    print(i, counter.tab[i])


0 ('t', 3)
1 ('h', 1)
2 ('i', 2)
3 ('s', 3)
4 (' ', 3)
5 ('a', 1)
6 ('e', 1)

Note that this code does not add any _ chars found in the input. Presumably, _ is being used to indicate an empty table slot; it would be more usual in Python to use an empty string, None, or perhaps a sentinel object (eg an instance of object).

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download