haimen haimen - 3 months ago 14
Python Question

Identify alternating Uppercase and lower case characters in python

I am having data as follows,

data['word']

1 Word1
2 WoRdqwertf point
3 lengthy word
4 AbCdEasc
5 Not to be filtered
6 GiBeRrIsH
7 zSxDcFvGnnn


I want to find out alternating capital and small letters in the string and remove those rows containing words like these. For ex., if we see here,
WoRdqwertf , AbCdEasc, GiBeRrIsH,zSxDcFvGnnn
has alternating characters and I need these to be removed.

The point here is, the first row which contains
Word1
shouldn't be removed because it has only one caps followed by one small. I want to remove the rows only when it has a caps, small, caps arrangement or small, caps, small arrangement. My output here should be,

data['word']

1 Word1
3 lengthy word
5 Not to be filtered


Can any body help me or give some idea how to approach this problem?

Answer

You can use string methods.

l = ['Word1','WoRdqwertf point','lengthy word','AbCdEasc', 'Not to be filtered','GiBeRrIsH', 'zSxDcFvGnnn']

n = []
for section in l:
    new_section = []
    for w in section.split():
        if w == w.title() or w == w.lower():
            new_section.append(w)
    s = ' '.join(new_section)
    if s:
        n.append(s)
    del new_section
print n

One-Liner ->

print filter(bool, [' '.join(w for w in section.split() if w == w.title() or w == w.lower()) for section in l])     

Output:

['Word1', 'point', 'lengthy word', 'Not to be filtered']