Ben Smith Ben Smith - 3 months ago 9
Python Question

Python read from a file, and only do work if a string isn't found

So I'm trying to make a reddit bot that will exec code from a submission. I have my own sub for controlling these clients.

while __name__ == '__main__':
string = open('config.txt').read()
for submission in subreddit.get_new(limit = 1):
if submission.url not in string:
f.write(submission.url + "\n")
f.close()
f = open('config.txt', "a")
string = open('config.txt').read()


So what this is suppose to do is read from the config file, then only do work if the submission url isn't in config.txt. However, it always sees the most recent post and does it's work. This is how F is opened.

if not os.path.exists('file'):
open('config.txt', 'w').close()
f = open('config.txt', "a")

Answer

First a critique of your existing code (in comments):

# the next two lines are not needed; open('config.txt', "a") 
# will create the file if it doesn't exist.
if not os.path.exists('file'):
    open('config.txt', 'w').close()
f = open('config.txt', "a")

# this is an unusual condition which will confuse readers
while __name__ == '__main__':
    # the next line will open a file handle and never explicitly close it
    # (it will probably get closed automatically when it goes out of scope,
    # but it's not good form)
    string = open('config.txt').read()
    for submission in subreddit.get_new(limit = 1):
        # the next line should check for a full-line match; as written, it 
        # will match "http://www.test.com" if "http://www.test.com/level2"
        # is in config.txt
        if submission.url not in string:
            f.write(submission.url + "\n")
            # the next two lines could be replaced with f.flush()
            f.close()
            f = open('config.txt', "a")
            # this is a cumbersome way to keep your string synced with the file,
            # and it never explicitly releases the new file handle
            string = open('config.txt').read()
    # If subreddit.get_new() doesn't return any results, this will act as
    # a busy loop, repeatedly requesting new results as fast as possible.
    # If that is undesirable, you might want to sleep here.
# file handle f should get closed after the loop

None of the problems pointed out above should keep your code from working (except maybe the imprecise matching). But simpler code may be easier to debug. Here's some code that does the same thing. Note: I assume there is no chance any other process is writing to config.txt at the same time. You could try this code (or your code) with pdb, line-by-line, to see whether it works as expected.

import time
import praw
r = praw.Reddit(...)
subreddit = r.get_subreddit(...)

if __name__ == '__main__':
    # open config.txt for reading and writing without truncating. 
    # moves pointer to end of file; closes file at end of block
    with open('config.txt', "a+") as f:
        # move pointer to start of file
        f.seek(0) 
        # make a list of existing lines; also move pointer to end of file
        lines = set(f.read().splitlines())

        while True:
            got_one = False
            for submission in subreddit.get_new(limit=1):
                got_one = True
                if submission.url not in lines:
                    lines.add(submission.url)
                    f.write(submission.url + "\n")
                    # write data to disk immediately
                    f.flush()
                    ...
            if not got_one:
                # wait a little while before trying again
                time.sleep(10)