Python Search A File For Text Using Input From Another File

October 25, 2023 Post a Comment

I'm new to python and programming. I need some help with a python script. There are two files each containing email addresses (more than 5000 lines). Input file contains email addr

Solution 1:

Maybe I'm missing something, but why not use a pair of sets?

#!/usr/local/cpython-3.3/bin/python

data_filename = 'dfile1.txt'
input_filename = 'ifile1.txt'withopen(input_filename, 'r') as input_file:
    input_addresses = set(email_address.rstrip() for email_address in input_file.readlines())

withopen(data_filename, 'r') as data_file:
    data_addresses = set(email_address.rstrip() for email_address in data_file.readlines())

print(input_addresses.intersection(data_addresses))

Solution 2:

mitan8 gives the problem you have, but this is what I would do instead:

withopen(inputfile, "r") as f:
    names = set(i.strip() for i in f)

output = []

withopen(datafile, "r") as f:
    for name in f:
        if name.strip() in names:
            print name

This avoids reading the larger datafile into memory.

If you want to write to an output file, you could do this for the second with statement:

withopen(datafile, "r") as i, open(outputfile, "w") as o:
    for name in i:
        if name.strip() in names:
            o.write(name)

Solution 3:

Here's what I would do:

names=[]
outputList=[]
withopen(inputfile) as f:
    for line in f:
        names.append(line.rstrip("\n")

myEmails=set(names)

withopen(outputfile) as fd, open("emails.txt", "w") as output:
    for line in fd:
        for name in names:
            c=line.rstrip("\n")
            if name in myEmails:
                print name #for console
                output.write(name) #for writing to file

Solution 4:

I think your issue stems from the following:

name = fd.readline()
if name[1:-1] in names:

name[1:-1] slices each email address so that you skip the first and last characters. While it might be good in general to skip the last character (a newline '\n'), when you load the name database in the "dfile"

withopen(inputfile, 'r') as f:
    names = f.readlines()

you are including newlines. So, don't slice the names in the "ifile" at all, i.e.

if name innames:

Solution 5:

I think you can remove name = fd.readline() since you've already got the line in the for loop. It'll read another line in addition to the for loop, which reads one line every time. Also, I think name[1:-1] should be name, since you don't want to strip the first and last character when searching. with automatically closes the files opened.

PS: How I'd do it:

withopen("dfile1") as dfile, open("ifile") as ifile:
    lines = "\n".join(set(dfile.read().splitlines()) & set(ifile.read().splitlines())
print(lines)
withopen("ofile", "w") as ofile:
    ofile.write(lines)

In the above solution, basically I'm taking the union (elements part of both sets) of the lines of both the files to find the common lines.

lacucinadiadine