I have the following script that opens a file that contains two column IPs, domains
Like 108.170.206.91 || com. EventMedia.product2pixel
and tries the previously-awarded domain name because it is in the FN form and then removes the second level domain through the public suffix module.
like- invitemedia.com`
It works well, but it is a bit slow, can anyone help me to make it faster is?
Here's my script:
psl = publicsuffixList () d = {} f = open (file, 'r') for n, in enumerate (f) Line: ip, reversed_domain_1 = line .split ('|') Try: reversed_domain_2 = reversed_domain_1.split ('.') Reversed_domain_3 = list (reverse (reversed_domain_2)) domain = (include '.' '(Reversed_domain_3)). Strip ('.') Domain = psl .get_public_suffix (domain) Specifies the domain if the IP in D: D [IP] .add (domain) Other: D [IP] = Set (except [domain]): Print (Domain) issued to the IP, domain in d.iteritems (): print ("% s |% d"% (ip, domain), file = output)
You can use a default word for the d
variable that you are handling. If you do a piece instead of reverse
and similar, then there may be better performance too.
The default word means it does not need to check that it exists before the key is handled Is:
d = defaultdict (set) d [1] .add (2)
Comments
Post a Comment