How Do You Get Python to Parse Web Pages to Find a String?

Problem scenario
You know a string is buried in a series of web pages. How do you get Python to read the web pages and find the string?

Solution
Change the URLs as you desire. Change the "searchterm" variable assignment to the word of your choice. Then run this Python 3 program:

import re
import requests
listofurls = ['https://www.continualintegration.com/', 'https://www.continualintegration.com/miscellaneous-articles/page/1/', 'https://www.continualintegration.com/miscellaneous-articles/page/2/', 'https://www.continualintegration.com/miscellaneous-articles/page/3/', 'https://www.continualintegration.com/miscellaneous-articles/page/4/']

searchterm = "learned"

def finder(yoururl, searchterm):
    r = requests.get(yoururl)
    found = re.search(searchterm, r.text)
    if found:
      print("The string/pattern '" + searchterm + "' was found when searching " + yoururl)
    else:
      print("The string/pattern '" + searchterm + "' was not found when searching " + yoururl)


for x in listofurls:
  finder(x, searchterm)

Leave a comment

Your email address will not be published. Required fields are marked *