How to webscrape Facebook with Python ===================================== .. post:: Apr 4, 2020 :exclude: :tags: tutorial, python, webscrape, requests, beautifulsoup :author: adriaan The problem ----------- I am sure you have seen posts from a company's or small business's Facebook page where they ask you to like, share and follow in order to win a prize. But, how do these businesses then determine who have won? They will have to go and compare the names of all the people who: * liked the post, * shared the post and * followed their page and then randomly select a winner from that group of people. If there are about 10 people who engaged with that promotional post, then it won't take too long, but imagine if hundreds of people participated? It will take quite a lot of time to determine the winner then... To help save a bit of time, we are going to construct a program that will scrape all the names of the people who engaged with the promotion post and randomly select a winner from the eligible participants. **Right! Let's get started!** The data page_source -------------------- We are interested in the names of people who follow the business's page and who liked and shared the post in question. All of this information is available on Facebook. In the end we will have three lists: * a list containing the names of people who follow the business's Facebook page, * a list of people who have liked the promotional post and * a list of people who have shared the post. The names that appear in all three lists, will be put in a final list and a random winner will be selected from the list of eligible names. Coding the web scraper ---------------------- Requirements ```````````` For this program to work, you will need to install: * Python 3.x * Requests library * Beautifulsoup library * Selenium library Get the HTML data ````````````````` First thing we need to do is to get the HTML data of the Facebook post we are interested in, into our Python script. We can do this using the Python library, ``requests``. Web scraping Facebook is a bit different, as the content is behind a login page. We will need to first have a look at the HTML code of the login page. For this project, we are going to log into ``https://mbasic.facebook.com``: * start working your way through the HTML until you find the ``