TriggerScrape is a python script to for example map out the cluster of Swedish sites containing highly anti-immigrant content.
It does this by the following procedure:
- Start at some entry point, with many outgoing links
- Collecting all outgoing links
- Randomly choosing a subsample of them and visiting them
- Looking at how many trigger words are found on those links
- Visiting them again by probability set by previous step
- If the percentage trigger words by the number of visited links is high – use that site as next starting point and restart at (1)
It looks something like this:
In the end it produces a list such as:
and if you give it enough time, it will map out the most of the sites in that cluster.