Here’s a quick script to check the spelling on multiple web pages at once. To run it on Linux or Mac (and possibly also on Bash for Windows):
- The script requires that
lynx
andaspell
are installed on the computer. - Save the code in a file called
spellcheck_urls
(no file extension necessary) - Make the file executable:
chmod u+x spellcheck_urls
- Create a file that contains a list of URLs with one URL per line. Save it in the same directory as
urls.txt
- Read the script carefully to be sure that you understand exactly what it does.
- If it looks like it does what you want, type
./spellcheck_urls
to run it.
#!/bin/bash
URLS_FILE='urls.txt'
OUTPUT_DIR='output'
REPORT_FILE="$OUTPUT_DIR/report.txt"
# Check if the output directory exists
if [ ! -d "$OUTPUT_DIR" ]; then
echo 'creating the output directory'
mkdir "$OUTPUT_DIR"
else
echo 'output directory exists - skipping...'
fi
# Read in the URLs from the text file
while IFS='' read -r l || [ -n "$l" ]; do
echo "processing $l"
# Create the header
echo "URL: $l" >> $REPORT_FILE
echo "========================================" >> $REPORT_FILE
# Download the text content from the current URL and spellcheck it
echo "$(lynx --dump $l | aspell --list | sort | uniq -c)" >> $REPORT_FILE
echo "" >> $REPORT_FILE
echo "" >> $REPORT_FILE
# Throttle the requests here, if you want
sleep 3
done < "$URLS_FILE"
After the script finished, check the file output/report.txt
for the results.
If anyone has suggestions for improvement, please leave a comment below.