Quick-and-easy way to extract body from a html file:
cat myfile.html | tr -d '\n' | grep -o -E '<\s*body[^>]*>(.*?)<\s*/\s*body\s*\>'
- cat – print file contents
- tr – remove newlines
- grep (and a regular expr) – get the content
You could use redirect (“>”) to send output to a file (instead of standard output):
cat myfile.html | tr -d '\n' | grep -o -E '<\s*body[^>]*>(.*?)<\s*/\s*body\s*\>' > output.html
Recent Comments