loading
FullScreen Fecha y Hora: 28-Apr-2024 20:46 IP Pública: 3.149.229.253 Usuario: Público
26/Oct/2013 Recuperar website en un unico pdf

 

 

The goal of this

The following script will download a website recursively into a collection of html files, convert them into PDFs and then concatenates them into a single PDF.

Prerequisites

You'll need pdftk, wget and wkhtmltopdf. Make sure that you have a wkhtmltopdf version that terminates properly, for example version 0.9.9.

If you're on OSX, you can install all of these tools via homebrew. The formula for pdftk can be found here.

The script

#!/bin/bash

echo "Collecting files from subfolders..."
for FILENAME in $(find . -type f -name '*\.html' -print | sed 's/^\.\///')
do
    mv $FILENAME `basename $FILENAME`
done

echo "Converting into PDF files..."
find . -name \*.html | sed 's/.html$//g' | xargs -n 1 -I X wkhtmltopdf --quiet X.html X.pdf

echo "Concatenating the PDF files..."
pdftk *.pdf cat output book.pdf

 
 

Ip Pública 3.149.229.253
Navegador Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)
País United States
Ciudad