Spaces:

Dataset-Tools
/

pdf-to-page-images-dataset

Running

davanstrien HF Staff commited on Sep 19, 2024

Commit

662b961

1 Parent(s): 6d2b0a3

create card template

Files changed (1) hide show

dataset_card_template.py ADDED Viewed

+DATASET_CARD_TEMPLATE = """
+# Dataset Card for {hf_repo}
+## Dataset Description
+This dataset contains images converted from PDFs using the PDFs to Page Images Converter Space.
+- **Number of images:** {num_images}
+- **Number of PDFs processed:** {num_pdfs}
+- **Sample size per PDF:** {sample_size}
+- **Created on:** {creation_date}
+## Dataset Creation
+### Source Data
+The images in this dataset were generated from user-uploaded PDF files.
+### Processing Steps
+1. PDF files were uploaded to the PDFs to Page Images Converter.
+2. Each PDF was processed, converting selected pages to images.
+3. The resulting images were saved and uploaded to this dataset.
+## Dataset Structure
+The dataset consists of JPEG images, each representing a single page from the source PDFs.
+### Data Fields
+- `images/`: A folder containing all the converted images.
+### Data Splits
+This dataset does not have specific splits.
+## Additional Information
+- **Contributions:** Thanks to the PDFs to Page Images Converter for creating this dataset.
+"""