Upload folder using huggingface_hub

Browse files

Files changed (12) hide show

README.md +160 -0
config.json +33 -0
onnx/model.onnx +3 -0
onnx/model_bnb4.onnx +3 -0
onnx/model_fp16.onnx +3 -0
onnx/model_int8.onnx +3 -0
onnx/model_q4.onnx +3 -0
onnx/model_q4f16.onnx +3 -0
onnx/model_quantized.onnx +3 -0
onnx/model_uint8.onnx +3 -0
preprocessor_config.json +23 -0
quantize_config.json +18 -0

README.md ADDED Viewed

	@@ -0,0 +1,160 @@

+---
+license: apache-2.0
+pipeline_tag: image-classification
+library_name: transformers.js
+tags:
+- deep-fake
+- ViT
+- detection
+- Image
+- transformers-4.49.0.dev0
+- precision-92.12
+- v2
+base_model:
+- prithivMLmods/Deep-Fake-Detector-v2-Model
+---
+# Deep-Fake-Detector-v2-Model (ONNX)
+This is an ONNX version of [prithivMLmods/Deep-Fake-Detector-v2-Model](https://huggingface.co/prithivMLmods/Deep-Fake-Detector-v2-Model). It was automatically converted and uploaded using [this Hugging Face Space](https://huggingface.co/spaces/onnx-community/convert-to-onnx).
+## Usage with Transformers.js
+See the pipeline documentation for `image-classification`: https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.ImageClassificationPipeline
+---
+![fake q.gif](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/PVkTbLOEBr-qNkTws3UsD.gif)
+# **Deep-Fake-Detector-v2-Model**
+# **Overview**
+The **Deep-Fake-Detector-v2-Model** is a state-of-the-art deep learning model designed to detect deepfake images. It leverages the **Vision Transformer (ViT)** architecture, specifically the `google/vit-base-patch16-224-in21k` model, fine-tuned on a dataset of real and deepfake images. The model is trained to classify images as either "Realism" or "Deepfake" with high accuracy, making it a powerful tool for detecting manipulated media.
+```
+Classification report:
+              precision    recall  f1-score   support
+     Realism     0.9683    0.8708    0.9170     28001
+    Deepfake     0.8826    0.9715    0.9249     28000
+    accuracy                         0.9212     56001
+   macro avg     0.9255    0.9212    0.9210     56001
+weighted avg     0.9255    0.9212    0.9210     56001
+```
+**Confusion Matrix**:
+  ```
+  [[True Positives, False Negatives],
+   [False Positives, True Negatives]]
+  ```
+![download.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/VLX0QDcKkSLIJ9c5LX-wt.png)
+**<span style="color:red;">Update :</span>** The previous model checkpoint was obtained using a smaller classification dataset. Although it performed well in evaluation scores, its real-time performance was average due to limited variations in the training set. The new update includes a larger dataset to improve the detection of fake images.
+| Repository | Link |
+|------------|------|
+| Deep Fake Detector v2 Model | [GitHub Repository](https://github.com/PRITHIVSAKTHIUR/Deep-Fake-Detector-Model) |
+# **Key Features**
+- **Architecture**: Vision Transformer (ViT) - `google/vit-base-patch16-224-in21k`.
+- **Input**: RGB images resized to 224x224 pixels.
+- **Output**: Binary classification ("Realism" or "Deepfake").
+- **Training Dataset**: A curated dataset of real and deepfake images.
+- **Fine-Tuning**: The model is fine-tuned using Hugging Face's `Trainer` API with advanced data augmentation techniques.
+- **Performance**: Achieves high accuracy and F1 score on validation and test datasets.
+# **Model Architecture**
+The model is based on the **Vision Transformer (ViT)**, which treats images as sequences of patches and applies a transformer encoder to learn spatial relationships. Key components include:
+- **Patch Embedding**: Divides the input image into fixed-size patches (16x16 pixels).
+- **Transformer Encoder**: Processes patch embeddings using multi-head self-attention mechanisms.
+- **Classification Head**: A fully connected layer for binary classification.
+# **Training Details**
+- **Optimizer**: AdamW with a learning rate of `1e-6`.
+- **Batch Size**: 32 for training, 8 for evaluation.
+- **Epochs**: 2.
+- **Data Augmentation**:
+  - Random rotation (±90 degrees).
+  - Random sharpness adjustment.
+  - Random resizing and cropping.
+- **Loss Function**: Cross-Entropy Loss.
+- **Evaluation Metrics**: Accuracy, F1 Score, and Confusion Matrix.
+# **Inference with Hugging Face Pipeline**
+```python
+from transformers import pipeline
+# Load the model
+pipe = pipeline('image-classification', model="prithivMLmods/Deep-Fake-Detector-v2-Model", device=0)
+# Predict on an image
+result = pipe("path_to_image.jpg")
+print(result)
+```
+# **Inference with PyTorch**
+```python
+from transformers import ViTForImageClassification, ViTImageProcessor
+from PIL import Image
+import torch
+# Load the model and processor
+model = ViTForImageClassification.from_pretrained("prithivMLmods/Deep-Fake-Detector-v2-Model")
+processor = ViTImageProcessor.from_pretrained("prithivMLmods/Deep-Fake-Detector-v2-Model")
+# Load and preprocess the image
+image = Image.open("path_to_image.jpg").convert("RGB")
+inputs = processor(images=image, return_tensors="pt")
+# Perform inference
+with torch.no_grad():
+    outputs = model(**inputs)
+    logits = outputs.logits
+    predicted_class = torch.argmax(logits, dim=1).item()
+# Map class index to label
+label = model.config.id2label[predicted_class]
+print(f"Predicted Label: {label}")
+```
+# **Dataset**
+The model is fine-tuned on the dataset, which contains:
+- **Real Images**: Authentic images of human faces.
+- **Fake Images**: Deepfake images generated using advanced AI techniques.
+# **Limitations**
+The model is trained on a specific dataset and may not generalize well to other deepfake datasets or domains.
+- Performance may degrade on low-resolution or heavily compressed images.
+- The model is designed for image classification and does not detect deepfake videos directly.
+# **Ethical Considerations**
+**Misuse**: This model should not be used for malicious purposes, such as creating or spreading deepfakes.
+**Bias**: The model may inherit biases from the training dataset. Care should be taken to ensure fairness and inclusivity.
+**Transparency**: Users should be informed when deepfake detection tools are used to analyze their content.
+# **Future Work**
+- Extend the model to detect deepfake videos.
+- Improve generalization by training on larger and more diverse datasets.
+- Incorporate explainability techniques to provide insights into model predictions.
+# **Citation**
+```bibtex
+@misc{Deep-Fake-Detector-v2-Model,
+  author = {prithivMLmods},
+  title = {Deep-Fake-Detector-v2-Model},
+  initial = {21 Mar 2024},
+  second_updated = {31 Jan 2025},
+  latest_updated = {02 Feb 2025}
+}

config.json ADDED Viewed

	@@ -0,0 +1,33 @@

+{
+  "_attn_implementation_autoset": true,
+  "_name_or_path": "prithivMLmods/Deep-Fake-Detector-v2-Model",
+  "architectures": [
+    "ViTForImageClassification"
+  ],
+  "attention_probs_dropout_prob": 0.0,
+  "encoder_stride": 16,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.0,
+  "hidden_size": 768,
+  "id2label": {
+    "0": "Realism",
+    "1": "Deepfake"
+  },
+  "image_size": 224,
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "label2id": {
+    "Deepfake": 1,
+    "Realism": 0
+  },
+  "layer_norm_eps": 1e-12,
+  "model_type": "vit",
+  "num_attention_heads": 12,
+  "num_channels": 3,
+  "num_hidden_layers": 12,
+  "patch_size": 16,
+  "problem_type": "single_label_classification",
+  "qkv_bias": true,
+  "torch_dtype": "float32",
+  "transformers_version": "4.49.0"
+}

onnx/model.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2e0662a81744aca769e240fd1daa3f44225c50d0565fd8fb8f702d1f26609c91
+size 343401688

onnx/model_bnb4.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:cafa8f6543f2892d7d63263c5484077b5416bf7507b4e5eb7a42f7ee7637b442
+size 51450010

onnx/model_fp16.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:07eee7514086ff66fdec3e3f55155e76cdb8e00a788c185e4c95d58040811191
+size 171801382

onnx/model_int8.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d8238e449fc640c2f0959e36e8aa778bd006065c0a8af4c7f6f54c8be9c9d9e4
+size 87333629

onnx/model_q4.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e239476e94466b225d490147c2f3c1ea64e969f0e09ae4f772e15a16d8e393a9
+size 56757898

onnx/model_q4f16.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d3ac808c1e632abe7d62160e67db63d10f6a952ebc0507cf3b6d5cd0de4a2594
+size 49718585

onnx/model_quantized.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3519c22b9695f99ddc00821228eeac91239065a90bfbdb4917858b3ec1dcfc42
+size 87333629

onnx/model_uint8.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3519c22b9695f99ddc00821228eeac91239065a90bfbdb4917858b3ec1dcfc42
+size 87333629

preprocessor_config.json ADDED Viewed

	@@ -0,0 +1,23 @@

+{
+  "do_convert_rgb": null,
+  "do_normalize": true,
+  "do_rescale": true,
+  "do_resize": true,
+  "image_mean": [
+    0.5,
+    0.5,
+    0.5
+  ],
+  "image_processor_type": "ViTFeatureExtractor",
+  "image_std": [
+    0.5,
+    0.5,
+    0.5
+  ],
+  "resample": 2,
+  "rescale_factor": 0.00392156862745098,
+  "size": {
+    "height": 224,
+    "width": 224
+  }
+}

quantize_config.json ADDED Viewed

	@@ -0,0 +1,18 @@

+{
+    "modes": [
+        "fp16",
+        "q8",
+        "int8",
+        "uint8",
+        "q4",
+        "q4f16",
+        "bnb4"
+    ],
+    "per_channel": true,
+    "reduce_range": true,
+    "block_size": null,
+    "is_symmetric": true,
+    "accuracy_level": null,
+    "quant_type": 1,
+    "op_block_list": null
+}