Skip to content
Snippets Groups Projects
Commit deff42f8 authored by ZeroAct's avatar ZeroAct
Browse files

update readme

parent 9ba2b2a2
No related branches found
No related tags found
No related merge requests found
......@@ -131,4 +131,6 @@ dmypy.json
# etc
*.png
*.jpg
*.pth
\ No newline at end of file
*.pth
./doc/*
\ No newline at end of file
# SceneTextRemover-pytorch
\ No newline at end of file
# Scene Text Remover Pytorch Implementation
This is a minimal implementation of [Scene text removal via cascaded text storke detection and erasing](https://arxiv.org/pdf/2011.09768.pdf). This github repository is for studying on image in-painting for scene text erasing. Thank you :)
## Requirements
Python 3.7 or later with all [requirements.txt](./requirements.txt) dependencies installed, including `torch>=1.6`. To install run:
```
$ pip install -r requirements.txt
```
## Model Summary
![model architecture](./doc/model.png)
This model has u-net like sub modules.
`Gd` detects text stroke image `Ms` with `I` and `M`. `G'd` detects more precise text stroke `M's`.
Similarly, `Gr` generates text erased image `Ite`, and `G'r` generates more precise output `I'te`.
## Custom Dictionary
Not to be confused, I renamed the names.
`I` : Input Image (with text)
`Mm` : Text area mask (`M` in the model)
`Ms` : Text stroke mask; output of `Gd`
`Ms_` : Text stroke mask; output of `G'd`
`Msgt` : Text stroke mask ; ground truth
`Ite` : Text erased image; output of `Gr`
`Ite_` : Text erased image; output of `G'r`
`Itegt`: Text erased image; ground truth
## Prepare Dataset
You need to prepare background images in `backs` directory and text binary images in `font_mask` directory.
![background image, text image example](./doc/back.png)
[part of background image sample, text binary image sample]
Executing `python create_dataset.py` will automatically generate `I`, `Itegt`, `Mm`, `Msgt` data.
(If you already have `I`, `Itegt`, `Mm`, `Msgt`, you can skip this section)
```
├─dataset
│ ├─backs
│ │ # background images
│ └─font_mask
│ │ # text binary images
│ └─train
│ │ └─I
│ │ └─Itegt
│ │ └─Mm
│ │ └─Msgt
│ └─val
│ └─I
│ └─Itegt
│ └─Mm
│ └─Msgt
```
I generated my dataset with 709 background images and 2410 font mask.
I used 17040 pairs for training and 4260 pairs for validation.
![](./doc/dataset_example.png)
Thanks for helping me gathering background images [sina-Kim]([sina-Kim (github.com)](https://github.com/sina-Kim)).
## Train
All you need to do is:
``` python
python train.py
```
## Result
From the left
`I`, `Itegt`, `Ite`, `Ite_`, `Msgt`, `Ms`, `Ms_`
* Epoch 2
![](./doc/epoch1.png)
* Epoch 5
![](./doc/epoch5.png)
* Epoch 10
![](./doc/epoch10.png)
* Epoch 30
![](./doc/epoch30.png)
These are not good enough for real task. I think the reason is lack of dataset and simplicity.
But, it was a good experience for me to implement the paper.
## Issue
If you are having a trouble to run this code, please use issue tab. Thank you.
doc/back.png

14.2 KiB

doc/dataset_example.png

222 KiB

doc/epoch1.PNG

124 KiB

doc/epoch10.PNG

151 KiB

doc/epoch30.PNG

117 KiB

doc/epoch5.PNG

147 KiB

doc/model.png

280 KiB

doc/text.png

1.13 KiB

opencv-python
matplotlib
numpy
tqdm
\ No newline at end of file
File added
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment