Gradient vanishing of G in the DCGAN example · pytorch/examples#822

(0 comments) (0 reactions) (0 assignees)Python (9,429 forks)batch import

help wanted

Repository metrics

Stars: (21,634 stars)
PR merge metrics: (No merged PRs in 30d)

Description

Hello,

I have trained the DCGAN with the default hyper-parameter settings on the downloaded "img_align_celeba" dataset (recommended in the tutorial). However, the results reveal strong gradient vanishing of G. While Loss_D keeps decreasing towards 0, Loss_G grows high (towards 100).

It seems that D is trained so well, preventing a good training on G. I didn't do any modifications on the code. Do you know what happened?

Thanks!

Contributor guide

Research direction: Investigate the default hyperparameters in the DCGAN example. Check the learning rates for generator and discriminator, and consider adjusting them. Look at the loss function implementation to ensure it matches the original DCGAN paper. Additionally, try using different GAN training techniques like label smoothing or adding noise to the discriminator inputs.
Tech stack: python
Domain: machine learning
Issue type: Research
Difficulty: 3
Estimated time: 3-5 days
Activity status: Stale
Clarity: Needs investigation
Prerequisites: Understanding of GANsFamiliarity with PyTorchBasic knowledge of DCGAN
Newbie friendliness: 20

Repository metrics

Description

Contributor guide

Get fresh easy issues in your inbox.