Derin Öğrenme Coursera Notları (Deep Learning Coursera Notes)

Coursera’dan aldığım derin öğrenme kurslarından anahtar kelimeleri daha sonrası için hafızada tutmak amacıyla bu yazıda paylaşmak istiyorum.


NEURAL NETWORK AND DEEP LEARNING

  • Supervised learning with neural network
  • Binary classification (örn: logistic regression)
  • Logistic regression, SVM -> traditional learning algorithms
  • Cost function
  • Gradient descent
  • Derivatives
  • Vectorization and broadcasting
  • Numpy, iPython, Jupyter
  • Activation functions (Softmax, ReLU, leaky ReLU, tanh (hiberbolik tanjant), swish (a self-gated activation function))
  • Forward / Back propagation
  • Random initialization
  • Shallow neural networks
  • CNN (for image)
  • Recurrent NN (Sequence data)
  • Deep reinforcement learning
  • Regression (Standart NN)
  • Structured data – Database, Unstructured data – audio, image, text
  • Tensorflow, Chainer, Theano, Pytorch
  • Normalization
    • Standart score
    • T test
    • Standardized moment
    • Coefficient of variance
    • Feature scaling (min-max)
  • Circuit Theory
  • Parameters
    • W, b
  • Hyperparameters
    • learning rate
    • #iterations
    • #hidden layers
    • #hidden units
    • activation function
    • momentum
    • minibatch size
    • regularization

IMPROVING DEEP NEURAL NETWORKS: HYPERPARAMETER TUNING, REGULARIZATION AND OPTIMIZATION

  • Dataset -> Training / Dev (hold-out validation) / Test sets
    • Büyük veri setleri için dağılım 98/1/1 olabilir. Geleneksel olarak 70/30 veya 60/20/20’dir.
  • Bias / variance.
    • high bias = underfitting
      • bigger network (her zaman işe yarar)
      • train longer (NN architecture search) (her zaman işe yaramaz)
    • high variance = overfitting
      • more data
      • regularization
        • L1 regularization
        • L2 regularization (lambd) – daha çok tavsiye ve tercih edilir.
        • Dropout regularization (keep prob)
        • Data augmentation
        • Early stopping
      • NN architecture search
  • Speed-up the training
    • normalizing the inputs
      • subtract mean
      • normalize variance
    • data vanishing / exploiding gradients
    • weight initializion of deep networks
      • xavier initialization
      • HE initialization
    • gradient checking -> backpropagation control
      • dont use in training
      • dtheta, dtheta approx.
      • remember regularization
      • does not work with dropout
  • Optimization algorithms
    • (stochastic) gradient descent
    • momentum
    • RMSProp
    • Adam
  • Mini batch
  • Exponentially weighted averages
  • Bias correction
  • Learning rate decay
  • The problem of local optima
  • HYPERPARAMETER TUNING
    • try random values
    • confonets, resnets
    • panda babysitting (sistem kaynakları kısıtlı ise) or baby fish (caviar) approach (değilse)
    • batch normalization
    • covariate shift
    • softmax regression
    • hardmax biggest 1, the others 0
  • Frameworks
    • Caffe/Caffe2
    • CNTK
    • DL4J
    • Keras
    • Lasagne
    • mxnet
    • PaddlePaddle
    • Tensorflow
    • Theano
    • Torch

STRUCTURING MACHINE LEARNING PROJECTS

  • Orthogonalization (eğitimin yeterince başarılı olması için gereklidir) (radyo ayarlama) (developer set (training)
    • fit training set well in cost function
      • bigger NN or better optimization algorithms
    • fit dev. set well on cost function
      • regularization or bigger training set
    • fit test set well on cost function
      • bigger dev set
    • performs well in real world
      • dev set is not set correctly, the cost function is not evaluating the right thing
  • Single number evaluation metric
    • P (precision) (toplam doğruluk, %95 kedidir)
    • R (Recall) (kedilerin %90’ı doğru bilindi.
    • F1 Score – average of precision and recall (F1 değeri yüksek olan daha başarılıdır)
  • Satisficing and optimizing metric
    • hangisi satisficing hangisi optimizing olacak.
  • Train/dev/test sets distribution
  • When to change dev/test sets and metrics
  • Human level performance
    • avoidable bias / bayes optimal error (best possible error)
    • reducing bias/variance
    • surprassing human-level performance
  • ERRORS
    • training
      • variance
      • more data
      • regularization (lz, dropout, augmentation)
      • NN architecture / hyperparameter search
    • dev
    • human-level errors
      • avoidable bias
      • train bigger model
      • train longer
      • train better optimization algorithms (momentum, RMSProb, Adam)
      • NN architecture
      • Hyperparameter search
      • RNN/CNN
Bugün 1, bugüne kadar toplam 84 kez ziyaret edildi.

Bir yanıt yazın

E-posta adresiniz yayınlanmayacak. Gerekli alanlar * ile işaretlenmişlerdir