• Datablock syntax:

    Untitled

    • Recall that this is a block and does not load the data
    • Use item_tfms to RESIZE certain images
    • When creating a dataloader include the specific training or validation portion of the path
  • Show sample batches:

    • dataloader.train/valid.show_batch (max n and n rows)
  • To define custom loss functions, extend torch.autograd.function

    • ctx → this is a context object where you can store information for backward computatio
      • We access this information via ctx.saved_tensors, and store it via save_for_backward
    • Then, implement a forward and backward method → both of these are static methods in python
      • Use the @staticmethod decorator to declare that both methods are static
    • Use the .apply method when calculating this
  • Loading a pretrained model from fastai:

    • Use the appropriate learner (cnn_learner → vision_learner) and specify the pretrained model type
  • torch.max

    • 0 → applies function column wise (selects max from each COLUMN)
    • 1 → applies function row wise (as in will select the max from each ROW)
  • Pytorch’s implementation of NLLLoss only accepts comparing a distribution to a given variable

    • You can’t compute cross entropy in Pytorch between two distributions using just this
    • CROSSENTROPY AND THE LIKE ONLY ACCEPT ONE DIMENSIONAL TENSORS
    • IF YOU WANT TO COMAPRE MULTIPLE TENSORS WITH ONE ANOTHER, USE KULLBACK-DIVERGENCE LOSS instead of cross entropy to measure how far the two tensors are
    • KL measures the difference between entropy and cross entropy
    • Or, implement it from scratch
  • Make sure that we are using THE CORRECT VALUES FOR ALL CALCULATIONS!

    • Namely → understand the difference between predictions (0-1) and labels (the index of the highest prediction)
  • NOTE → FOR CWTM, WE ARE SUBTRACTING THE MAXIMUM PROBABILITIES OF THE STUDENT FROM THE TRUE PROBABILITIES (LABELS) OF THE TRUE LABELS

    • This is not the same as subtracting indexes. We don’t care about the predicted class but rather the probability
    • And, the probabilities for the true labels can only be zero or one
    • So, subtract the probabilities at the CORRECT index.
      • Use torch.index_select to specify index selection along indicies
        • CRITICAL NOTE → THIS SELECTS THE SPECIFIED INDEXES PER DIMENSION AND NOT IN TOTAL!
        • So, use TORCH.GATHER for this instead
      • Use torch.rand to generate random tensors for testing, anad torch.randint for a tensor of random integers
    • IN FASTAI, DO NOT PASS THE SAME MODEL ARCHITECTURE TO BOTH LEARNERS AS IT MEANS THAT THE UNDERLYING MODEL WILL BE THE SAME!
  • The loss function IS JUST STANDARD CROSS ENTROPY → WE ARE NOT MEASURING DIFFERENCE IN DISTRIBUTIONS

    • We instead just want cross entropy between student and labels
    • But, the GRADIENT will have CWTM.
  • Use torch.where for conditional elementwise operations