Erosis t1_jegj2l9 wrote on March 31, 2023 at 9:13 PM

Reply to comment by Educational-Net303 in [News] Twitter algorithm now open source by John-The-Bomb-2

Twitter is already established as a brand to near saturation and Elon has more money than god. It's the perfect combo for ML philanthropy. Now waiting for that Tesla vision algorithm...

Erosis t1_j72rzdl wrote on February 3, 2023 at 5:38 PM

Reply to comment by SAbdusSamad in [D] Understanding Vision Transformer (ViT) - What are the prerequisites? by SAbdusSamad

You'll probably be fine learning transformers directly, but a better understanding of RNNs might make some of the NLP tutorials/papers containing transformers more easily comprehensible.

Attention is an very important component of transformers, but attention can be applied to RNNs, too.

Erosis t1_j07aho6 wrote on December 14, 2022 at 4:16 PM

Reply to comment by Deep-Station-1746 in [P] Implemented Vision Transformers 🚀 from scratch using TensorFlow 2.x by TensorDudee

Yet people here praise Torch when Tensorflow equivalents are often faster in production. Tensorflow still has relevance and gets a bit too much hate here (and I personally prefer pytorch).

Erosis t1_ivwar5d wrote on November 11, 2022 at 2:05 AM

Reply to comment by DreamyPen in [Discussion] Can we train with multiple sources of data, some very reliable, others less so? by DreamyPen

Yes, this is 'instance' or 'sample' weighting. You can choose to apply this weight to the loss or the gradients before your parameter update.

Erosis t1_ivu2gnv wrote on November 10, 2022 at 4:46 PM

Reply to comment by DreamyPen in [Discussion] Can we train with multiple sources of data, some very reliable, others less so? by DreamyPen

Trees complicate it a bit more. I've never done it for something like that, but check this instance weight input to xgboost as an example. In the xgboost fit function, there is an input for sample_weight.

I know that tensorflow has a new-ish library for trees. You could manually write a gradient descent loop with modified minibatch gradients there, potentially.

Erosis t1_ivtyokc wrote on November 10, 2022 at 4:22 PM

Reply to comment by DreamyPen in [Discussion] Can we train with multiple sources of data, some very reliable, others less so? by DreamyPen

You could use a custom training loop where you down-weight the gradients of the unreliable samples before you do parameter updates.

Erosis t1_ivtxuwn wrote on November 10, 2022 at 4:16 PM

Reply to [Discussion] Can we train with multiple sources of data, some very reliable, others less so? by DreamyPen

Are the outputs of your model binary? You could instead set the target of your uncertain data points to somewhere closer to the middle instead of 0 and 1.

If you are training in batches, you could reduce the size of the gradient updates coming from the unreliable data.

Erosis t1_irewcom wrote on October 7, 2022 at 3:24 PM

Reply to [N] I Have Released the YouTube Series Discussing and Implementing Activation Functions by itsstylepoint

Your videos have been great so far! Can't wait for more modeling content.

Erosis OP t1_ir9kj9k wrote on October 6, 2022 at 10:50 AM

Reply to comment by KeikakuAccelerator in [R] Google announces Imagen Video, a model that generates videos from text by Erosis

I'm referring to their new Make-A-Video model, but I suppose they just announced that a few days ago. Hopefully they fully release that model.

Erosis OP t1_ir8cdlx wrote on October 6, 2022 at 2:07 AM

Reply to comment by IntelArtiGen in [R] Google announces Imagen Video, a model that generates videos from text by Erosis

It seems that Google is being very conservative with the release of their diffusion models compared to even Meta and OpenAI's closed-source approach.

Luckily, Stability AI seems to be working on a video generating diffusion model.

Erosis t1_ir0tjwg wrote on October 4, 2022 at 2:50 PM

Reply to [P] New Book: Understanding Deep Learning by SimonJDPrince

I believe in figure 3.8, there seems to be a small typo.

> g-h) The clipped planes are then weighted

should be:

> g-i) The clipped planes are then weighted

Let me know if I'm mistaken here. Good stuff so far!

Edit: Added github issue regarding this.