Dartagnjan

Dartagnjan t1_jdzo44q wrote

Is anyone in need of machine learning protégé? I am looking for a doctorate position in the German and English speaking worlds.

My experience is in deep learning, specifically GNNs applied to science problems. I would like to remain in deep learning, broadly but would not mind changing topic to some other application, or to a more theoretical research project.

I am also interested in theoretical questions, e.g. given a well defined problem (e.g. the approximation of the solution of a PDE), what can we say about the "training difficulty", is optimization at all possible (re. Tangent kernel analysis), how architectures help facilitate optimization, and solid mathematical foundations of deep learning theory.

I have a strong mathematical background with knowledge in functional analysis and differential geometry, and also hold a BSc in Physics, adjacent to my main mathematical educational track.

Last week I also started getting into QML with pennylane and find the area also quite interesting.

Please get in touch if you think I could be a good fit for your research group or know an open position that might fit my profile.

2

Dartagnjan OP t1_j108k4y wrote

That is what I have already done. So far, the loss just oscillates but remains high, which leads me to believe that either I am not training in the right way i.e. maybe the difference between the easy and hard training examples is too drastic to bridge. Or my model is just not capable of handing the harder examples.

1

Dartagnjan OP t1_j103ef6 wrote

  1. I have already tried my own version of selective backprob, but thanks for the link. this is exactly what I was looking for. I want to know how other people implement it and if I did something wrong.
  2. Overfitting on the hard examples is a test that I carried out already multiple times but not yet on the latest experiments. Thanks for reminding me of this. I guess from this I can infer whether my complexity is definitely too low, if I cannot overfit. If I can overfit. If I can overfit on the hard examples it does not mean the model is able to handle easy and hard examples at the same time, still.
15

Dartagnjan OP t1_j103e5a wrote

Yes, I already have batch_size=1. I am looking to sharding the model on multiple GPUs now. In my case, not being able to predict on the 1% of super hard examples means that those examples have features that the model has not learned to understand yet. The labeling is very close to perfect with mathematically proven error bounds...

> focal loss, hard-example mining

I think these are exactly the keywords that I was missing in my search.

5