UnusualClimberBear t1_jd9109w wrote on March 22, 2023 at 6:22 PM

Reply to comment by [deleted] in [D] ICML 2023 Reviewer-Author Discussion by zy415

First, they know publication is now a big circus and that most papers are clever solutions to problems that don't exist or beautiful explanations that cannot be leveraged. Acceptance is random if your work is not in the top 2% but still in the top 60%.

Publication as proof of work is toxic

UnusualClimberBear t1_jd3mklf wrote on March 21, 2023 at 4:29 PM

Reply to comment by tekktokk in [R] What do we think about Meta-Interpretive Learning? by tekktokk

Even for protein folding it has been overridden by deep models. It might be useful for critical tasks where error is not allowed and everything is deterministic, but I'm not expert of the field.

UnusualClimberBear t1_jd3gqap wrote on March 21, 2023 at 3:52 PM

Reply to comment by tekktokk in [R] What do we think about Meta-Interpretive Learning? by tekktokk

Usually, the problem is the combinatorial nature of the possible number of rules that could apply. Here they seem to be able to find a subset of possible rules with a polynomial complexity, but as table 7 of the second paper contains tiny 'wrt ML/RL data) instances of problems, I would answer yes to your questions. ILP is something coming with strong guarantees, while ML comes with a statistical risk. Theses guarantees aren't free.

UnusualClimberBear t1_jd2myu1 wrote on March 21, 2023 at 12:11 PM

Reply to [R] What do we think about Meta-Interpretive Learning? by tekktokk

Sounds like a rebranding of Inductive Logic Programming. It does not scale, while all recent advances are about scaling simple systems. Think that for a vanilla transformer, the bottleneck is often the size of the attention because it is N^2, and people are switching to linear attention.

UnusualClimberBear t1_jczl0bn wrote on March 20, 2023 at 7:27 PM

Reply to comment by currentscurrents in [Project] Alpaca-30B: Facebook's 30b parameter LLaMa fine-tuned on the Alpaca dataset by imgonnarelph

Better light a candle rather than buy an AMD GC for anything close to cutting edge.

UnusualClimberBear t1_jcfd2jt wrote on March 16, 2023 at 1:08 PM

Reply to comment by SomewhereAtWork in [D] Is it possible to train LLaMa? by New_Yak1645

Yes, doable on a low budget if you have no fear of legal actions...

Follow https://crfm.stanford.edu/2023/03/13/alpaca.html

UnusualClimberBear t1_jceked4 wrote on March 16, 2023 at 7:27 AM

Reply to In your experience, are AI Ethics teams valuable/effective? [D] by namey-name-name

Pure PR. And please do not slow down other projects.

UnusualClimberBear t1_jbnoo8n wrote on March 10, 2023 at 10:15 AM

Reply to comment by potatoandleeks in [D] Is it possible to train LLaMa? by New_Yak1645

You can rent some (but not thousands) on vast.ai around $1.5 an hour

UnusualClimberBear t1_jbngux4 wrote on March 10, 2023 at 8:23 AM

Reply to [D] Is it possible to train LLaMa? by New_Yak1645

Training from scratch required 2048 A100 for 21 days. And it seems only to be the final run.

I guess you can start to fine-tune it with much lower resources, 16 A100 seems reasonable as going lower will require quantization or partial loadings for the model.

UnusualClimberBear t1_j7pdue6 wrote on February 8, 2023 at 1:45 PM

Reply to comment by EmbarrassedFuel in Model/paper ideas: reinforcement learning with a deterministic environment [D] by EmbarrassedFuel

This is because the information is in the books.

(free online) http://www.cds.caltech.edu/~murray/amwiki/index.php/Main_Page

https://www.amazon.com/Modern-Control-Systems-12th-Edition/dp/0136024580

Yet nonlinear breaks everything there. The usual approach is to linearize at well-chosen positions and compute the control using the closest linearization.

UnusualClimberBear t1_j7opc2r wrote on February 8, 2023 at 8:55 AM

Reply to comment by UnusualClimberBear in Model/paper ideas: reinforcement learning with a deterministic environment [D] by EmbarrassedFuel

Also if your world is deterministic but you cannot build a good model of it, it may be that you are close to the situation of games such as Go, and Monte Carlo Tree search algorithms are an option to consider (variants of UCT with or without function approximation)

UnusualClimberBear t1_j7lvpz8 wrote on February 7, 2023 at 7:14 PM

Reply to Model/paper ideas: reinforcement learning with a deterministic environment [D] by EmbarrassedFuel

Looks like an optimal control problem rather than an RL one. RL is there for situations with no good model available. If stochasticity is present, but you still have a good model once the uncertainty is known, then Markov predictive control is a good way to go.

UnusualClimberBear t1_j7cjxu9 wrote on February 5, 2023 at 8:06 PM

Reply to comment by AdFew4357 in Are PhDs in statistics useful for ML research? [D] by AdFew4357

Basically, current trends just ignore any reasonable thing, such as train/valid/test set. For now, the bigger, the better. This requires quite a lot of tech support functions (parallelization and data pipelining in particular) rather than theory-related ones.

UnusualClimberBear t1_j7chhkq wrote on February 5, 2023 at 7:49 PM

Reply to Are PhDs in statistics useful for ML research? [D] by AdFew4357

This is all about timing. Currently stats/maths capabilities are not are their best.

UnusualClimberBear t1_izaa5xr wrote on December 7, 2022 at 5:12 PM

Reply to comment by Ulfgardleo in [Discussion] Suggestions on Trust Region Methods For Natural Gradient by randomkolmogorov

For RL you also need to account for the uncertainty from the states actions you almost ignored during data collection but you'd like to use more. Gradient on a policy has different behavior than a gradient on a supervised loss.

UnusualClimberBear t1_iza00fp wrote on December 7, 2022 at 4:04 PM

Reply to comment by randomkolmogorov in [Discussion] Suggestions on Trust Region Methods For Natural Gradient by randomkolmogorov

TRPO is often too slow for applications because of that line search and researchers often prefer to use PPO, which also has some guarantees in terms of KL on the state distribution and is faster. I'd be curious to hear about your problem if it ends up that TRPO is the best choice.

UnusualClimberBear t1_iz9ohx8 wrote on December 7, 2022 at 2:44 PM

Reply to [Discussion] Suggestions on Trust Region Methods For Natural Gradient by randomkolmogorov

TRPO follows the same direction as NPG with a maximal step size to still satisfy the quadratic approximation of the KL constraint. I'm not sure of what you would like to to better.

Nicolas Leroux gave a nice talk on RL seen as an optimization problem: https://slideslive.com/38935818/policy-optimization-in-reinforcement-learning-rl-as-blackbox-optimization

UnusualClimberBear t1_iyvpzci wrote on December 4, 2022 at 3:29 PM

Reply to [D] Score 4.5 GNN paper from Muhan Zhang at Peking University was amazingly accepted by NeurIPS 2022 by Even_Stay3387

Because the area chair is the one making the recommendation. He managed to convince his senior area chair. Indeed you can suspect collusion, but without reading the paper, from the reviews, it looks like a typical paper with quality in the quantile 10%-60%, and at this level, acceptance is pretty random.

UnusualClimberBear t1_iylwrfc wrote on December 2, 2022 at 10:55 AM

Reply to [D] Entropy in feature engineering by YamEnvironmental4720

ID3 classically builds a tree by maximizing entropy gains in leaves, thus removing some irrelevant variables.

You may also be interested in energy-based models.

UnusualClimberBear t1_irzxolu wrote on October 12, 2022 at 8:28 AM

Reply to [D] Would you rather work for DeepMind or a ML startup? Why? by [deleted]

It is unlikely that you are going to personally shine at Deepmind. If you think that the startup is doing the right thing and that you can have an impact on it, then it is probably a better carrer choice.

UnusualClimberBear t1_iqna2rw wrote on October 1, 2022 at 5:20 PM

Reply to [Discussion] If we had enough memory to always do full batch gradient descent, would we still need rmsprop/momentum/adam? by 029187

The full gradient does not work well for NN. Plus adam has a coarse estimate of the curvature, so it would be more of a second-order method even if you can find some functions where the proposed estimates are not good.