pyepyepie

pyepyepie t1_jc7l2m4 wrote

Write me a poem about right-wing vs left-wing:

Right wing, left wing, Politics divide us and make us see. Different views, different opinions, One thing is clear, it’s a battle royale.

The right with its conservative views, Firm in its beliefs, it sticks to the rules. The left, ever so liberal, In favor of social justice, it won’t relent.

Both sides struggle to gain the upper hand, Each thinking their way is the right way. One thing is clear, we can’t agree, So let us compromise and find a way.

In the end, it’s the people that matter, Not the right or the left, that will make the matter. Let us come together and make a stance, Unity and respect will be our gain.

So right and left, let us find a way, To come together and make a new day.

ChatGPT: Right-wing and left-wing, two sides of a coin Each with its own ideology to join One touting tradition, the other progress Each thinks their way is the best

The right-wing values individuality Free markets and small governmentality They believe in personal responsibility And defend their rights and liberty

The left-wing sees society as a whole Inequality and injustice they want to control They promote social welfare and public good And want to create a fairer neighbourhood

Right-wing and left-wing, different as can be Both with their own views, but not easy to agree Their debates can be fierce, their opinions strong But a balance between them is where we belong

So let's listen to each other, let's try to understand For a democracy to flourish, we need both sides at hand We can't dismiss one, for the other to prevail We need both wings to make our society sail.

1

pyepyepie t1_jbz9363 wrote

The TLDR of XAI is that you can "see" (or think you see) how features influence the decisions of your models. For example, if you have a sentence "buy this pill to get skinny!!!!!" and you try to classify if it's spam, the "!!!" might be marked as very spammy. You often find it by masking the "!!!" and seeing that now the message is maybe not classified as spam (often you look at the output dist). Of course, there are many more sophisticated methods to do so and there is a lot of impressive work, but it's the TLDR.

There are many explainability methods, it's a very hot topic. It might be yet another paper, or not. The title makes no sense at all, there are gazillion explainability methods for transformers. I am sorry, I did not read all of the paper so I should probably not talk too much. It just looks very similar to things I already saw.

Generally speaking, you should start using XAI if you do ML, if you do NLP - look into the proven methods, e.g. SHAP and LIME first. If you work with trees, look into TreeSHAP. If you work with vision, look into what I shared here. Sorry if my preceding comments were inaccurate but I hope I still provide some value here :).

2

pyepyepie t1_jbz57hd wrote

To be fair the paper looks interesting, the news title is garbage but it's not the fault of the authors who did a pretty cool job. Anyway, it seems like a nice application of a very well-known idea, which is cool.

By the way, is measuring the perturbation influence on the loss a common idea? Because I am mostly aware of using it to see how the regression value or class probabilities change - and the perturbation is done on the inputs, not params (edit ** incorrect, they do the perturbation on the inputs).

edit: "We follow the results of the studies [Koh and Liang, 2017; Bis et al., 2021] to approximate the perturbation effect directly through the model’s parameters when executing Leaving-One-Out experiments on the input. The influence function estimating the perturbation  of an input z is then derived as:" - seems like I misunderstood it due to their notation. Seems like a pretty regular method.

1

pyepyepie t1_jbu75ec wrote

Let's agree to disagree. Your example shows random data while I talk about how much of the information your plot actually shows after dimensionality reduction (you can't know).

Honestly, I am not sure what your work actually means since the details are kept secret - I think you can shut my mouth by reporting a little more or releasing the data, but more importantly - it would make your work a significant contribution.

Edit: I would like to see a comparison of the plot with a very simple method, e.g. mean of word embeddings. My hypothesis is that it will look similar as well.

11

pyepyepie t1_jbu3245 wrote

I think you misunderstood my comment. What I say is, that since you have no way to measure how well UMAP worked and how much of the variance of the data this plot contains, the fact that it "seems similar" means nothing (I am really not an expert on it, if I get it wrong feel free to correct me). Additionally, I am not sure how balanced the dataset you used for classification is, and if sentence embeddings are even the right approach for that specific task.

It might be the case - for example, that the OpenAI embeddings + the FFW network classify the data perfectly/as well as you can since the dataset is very imbalanced and the annotation is imperfect/categories are very similar. In this case, 89% vs 91% could be a huge difference. In fact, for some datasets the "majority classifier" would yield high accuracy, I would start by reporting precision & recall.

Again, I don't want to be "the negative guy" but there are serious flaws that make me unable to make any conclusion based on it (I find the project very important and interesting). Could you release the data of your experiments (vectors, dataset) so other people (I might as well) can look into it more deeply?

7

pyepyepie t1_jbtsc3n wrote

Your plot doesn't mean much - when you use UMAP you can't even measure the explained variance, differences can be more nuanced than what you get from the results. I would evaluate with some semantic similarity or ranking.

For the "91% vs 89%" - you need to pick the classification task very carefully if you don't describe what it was then it also literally means nothing.

That being said, thanks for the efforts.

39

pyepyepie t1_jad28c2 wrote

I think it's a cool effort. Regarding feedback, personally, I would use an independent project in production only if I have no alternative. For example, using Shap was really painful even though the problem is less broad than this one and it had many contributors. That being said, it's a cool educational tool.

1

pyepyepie t1_j9uanug wrote

In all honesty, at some point, any type of evaluation that is not qualitative is simply a joke. I have observed it a long time ago while working on NMT and trying to base the results on BLEU score - it literally meant nothing. Trying to force new metrics based on simple rules or computation will probably fail - I believe we need humans or stronger LLMs in the loop. E.g., humans should rank the output of multiple LLMs and the same humans should do so for multiple different language models, not just for the new one. Otherwise, I view it as a meaningless self-promoting paper (LLMs are not interesting enough to read about if there are no new ideas and no better performance). Entropy is good for language models that are like "me language model me no understand world difficult hard", not GPT-3 like.

Edit: this semantic uncertainty looks interesting but I would still rather let humans rank the results.

8

pyepyepie t1_j9gnf6y wrote

This anecdote I have heard but I was kind of hoping for non-trivial cases from everyday life at work. I feel I understand SGD perfectly fine without learning to solve complicated DE but it's probably limiting me on other tasks, or my ability to analyze ML algorithms. Are you sure it's the right hierarchy to say that SGD is rooted in differential equations? I mean, I agree you are right, it's a differential equation, but are the methods you learn in differential equations courses useful for ML?

I found a nice article about the link to SGD: https://tivadardanka.com/blog/why-does-gradient-descent-work - but I am not sure if I am convinced (again, I am still an idiot about it, I shouldn't have any opinion regarding links to differential equations lol - but for me trying to fit SGD to the framework of differential equations is against the KISS principle). Sorry if I go too deep, I just try to figure out how much effort (I can actually study it all day for fun but we have work and so on) to put into it since we only have some amount of time :)

Thanks for the answer! I was convinced (by your message and myself today) it's terrible I don't know it and I should learn it ASAP.

1

pyepyepie t1_j9fn745 wrote

> Differential Equations

I have a (somewhat) strong math background (studied many math courses with the math department of the universities I studied at) and a strong SW background (web and then MLE for a few years) - however, I have never used or studied Differential Equations (god knows why). I understand quite deeply how calculus and linear algebra are related to neural networks, and probability is related to the field everywhere by definition - but could you explain to me when you need knowledge of Differential Equations? I ask it due to my ignorance, again - I have never studied it. Could you link it to ML concepts which I probably don't understand well due to my ignorance? Also, I would add optimization to the answer :)

Edit: also 2 - how deeply would you suggest to learn it? https://www.youtube.com/watch?v=9fQkLQZe3u8 what do you think about this one?

1

pyepyepie t1_j9fgpej wrote

A week is not enough dude, try a few more days and maybe you will beat the market! Tip on how you can beat 99% of the people who do ML for the stock market: search for extrapolation with machine learning, and then search how well it works. You can try "how well does extrapolation work with machine learning". If you feel lazy you can ask ChatGPT.

1

pyepyepie t1_j9evz4c wrote

Personally, I think plagiarism is a terrible word to use in this case. I also don't like this shaming of young researchers who seem to come with good intentions. That being said, I don't particularly enjoy reading ML papers. I feel I learn more from Math and ML books and only from papers I need for my work or classics.

1

pyepyepie t1_j99prs0 wrote

Reply to comment by TeamRocketsSecretary in [D] Please stop by [deleted]

LOL, I don't know what to say. I personally don't have anything smart to say about this question currently, it's as if you ask me if there is external life. Sure, I would watch it on Netflix if I have time, but generally speaking, it's way out of my field of interest. When you say snake oil, do you mean AI ExPeRtS? Why would you care about it? I think it's good that ML becomes mainstream.

1

pyepyepie t1_j95f3m2 wrote

I actually think your approach shows the idea better than the original paper. However, the original paper can be implemented with smaller language models which might be better for people who want to deploy it. All over, I think the application is almost trivial and I am not surprised it worked well for you (due to the crazy power of LLMs).

Great work!

9

pyepyepie t1_j95e9ka wrote

Reply to comment by TeamRocketsSecretary in [D] Please stop by [deleted]

I implemented GPT-like (transformers) models almost since it was out (not exactly but worked with the decoder in the context of NMT and with encoders a lot like everyone who does NLP, so yeah not GPT-like but I understand the tech) - I also argue you guys are just guessing. Do you understand how funny it looks when people claim what it is and what it isn't? Did you talk with the weights?

Edit: what I agree with is that this discussion is a waste of time in this sub.

2

pyepyepie t1_j93fd53 wrote

Reply to comment by goolulusaurs in [D] Please stop by [deleted]

You are 100% not deserving to be downvoted. You are also not the one who initiated this (old) discussion, you reacted to the original post.

All you said is that you can't know, it can't be measured, and he is literally guessing, which I think is just saying that you literally have no idea how to discuss the topic and are sick of empty claims - and I 100% agree. It's probably the most responsible take you can have on this subject in my opinion - get 10000 upvotes from me :)

2

pyepyepie t1_j93btxb wrote

Reply to comment by csreid in [D] Please stop by [deleted]

I agree, and there are no stupid questions! So you are a good programmer or ML engineer but then you start studying chess and you are the idiot who asks stupid questions now (or gets downvoted because you use the incorrect term). I really like your comment.

0