Comments

You must log in or register to comment.

Kapri111 t1_j8qxq26 wrote

I've worked in some of those topics but from a human-computer interaction perspective. As in, how sentiment analysis distorts information perception and such.

1

Mikarz t1_j8r11wh wrote

If you’re going to need a dataset that’s NLP related, go to https://aclanthology.org (THE database for NLP research) and search “Reddit dataset” with some keywords that you’re interested in. Read the papers. There’s loads of annotated Reddit datasets out there. Good luck with your thesis.

1

suflaj t1_j8qwt5d wrote

People usually create datasets when they work on something new. I don't know why you would think that just because a dataset exists you can't or even need to outperform anything.

0

mems_m OP t1_j8qwyhk wrote

They want us to find an existing dataset cause of the short time we have, and novelty is a big part of the assessment

1

suflaj t1_j8qx0qv wrote

As I've said, there is no reason you can't do something novel with that, you just can't do what something else has done with it.

3

mems_m OP t1_j8qx61s wrote

the thing is that i find that almost everything i can do has been already done on the public datasets i find

1

suflaj t1_j8qxasd wrote

That's more of an issue of you searching. You mention sentiment analysis, for example, but it is a problem that is considered to be solved for years. There is no novelty you could do here besides a bigger model.

Obviously you need to stop looking at what people have done, and start looking at what in their process of doing something they didn't do or did poorly. One such thing is tokenization of text. You can't tell me that it's all figured out.

5

timelyparadox t1_j8qxcss wrote

Yes and finding small novel new things to do is big part of the way you show you are worth a masters degree

2

redflexer t1_j8ryw02 wrote

Actually, i find this notion harmful. I consider senior PhD students to be able to assess whether an idea in their field is novel, feasible, and in the right scope given fixed resources. I would never expect that from Master students. That does of course not mean that students can’t have great ideas, but it’s not mandatory for a degree.

1

mems_m OP t1_j8qx1eg wrote

novelty could be in the data or in the methods applied

1

2blazen t1_j8r3le4 wrote

You'd want to find a more in-depth topic for a master's thesis, Reddit scraping and sentiment analysis sounds more like an assignment. Ask your supervisor if they have a topic they're researching on, and if you can join. Look around if your university has example projects or even better, open projects. Look around past year's theses if you can continue working on any of them (hint: future works section) Once you find a topic you're interested in and is niche enough, it's still too broad so you have to filter it down to research questions, for which you have to start an in-depth research about the challenges of the topic and such.

Don't panic, there are many topics that need research. I'm starting my thesis in audio processing - health AI / speaker embeddings / impaired speech / diagnosis assistance and it's wild west over here, partially because the data is not publicly accessible though

0