Submitted by Kebet-Mendez t3_10fz4b4 in MachineLearning

Lets say I have a dataset of real estate listings. I have a column of text that describes the listing, and another column that shows the number of rooms for example. In most of the cases, the number of rooms is shown in both columns, in the description text and also in the dedicated column.

But for some observations, the number of rooms is in the description text but not in the column "number of rooms". So I have missing data.

I could try to fill the missing data with by applying regex in the description text, but the number of possibilities seems to big.

Is there a machine learning technique in NLP that allows me to do that, since it most of the observations the data is present in both column, so is "naturally labelled"?

If there is, what is the name of these techniques? I would like to search about it but I don't know the proper keywords to google.

5

Comments

You must log in or register to comment.

Dear-Acanthisitta698 t1_j4zqkv8 wrote

Text QA might work. Give descriptiom as passage and question as "how many number of rooms in this house?".

2