Viewing a single comment thread. View all comments

CVxTz t1_j40ohf4 wrote

You need a dataset of a few thousands or a few millions examples of input (documents + other contextual info like location data) and outputs ( estimates, other attributes like number of bedrooms and stuff) in order to build such feature. Depending on the quality and amount of data that you have and the perfomance requirements that you have, this can go from a few months projetcs to nearly impossible to do. (note, if you have no data like you said or expect 0 error, then this is impossible to do)

1

CuriousCesarr OP t1_j41at7f wrote

As far as I can tell, a few 100s of homes would be available to be used as a dataset.

1

NamerNotLiteral t1_j41cbv0 wrote

A few hundreds is way too little. I would be comfortable with a few thousand homes' data, and more comfortable yet if I could scrape Zillow or something on top of that.

(but that has its own issues, both legally and in terms of data drift, since Zillow data would be American while you're European).

1

[deleted] t1_j41d9av wrote

[deleted]

1

NamerNotLiteral t1_j41dryy wrote

Circumventing bot blocking protocols is a trivial matter.

The potential lawsuit, on the other hand, is not.

2

CuriousCesarr OP t1_j54lfq2 wrote

Sorry for the late reply but I had a very busy period. In the end, I found a small Greek ML company that was excited about the project and we entered deeper discussions. I also updated my post to reflect this. Have a great day! :)

1