Submitted by evanthebouncy t3_zxef0f in MachineLearning

Foundational models can generate realistic images from prompts, but do these models understand their own drawings? Generating SVG (Scalable Vector Graphics) gives us a unique opportunity to ask this question. SVG is programmatic, consisting of circles, rectangles, and lines. Therefore, the model must schematically decompose the target object into meaningful parts, approximating each part using simple shapes, then arrange the parts together in a meaningful way.

Check out the blog (5min read) for the full report https://medium.com/p/74ec9ca106b4

tl;dr:
GPT can symbolically decompose an object into parts, is okay at approximating the parts using SVG, is bad at putting the parts together, and is Egyptian.

be happy to take some comments and QA here :D

--evan

35

Comments

You must log in or register to comment.

Shir_man t1_j21rb06 wrote

I did this too some time ago. Cool experiment and I enjoyed your results

9

evanthebouncy OP t1_j22d2ay wrote

Woah the Mona Lisa man himself!

Yeah ofc I'm aware of your work. I think everyone generating SVG has used variants of your prompt. Nice to meet you

8

Shir_man t1_j23cu16 wrote

Oh, that was unexpected! Thank you for following my experiments; I appreciate it :)

3

suspicious_Jackfruit t1_j21mu6v wrote

Didn't think about SVG, I got chatGPT to draw in ascii art instead, it drew itself as a human, but with a larger head

3

evanthebouncy OP t1_j22bzqb wrote

Ya i find if you ask it to draw X where X isn't commonly drawn ie you can ask it to draw boogieboogie, it'll default to drawing a person

2

shadowylurking t1_j248ivr wrote

this seems super interesting, checking out!

1

slashdave t1_j1zwb0a wrote

SVG supports complex shapes via paths, text, images (photos) and complex fills including gradient fields.

−1