They mention it a few times, but kind of hand-wave it away:
> Solutions such as Git LFS [9] and DVC [10] provide a
light-weight facade for adding large files to Git repositories but do
not provide sufficient integration to support the needs of industry
ML datasets as described in Sec. 2.
I'm not sure what they mean by 'sufficient integration', but whatever the insufficiencies, why not address those? Considering all the authors work at XetHub, I'm pretty sure this is an advertisement disguised as a research paper.
theDaninDanger t1_j42vf8i wrote
Reply to comment by PassionatePossum in [R] Git is for Data (CIDR 2023) - Extending Git to Support Large-Scale Data by rajatarya
They mention it a few times, but kind of hand-wave it away:
> Solutions such as Git LFS [9] and DVC [10] provide a
light-weight facade for adding large files to Git repositories but do
not provide sufficient integration to support the needs of industry
ML datasets as described in Sec. 2.
I'm not sure what they mean by 'sufficient integration', but whatever the insufficiencies, why not address those? Considering all the authors work at XetHub, I'm pretty sure this is an advertisement disguised as a research paper.