Companies are training LLMs on all the data that they can find, but this data is not the world, but discourse about the world. The rank-and-file developers at these companies, in their naivete, do not see that distinction…So, as these LLMs become increasingly but asymptotically fluent, tantalizingly close to accuracy but ultimately incomplete, developers complain that they are short on data. They have their general purpose computer program, and if they only had the entire world in data form to shove into it, then it would be complete.

  • kromem@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    4 months ago

    Something you might find interesting given our past discussions is that the way that the Gospel of Thomas uses the Greek eikon instead of Coptic (what the rest of the work is written in), that through the lens of Plato’s ideas of the form of a thing (eidelon), the thing itself, an attempt at an accurate copy of the thing (eikon), and the embellished copy of the thing (phantasm), one of the modern words best translating the philosophical context of eikon in the text would arguably be ‘simulacra.’

    So wherever the existing English translations use ‘image’ replace that with ‘simulacra’ instead and it will be a more interesting and likely accurate read.

    (Was just double checking an interlinear copy of Plato’s Sophist to make sure this train of thought was correct, inspired by the discussion above.)