Don't you think that LLM inference is already very democratic? Not saying that there is no room for improvements there -- there is still a lot to do in the space of speculative decoding, quantization and other stuff. I'm saying that every 16 year old with a decent enough personal computer can run fully locally latest open weights model like Llama3-8b that beats almost everything we had a year ago.
The part of this ecosystem that is as non-democratized as it can be is training. It's currently impossible to train decent enough model with resources that are available to one person.
The part of this ecosystem that is as non-democratized as it can be is training. It's currently impossible to train decent enough model with resources that are available to one person.