Beren writes:
"Similarly, we can derive an equivalent of a FLOP count. Each LLM call/generation can be thought of as trying to perform a single computational task – one Natural Language OPeration (NLOP). For the sake of argument, let’s say that generating approximately 100 tokens from a prompt counts as a single NLOP. From this, we can compute the NLOPs per second of different LLMs. For GPT4, we get on the order of 1 NLOP/sec. For GPT3.5 turbo, it is about 10x faster so 10 NLOPs/sec. Here there is a huge gap from CPUs which can straightforwardly achieve billions of FLOPs/sec. However, a single NLOP is much more complex than a CPU processor instruction, so a direct comparison is unfair."
This got me thinking about how the internet had turned us (well before the advent of LLMs) into a new form of computer. Depending on the amount of risk, context and complexity of a given task, you can essentially recruit others for the same sort of NLOPs through services like Fiverr in the same way we do now with an LLM. The Ops/sec and time for a given Op to finish is insanely slow, but for simple graphic design, spread-sheeting, coding and writing, it used to work basically in the same we now use Midjourney and ChatGPT.
The question remains open on whether this makes those peoples throughput much higher or if it completely replaces the market for this kind of human work entirely.