This discussion started with what a data engineer does, and the diversity of rol...

gigatexal · on Feb 8, 2024

Hmm. I’m pretty convinced. I often get mired in steps 1 and 2. Partly because of my ADHD but also cuz of the tedium of writing code.

Which LLM are you using? ChatGPT enterprise? Something offline data / sql centric?

blagie · on Feb 9, 2024

ChatGPT a plurality of the time. I go through the API with a little script I wrote. If it doesn't work, I step up to GPT4. The costs are nominal (<$3/month). ChatGPT has gotten worse over time, so the need to escalate is more frequent; when it first came out, it was excellent.

API (rather than web) is more convenient and avoids a lot of privacy / data security issues. I wouldn't use it for highly secure things, but most of what I do is open-source, or just isn't that special.

I have analogous scripts to run various local LLMs, but with my setup, the init / cooldown time is long enough that it's easier to use a web API. Plus my GPU is often otherwise occupied. Most of what I use my GPU for are text (not code) tasks, and I find the open source models are good enough. I've heard worse things about them for code, but I haven't experimented enough to see if they'd be adequate. Some of that is getting used to how the system works, good / bad prompts, etc.

ollama + a second GPU + a running chat process would likely solve the problem for around ≈$2k, so about the equivalent of a bit over a half-century of calls to the OpenAI API. If I were dealing with something secure, that'd probably make sense. What I'm doing now, it doesn't.