Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This discussion started with what a data engineer does, and the diversity of roles. I wasn't trying to push my workflow on anyone. With what I'm doing right now (which includes data and SWE; I'm now writing more holistically), there is a flow:

---

Step 1:

- I do a lot of exploratory and one-off analysis, some of which leads to internal memos and similar and some of which goes nowhere. I do a lot of prototyping to.

- I do a lot of whiteboarding with stakeholders. This is also open-ended and exploratory. I might have a hundred mock-ups before I build something which would go into prod (which isn't a lot of time; a mock-up might be 5 minutes, so a hundred represents a few days' time).

This helps make sure: (1) I have enough flexibility in my architecture to guide likely use-cases, and I don't overengineer for things which will never happen (2) I pick the right set of things.

---

Step 2:

I build high-fidelity versions of the above. These, I can review e.g. with focus groups, in 1:1s, and in meetings.

---

Step 3:

I build production-ready deployable code. Probably about a third of the thing in step 2 reach step 3.

---

LLMs do relatively little for step 3. If I have time, I'll have GPT do a code review. It's sometimes helpful. It sounds like you spend most of your time here, so you might get less benefit than I do.

For step 2, they can often build my high-fidelity mockup for me, which is nice. What they can't do yet is do so in a way which is consistent with the rest of my codebase (front-end theming, code style, tools used, etc.). I'll get something working end-to-end quickly, but not necessarily something I can leverage directly for step 3.

However, in step 1, they've had a transformational impact. Exploratory work is 99% throw-away code (even the stuff which eventually makes it to prod; by that point, it has a clean rewrite).

One more change is that in step 1, I can try different libraries and tools too. LLMs are at the level of a very junior programmer, which is a lot better than me in a tool I've never used. Evaluating e.g. a library might be a couple of days of learning with the equivalent of 5 minutes - 1 day of building (usually, to figure out it's useless for my use-case). With an LLM, I have a feasible lousy first version in minutes. This means I can try a half-dozen libraries in an hour. That didn't fit into my timelines pre-LLM, and definitely does now. I end up using better libraries in my code, which leads to better architecture.

So YMMV.

I'm posting since I like reading stories like the above myself. Contexts vary, and it's helpful to see how things are done in contexts others than my own. If others have them, please feel free to share too.



Hmm. I’m pretty convinced. I often get mired in steps 1 and 2. Partly because of my ADHD but also cuz of the tedium of writing code.

Which LLM are you using? ChatGPT enterprise? Something offline data / sql centric?


ChatGPT a plurality of the time. I go through the API with a little script I wrote. If it doesn't work, I step up to GPT4. The costs are nominal (<$3/month). ChatGPT has gotten worse over time, so the need to escalate is more frequent; when it first came out, it was excellent.

API (rather than web) is more convenient and avoids a lot of privacy / data security issues. I wouldn't use it for highly secure things, but most of what I do is open-source, or just isn't that special.

I have analogous scripts to run various local LLMs, but with my setup, the init / cooldown time is long enough that it's easier to use a web API. Plus my GPU is often otherwise occupied. Most of what I use my GPU for are text (not code) tasks, and I find the open source models are good enough. I've heard worse things about them for code, but I haven't experimented enough to see if they'd be adequate. Some of that is getting used to how the system works, good / bad prompts, etc.

ollama + a second GPU + a running chat process would likely solve the problem for around ā‰ˆ$2k, so about the equivalent of a bit over a half-century of calls to the OpenAI API. If I were dealing with something secure, that'd probably make sense. What I'm doing now, it doesn't.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: