I agree it's way too hard as a one-hour interview question! When I interviewed a...

layer8 · on Aug 12, 2022

I’ve implemented glob matchers by regex-replacing the glob pattern, turning it into a regex, and then using that regex. I wonder how that would have gone in an interview. :)

burntsushi · on Aug 12, 2022

This is actually how glob matching is currently implemented in ripgrep: https://github.com/BurntSushi/ripgrep/blob/a9d97a1dda0a07a8e...

Works quite well because it can take advantage of the host of optimizations inside the regex engine.

IshKebab · on Aug 12, 2022

Honestly even that is way too hard. The people who originally came up with DFA and NFA engines and all that didn't do it under pressure in 40 minutes. You're basically selecting for people who have done it before.

naniwaduni · on Aug 12, 2022

But Dijkstra's shortest-paths algorithm is, of course, a perfect interview question, having infamously been derived in half an hour over a coffee date with his fiancée.

google234123 · on Aug 12, 2022

This is completely unfair comparison. Deriving is not comparable to reciting/applying something... Dijkstra's is pretty intuitive once you know it and about 10 or 12 line of python code? Quantum mechanics took years and multiple geniuses to figure out yet somehow undergrads and graduate students are able to learn it in one or two semesters.

IshKebab · on Aug 13, 2022

I'm not sure I follow - interview questions are supposed to be about deriving things, not reciting/applying them. In any case I'm pretty sure he was being sarcastic. You aren't hiring Dijkstra.

google234123 · on Aug 13, 2022

You are right. However they also test that you know and are able apply and recite the fundamentals.

compsciphd · on Aug 13, 2022

we probably got asked the same Q, but I believe pike's solution is close to the optimum solution to the google interview Q (in terms of being able to code out on a board in a short period of time).

If I remember correctly I was asked to be able to do a regex that handled . ? and *

The problem I have with this question is that if you spend 10-15 minutes going down a very wrong path, it's very very difficult to correct yourself in a pressurized environment.

I've tried to take the question and reformat it.

i.e.

I ask, what language do you want to program in (presuming I'm comfortable with the languages they want to use, otherwise I wouldn't be in the room, as wouldn't be a fair evaluation).

I give them the basic "strcmp" from pike's implementation (i.e. move character by character matching). i.e. something like

  int match(char *s, char *r) {
    if (*s == '\0' && *r == '\0') {
      return 1;
    } else if (*s == *r) {
      return match(s+1, r+1);
    } else {
      return 0;
    }
  }

First Q. What does this do? (explained above)

Second Q. how could we modify it work to match a substring (my explanation would be to just iterate over text like pike, but as with everything need to be open to where the interview takes you, but sometimes might need to say "that works for this, but for the future Qs, might want to think of it in this way", and give them some code, as again, I dont want them to code themselves into a corner that they don't have time to get out of).

Third Q. how could we add anchor support (ala regex, with a bit of explanation if they don't know what they are) (ala pike, determines where we are in the text, ala for ^ no iterating, and for $ we should be at a '\0' in the text

Fourth Q. how could we add "." support (just a || change to the primary matching condition, with being smart to check for '\0' which shouldn't match, but in a pressurized environment they might not get that without prodding right away)

Fifth Q. how could we add "?" support (now we have to look ahead a bit ala what pike does with )

Sixth Q (and the hardest), how could we add support.

I want to see if they can take some existing code, digest it and then see how they think to expand it. As there are many deliverables along the way, I'd hope I'd get a decent amount of "signal" even if they can't get to "?" or "*"

I also give them the link to kernighan's discussion on pike's code at the end.

if people have critiques of this approach to the Q (or further ways to improve it, I'm open)