Vijay urges skill shift to AI work

- Vijay Raji used two May 2 X posts to tell software engineers the job is shifting from hand-writing SaaS code to designing and verifying AI systems. - The concrete tell is where he pointed people next: free OpenAI and Microsoft materials on prompt engineering, evals, and model testing workflows. - That matters because major AI platforms now treat variability as normal — and systematic evaluation as part of shipping.

Software engineering is getting a new center of gravity. Not because coding disappears, but because AI systems change what “good engineering” looks like. In two May 2 posts on X, Vijaye Raji argued that engineers should stop thinking like pure SaaS builders and start thinking like people who design, test, and supervise probabilistic systems. He paired that with a second post pushing free learning resources from OpenAI and Microsoft — basically a reading list for the transition. (developers.openai.com) ### Who is Vijay talking to? He’s talking to working software engineers — especially people whose instincts were shaped by classic SaaS. In that world, you write deterministic code, define the edge cases, and expect the same input to produce the same output. Raji’s point is that AI apps don’t behave like that. The model is part of the product, and the model is variable by design. That changes the craft. (developers.openai.com) ### What’s the actual shift? The shift is from “produce code” to “produce reliable behavior.” That sounds subtle, but it’s a big deal. In an LLM stack, the hard part often isn’t writing one more function. It’s choosing the model, structuring prompts, controlling context, handling failure modes, and deciding what the system should do when(developers.openai.com)riability means traditional software testing is not enough. (developers.openai.com) ### Why do evals keep coming up? Because evals are the bridge between a cool demo and a dependable product. OpenAI frames an evaluation as a test with grading logic for model outputs, and its developer docs push teams to build evals programmatically and use them to compare prompts, models, and parameters over time. Microsoft is pushing t(developers.openai.com)safety, and set thresholds before release. (platform.openai.com) ### Why is that different from normal QA? Normal QA assumes the software should do the same thing every time. AI QA assumes the answer may vary, so the job becomes statistical and behavioral. You don’t just ask “did the function return the right value?” You ask whether the system stayed grounded, followed instructions, called tools correctly, avoided h(platform.openai.com)-in evaluators now cover quality, safety, reliability, and agent behavior — that tells you where the field is heading. (learn.microsoft.com) ### So is prompt engineering the whole job now? No — and that’s the part people often miss. Prompting matters, but the stronger signal in the tooling is systems thinking. Microsoft still teaches prompt engineering as a core skill, but it places it inside a larger workflow of experimentation, evaluation, observability, and deployment. (learn.microsoft.com) are how you know whether a change actually helped. (learn.microsoft.com) ### Why share OpenAI and Microsoft resources? Because they are the mainstream playbooks now. If you want to learn this shift without paying for a bootcamp, the official docs are unusually practical. OpenAI has guides for evals, datasets, and evaluation best practices. Microsoft has parallel material on prompt engineering, evalua(learn.microsoft.com)cause the market is still sorting out the new baseline skill set. (developers.openai.com) ### What does this mean for engineers right now? It means the durable skill is judgment. You still need to code, but the premium moves toward decomposition, architecture, test design, and failure handling. The engineer who can make an AI system observable and trustworthy is more valuable than the engineer who only ships raw output fast. That’s not hype — it’s where the platform docs themselves are putting the work. (developers.openai.com) ### Bottom line? Raji’s posts matter because they capture a real turn in the job. The AI era is not just “learn prompts.” It’s “learn to run a messy model like production software” — with evals, thresholds, monitoring, and system design doing the heavy lifting. (developers.openai.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.