Understanding syscalls thumbnail

Syscalls pt.0 - Understanding syscalls

os_internalsreverse_engineeringdebugging

What is a syscall?

A syscall is the mechanism a user-space program uses to ask the kernel to do something privileged for it.

That definition is simple, but in practice people use the word "syscall" to refer to several different things at once:

  • the CPU instruction that switches into kernel mode
  • the OS service being requested
  • the number used to identify that service
  • the user-mode stub that prepares the call

Those are related, but they are not the same thing.
If you mix them, things become confusing very quickly.

I think the cleanest way to understand syscalls is to start from the basic question:

Why do they exist at all?

Why do syscalls exist?

A normal program does not run with full privileges.

It cannot directly:

  • talk to hardware however it wants
  • schedule threads
  • map arbitrary memory pages with kernel privileges
  • access files by bypassing the operating system
  • manipulate other processes freely

If any user program could do that directly, the system would be chaos.

So modern operating systems split execution into at least two worlds:

  • user mode, where normal applications live
  • kernel mode, where the operating system enforces rules and manages resources

When a program needs something that belongs to the kernel world, it has to ask for it through a controlled interface.

That interface is the syscall boundary.

The basic idea

At a high level, the flow usually looks like this:

flowchart LR A["Your program"] --> B["User-mode library / API"] B --> C["Syscall stub"] C --> D["CPU transition to kernel mode"] D --> E["Kernel dispatcher"] E --> F["Kernel subsystem"]

This is the part that stays conceptually the same across platforms.

What changes is:

  • the library layer
  • the calling convention
  • the instruction used to enter the kernel
  • the numbering and naming of the services
  • the amount of abstraction each OS puts in front of you

And this is why people who learn syscalls only from one OS often get lost when they move to another one.

Linux syscalls

Linux is usually the easiest place to learn this topic.

If you call something like read(), write() or mmap(), you are typically calling a libc wrapper that will eventually perform the real syscall into the kernel.

When people study "raw syscalls" on Linux, they often skip libc and go directly to the kernel interface:

  • put the syscall number in the right register
  • place the arguments in the expected registers
  • execute the syscall instruction
  • read the return value

This is why Linux examples are everywhere in shellcode posts and low-level PoCs.
The model is relatively direct, and the barrier between the user API and the kernel interface feels thinner.

That does not mean it is universal.

Even inside Linux, syscall details depend on:

  • the architecture
  • the ABI
  • the syscall table of that architecture
  • how errors are reported
  • what the specific kernel supports

So "Linux syscalls" are already not one single thing. They are just easier to reason about at first.

Windows syscalls

Windows is where the vocabulary starts hurting people.

A lot of people say "syscall" when they are actually talking about one of these layers:

  • Win32 API
  • NT API
  • the actual syscall transition

These are not equivalent.

For example, if a program calls CreateFileW or VirtualAlloc, it is not jumping directly into the kernel from that high-level API. There are usually user-mode layers in between. Eventually, things tend to flow down into ntdll, and from there the actual syscall boundary is reached.

So if you are reversing Windows code, you should keep three layers clearly separated in your head:

  1. Win32 API – the friendly documented interface most code uses
  2. NT API – the lower-level interface exposed through ntdll
  3. syscall transition – the actual jump into kernel mode

This distinction matters a lot in offsec and reversing work.

Why? Because many discussions about userland hooks, EDR visibility, direct syscalls, or WoW64 tricks depend entirely on knowing which layer you are actually looking at.

Another thing that makes Windows annoying is that syscall numbers are not stable in the way beginners often expect. They can change across versions and builds, so copying syscall IDs from some random table on the internet is a good way to misunderstand what is going on.

macOS syscalls

macOS is similar enough to Unix-like systems that people often assume it behaves "basically like Linux".

That assumption is useful only for five minutes.

Yes, the same general idea exists:

  • user code runs in user space
  • the kernel is privileged
  • a controlled transition is needed to request privileged work

But the userland, the ABI details, the kernel implementation and the ecosystem are different.

On macOS, the path typically goes through libSystem, and if you are on modern Apple hardware, you are also very likely dealing with ARM64. So even if the concept feels familiar, the practical mechanics are not the same as on a Linux x86_64 machine.

This is one of the easiest ways to get humbled in low-level work:
you realize that understanding the concept is not enough if you do not understand the platform-specific contract.

x86_64 vs ARM64

The operating system is only half of the story.

The other half is the architecture.

Even if two systems both support "syscalls", that does not mean:

  • they use the same instruction
  • they pass arguments the same way
  • they preserve the same registers
  • they return errors in the same form

x86_64

On x86_64, the syscall path is commonly associated with the syscall instruction.

If you have done Linux shellcode or Windows ntdll reversing, you have probably seen this instruction already.

But this is where people make a mistake:
they see the same instruction on two systems and assume the whole model is the same.

It is not.

The surrounding ABI is still OS-specific.

So even on the same CPU architecture:

  • Linux x86_64 syscall conventions are one thing
  • Windows x64 syscall conventions are another thing
  • the surrounding libraries and error models differ too

ARM64

On ARM64, you will usually see svc used for the transition.

Again, same concept:

  • user mode asks
  • kernel handles it

But the mechanics are different:

  • different registers
  • different ABI expectations
  • different assembly idioms
  • different reversing patterns

This matters more and more now because ARM64 is no longer a niche platform.
It is relevant in mobile, in macOS, and increasingly in general low-level research.

If you only learn syscalls through x86_64 examples, ARM64 will force you to understand the why, not just the syntax.

And honestly, that is probably a good thing.

So what is the "real syscall"?

If I had to explain it in the least confusing way possible, I would say this:

A syscall is not just the instruction.
A syscall is not just the API.
A syscall is the controlled request path from user space into kernel space.

Depending on context, people may focus on one piece of that path more than the others.

For example:

  • a reverser may care about where the transition actually happens
  • an exploit developer may care about ABI details and argument setup
  • a defender may care about where hooks are placed
  • a developer may never see any of this because libc or Win32 hides it

All of them are touching the same idea from different angles.

Why do offsec people care so much?

Because syscalls sit very close to the operating system boundary.

If you are doing:

  • malware analysis
  • reverse engineering
  • shellcode development
  • EDR bypass research
  • low-level debugging
  • process injection analysis
  • WoW64 research

then syscalls stop being an abstract OS topic and become a practical one.

At some point you will need to answer questions like:

  • What layer am I in right now?
  • Am I still in a high-level API or already in the NT layer?
  • Is this x86 code, x64 code, or ARM64 code?
  • What actually crosses into the kernel here?
  • What is the defender seeing?
  • What assumptions am I making that only hold on one platform?

And most of those questions become much easier once your syscall mental model is solid.

Final idea to keep in mind

The mistake is not "not knowing syscall numbers".

The real mistake is thinking syscalls are just a table of numbers.

They are not.

They are a contract between:

  • a program
  • an architecture
  • an ABI
  • and an operating system kernel

The concept is universal.
The implementation is not.

That is the part worth learning first.

What comes next

In the next posts I want to go deeper into:

  • the difference between Win32, NT API and the actual syscall boundary
  • what syscall stubs look like
  • why ABI details matter
  • how to recognize the transition in a debugger
  • how Linux, Windows and ARM64 examples differ in practice

This post is only the floor.
The interesting part starts once you begin following the boundary for real.


Thanks for Reading!

If you found this post useful, consider:

  • Sharing it with others who might benefit
  • Following me on GitHub for more projects
  • Connecting on LinkedIn

Have questions or feedback? Feel free to reach out!

Happy Hacking! 🔐

End of post