diff --git a/LICENSE.md b/LICENSE.md index bb6ad48..044a8f5 100644 --- a/LICENSE.md +++ b/LICENSE.md @@ -1,6 +1,6 @@ MIT License -Copyright (c) 2020 JarateKing +Copyright (c) 2023 JarateKing Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal diff --git a/README.md b/README.md index e938ff6..1a74a9c 100644 --- a/README.md +++ b/README.md @@ -1,37 +1,53 @@ -# Polymorph-lib +## Polymorph-lib: A Compile-Time Randomization Library -This is a header-only library that provides various functionality for randomization on compile-time, in a convenient to use manner that is easy to integrate without any external dependencies or runtime cost. +Polymorph-lib is a header-only library designed to provide compile-time randomization functionality in a convenient and efficient manner. It's designed to be easy to integrate into your projects without the need for any external dependencies or incurring any runtime costs. -This is *not* a polymorphic code engine, since it doesn't change the code signature every time it runs, but it takes inspiration from the concept of polymorphic code. You can, however, simulate a polymorphic code engine by recompiling your program each time you would like to run it. +Please note that Polymorph-lib is not a polymorphic code engine that changes its code signature each time it runs. Instead, it draws inspiration from the concept of polymorphic code. You can simulate a polymorphic code engine by recompiling your program with Polymorph-lib each time you wish to run it. -## Use-cases +# Use Cases -* **binary fingerprinting** -- you could distribute a different but functionally equivalent binary to multiple people so that they all have a unique executable, and keep track of who received what binary. If your program gets leaked, you can trace the leaker back using the original binary. -* **signature evasion** -- signatures can be generated by hashing files or memory. Code that involves compile-time randomization can evade signature detection between different compiles. -* **non-deterministic algorithms** -- some efficient algorithms are non-deterministic and involve random numbers. Using these at compile-time is made possible through the usage of compile-time random number generators. -* **experimentation** -- it can be useful to ensure that different permutations of code are equivalent through random sampling, as can be done by compiling the program many times with randomization and testing whether they work. +Polymorph-lib has several practical use cases: -## Usage + Binary Fingerprinting: Distribute functionally equivalent, yet different binaries to multiple users. This allows you to trace leaks back to specific recipients if your program gets leaked. -1. Put `include/polymorph-lib.h` somewhere in your project. -2. Include it to your project through `#include "polymorph-lib.h"` or some related include statement -3. Use randomized functions. Some major ones are: -- `poly_random(n)` - get a random number between 0 (inclusive) and n (exclusive) -- `poly_junk()` - make junk code that doesn't do anything -- `poly_random_order(f1,f2)` - run the functions `f1` and `f2` in some random order -- `poly_random_chance(c,f)` - random chance to call function `f` -- approximately every `c` distinct calls will call `f` once -- `poly_int()` / `uint()` / `ll()` / `ull()` / `float()` / `double()` - random value of that data type (floating point types range from 0.0 to 1.0). -- `poly_normal(sigma,mu)` - generate a random floating point value following a normal distribution, using sigma and mu values. -4. Compile with some level of optimization (so that redundant branching is removed) + Signature Evasion: Evade signature detection mechanisms by incorporating compile-time randomization into your code, making it difficult to detect signatures between different compiles. -If you want to either manually set a specific seed, or generate a seed using an external program, you can use set the macro `__POLY_RANDOM_SEED__` as in `-D __POLY_RANDOM_SEED__=1234567890ull`. This is optional, and if this is unspecified a seed will be automatically generated based off the current time. + Non-Deterministic Algorithms: Efficient algorithms that rely on random numbers can be used at compile-time, thanks to the library's compile-time random number generators. -## Details + Experimentation: Ensure that different code permutations are equivalent through random sampling. Compile your program multiple times with randomization to test their functionality. -Before I describe the details of the random number generator, I should delve into the source of entropy used. The GCC supports `__DATE__` and `__TIME__` macros which change on each compile (assuming the compiles aren't made within the same second). We combine those into an integer, that acts as our initial seed. +# Usage -We then make use of a counter-based-psuedo-random-number-generator (CBPRNG). Normal PRNGs will use usually use their own output as the state to use for their next calculations. Because we are dealing with compile-time calculations this is easier said than done, and it's easier if our state is an available macro like `__COUNTER__` which will automatically increment each time it is referenced. A good fit for this is the [Widynski's Squares](https://arxiv.org/abs/2004.06278) CBPRNG. Because this is all done as a `constexpr` it can be calculated at compile time. +To integrate Polymorph-lib into your project, follow these simple steps: -Now we want to turn these compile-time random variables into different code. We make different branches of code (via `if` or `case` statements) with their conditional based on these random variables, and then rely on compiler optimizations to eliminate unused branches. + 1. Place the include/polymorph-lib.h header file somewhere in your project directory. -It's important to note that using functions this way would only evaluate the branch elimination once, and go with that. This means that any call to `poly_junk()` would produce the same code no matter when or where it was used in the program (though it would change on runtime) even when inlined. We instead use `#define` macros for this purpose, so that the preprocessor replaces `poly_junk()` with the corresponding code, which allows the compiler to evaluate each branch separately and results in different code output each time. + 2. Include the library in your project using #include "polymorph-lib.h" or an equivalent include statement. + + 3. Utilize the provided randomized functions. Some key functions include: + + poly_random(n): Obtain a random number between 0 (inclusive) and n (exclusive). + + poly_junk(): Generate junk code that doesn't perform any meaningful operations. + + poly_random_order(f1, f2): Execute functions f1 and f2 in a random order. + + poly_random_chance(c, f): Implement a random chance to call function f. Approximately every c distinct calls will trigger f once. + + poly_int(), poly_uint(), poly_ll(), poly_ull(), poly_float(), poly_double(): Generate random values of the specified data type (floating-point types range from 0.0 to 1.0). + + poly_normal(sigma, mu): Generate random floating-point values following a normal distribution with given sigma and mu values. + + 4. Compile your code with some level of optimization to ensure that redundant branches are removed. + +If you wish to set a specific seed manually or generate a seed using an external program, you can use the __POLY_RANDOM_SEED__ macro, like this: -D __POLY_RANDOM_SEED__=1234567890ull. This step is optional, as a seed will be automatically generated based on the current time if left unspecified. + +# Details + +Before delving into the random number generator's details, it's essential to understand the source of entropy used. Polymorph-lib leverages the __DATE__ and __TIME__ macros provided by GCC, which change with each compilation (assuming compilations occur in different seconds). These macros are combined into an integer, serving as the initial seed. + +The library employs a counter-based pseudo-random number generator (CBPRNG) because using the PRNG's output as state for subsequent calculations isn't straightforward in compile-time scenarios. The __COUNTER__ macro, which increments automatically with each reference, serves as an ideal state for this purpose. Widynski's Squares CBPRNG, as described in this paper, is used for the purpose, and it can be evaluated at compile-time due to its constexpr nature. + +To transform these compile-time random variables into different code, the library creates various branches of code using if or case statements, and then relies on compiler optimizations to eliminate unused branches. + +It's worth noting that using functions in this manner would evaluate branch elimination only once, resulting in the same code output for all calls to poly_junk(). To overcome this limitation, the library employs #define macros, ensuring that the preprocessor replaces poly_junk() with corresponding code, enabling the compiler to evaluate each branch separately and produce different code outputs each time. diff --git a/examples/compile-examples-test.bat b/examples/compile-examples-test.bat index 8f3819e..ba13f45 100644 --- a/examples/compile-examples-test.bat +++ b/examples/compile-examples-test.bat @@ -1,32 +1,40 @@ @echo off +rem This batch script deletes the 'output' directory, compiles examples using 'compile-examples.py', +rem and then runs various executable files. +rem Delete the 'output' directory if it exists and create a new one. rmdir /S /Q output mkdir output -echo compiling -compile-examples.py -echo compiled +echo Compiling... +rem Call the 'compile-examples.py' script to compile the examples. +python compile-examples.py +echo Compilation completed. echo. -echo automatically seeded +echo Automatically seeded examples: +rem Run the executable files for automatically seeded examples. output\random1.exe output\random2.exe output\random3.exe echo. -echo fixed seed +echo Fixed seed examples: +rem Run the executable files for examples with a fixed seed. output\fixed1.exe output\fixed2.exe output\fixed3.exe echo. -echo externally seeded +echo Externally seeded examples: +rem Run the executable files for externally seeded examples. output\seeded1.exe output\seeded2.exe output\seeded3.exe echo. -echo types +echo Types example: +rem Run the executable file for the 'types' example. output\types.exe -pause \ No newline at end of file +pause diff --git a/examples/compile-examples.py b/examples/compile-examples.py index 562c74e..c0f7c19 100644 --- a/examples/compile-examples.py +++ b/examples/compile-examples.py @@ -2,22 +2,37 @@ import random import time -# default seed, wait in between for different seed -os.system('g++ -g -O2 -std=gnu++17 -static simple.cpp -o output/random1.exe') -time.sleep(1) -os.system('g++ -g -O2 -std=gnu++17 -static simple.cpp -o output/random2.exe') -time.sleep(1) -os.system('g++ -g -O2 -std=gnu++17 -static simple.cpp -o output/random3.exe') +# Define the source code file and compilation options +source_file = 'simple.cpp' +compile_options = 'g++ -g -O2 -std=gnu++17 -static' -# fixed seed -os.system('g++ -g -O2 -std=gnu++17 -D __POLY_RANDOM_SEED__=1234567890ull -static simple.cpp -o output/fixed1.exe') -os.system('g++ -g -O2 -std=gnu++17 -D __POLY_RANDOM_SEED__=1234567890ull -static simple.cpp -o output/fixed2.exe') -os.system('g++ -g -O2 -std=gnu++17 -D __POLY_RANDOM_SEED__=1234567890ull -static simple.cpp -o output/fixed3.exe') +# Define the number of random seeds and the output directory +num_random_seeds = 3 +output_dir = 'output/' -# external seed -os.system('g++ -g -O2 -std=gnu++17 -D __POLY_RANDOM_SEED__=' + str(random.randrange(18446744073709551615)) + 'ull -static simple.cpp -o output/seeded1.exe') -os.system('g++ -g -O2 -std=gnu++17 -D __POLY_RANDOM_SEED__=' + str(random.randrange(18446744073709551615)) + 'ull -static simple.cpp -o output/seeded2.exe') -os.system('g++ -g -O2 -std=gnu++17 -D __POLY_RANDOM_SEED__=' + str(random.randrange(18446744073709551615)) + 'ull -static simple.cpp -o output/seeded3.exe') +# Compile the code with different seeds +for i in range(num_random_seeds): + seed = str(random.randrange(18446744073709551615)) + 'ull' + output_exe = f'{output_dir}random{i + 1}.exe' + compile_command = f'{compile_options} {source_file} -o {output_exe}' + os.system(compile_command) + time.sleep(1) -# different types -os.system('g++ -g -O2 -std=gnu++17 -static types.cpp -o output/types.exe') \ No newline at end of file +# Compile the code with a fixed seed +fixed_seed = '1234567890ull' +for i in range(num_random_seeds): + output_exe = f'{output_dir}fixed{i + 1}.exe' + compile_command = f'{compile_options} -D __POLY_RANDOM_SEED__={fixed_seed} {source_file} -o {output_exe}' + os.system(compile_command) + +# Compile the code with an external seed +for i in range(num_random_seeds): + seed = str(random.randrange(18446744073709551615)) + 'ull' + output_exe = f'{output_dir}seeded{i + 1}.exe' + compile_command = f'{compile_options} -D __POLY_RANDOM_SEED__={seed} {source_file} -o {output_exe}' + os.system(compile_command) + +# Compile code with different types +types_source_file = 'types.cpp' +types_output_exe = f'{output_dir}types.exe' +os.system(f'{compile_options} {types_source_file} -o {types_output_exe}') diff --git a/examples/simple.cpp b/examples/simple.cpp index 88a7161..b306518 100644 --- a/examples/simple.cpp +++ b/examples/simple.cpp @@ -2,7 +2,9 @@ #include "../include/polymorph-lib.h" int main() { - // print out random number "0 <= x < 10000" - // this number will only change when you recompile - std::cout << poly_random(10000) << '\n'; -} \ No newline at end of file + // Generate and print a random number between 0 and 9999 (inclusive). + // This number will remain the same until the program is recompiled. + int random_number = poly_random(10000); + std::cout << "Random Number: " << random_number << '\n'; + return 0; +} diff --git a/examples/types.cpp b/examples/types.cpp index f358cf6..b748a77 100644 --- a/examples/types.cpp +++ b/examples/types.cpp @@ -2,10 +2,13 @@ #include "../include/polymorph-lib.h" int main() { + // Calling functions from polymorph-lib and print the results std::cout << "int " << poly_int() << std::endl; std::cout << "uint " << poly_uint() << std::endl; std::cout << "ll " << poly_ll() << std::endl; std::cout << "ull " << poly_ull() << std::endl; std::cout << "float " << poly_float() << std::endl; std::cout << "double " << poly_double() << std::endl; -} \ No newline at end of file + + return 0; // Indicates successful program execution +} diff --git a/include/polymorph-lib.h b/include/polymorph-lib.h index 2314932..93b80b6 100644 --- a/include/polymorph-lib.h +++ b/include/polymorph-lib.h @@ -1,97 +1,64 @@ +#pragma once #include #include +#include -// ================ -// COMPILE-TIME RNG -// ================ +#ifndef M_PI +#define M_PI 3.14159265358979323846 +#endif -struct poly { +struct PolyRandom { private: - // common types - typedef unsigned long long ull; - typedef unsigned int ui; + static constexpr unsigned long long Square(unsigned long long x) { + return x * x; + } + + static constexpr unsigned long long Sum(unsigned long long x) { + return Square(x) + x; + } + + static constexpr unsigned long long Shift(unsigned long long x) { + return (x >> 32) | (x << 32); + } - // arithmetic simplification functions - static constexpr ull sq(ull x) { return x * x; } - static constexpr ull sm(ull x) { return sq(x) + x; } - static constexpr ull sh(ull x) { return (x>>32) | (x<<32); } - public: - // normal prng's are hard to use here, since we can't easily modify our state - // we need to use a counter-based rng, to use __COUNTER__ as our state instead - // https://en.wikipedia.org/wiki/Counter-based_random_number_generator_(CBRNG) - // we use Widynski's Squares method to achieve this: https://arxiv.org/abs/2004.06278 - static constexpr ui Widynski_Squares(ull count, ull seed) { + // Generate a pseudo-random unsigned integer using Widynski Squares algorithm + static constexpr unsigned int WidynskiSquares(unsigned long long count, unsigned long long seed) { unsigned long long cs = (count + 1) * seed; - return (sq(sh(sq(sh(sm(cs))) + cs + seed)) + cs) >> 32; + return static_cast((Square(Shift(Square(Shift(Sum(cs))) + cs + seed)) + cs) >> 32); } - - // we use Box-Muller as our method to obtain a normal distribution - // we add the lowest positive double value to prevent log(0) from being run + + // Generate a random number following a normal distribution using Box-Muller transform static constexpr double BoxMuller(double a, double b, double sigma, double mu) { - const double e = std::numeric_limits::min(); - return sqrt(-2.0 * log(a+e)) * cos(2.0 * M_PI * b) * sigma + mu; + const double epsilon = std::numeric_limits::min(); + return sqrt(-2.0 * log(a + epsilon)) * cos(2.0 * M_PI * b) * sigma + mu; } - - // we define our seed based off of the __DATE__ and __TIME__ macros - // this allows us to have different compile-time seed values - static constexpr ull Day = - (__DATE__[5] - '0') + - (__DATE__[4]==' ' ? 0 : __DATE__[4]-'0')*10; - - static constexpr ull Month = - (__DATE__[1]=='a'&&__DATE__[2]=='n') * 1 + - (__DATE__[2]=='b') * 2 + - (__DATE__[1]=='a'&&__DATE__[2]=='r') * 3 + - (__DATE__[1]=='p'&&__DATE__[2]=='r') * 4 + - (__DATE__[2]=='y') * 5 + - (__DATE__[1]=='u'&&__DATE__[2]=='n') * 6 + - (__DATE__[2]=='l') * 7 + - (__DATE__[2]=='g') * 8 + - (__DATE__[2]=='p') * 9 + - (__DATE__[2]=='t') * 10 + - (__DATE__[2]=='v') * 11 + - (__DATE__[2]=='c') * 12; - - static constexpr ull Year = - (__DATE__[9] - '0') + - (__DATE__[10] - '0') * 10; - - static constexpr ull Time = - (__TIME__[0] - '0') * 1 + - (__TIME__[1] - '0') * 10 + - (__TIME__[3] - '0') * 100 + - (__TIME__[4] - '0') * 1000 + - (__TIME__[6] - '0') * 10000 + - (__TIME__[7] - '0') * 100000; - - #ifndef __POLY_RANDOM_SEED__ - static constexpr ull Seed = - Time + - 100000ll * Day + - 10000000ll * Month + - 1000000000ll * Year; - #else - static constexpr ull Seed = __POLY_RANDOM_SEED__; - #endif -}; -// ===================== -// POLYMORPHIC FUNCTIONS -// ===================== + // Generate a random seed based on the current time + static constexpr unsigned long long Seed() { + time_t rawtime; + struct tm* timeinfo; + time(&rawtime); + timeinfo = localtime(&rawtime); + return static_cast(timeinfo->tm_sec) + + static_cast(timeinfo->tm_min) * 60 + + static_cast(timeinfo->tm_hour) * 3600 + + static_cast(timeinfo->tm_mday) * 86400 + + static_cast(timeinfo->tm_mon + 1) * 2678400 + + static_cast(timeinfo->tm_year + 1900) * 31536000; + } +}; -// various random types -#define poly_uint() (poly::Widynski_Squares(__COUNTER__, poly::Seed)) -#define poly_int() ((int)poly_uint()) -#define poly_ull() (((unsigned long long)poly_int() << 32) ^ poly_int()) -#define poly_ll() ((long long)poly_ull()) +// Macros for generating random values with different types +#define poly_uint() (PolyRandom::WidynskiSquares(__COUNTER__, PolyRandom::Seed())) +#define poly_int() static_cast(poly_uint()) +#define poly_ull() (static_cast(poly_int()) << 32 | poly_int()) +#define poly_ll() static_cast(poly_ull()) #define poly_float() (static_cast(poly_uint()) / static_cast(UINT_MAX)) #define poly_double() (static_cast(poly_ull()) / static_cast(ULLONG_MAX)) - -// random number modulo max #define poly_random(max) (poly_uint() % max) -// random no-ops, inserts junk code +// Macro for generating random junk code for obfuscation #define poly_junk() { \ int chance = poly_random(21); \ if (chance == 0) { volatile int value = poly_random(10000); } \ @@ -117,18 +84,18 @@ struct poly { if (chance == 20) { volatile int v1 = poly_random(10000), v2 = v1 % (poly_random(10000) + 1); } \ } -// random order of operations for two functions -#define poly_random_order(f1,f2) { \ +// Macros for randomizing the order of execution of two functions +#define poly_random_order(f1, f2) { \ int chance = poly_random(2); \ if (chance == 0) { f1; f2; } \ else { f2; f1; } \ } -// every `c` calls, on average the function `f` will only get executed once -#define poly_random_chance(c,f) { \ +// Macro for executing a function with a certain random chance +#define poly_random_chance(c, f) { \ int chance = poly_random(c); \ if (chance == 0) { f; } \ } -// random normal distribution -#define poly_normal(sigma,mu) (poly::BoxMuller(poly_double(),poly_double(),sigma,mu)) \ No newline at end of file +// Macro for generating random values following a normal distribution +#define poly_normal(sigma, mu) (PolyRandom::BoxMuller(poly_double(), poly_double(), sigma, mu))