Encryption/Decryption Learning Notes
Compared to CSAPP, "Encryption and Decryption" leans more towards practical application, with rich but somewhat scattered content. The assembly code syntax differs from CSAPP, using Intel syntax.
Basic Knowledge
Introductions to topics such as APIs, Unicode, and Little-endian are omitted here.
(However, placing Win32 API and WOW64 in this section feels quite discouraging—oh well, I'll just refer to the documentation when needed in the future—)
Dynamic Analysis Techniques
-
OllyDbg
(Learned about hardware breakpoints, message breakpoints, conditional breakpoints, memory breakpoints, and the trace function) -
x64dbg
-
MDebug -
WinDbg
Static Analysis Techniques
-
PEiD/ExeinfoPE
-
ODDisasm, BeaEngine, Udis86, Capstone, AsmJit, Keystone -
IDA is the GOAT!
(IDA offers a feature I wasn't aware of before—pressing F1 on a system API brings up its usage. However, this only supports .hlp files, so it's practically useless.) -
WinHex/010editor
Reverse Engineering Techniques
32-bit Software Reverse Engineering Techniques
Startup Function
WinAPI centralized area. Skip if you don't understand.
Functions
Another chance to review stack calls for the Nth time.
Common calling conventions (VARARG indicates that the number of parameters is uncertain):

①: Only applicable when the stack balancer is the caller.
Data Structures
Local variables: Stored on the stack.
Global variables: Located in the .data section / cs:xxxx.
Arrays: Accessed via base-plus-index addressing. To understand assembly code for arrays, I frantically reviewed the relevant chapters of CSAPP.
Virtual Functions - 1/15/2022
Reference to virtual functions: First, use a pointer (usually allocated by new
or malloc
) to point to the virtual function table (VTBL), which stores the addresses of all virtual functions. Then, use the virtual function table pointer (VPTR) to call the function.
Based on the virtual table, the number of virtual functions in the class and the code of the virtual functions can be restored.
Question: In the assembly code at address 0040101B, why is
eax = *VTBL = **Add()
, instead ofeax = *VTBL = &Add()
?
Control Statements — 1/17/2022
(Most of this has been covered in CS:APP; no particularly important points here.)
a & (-b)
where b is a power of 2 is equivalent to
sbb A B
instruction: A = A - B - CF
Loop Statements — 1/23/2022
As explained in CS:APP, the essence is a jump from a higher address to a lower address.
Corrections:
In both schemes,
i < 5
should be changed toi <= 5
.The comment at address 0x40102E in the unoptimized code should read "from higher to lower address region."
Mathematical Operators — 1/23/2022
Addition and subtraction are accelerated using the lea
instruction.
Multiplication is accelerated using shift instructions.
For division, when the divisor is known, a constant (similar to a modular inverse) is multiplied, and the high bits of the result are taken. Specifically, if the divisor is a power of two, a right shift is used directly.
If the result is negative, the value is incremented by 1 (likely related to rounding toward zero for negative numbers).
Text String——1/23/2022
The commonly used C string ends with '\0'
. Other types include DOS strings (ending with $
), PASCAL strings (with a single-byte ANSI character at the beginning indicating the length), and Delphi strings (with two-byte or four-byte length prefixes).
If an instruction like mov ecx, FFFFFFFF
appears, it often indicates that the program is obtaining the length of a string. The corresponding assembly code is as follows.

Instruction Modification Techniques — 1/24/2022
Not much to say, so I'll just post a summary image directly.

64-bit Software Reverse Engineering — 1/24/2022
Registers and Loop Statements
- There is only one calling convention, with the following rules (which differ from CSAPP again):
- The first four parameters are passed via registers, and the rest are pushed onto the stack.
- The order of register usage is: rcx, rdx, r8, r9 (xmm0~xmm3 for floating-point values).
- The order of stack parameters: from right to left.
- The stack reserves 32 bytes of space for these four parameters, even if only two are passed.
- The
rep
instruction: Repeats the subsequent instruction while decrementingecx
, untilecx
becomes zero after decrementing (thus, ifecx
is 0, it will executetimes). - The
stos
instruction: Stores the data ineax
into the memory address pointed to byedi
, then incrementsedi
by 4 bytes. It is often used with therep
instruction to fill stack space with0xcccccccc
("烫烫烫"). - The
movsb
instruction: Copies data from the memory location pointed to byesi
to the memory location pointed to byedi
, then automatically increments both registers by 1 byte (due to the 'b' suffix). It is often used with therep
instruction to copy arrays, structs, etc. - For struct parameter passing: If the struct size is ≤ 8 bytes, it is passed directly via registers. If larger, the address of the struct is passed, and its contents are accessed via offsets.
- Member function calls in classes have one additional parameter compared to regular functions: the
this
pointer.
What is
__security_check_cookie
?By filling unused stack space with
0xcc
and XORing this value with thersp
pointer, the result is__security_check_cookie
. Since modern programs enable stack randomization protection, the cookie value cannot be predicted, making it an effective way to prevent stack overflow.
What does this statement mean? (P150)
mov eax, ds:(jpt_140001060 - 140000000h)[rcx+rax*4]
An operation like
(a)[b+c*d]
is similar toa(b,c,d)
in AT&T syntax, meaningb + c*d + a
. Here, it implements a jump table entry operation, preparing for branch jumps in aswitch
statement.
- Besides jump tables,
switch
statements can also be optimized using decision trees. - The machine code for the 64-bit
call
instruction isFF 15 xx xx xx xx
, where the last four bytes represent a relative offset, not a memory address.
Mathematical Operators
-
Various optimization techniques for division
-
For divisors of the form
, the formula is x >> n
(when the dividend is positive) and(x + 2**n - 1) >> n
(when the dividend is negative). -
For divisors of the form
, it is the same as above, but the result must be negated. -
For divisors not of the form
, there are two optimization methods. Here, we take 64-bit as an example. -
The first is
x * c >> 64 >> n
(when the dividend is positive) and(x * c >> 64 >> n) + 1
(when the dividend is negative). Here, c is a positive number, and n may be 0. -
The second is
(x * c >> 64) + x >> n
(when the dividend is positive) and((x * c >> 64) + x >> n) + 1
(when the dividend is negative). Here, c is a negative number. -
The divisor can be calculated using
, where is the magic_num.
-
A brief proof for the second formula (let
be the divisor): Due to overflow,
, which reduces to the case of the first formula. The rest of the process is omitted. Q: Why add x?
Because adding x is equivalent to adding
to . Previously, was 0x92492493
. After converting to signed, it becomes0x92492493-0x100000000
, and addingrestores it to its original value. Without adding x, the result would be negative. -
For divisors not of the form
, the formulas are essentially the same as above, except that in the first formula, c is negative; in the second formula, c is positive, and the "+" in the middle should be changed to "-". Here, , where is the magic_num. -
For unsigned division, when the divisor is of the form
, the formula is x >> n
. -
For unsigned division, when the divisor is not of the form
, there are two formulas: - The first is
x * c >> 64 >> n
, where. - The second is
(x - (x * c >> 64) >> n1) + (x * c >> 64) >> n2
, where, and is the magic_num.
- The first is
A brief proof for the second formula (let
be the divisor): I will never prove this in my life.Therefore, the formula given in the book,
, is incorrect and only holds for . The link below also confirms the correctness of my formula. -
-
Some optimization techniques for modulo
Note: In C, the sign of the modulo result is the same as the sign of the dividend, and its value is the same as when both operands are positive. This differs from Python.
For example, calculating
5 % 3
,(-5) % 3
,5 % (-3)
, and(-5) % (-3)
in C yields2 -2 2 -2
, while in Python it yields2 1 -1 -2
.- For divisors of the form
, there are two optimization methods: - The first is
x & ((1 << n) - 1)
(when the dividend is positive) and((x & ((1 << n) - 1)) - 1 | (~ ((1 << n) - 1))) + 1
(when the dividend is negative). Here, subtracting 1 and then adding 1 is to handle the special case where the remainder is 0. - When the dividend is negative, another optimization formula is
((x + ((1 << n) - 1)) & ((1 << n) - 1)) - ((1 << n) - 1)
. (According to the editor's tests, compilers mostly use this formula for optimization.)
- The first is
Explanation of the rationality of the two formulas:
Formula 1:
C handles negative modulo in the opposite way to positive modulo. It simply sets the first
bits to 1. Therefore,
| (~ ((1 << n) - 1))
does exactly that.Actually, the preceding
& (~ ((1 << n) - 1))
can be omitted, but to reduce condition checks, it is unified with the positive case.When the last
bits of are , if treated as negative and the first bits are all set to , the result would be . By subtracting
, the last bits of are set to , and after setting the first bits to and then adding , the result becomes . Generally, if you subtract
at the beginning, the final modulo range becomes ~ . Formula 2:
Take
as an example: Observation shows that when
x < 0
, the element in columnof the second row is the value of the element in column of the third row minus . Thus, by generalization, the above formula is derived. - When the divisor is not of the form
, the optimization method is x - x / c * c
, whereis the divisor.
- For divisors of the form
Virtual Functions
-
Virtualizing the destructor results in the compiler generating a regular destructor and a destructor with
delete
in the virtual table. -
To prevent virtual destructors from freeing memory multiple times, VC++ adds a parameter to the destructor (if the parameter is 1, it frees the memory), while GCC places two destructor addresses in the virtual table.
-
Memory layout of a single object:
-
Constructor call order (can serve as a basis for reconstructing class inheritance hierarchies):
- Call the virtual base class constructor (multiple calls in inheritance order).
- Call the ordinary base class constructor (multiple calls in inheritance order).
- Call the constructors of object members (multiple calls in definition order).
- Call the derived class constructor.
-
The destructor call order is the reverse.
-
Virtual table population process for derived classes (can serve as a basis for reconstructing class inheritance hierarchies):
- Copy the base class's virtual table.
- If a derived class virtual function overrides a base class virtual function, replace the corresponding entry with the derived class's virtual function address.
- If the derived class has new virtual functions, append them to the end of the virtual table.
-
Memory layout of single inheritance objects:
-
A characteristic of multiple inheritance is that the constructor performs two operations to initialize the virtual table.
-
Memory layout of multiple inheritance objects:
-
To prevent memory redundancy of the base class in diamond inheritance, virtual inheritance (
virtual public: <class name>
) is used. The specific implementation involves passing an additional parameter in the constructor to indicate whether to call the virtual base class constructor. -
To facilitate locating the position of the virtual base class in the object's memory, an 8-byte virtual base class offset table (located in the global data area) is created, with the last 4 bytes indicating the offset of the virtual base class in the current virtual base class offset table.
-
Another way to determine virtual inheritance is whether the constructor initializes the virtual base class offset table.
-
Memory layout of diamond inheritance objects
(gradually losing sanity): -
In IDA,
vftable
represents the virtual table, whilevbtable
represents the virtual base class offset table. IDA can even automatically indicate which class avftable/vbtable
address points to, which is quite intelligent. -
The only difference between the virtual table of an abstract base class and single inheritance is that the base class's virtual table code is
_purecall
, which displays an error message and exits the program. If a class has a_purecall
virtual table entry, it can be suspected to be an abstract class.
Demo Version Protection Techniques
Serial Number Protection Methods - 2/4/2022
- APIs for reading registration codes:
GetWindowText
,GetDlgItemText
,GetDlgItemInt
, etc. - APIs for displaying registration code correctness:
MessageBox
,MessageBoxEx
,ShowWindow
,CreateDialogParam
,DialogBoxParam
, etc. - For protection methods that use plaintext comparison of registration codes, you can open the memory window in OllyDbg and press Ctrl+B to search for the entered serial number (to locate the memory address of the input). In most cases, the actual serial number is located within ±90h bytes around this address.
- The Asm2Clipboard plugin in OllyDbg can be used to extract disassembly and embed it into C code. When converting, pay attention to stack balance, data formats, assembly syntax, and string references.
Warning Window - 2/7/2022
- Window ID Extraction: Use exescope or Resource Hacker.
- The prototype of
DialogBoxParam
is as follows:
int DialogBoxParam(
HINSTANCE hInstance,
LPCTSTR lpTemplateName,
HWND hWndParent,
DLGPROC lpDialogFunc,
LPARAM dwInitParam
);
- Two methods to remove the warning window:
- Use assembly to skip the warning window.
- Replace the parameters of the warning window with those of a normal window.
Time Limit – 2/8/2022
- Common timer functions:
SetTimer()
,KillTimer()
,timeSetEvent()
,GetTickCount()
,timeGetTime()
. - Common functions for retrieving time:
GetSystemTime()
,GetLocalTime()
,GetFileTime()
. - Two methods to bypass time restrictions:
- Skip time-related functions.
- Skip functions that check for timeout and trigger exit.
- Speed Gear can be used to assist in debugging (not yet successful).
Menu Function Restrictions — 2/9/2022
-
Related functions:
-
The
EnableMenuItem()
function is defined asBOOL EnableMenuItem(HMENU hMenu, UINT uIDEnableItem, UINT uEnable)
. TheuEnable
parameter includes options such asMF_ENABLED
(0h),MF_GRAYED
(1h),MF_DISABLED
(2h),MF_COMMAND
, andMF_BYPOSITION
. -
The
EnableWindow()
function is defined asBOOL EnableWindow(HWND hWnd, BOOL bEnable)
. The function returns a non-zero value for success, and 0 for failure.
-
-
Method to remove restrictions (only applicable when the full version and trial version files are identical): Modify the
Enable
parameter passed during the push calls of these two functions.
KeyFile Protection——2/10/2022
-
Related Functions:
-
LODS instruction:
lods byte ptr [esi]
, moves one byte of data pointed to by [esi] into eax, while incrementing esi. -
Analysis Approach:
- Use Process Monitor to monitor the program's file operations to identify the KeyFile's filename.
- Edit the KeyFile using a hex editor.
- Set a breakpoint on CreateFile in the debugger to check the pointer to the opened filename and note the returned handle.
- Set a breakpoint on the ReadFile function to analyze the file handle passed to ReadFile and the buffer address. The file handle is usually the same as the one returned in step 3. Set a memory breakpoint on the bytes stored in the buffer to monitor the content read from the KeyFile.
Network Verification - 2/12/2022
-
Related Functions:
send()
function, extended by Microsoft asWSASend()
int send( SOCKET s, const char FAR *buf, int len, int flags );
recv()
function, extended by Microsoft asWSARecv()
int recv( SOCKET s, char FAR *buf, int len, int flags );
-
Analysis Approach:
- Analyze the sent and received data packets. (Key step)
- Two methods:
- Write a server to receive and send data. If the client uses a domain name to log in, modify the hosts file; if it connects directly via IP, use
inet_addr
or set a breakpoint atconnect
to redirect the IP to the local machine. Alternatively, proxy software can achieve this. - Directly modify the client program. First, paste the correct received data packet to a blank address, then skip the
send()
andrecv()
functions and replace them with functions that handle the correct data packets. Finally, bypass dialogs such as "Connection Failed."
- Write a server to receive and send data. If the client uses a domain name to log in, modify the hosts file; if it connects directly via IP, use
Question: The getasm.py script does not seem to be compatible with IDA 7.6. How can the following code be reimplemented?
#coding=utf-8 ##"Encryption and Decryption" Fourth Edition ##code by DarkNess0ut import os import sys def Getasm(ea_from, ea_to, range1, range2): fp = open("code.txt","w") ea = ea_from while ea < ea_to: cmd = GetMnem(ea) if cmd == "mov" or cmd == "lea": opcode = Dword(NextNotTail(ea)-4) if opcode < 0: #opcode < 0, handles instructions like mov edx, [ebp-350]; otherwise, handles mov edx, [ebp+350] opcode = (~opcode + 1) Message("-> %08X %08X\n" % (ea, opcode)) if range1 <= opcode <= range2: delta = opcode - range1 MakeComm(ea, "// +0x%04X" % delta) # Add comment to IDA fp.write("%08X %s\n" % (ea, GetDisasm(ea))) ea = NextNotTail(ea) fp.close() Message("OK!") Getasm(0x401000,0x40F951,0x41AE68,0x0041AEC1);
CD Detection - 2/11/2022
-
Related Functions:
GetDriveType()
, retrieves the type of a disk drive
UINT GetDriveType( LPCTSTR lpRootPathName ); /* Return values: 0: Drive type cannot be determined. 1: Root path does not exist. 2: Removable storage. 3: Fixed drive (hard disk). 4: Remote drive (network). 5: CD-ROM drive. 6: RAM disk. */
GetLogicalDrives()
, retrieves logical drive letters, no parameters
/* Return values: Returns 0 on failure; otherwise, returns a bitmask representing currently available drives, e.g., bit 0 drive A bit 1 drive B bit 2 drive C ...... */
GetLogicalDriveStrings()
, retrieves root drive paths of logical drives
DWORD GetLogicalDriveStrings( DWORD nBufferLength, LPTSTR lpBuffer ); /* Return values: Returns 0 on failure Returns the actual number of characters on success */
GetFileAttributes()
, determines the attributes of a specified file
DWORD GetFileAttributes( LPCTSTR lpFileName );
-
Analysis Methods:
- For simpler CD detection (first obtain all drive lists, then check the type of each drive; if it is a CD-ROM drive, use
CreateFile()
orFindFirstFile()
to check for file existence, attributes, size, content, etc.), simply set breakpoints using the above functions, locate where the CD drive is checked, and modify the conditional instructions. - For enhanced types (where critical data for the program is stored on the CD), multiple copies can be made using burning tools, or virtual drive programs can be used to simulate the original CD (among which Daemon Tools~~ is not only free~~ but can also simulate some encrypted CDs).
- For simpler CD detection (first obtain all drive lists, then check the type of each drive; if it is a CD-ROM drive, use
Running Only One Instance - 2/11/2022
-
Implementation Methods:
-
Window Lookup Method: If a window with the same class name and title is found, exit the program. Implemented using
FindWindowA()
andGetWindowText()
.HWND FindWindowA( LPCTSTR lpClassName, LPCTSTR lpWindowName ); // Returns 0 if no matching window is found.
-
Using Mutex Objects: Generally implemented with
CreateMutexA()
, which creates a named or unnamed mutex object(what is this?).HANDLE CreateMutexA( LPSECURITY_ATTRIBUTES lpMutexAttributes, // Security attributes BOOL bInitialOwner, // Initial ownership of the mutex LPCTSTR lpName // Pointer to the mutex name ); // If the function succeeds, it returns a handle to the mutex object.
-
Using Shared Sections (Section). This section has read, write, and shared protection attributes, allowing multiple instances to share the same memory block. Place a variable as a counter in this section, and all instances of the application can share this variable, thereby determining whether another instance is already running.
-
-
Bypass Methods:
- Modify the application's window title.
- Alter the return values of functions like
FindWindow()
(or modify the conditional instructions).
Common Breakpoint Setting Techniques——2/11/2022
Mastering Win32 programming techniques is still very important!

Encryption Algorithms
The notes in this section primarily focus on algorithm identification, without delving into the specific processes of the algorithms (except for public key algorithms, as they lack distinctive constants), assembly analysis, and crack operations.
Tools such as IDA's FindCrypt or PEiD's Krypto ANALyzer can be used to assist in algorithm analysis.
One-Way Hash Algorithm — 2/27/2022
MD5
Important Constants:
-
Initial message digest:
67452301h
,efcdab89h
,98badcfeh
,10325476h
. -
32-bit values corresponding to
floor(2**32 * abs(sin(i)))
, such asd76aa478h
.
Possible Variants:
- Modify the initial four constants.
- Change the padding method of the original string.
- Alter the processing steps of the hash transformation.
SHA
SHA-1 constants: 5a827999h
, 6ed9eba1h
, 8f1bbcdch
, ca62c1d6h
.
SHA-1 160-bit initial message digest: 67452301h
, efcdab89h
, 98badcfeh
, 10325476h
, c3d2e1f0h
.
Initial message digests for SHA-256, SHA-384, and SHA-512:

SM3
Publicly available national cryptographic algorithm, process overview: https://zhuanlan.zhihu.com/p/129692191
Possible code implementation: https://blog.csdn.net/a344288106/article/details/80094878
Constants?
79CC4519h
,7A879D8Ah
Initialize message digest?
7380166Fh
,4914B2B9h
,172442D7h
,DA8A0600h
,A96F30BCh
,163138AAh
,E38DEE4Dh
,B0FB0E4Eh
Symmetric Encryption Algorithm – 2/28/2022
RC4
Decryption script sourced online:
import base64
def rc4_main(key = "init_key", message = "init_message"):
print("RC4 decryption main function called successfully")
print('\n')
s_box = rc4_init_sbox(key)
crypt = rc4_excrypt(message, s_box)
return crypt
def rc4_init_sbox(key):
s_box = list(range(256))
print("Original s-box: %s" % s_box)
print('\n')
j = 0
for i in range(256):
j = (j + s_box[i] + ord(key[i % len(key)])) % 256
s_box[i], s_box[j] = s_box[j], s_box[i]
print("Scrambled s-box: %s"% s_box)
print('\n')
return s_box
def rc4_excrypt(plain, box):
print("Decryption program called successfully.")
print('\n')
plain = base64.b64decode(plain.encode('utf-8'))
plain = bytes.decode(plain)
res = []
i = j = 0
for s in plain:
i = (i + 1) % 256
j = (j + box[i]) % 256
box[i], box[j] = box[j], box[i]
t = (box[i] + box[j]) % 256
k = box[t]
res.append(chr(ord(s) ^ k))
print("res is used to decrypt the string, decrypted result: %res" %res)
print('\n')
cipher = "".join(res)
print("Decrypted string: %s" %cipher)
print('\n')
print("Decrypted output (without any encoding):")
print('\n')
return cipher
a=[0xc6,0x21,0xca,0xbf,0x51,0x43,0x37,0x31,0x75,0xe4,0x8e,0xc0,0x54,0x6f,0x8f,0xee,0xf8,0x5a,0xa2,0xc1,0xeb,0xa5,0x34,0x6d,0x71,0x55,0x8,0x7,0xb2,0xa8,0x2f,0xf4,0x51,0x8e,0xc,0xcc,0x33,0x53,0x31,0x0,0x40,0xd6,0xca,0xec,0xd4]
s=""
for i in a:
s+=chr(i)
s=str(base64.b64encode(s.encode('utf-8')), 'utf-8')
rc4_main("Nu1Lctf233", s)
TEA
The constant 0x9e3779b9
is derived from the 32-bit golden ratio
(Note: XTEA/XXTEA also use this constant)
Decryption script:
#include <stdio.h>
#include <stdint.h>
#define DELTA 0x9e3779b9
#define MX (((z>>5^y<<2) + (y>>3^z<<4)) ^ ((sum^y) + (key[(p&3)^e] ^ z)))
void btea (uint32_t* v,int n, uint32_t* k) { // however the 'n' is useless
uint32_t v0=v[0], v1=v[1], sum=0xC6EF3720, i; /* set up */
uint32_t delta=0x9e3779b9; /* a key schedule constant */
uint32_t k0=k[0], k1=k[1], k2=k[2], k3=k[3]; /* cache key */
for (i=0; i<32; i++) { /* basic cycle start */
v1 -= ((v0<<4) + k2) ^ (v0 + sum) ^ ((v0>>5) + k3);
v0 -= ((v1<<4) + k0) ^ (v1 + sum) ^ ((v1>>5) + k1);
sum -= delta;
} /* end cycle */
v[0]=v0; v[1]=v1;
}
int main()
{
uint32_t v[2]= {0x3e8947cb,0xcc944639};
uint32_t w[2]= {0x31358388,0x3b0b6893};
uint32_t x[2]= {0xda627361,0x3b2e6427};
uint32_t const k[4]= {17477,16708,16965,17734};
int n = 2; //The absolute value of n indicates the length of v, positive for encryption, negative for decryption
// v is the data to be encrypted/decrypted, consisting of two 32-bit unsigned integers
// k is the encryption/decryption key, consisting of four 32-bit unsigned integers, i.e., a 128-bit key
btea(v, -n, k);
printf("%x %x ",v[0],v[1]);
btea(w, -n, k);
printf("%x %x ",w[0],w[1]);
btea(x, -n, k);
printf("%x %x",x[0],x[1]);
return 0;
}
IDEA
The 52 subkeys are the inverses of the encryption key pairs for 16-bit addition and multiplication modulo (
The subkeys should be used in the reverse order of the encryption key.
The decryption code is omitted. Please search for bouncycastle on your own.
BlowFish
Based on the Feistel network.
P-array (derived from the fractional part of Pi):
243f6a88h
, 85a308d3h
, 13198a2eh
, 03707344h
Decryption code omitted.
AES (Rijndael)
Decryption modes include:
- ECB (Electronic Code Book) mode
- CBC (Cipher Block Chaining) mode
- CTR (Counter) mode
- CFB (Cipher Feedback Mode) mode
- OFB (Output Feedback) mode
Knowing these isn't very useful—you'll still have to try them one by one when the time comes.
S-Box:


Online decryption websites can only decrypt their own, not others'.
Decryption website: http://tool.chacuo.net/cryptaes
SM4
https://zhuanlan.zhihu.com/p/363900323
S-box:
Values of the system parameters
Specific values of the 32 fixed parameters
Refer to the link in the SM2 section for the decryption tool.
Public Key Encryption Algorithm - 3/1/2022
RSA
-
Generation of Public and Private Keys:
- First, we arbitrarily choose two prime numbers
and . Here, we take and , and compute . - Using Euler's totient function
(the number of positive integers less than or equal to that are coprime to ), we find . - Select an integer
smaller than such that is coprime to . Then, compute the modular inverse of modulo , i.e., . Here, we take and obtain . - Destroy
and .
Thus, we obtain the public key
and the private key . - First, we arbitrarily choose two prime numbers
-
Encrypting Information:
Suppose B wants to send a message
to A, and B knows the and generated by A. B converts into a positive integer smaller than using a pre-agreed format (such as Unicode, described below). Then, B encrypts into using the following formula: can be computed efficiently using fast exponentiation. -
Decrypting Information:
After receiving the message from B, A can use their private key
to decode it. A recovers from using the following formula:
For information on attacks against RSA, which is a specialized topic in cryptography, you can refer to this article.
ElGamal
- Key Pair Generation:
- Select a large prime
, a random number , and a random number such that and . - Compute
. - The public key is
, and the private key is .
- Select a large prime
- Encryption and Decryption:
- Select a random number
such that and . - Compute
. - Compute
, where is the ciphertext. - For decryption, compute
.
- Select a random number
- Signature:
- Select a random number
such that and . - Compute
. - Let the plaintext be
, and find a solution that satisfies . It can be proven that such a is unique. The signature is . - To verify the signature, ensure that
and .
- Select a random number
Attacks on Discrete Logarithms: BSGS, Pollard-Rho, Index-Calculus Algorithm, Pohlig-Hellman Algorithm, etc.
If the same

DSA—3/2/2022
Used for signing, not for encryption/decryption.
- Key pair generation:
is a prime number of bits. and , . is a prime factor of , . , with , . is the private key, . . , , , and are public keys. is a random number, , to be discarded after use.
- Signature generation:
- Input plaintext
, public keys , , , private key , and random number . r = g**k % p % q
s = inv(k) * (SHA-1(M) + x*r) % q
, whereinv(k)
is the modular multiplicative inverse ofmodulo . - The signature is
.
- Input plaintext
- Signature verification:
- Input plaintext
, public keys , , , , and signature , . - First, ensure
and . w = inv(s') % q
u1 = ((SHA-1(M')) * w) % q
u2 = (r' * w) % q
v = (g**u1 * y**u2) % p % q
- If
, the signature verification is successful.
- Input plaintext
ECC with GF(p)——March 3, 2022
Due to the high mathematical knowledge requirements, it is mainly used in Crypto challenges and is less likely to be encountered in Reverse Engineering. Therefore, the specific principles are not provided here.
Algorithm principles from Wikipedia: https://en.wikipedia.org/wiki/Elliptic-curve_cryptography
Algorithm principles + Python implementation from Zhihu: https://zhuanlan.zhihu.com/p/101907402
Potentially useful ECC template:
import collections
import random
EllipticCurve = collections.namedtuple('EllipticCurve', 'name p a b g n h')
curve = EllipticCurve(
'secp256k1',
# Field characteristic.
p=int(input('p=')),
# Curve coefficients.
a=int(input('a=')),
b=int(input('b=')),
# Base point.
g=(int(input('Gx=')),
int(input('Gy='))),
# Subgroup order.
n=int(input('k=')),
# Subgroup cofactor.
h=1,
)
# Modular arithmetic ##########################################################
def inverse_mod(k, p):
"""Returns the inverse of k modulo p.
This function returns the only integer x such that (x * k) % p == 1.
k must be non-zero and p must be a prime.
"""
if k == 0:
raise ZeroDivisionError('division by zero')
if k < 0:
# k ** -1 = p - (-k) ** -1 (mod p)
return p - inverse_mod(-k, p)
# Extended Euclidean algorithm.
s, old_s = 0, 1
t, old_t = 1, 0
r, old_r = p, k
while r != 0:
quotient = old_r // r
old_r, r = r, old_r - quotient * r
old_s, s = s, old_s - quotient * s
old_t, t = t, old_t - quotient * t
gcd, x, y = old_r, old_s, old_t
assert gcd == 1
assert (k * x) % p == 1
return x % p
# Functions that work on curve points #########################################
def is_on_curve(point):
"""Returns True if the given point lies on the elliptic curve."""
if point is None:
# None represents the point at infinity.
return True
x, y = point
return (y * y - x * x * x - curve.a * x - curve.b) % curve.p == 0
def point_neg(point):
"""Returns -point."""
assert is_on_curve(point)
if point is None:
# -0 = 0
return None
x, y = point
result = (x, -y % curve.p)
assert is_on_curve(result)
return result
def point_add(point1, point2):
"""Returns the result of point1 + point2 according to the group law."""
assert is_on_curve(point1)
assert is_on_curve(point2)
if point1 is None:
# 0 + point2 = point2
return point2
if point2 is None:
# point1 + 0 = point1
return point1
x1, y1 = point1
x2, y2 = point2
if x1 == x2 and y1 != y2:
# point1 + (-point1) = 0
return None
if x1 == x2:
# This is the case point1 == point2.
m = (3 * x1 * x1 + curve.a) * inverse_mod(2 * y1, curve.p)
else:
# This is the case point1 != point2.
m = (y1 - y2) * inverse_mod(x1 - x2, curve.p)
x3 = m * m - x1 - x2
y3 = y1 + m * (x3 - x1)
result = (x3 % curve.p, -y3 % curve.p)
assert is_on_curve(result)
return result
def scalar_mult(k, point):
"""Returns k * point computed using the double and point_add algorithm."""
assert is_on_curve(point)
if k < 0:
# k * point = -k * (-point)
return scalar_mult(-k, point_neg(point))
result = None
addend = point
while k:
if k & 1:
# Add.
result = point_add(result, addend)
# Double.
addend = point_add(addend, addend)
k >>= 1
assert is_on_curve(result)
return result
# Keypair generation and ECDHE ################################################
def make_keypair():
"""Generates a random private-public key pair."""
private_key = curve.n
public_key = scalar_mult(private_key, curve.g)
return private_key, public_key
private_key, public_key = make_keypair()
print("private key:", hex(private_key))
print("public key: (0x{:x}, 0x{:x})".format(*public_key))
SM2
A national cryptographic algorithm based on ECC.
SM2~SM4 encryption and decryption tool: https://github.com/ASTARCHEN/snowland-smx-python
Other Algorithms — March 1, 2022
CRC32
Can only be used for file verification, not for encryption.
The key lies in recognizing the initialization-generated crctab
table.
The algorithm is just a few lines of code:
#include <stdio.h>
int crctab[256];
void gentable() {
for(int i = 0; i < len; i++) {
int crc = i;
for(int j = 0; j < 8; j++) {
if(crc & 1)
crc = (crc >> 1) ^ 0xedb88320; // or 04c11db7h
else
crc >>= 1;
}
crctab[i] = crc;
}
}
int main()
{
gentable();
int dwCRC = 0xffffffff;
for(int i = 0; i < Len; i++) {
dwCRC = crctab[(dwCRC ^ Data[i]) & 0xff] ^ (dwCRC >> 8); // Data is the bytevalue of your file.
}
dwCRC = ~dwCRC;
return 0;
}
Base64
Since the encoding table might be changed during competitions, or even some logic of the algorithm might be modified, it is necessary to understand its implementation process.
Because
The mapping method is straightforward: list the 24 bits of the three bytes as a binary string (in big-endian order), then divide it into four groups of 6 bits each.
Each 6 bits can only represent values from 0 to 63, which correspond to the 64 characters in the Base64 table (array). These bits are replaced by their corresponding characters.
To ensure the encoded string length is a multiple of 4, if there are fewer than 6 bits to fill (note that this is different from having a value of 0), they are replaced with =
as padding.
An example is shown in the figure below:

Core code:
for(i=0,j=0;i<len-2;j+=3,i+=4)
{
res[i]=base64_table[str[j]>>2]; // Extract the first 6 bits of the first byte and find the corresponding character
res[i+1]=base64_table[(str[j]&0x3)<<4 | (str[j+1]>>4)]; // Combine the last 2 bits of the first byte with the first 4 bits of the second byte and find the corresponding character
res[i+2]=base64_table[(str[j+1]&0xf)<<2 | (str[j+2]>>6)]; // Combine the last 4 bits of the second byte with the first 2 bits of the third byte and find the corresponding character
res[i+3]=base64_table[str[j+2]&0x3f]; // Extract the last 6 bits of the third byte and find the corresponding character
}
Common Encryption Library Interfaces and Their Identification – 3/3/2022
Miracl, FGInt, Crypto++, OpenSSL, and more.
Application of Encryption Algorithms in Software Protection - 3/3/2022
Many software products demonstrate that security and user experience are often contradictory.
- Do not rely on self-designed algorithms.
- Use mature and highly secure cryptographic algorithms whenever possible.
- Regularly update encryption keys.
- Update algorithms or security mechanisms periodically, if cost permits.
- Strictly follow standard-recommended security parameters and use standardized security algorithms or protocols.
- Examine self-designed security mechanisms from an attacker's perspective.
- Remove information prompts useful to attackers when using open-source cryptographic algorithm libraries.
- Stay updated with the latest advancements in cryptographic algorithms.
Windows Kernel Fundamentals
Kernel Theory Fundamentals - 1/4/2023
- Virtual memory of user-mode programs is isolated from each other, while kernel-mode programs share a common virtual memory space. Hence, if a kernel driver crashes, it results in a blue screen.
- User-mode programs cannot access kernel-mode memory, but the reverse is possible.

- User-mode programs operate at privilege level R3, while kernel drivers run at R0 (the highest level).
- In Windows x64, the virtual memory range used by user-mode programs is from
0x000'00000000
to0x7FFF'FFFFFFFF
. Memory beyond0xffff800000000000
is reserved for kernel-mode. - The Windows driver framework is divided into NT drivers, WDM drivers, and KMDF drivers.
- Each driver object creates one or more device objects, and each device object contains a pointer to the next device object, forming a device chain.
- Windows organizes devices in a tree structure known as the device tree. Nodes in the device tree are called "device nodes," and the root node is referred to as the "root device node." Typically, the root device node is depicted at the bottom of the device tree.
- Device objects in the device stack are connected via Filter Device Objects (Filter DO), Function Device Objects (FDO), and Physical Device Objects (PDO). The first device object is at the bottom of the device stack, and the last created device object is at the top.
- Communication between R3 and R0 occurs through IRPs (similar to packets in network communication). IRPs are passed down the device stack.
- IRQL levels are defined as follows (higher values indicate higher priority):
#if defined(_AMD64_)
//
// Interrupt Request Level definitions
//
#define PASSIVE_LEVEL 0 // Passive release level
#define LOW_LEVEL 0 // Lowest interrupt level
#define APC_LEVEL 1 // APC interrupt level
#define DISPATCH_LEVEL 2 // Dispatcher level
#define CMCI_LEVEL 5 // CMCI handler level
#define CLOCK_LEVEL 13 // Interval clock level
#define IPI_LEVEL 14 // Interprocessor interrupt level
#define DRS_LEVEL 14 // Deferred Recovery Service level
#define POWER_LEVEL 14 // Power failure level
#define PROFILE_LEVEL 15 // timer used for profiling.
#define HIGH_LEVEL 15 // Highest interrupt level
#endif
One blue screen error code is
irql_not_less_or_equal
The IRQL_NOT_LESS_OR_EQUAL bug check has a value of 0x0000000A. This bug check indicates that Microsoft Windows or a kernel-mode driver accessed paged memory at an invalid address while at a raised interrupt request level (IRQL). The cause is typically a bad pointer or a pageability problem.
Core Important Data Structures — 1/5/2023
Kernel Objects
Common types include Dispatcher objects, I/O objects, process objects, and thread objects.
SSDT
https://m0uk4.gitbook.io/notebooks/mouka/windowsinternal/ssdt-hook
System Service Dispatch Table or SSDT, simply is an array of addresses to kernel routines for 32-bit operating systems or an array of relative offsets to the same routines for 64-bit operating systems.
SSDT is the first member of the Service Descriptor Table kernel memory structure as shown below:
typedef struct tagSERVICE_DESCRIPTOR_TABLE {
SYSTEM_SERVICE_TABLE nt; //effectively a pointer to Service Dispatch Table (SSDT) itself
SYSTEM_SERVICE_TABLE win32k;
SYSTEM_SERVICE_TABLE sst3; //pointer to a memory address that contains how many routines are defined in the table
SYSTEM_SERVICE_TABLE sst4;
} SERVICE_DESCRIPTOR_TABLE;
In x64, the relation between SSDT and its function address is shown below:
FuncAddr = ([KeServiceDescriptortable+index*4]>>4 + KeServiceDescriptortable)
SSDT lookup:
.foreach /ps 1 /pS 1 ( offset {dd /c 1 nt!KiServiceTable L poi(nt!KeServiceDescriptorTable+10)}){ r $t0 = ( offset >>> 4) + nt!KiServiceTable; .printf "%p - %y\n", $t0, $t0 }
SSDT(shadow) struct:
struct SSDTStruct
{
LONG* pServiceTable;
PVOID pCounterTable;
#ifdef _WIN64
ULONGLONG NumberOfServices;
#else
ULONG NumberOfServices;
#endif
PCHAR pArgumentTable;
};
Function Index to real function address:
readAddress = (ULONG_PTR)(ntTable[FunctionIndex] >> 4) + SSDT(Shadow)BaseAddress;
SSDT(shadow) lookup:
In x64 there is no symbols.
0: kd> !process 0 0 mspaint.exe
PROCESS ffff850e48ee1080
SessionId: 1 Cid: 0adc Peb: 219f280000 ParentCid: 12c8
DirBase: 28b00002 ObjectTable: ffffe5088ad08e80 HandleCount: 296.
Image: mspaint.exe
0: kd> .process /p ffff850e48ee1080
Implicit process is now ffff850e`48ee1080
.cache forcedecodeuser done
0: kd> dps nt!KeServiceDescriptorTableShadow
fffff806`451da980 fffff806`45095570 nt!KiServiceTable # SSDT base address
fffff806`451da988 00000000`00000000
fffff806`451da990 00000000`000001cf
fffff806`451da998 fffff806`45095cb0 nt!KiArgumentTable
fffff806`451da9a0 fffff528`64b6b000 win32k!W32pServiceTable # SSDT Shadow base address
fffff806`451da9a8 00000000`00000000
fffff806`451da9b0 00000000`000004da
fffff806`451da9b8 fffff528`64b6c84c win32k!W32pArgumentTable
fffff806`451da9c0 00000000`00111311
fffff806`451da9c8 00000000`00000000
fffff806`451da9d0 ffffffff`80000010
fffff806`451da9d8 00000000`00000000
fffff806`451da9e0 00000000`00000000
fffff806`451da9e8 00000000`00000000
fffff806`451da9f0 00000000`00000000
fffff806`451da9f8 00000000`00000000
0: kd> dd /c 1 win32k!W32pServiceTable l10
fffff528`64b6b000 ff972820 # GDI Function offset
fffff528`64b6b004 ff972940
fffff528`64b6b008 ff972a60
fffff528`64b6b00c ff972b80
fffff528`64b6b010 ff972ca2
fffff528`64b6b014 ff972dc0
fffff528`64b6b018 ff972ee0
fffff528`64b6b01c ff973000
fffff528`64b6b020 ff973120
fffff528`64b6b024 ff973240
fffff528`64b6b028 ff973363
fffff528`64b6b02c ff973487
fffff528`64b6b030 ff9735a0
fffff528`64b6b034 ff9736c0
fffff528`64b6b038 ff9737e0
fffff528`64b6b03c ff973900
TEB
TEB (Thread Environment Block) stores frequently used thread-related data in the system. It resides in user address space at a lower address than the PEB. Each thread in a process has its own TEB. All TEBs of a process are stored in a stack-like manner in linear memory starting at 0x7FFDE000
, with each full TEB occupying 4KB. However, this memory region expands downward. In user mode, the current thread's TEB is located in a separate 4KB segment, accessible via the CPU's FS register, typically stored at FS:[0]
. In user mode, the WinDbg command $thread
can be used to obtain the TEB address.
FS:[000] Points to the SEH chain pointer
FS:[004] Thread stack top
FS:[008] Thread stack bottom
FS:[00C] SubSystemTib
FS:[010] FiberData
FS:[014] ArbitraryUserPointer
FS:[018] Points to the TEB itself
FS:[020] Process PID
FS:[024] Thread ID
FS:[02C] Points to the thread local storage pointer
FS:[030] PEB structure address (process structure)
FS:[034] Last error number
// Thread Environment Block (TEB)
typedef struct _TEB
{
NT_TIB Tib; /* 00h */
PVOID EnvironmentPointer; /* 1Ch */
CLIENT_ID Cid; /* 20h */
PVOID ActiveRpcHandle; /* 28h */
PVOID ThreadLocalStoragePointer; /* 2Ch */
struct _PEB *ProcessEnvironmentBlock; /* 30h */
ULONG LastErrorValue; /* 34h */
ULONG CountOfOwnedCriticalSections; /* 38h */
PVOID CsrClientThread; /* 3Ch */
struct _W32THREAD* Win32ThreadInfo; /* 40h */
ULONG User32Reserved[0x1A]; /* 44h */
ULONG UserReserved[5]; /* ACh */
PVOID WOW32Reserved; /* C0h */
LCID CurrentLocale; /* C4h */
ULONG FpSoftwareStatusRegister; /* C8h */
PVOID SystemReserved1[0x36]; /* CCh */
LONG ExceptionCode; /* 1A4h */
struct _ACTIVATION_CONTEXT_STACK *ActivationContextStackPointer; /* 1A8h */
UCHAR SpareBytes1[0x28]; /* 1ACh */
GDI_TEB_BATCH GdiTebBatch; /* 1D4h */
CLIENT_ID RealClientId; /* 6B4h */
PVOID GdiCachedProcessHandle; /* 6BCh */
ULONG GdiClientPID; /* 6C0h */
ULONG GdiClientTID; /* 6C4h */
PVOID GdiThreadLocalInfo; /* 6C8h */
ULONG Win32ClientInfo[62]; /* 6CCh */
PVOID glDispatchTable[0xE9]; /* 7C4h */
ULONG glReserved1[0x1D]; /* B68h */
PVOID glReserved2; /* BDCh */
PVOID glSectionInfo; /* BE0h */
PVOID glSection; /* BE4h */
PVOID glTable; /* BE8h */
PVOID glCurrentRC; /* BECh */
PVOID glContext; /* BF0h */
NTSTATUS LastStatusValue; /* BF4h */
UNICODE_STRING StaticUnicodeString; /* BF8h */
WCHAR StaticUnicodeBuffer[0x105]; /* C00h */
PVOID DeallocationStack; /* E0Ch */
PVOID TlsSlots[0x40]; /* E10h */
LIST_ENTRY TlsLinks; /* F10h */
PVOID Vdm; /* F18h */
PVOID ReservedForNtRpc; /* F1Ch */
PVOID DbgSsReserved[0x2]; /* F20h */
ULONG HardErrorDisabled; /* F28h */
PVOID Instrumentation[14]; /* F2Ch */
PVOID SubProcessTag; /* F64h */
PVOID EtwTraceData; /* F68h */
PVOID WinSockData; /* F6Ch */
ULONG GdiBatchCount; /* F70h */
BOOLEAN InDbgPrint; /* F74h */
BOOLEAN FreeStackOnTermination; /* F75h */
BOOLEAN HasFiberData; /* F76h */
UCHAR IdealProcessor; /* F77h */
ULONG GuaranteedStackBytes; /* F78h */
PVOID ReservedForPerf; /* F7Ch */
PVOID ReservedForOle; /* F80h */
ULONG WaitingOnLoaderLock; /* F84h */
ULONG SparePointer1; /* F88h */
ULONG SoftPatchPtr1; /* F8Ch */
ULONG SoftPatchPtr2; /* F90h */
PVOID *TlsExpansionSlots; /* F94h */
ULONG ImpersionationLocale; /* F98h */
ULONG IsImpersonating; /* F9Ch */
PVOID NlsCache; /* FA0h */
PVOID pShimData; /* FA4h */
ULONG HeapVirualAffinity; /* FA8h */
PVOID CurrentTransactionHandle; /* FACh */
PTEB_ACTIVE_FRAME ActiveFrame; /* FB0h */
PVOID FlsData; /* FB4h */
UCHAR SafeThunkCall; /* FB8h */
UCHAR BooleanSpare[3]; /* FB9h */
} TEB, *PTEB;
PEB
https://www.cnblogs.com/viwilla/p/5109966.html
The content is outdated. The current PEB offset in Windows 10/11 has changed to 0x60.
PEB (Process Environment Block) stores process information, and each process has its own PEB data. It resides in user address space. In Windows 2000, the address of the Process Environment Block is fixed for each process at 0x7FFDF000
, which is within the user address space, allowing programs to access it directly.
The exact PEB address should be obtained from the 0x1b0
offset of the system's EPROCESS
structure. However, since EPROCESS
is located in the system address space, accessing this structure requires ring0 privileges.
Alternatively, the PEB location can be retrieved from the TEB structure at offset 0x30. The FS segment register points to the current TEB structure:
mov eax, dword ptr fs:[0x30]
Or via the TEB pointer:
mov eax, dword ptr fs:[0x18] ;eax = *TEB
mov eax, dword ptr [eax+0x30] ;eax = *PEB
In user mode, the WinDbg command $proc
can be used to obtain the PEB address.
//Process Environment Block
typedef struct _PEB
{
UCHAR InheritedAddressSpace; // 00h
UCHAR ReadImageFileExecOptions; // 01h
UCHAR BeingDebugged; // 02h
UCHAR Spare; // 03h
PVOID Mutant; // 04h
PVOID ImageBaseAddress; // 08h
PPEB_LDR_DATA Ldr; // 0Ch
PRTL_USER_PROCESS_PARAMETERS ProcessParameters; // 10h
PVOID SubSystemData; // 14h
PVOID ProcessHeap; // 18h
PVOID FastPebLock; // 1Ch
PPEBLOCKROUTINE FastPebLockRoutine; // 20h
PPEBLOCKROUTINE FastPebUnlockRoutine; // 24h
ULONG EnvironmentUpdateCount; // 28h
PVOID* KernelCallbackTable; // 2Ch
PVOID EventLogSection; // 30h
PVOID EventLog; // 34h
PPEB_FREE_BLOCK FreeList; // 38h
ULONG TlsExpansionCounter; // 3Ch
PVOID TlsBitmap; // 40h
ULONG TlsBitmapBits[0x2]; // 44h
PVOID ReadOnlySharedMemoryBase; // 4Ch
PVOID ReadOnlySharedMemoryHeap; // 50h
PVOID* ReadOnlyStaticServerData; // 54h
PVOID AnsiCodePageData; // 58h
PVOID OemCodePageData; // 5Ch
PVOID UnicodeCaseTableData; // 60h
ULONG NumberOfProcessors; // 64h
ULONG NtGlobalFlag; // 68h
UCHAR Spare2[0x4]; // 6Ch
LARGE_INTEGER CriticalSectionTimeout; // 70h
ULONG HeapSegmentReserve; // 78h
ULONG HeapSegmentCommit; // 7Ch
ULONG HeapDeCommitTotalFreeThreshold; // 80h
ULONG HeapDeCommitFreeBlockThreshold; // 84h
ULONG NumberOfHeaps; // 88h
ULONG MaximumNumberOfHeaps; // 8Ch
PVOID** ProcessHeaps; // 90h
PVOID GdiSharedHandleTable; // 94h
PVOID ProcessStarterHelper; // 98h
PVOID GdiDCAttributeList; // 9Ch
PVOID LoaderLock; // A0h
ULONG OSMajorVersion; // A4h
ULONG OSMinorVersion; // A8h
ULONG OSBuildNumber; // ACh
ULONG OSPlatformId; // B0h
ULONG ImageSubSystem; // B4h
ULONG ImageSubSystemMajorVersion; // B8h
ULONG ImageSubSystemMinorVersion; // C0h
ULONG GdiHandleBuffer[0x22]; // C4h
PVOID ProcessWindowStation; // ???
} PEB, *PPEB;
Kernel Debugging Basics – 1/10/2023
See link: http://blog.junyu33.me/2023/01/10/winkernel_environ.html