最近南邮登录不上去了,原因不清楚。
问题
Hello,RE!
先通过看汇编复习一下以前的知识,推荐英特尔的64ia32指令集手册
.text:00401500 push ebp
.text:00401501 mov ebp, esp
.text:00401503 and esp, 0FFFFFFF0h
.text:00401506 sub esp, 90h
.text:0040150C call ___main
.text:00401511 mov dword ptr [esp], offset fmt ; fmt
.text:00401518 call __Z6printfPKcz ; printf(char const*,...)
.text:0040151D mov dword ptr [esp+75h], 67616C66h
.text:00401525 mov dword ptr [esp+79h], 6C65577Bh
.text:0040152D mov dword ptr [esp+7Dh], 656D6F63h
.text:00401535 mov dword ptr [esp+81h], 5F6F545Fh
.text:00401540 mov dword ptr [esp+85h], 575F4552h
.text:0040154B mov dword ptr [esp+89h], 646C726Fh
.text:00401556 mov word ptr [esp+8Dh], 7D21h
.text:00401560 mov byte ptr [esp+8Fh], 0
.text:00401568 jmp short loc_401592
main 函数先对齐栈顶,分配栈空间,将一个字符串的指针先“入栈”(准确的来说是直接对栈上的内存赋值),然后调用标准c的printf输出一个字符串。完事了将字符串的flag用4字节int的形式“入栈”;另外,windows上的字节序是小端,先存储LSB(Least significant byte)。然后跳转。
.text:00401592 lea eax, [esp+11h]
.text:00401596 mov [esp+4], eax
.text:0040159A mov dword ptr [esp], offset format ; "%s"
.text:004015A1 call __Z5scanfPKcz ; scanf(char const*,...)
.text:004015A6 cmp eax, 0FFFFFFFFh
.text:004015A9 setnz al
.text:004015AC test al, al
.text:004015AE jnz short loc_40156A
将栈上的一个地址“入栈”,根据后文可以看出来这是一个指向局部变量char数组的指针;“压入”格式字符串,调用c的scanf,表示要读入一段字符串。跟-1(0xFFFFFFFF
)比较,留意scanf的调用是否出错。比较的汇编看着比较捉急。
.text:004015B0 loc_4015B0: ; CODE XREF: _main:loc_401590↑j
.text:004015B0 mov dword ptr [esp], offset aFlag_0 ; "flag"
.text:004015B7 call __Z6printfPKcz ; printf(char const*,...)
.text:004015BC mov dword ptr [esp], offset byte_410030 ; fmt
.text:004015C3 call __Z6printfPKcz ; printf(char const*,...)
.text:004015C8 mov dword ptr [esp], offset byte_410064 ; fmt
.text:004015CF call __Z6printfPKcz ; printf(char const*,...)
.text:004015D4 mov dword ptr [esp], offset byte_41008F ; fmt
.text:004015DB call __Z6printfPKcz ; printf(char const*,...)
.text:004015E0 call _getchar
.text:004015E5 call _getchar
.text:004015EA mov eax, 0
.text:004015EF leave
.text:004015F0 retn
ReadAsm2
int main(int argc, char const *argv[])
{
char input[] = {0x0, 0x67, 0x6e, 0x62, 0x63, 0x7e, 0x74, 0x62, 0x69, 0x6d,
0x55, 0x6a, 0x7f, 0x60, 0x51, 0x66, 0x63, 0x4e, 0x66, 0x7b,
0x71, 0x4a, 0x74, 0x76, 0x6b, 0x70, 0x79, 0x66 , 0x1c};
func(input, 28);
printf("%s\n",input+1);
return 0;
}
结合上边的C源码,下边的汇编的作用是对字符数组里面的每个字节和对应的索引进行异或然后写入原来的位置
00000000004004e6 <func>:
4004e6: 55 push rbp
4004e7: 48 89 e5 mov rbp,rsp
4004ea: 48 89 7d e8 mov QWORD PTR [rbp-0x18],rdi
4004ee: 89 75 e4 mov DWORD PTR [rbp-0x1c],esi
4004f1: c7 45 fc 01 00 00 00 mov DWORD PTR [rbp-0x4],0x1
4004f8: eb 28 jmp 400522 <func+0x3c>
4004fa: 8b 45 fc mov eax,DWORD PTR [rbp-0x4]
4004fd: 48 63 d0 movsxd rdx,eax
400500: 48 8b 45 e8 mov rax,QWORD PTR [rbp-0x18]
400504: 48 01 d0 add rax,rdx
400507: 8b 55 fc mov edx,DWORD PTR [rbp-0x4]
40050a: 48 63 ca movsxd rcx,edx
40050d: 48 8b 55 e8 mov rdx,QWORD PTR [rbp-0x18]
400511: 48 01 ca add rdx,rcx
400514: 0f b6 0a movzx ecx,BYTE PTR [rdx]
400517: 8b 55 fc mov edx,DWORD PTR [rbp-0x4]
40051a: 31 ca xor edx,ecx
40051c: 88 10 mov BYTE PTR [rax],dl
40051e: 83 45 fc 01 add DWORD PTR [rbp-0x4],0x1
400522: 8b 45 fc mov eax,DWORD PTR [rbp-0x4]
400525: 3b 45 e4 cmp eax,DWORD PTR [rbp-0x1c]
400528: 7e d0 jle 4004fa <func+0x14>
40052a: 90 nop
40052b: 5d pop rbp
40052c: c3 ret
所以答案是
data = [
0x0, 0x67, 0x6e, 0x62, 0x63, 0x7e, 0x74, 0x62, 0x69, 0x6d,
0x55, 0x6a, 0x7f, 0x60, 0x51, 0x66, 0x63, 0x4e, 0x66, 0x7b,
0x71, 0x4a, 0x74, 0x76, 0x6b, 0x70, 0x79, 0x66, 0x1c
]
new = []
for i, b in enumerate(data):
new.append(chr((i^b)&(0xFF)))
print(''.join(new[1:]))
Py交易
先安装 uncompyle6 py包。如果pip安装之后输入uncompyle6 -h
提示找不到命令,可以先把py包卸载了,然后从源代码安装。
现在下来之后用uncompyle6 Py.pyc > Py.py
反编译
# Python27
# uncompyle6 version 3.6.4
# Python bytecode 2.7 (62211)
# Decompiled from: Python 2.7.17 (default, Nov 7 2019, 10:07:09)
# [GCC 7.4.0]
# Embedded file name: 1.py
# Compiled at: 2017-06-03 10:20:43
import base64
def encode(message):
s = ''
for i in message:
x = ord(i) ^ 32
x = x + 16
s += chr(x)
return base64.b64encode(s)
correct = 'XlNkVmtUI1MgXWBZXCFeKY+AaXNt'
flag = ''
print 'Input flag:'
flag = raw_input()
if encode(flag) == correct:
print 'correct'
else:
print 'wrong'
# okay decompiling Py.pyc
所以答案是:
# Python37
''.join([chr(((c-16)^32)&0xFF) for c in base64.b64decode('XlNkVmtUI1MgXWBZXCFeKY+AaXNt')])
maze
程序下载下来,用ida分析汇编,可以得出以下主要的程序逻辑(用python表示):
# encoding=utf8
# python37
R = " ******* * **** * **** * *** *# *** *** *** *********"
def func(s):
b, v24, v28, r = 5, 0, 0, 0
while b < len(s):
if s[b] == chr(0x4f): # O
r = v24 == 0
v24 -= 1
elif s[b] == chr(0x6f): # o
v24 += 1
r = v24 < 8
elif s[b] == chr(0x2e): # .
r = v28 == 0
v28 -= 1
elif s[b] == chr(0x30): # 0
v28 += 1
r = v28 < 8
c = R[v24 + v28 * 8]
if c != chr(0x20) and c != chr(0x23):
break
b += 1
if r and R[v24+v28*8] == chr(0x23):
print("Congratulation!")
else:
print("Wrong flag!")
注意到形如R[v24 + v28 * 8]
的数组访问方式,容易让人想起和图遍历有关的算法课上机作业 :laughing: 。正如题目的名字所暗示的那样,R数组其实是一个迷宫;输入的flag其实一条路径,程序根据这条路径去遍历迷宫,如果到达了#
就说明这条路径是答案了。
可以用DFS找到这条路径:
# encoding=utf8
# python37
R = " ******* * **** * **** * *** *# *** *** *** *********"
direction = [[-1, 0, 'O'], [1, 0, 'o'], [0, -1, '.'], [0, 1, '0']]
seen = set()
stack = []
found = []
def dfs(x, y):
if 0 <= x+y*8 < len(R):
if R[x+y*8] == '#':
print(''.join(stack), len(stack)) # 输出答案
found.append(True)
if R[x+y*8] == '*' or found or (x, y) in seen:
return
seen.add((x, y))
for a, b, c in direction:
stack.append(c)
dfs(x+a, y+b)
stack.pop()
dfs(0, 0, '')
小插曲:一开始看汇编的时候看到程序有个判断检查flag长度的逻辑:cmp rbx, 18h
,下意识的认为输入的长度是18;dfs解出答案后看到路径的长度是18还有点怀疑。但其实那是16进制的18h
,也就是24=18+6,6是nctf{
和}
的长度。
480小时精通C++
IDA打开找main函数,发现main函数的作用只是输出一段加密之后的字符串而已;要将关注点放到StringEncryptFunction
函数。这个函数又调用了480个类似的函数;幸运的是这480个函数几乎完全相同,区别仅仅在于各自使用了一个不同的字符串(001001001
~ 480480480
)。整个加密逻辑可以用 python 转写成以下代码
def bar(array):
for i in range(1, 481):
a = '{:03}'.format(i) * 3
for j in range(len(array)):
array[j] = j^array[j]^ord(a[j % len(a)])
因为用到了异或,所以加密算法同时也是解密算法,而且顺序没关系。答案如下:
import re
a = list(int(f'0x{b}', base=16) for b in re.findall(r'..', '62646163734e346a6f60715f673c6e5b4561777c337657635b7831717b5f74447577297d'))
bar(a)
print(''.join(map(chr, a)))
之前没了解过x64 C++类方法的调用约定,这次算是学习了。this
指针被当成第一个参数,x64 gcc 会把this
指针通过rdi
传递,也就是相当于函数的第一个参数。
Single
IDA找到main函数。这个程序会判断输入是否满足一定条件,满足条件的就是答案了。用python转写主要代码逻辑如下:
# encoding=utf8
# python37
def f70e(s):
"""限制输入s为数字0~9,而且长度小于81"""
if len(s) > 0x51:
exit(-1)
for v14 in range(len(s)):
if s[v14] <= 0x2f or s[v14] > 0x39 :
exit(-1)
def f78b(s, v30):
"""对于输入s不为'0'的部分,如果v30同样位置的数据不为0程序就会失败退出;
否则转换为整数并存到v30同样的位置;"""
for v14 in range(len(s)):
if s[v14] != 0x30:
if s[v14] == 0x00:
exit(-1)
if v30[v14] == 0:
v30[v14] = s[v14] - 0x30
else:
exit(-1)
def f833(v38):
for v28 in range(9):
s = [0] * 0xa
for v24 in range(9):
d = (v28 << 3) + v28 + v24
s[v38[d]] += 1
for v24 in range(1, 9+1):
if s[v24] != 0x1:
exit(-1)
f8fe = f833
def f9c9(v48):
v28, v24 = 3, 3
for v34 in range(9):
s = b'\x00' * 0xa
for v30 in range(v28-3, v28):
for v2c in range(v24-3, v24):
d = (v30 << 3) + v30 + v2c
s[v48[d]] += 1
for v30 in range(1, 9+1):
if s[v30] != 1:
exit(-1)
if v24 == 9:
v24 = 3
v28 += 3
else:
v24 += 3
def fad4(v8):
f833(v8)
f8fe(v8)
f9c9(v8)
def main():
s = [int(d) for d in input().strip()]
c = [int(d, base=16) for d in 'This is a sudoku']
f70e(s)
f78b(s, c)
fad4(c)
函数f78b将输入转换到一个数组全局变量c,其它函数按照一定的规则校验这个数组c。关注点主要在 f833, f8fe, f9c9。这三个函数计算c索引的过程有点迷惑,收集这个三个函数计算出来的索引观察之后发现函数的逻辑如下:
- f833依次检查1-9是否在
c[i*9:(i+1)*9]
出现并仅出现一次; - f9c9依次检查1-9是否在
c[i*3:i*3+3]
,c[i*3+9:i*3+12]
,c[i*3+18:i*3+21]
出现并仅出现一次
到最后发现答案和数独有关;全局变量数组c是一个数独题目:f833检查行,f9c9检查宫。需要解决下面这个数独:
00 03 00 | 06 00 00 | 00 00 00
06 00 00 | 00 03 02 | 04 09 00
00 09 00 | 01 00 07 | 00 06 00
------------------------------
07 04 06 | 00 00 00 | 00 00 00
00 01 08 | 00 00 00 | 06 03 00
00 00 00 | 00 00 00 | 01 04 07
------------------------------
00 08 00 | 09 00 04 | 00 07 00
00 07 04 | 02 01 00 | 00 00 06
00 00 00 | 00 00 03 | 00 01 00
不过让人疑惑的是,除了4个指令编码有差异以外,函数f8fe和f833 完 全 一 致;按照这个题目是个数独的猜想,f8fe应该检查列才对。
有一个地方需要注意一下,因为f78b的关系,需要数独答案中已经提供的数字改成0才能通过。
echo -n 401095728057800001802040305000321589500479002923586000105060203300008950269750804 | ./single
bt
IDA找到main函数。程序读取一个字符串byte_601100,先后用两个函数sub_400666和sub_4006BE生成字符串分别与s2和a7d8dcdcaed592e比较,如果一致,那么字符串byte_601100就是答案。两个函数用python转写如下:
def sub_400666(n):
if n>0x3f:
return
Global.s1[Global.dw1064] = Global.bt1100[n]
Global.dw1064 += 1
sub_400666(2*n+1)
sub_400666(2*n+2)
def sub_4006BE(n):
if n>0x3f:
return
sub_4006BE(2*n+1)
Global.s1[Global.dw1064] = Global.bt1100[n]
Global.dw1064 += 1
sub_4006BE(2*n+2)
不难看出来这其实是二叉树的两种遍历方式:前序遍历和中序遍历。题目名字为’bt’,应该就是 backtrace 的缩写。收集前序遍历的索引、对结果字符串s2进行排序就可以得到答案:
def func(length):
indices = []
def bar(n):
if n>length:
return
indices.append(n)
bar(2*n+1)
bar(2*n+2)
bar(0)
return indices
indices = func(0x3f)
s2 = 'bcec8d7dcda25d91ed3e0b720cbb6cf202b09fedbc3e017774273ef5d5581794'
print('nctf{%s}' % ''.join([c for _, c in sorted(enumerate(s2), key=lambda t: indices[t[0]])]))
不过有一点存在疑惑:只用前序遍历的结果s2就可以得到答案了,为什么题目还要再使用中序遍历再检查一次?
Notes
- x86 uses a full descending stack
push
firstly decrementesp
by 4, and then stores its operand
- x86 instructions
- Intel syntax: dst<-src
cmp s1, s2
Subtract s2 from s1test s1, s2
Bitwise ANDmovsx
/movzx
Sign-extended/zero-extendedcdqe
Sign-extends a double word in eax to quad wordidiv
Signed divide AX/DX:AX/EDX:EAX/RDX:RAX, with result stored in: AL/AX/EAX/RAX <– quotient, AH/DX/EDX/RDX <– remainder- AL← Quotient, AH ←Remainder.
- System V AMD64 ABI
- The first 6 integer or pointer arguments are passed in registers RDI, RSI, RDX, RCX, R8 and R9
- gcc passes
this
pointer as the first argument
fs
andgs
: base-pointer addressesfs:0x28
on linux stores a sentinel stack-guard value