linux捕获进程crash信息(core_pattern)

2019-08-26

本博客所有文章采用的授权方式为 自由转载-非商用-非衍生-保持署名 ,转载请务必注明出处,谢谢。


ubuntu core dump(core_pattern )control

前面的文章讲过了内核crash后,如何捕获内核的crash现场;还有种情况是进程crash,

Linux系统中,如果进程崩溃了,系统内核会捕获到进程崩溃信息,然后将进程的coredump 信息写入到文件中,这个文件名默认是core,但是也可以通过配置修改这个文件名。比如可以通过修改/proc/sys/kernel/core_pattern 文件的内容来指定.

core_pattern

自Linux 内核2.6.19 之后 core_pattern 不仅仅可以包含一个指定报文coredump信息的文件名,还可以是Linux 管道加一个用户空间的程序或者一个脚本.如果core_pattern 中第一个字符是 Linux管道符 , 那么Linux 内核在捕获进程崩溃信息的时候,就会以root权限执行管道符后门的程序或者脚本,将进程崩溃信息传递给这个程序或者脚本,这就给我们提供了一个隐藏系统后门的方法,我们可以在管道符后面隐藏我们的后门脚本,以实现在特定条件下反弹shell

配置文件位于

 less /etc/sysctl.d/30-ezs3.conf 
 
kernel.panic = 10
kernel.printk = 3 4 1 3
vm.swappiness = 5
vm.max_map_count = 1048576
kernel.core_pattern = |/usr/local/bin/ezs3-coredump %e %s %p
kernel.pid_max = 1048576
net.ipv4.conf.all.rp_filter = 0
net.ipv4.conf.default.rp_filter = 0
net.ipv4.conf.all.arp_filter = 1
net.ipv4.conf.default.arp_filter = 1
net.ipv4.conf.all.arp_announce = 1
net.ipv4.conf.default.arp_announce = 1
net.ipv4.conf.all.arp_ignore = 2
net.ipv4.conf.default.arp_ignore = 2
net.core.rmem_max = 104857600
net.core.wmem_max = 104857600
net.core.optmem_max = 104857600
net.core.netdev_max_backlog = 300000
net.ipv4.tcp_rmem = 65536 20971520 104857600
net.ipv4.tcp_wmem = 65536 20971520 104857600
net.ipv4.tcp_slow_start_after_idle = 0
net.ipv4.tcp_congestion_control = htcp

其中,kernel.core_pattern = |/usr/local/bin/ezs3-coredump %e %s %p 参数定义了core_dump 执行的脚步,之前该脚本不够智能,产生该脚本 经过bean的升级,处理

看core dump文件 是否 生效

 cat /proc/sys/kernel/core_pattern 

测试 core 是否 生效

sleep 200 &
然后 发sig 6 杀死 上面进程
kill -6 pid

正常的情况 会产生 下面的日志文件
/var/log/ezcloudstor# tailf  ezs3-coredump.log

如果发现,日志文件没有 产生,可以 手动 执行
root@node1:/var/log/ezcloudstor# /usr/local/bin/ezs3-coredump 
看是否 有明显报错。在5.5环境上使用时,发现 缺少 putil的包。


注意:

自定义core dump文件路径时需要注意配置好路径的权限。

在普通用户运行设置了setuid的程序一定要将suid_dumpable设置为2才能生成coredump文件。

要保证全部用户能在指定路径生成core dump,需要进行以下配置:

  1. 创建一个777的core dump文件路径。
  2. 将/proc/sys/kernel/core_pattern参数为core dump路径。
  3. 将/proc/sys/fs/suid_dumpable设定为2,保证使用setuid的程序能生成dump。

core dump 脚本(bash)

亚信5.5 版本 c集群

root@Storage-c2:~# cat /usr/local/bin/coredump_gen
#!/bin/bash


# echo "|/usr/local/bin/coredump_gen %e %p %s > /proc/sys/kernel/core_pattern"

MAX_CORE_FILES=3
MAX_CUR_CORE_FILES=$((MAX_CORE_FILES-1))

CORE_PATH="/var/log"
DEBUG_LOG=/var/log/debug_core.log

rotate_same_core()
{
    local dest_prefix="core\.${1}\."

    local core_num=`ls $CORE_PATH --sort=time |grep ${dest_prefix} | wc -l`
    
    if [ $core_num -gt $MAX_CUR_CORE_FILES ]; then
        cores_delete=`ls $CORE_PATH  --sort=time |grep ${dest_prefix} |tail -n $(($core_num-MAX_CUR_CORE_FILES))`
        for core in ${cores_delete}
        do
            echo `date` " DELETE   $CORE_PATH/${core}" >> $DEBUG_LOG
            rm -f $CORE_PATH/${core}
        done 
    fi
}

# core pattern %e.%s.%p
prog_name=$1
pid_num=$2
sig_num=$3

exit 0

dest_filename="$CORE_PATH/core."${prog_name}.${pid_num}.sig${sig_num}

#rotate_same_core ${prog_name}
#cat <&0 >$dest_filename
echo `date` " GENERATE $dest_filename" >> $DEBUG_LOG

root@Storage-c2:~# 

5.5 版本 还不能 直接用 7.0 core dump 脚本,报错如下:

root@Storage-b6:~# source /etc/sysctl.d/30-ezs3.conf 
kernel.panic: command not found
kernel.printk: command not found
vm.swappiness: command not found
Traceback (most recent call last):
  File "/usr/local/bin/ezs3-coredump", line 7, in <module>
    import psutil
ImportError: No module named psutil
kernel.core_pattern: command not found
net.ipv4.conf.all.rp_filter: command not found
net.ipv4.conf.default.rp_filter: command not found
net.core.rmem_max: command not found
root@Storage-b6:~# ll

core dump 脚本(最新的7.0版本)python

root@converger-124:~# cat /usr/local/bin/ezs3-coredump
#! /usr/bin/python
import os
import re
import sys
import time
import glob
import psutil
import tarfile
import logging
from ezs3.log import EZLog
from ezs3.command import do_cmd

EZLog.init_handler(logging.INFO, '/var/log/ezcloudstor/ezs3-coredump.log')
logger = EZLog.get_logger('ezs3-coredump')

MB = 1024 * 1024
# if os is created in small partition, skip core dump  
TINY_OS_PARTITION_THRESHOLD = 32*1024*1024*1024

# if core is generated by same program and same signal,
# in SAME_CORE_RECENT_THRESHOLD second ,we will skip the second core
SAME_CORE_RECENT_THRESHOLD = 10*60
MAX_DUPLICATED_CORE = 3

# preserve 30% os space at least 
OS_SPACE_TO_RESERVED = 30


def purge_extra_cores(core_path, name, sig):
    logger.info('purging extra cores start')

    if not re.search(r'_[0-9]+$', name):
        prefix = 'ezcore.{}.sig{}.*'.format(name, sig)
    else:
        name = '_'.join(name.split('_')[:-1]) + '_[0-9]*'
        prefix = 'ezcore.{}.sig{}.*'.format(name, sig)
    old_cores = glob.glob(os.path.join(core_path, prefix))
    if len(old_cores) >= MAX_DUPLICATED_CORE:
        old_cores.sort(key=os.path.getmtime, reverse=True)

        try:
            for core in old_cores[MAX_DUPLICATED_CORE-1:]:
                logger.info("purge {}".format(core))
                os.unlink(core)
        except Exception as e:
            logger.exception("Exception happened when purge extra core {} ({})"
                         .format(core,str(e)))

    logger.info('purging extra cores done')

def dump_core(core_path, name, sig, pid):
    core_file = os.path.join(core_path, 'ezcore.{}.sig{}.{}'.format(name, sig, pid))
    try:
        logger.info('start dumping %s', core_file)
        loop = 0 
        with open(core_file, 'wb+') as f:
            while True:
                if loop % 100 == 0:
                    du = psutil.disk_usage(core_file)
                    if du.free * 100.0 / du.total < OS_SPACE_TO_RESERVED:
                        raise RuntimeError('not enough disk space, skip core dumping. du={} to_preserve={}'
                                           .format(du, OS_SPACE_TO_RESERVED))

                loop += 1
                data = sys.stdin.read(MB)
                if data:
                    f.write(data)
                else:
                    logger.info('finish dumping %s', core_file)
                    os.chdir(core_path)
                    old_file_name = 'ezcore.{}.sig{}.{}'.format(name, sig, pid)
                    new_file_name = 'ezcore.{}.sig{}.{}.tar'.format(name, sig, pid)
                    do_cmd("echo 'tar -zcf %s %s && rm %s'|at now" %(new_file_name, old_file_name, old_file_name))
                    return
    except Exception:
        logger.exception('unable to dump %s', core_file)
        if os.path.isfile(core_file):
            os.remove(core_file)

def same_core_exists_recently(core_path, name, sig):
    now = time.time()
    if not re.search(r'_[0-9]+$', name):
        prefix = 'ezcore.{}.sig{}.*'.format(name, sig)
    else:
        name = '_'.join(name.split('_')[:-1]) + '_[0-9]*'
        prefix = 'ezcore.{}.sig{}.*'.format(name, sig)
    old_cores = glob.glob(os.path.join(core_path, prefix)) 
    for old_core in old_cores:
        interval = now -os.stat(old_core).st_mtime
        if interval < SAME_CORE_RECENT_THRESHOLD:
            logger.info("same core {} have exists recently ({} < {})"
                        .format(old_core, interval, SAME_CORE_RECENT_THRESHOLD))
            return True

    return False

def os_partition_too_small():
    du = psutil.disk_usage('/')
    if du.total < TINY_OS_PARTITION_THRESHOLD:
        logger.info("will not store the core file , because of the tiny os partition")
        return True

    return False


def main(name, sig, pid):
    logger.info("program {} signal {} pid {}".format(name,sig,pid))

    if os_partition_too_small():
        return

    core_path = "/var/log/cores"
    if not os.path.isdir(core_path):
        os.makedirs(core_path)

    if same_core_exists_recently(core_path, name, sig):
        return

    purge_extra_cores(core_path, name, sig)
    dump_core(core_path, name, sig, pid)


if __name__ == '__main__':
    main(sys.argv[1], sys.argv[2], sys.argv[3])
root@converger-124:~# 

参考链接

https://www.jianshu.com/p/20d7326cc07a

下面这个链接,是利用core_pattern,设计出的一个 后门程序,挺有意思

https://xz.aliyun.com/t/1098

文章评论

comments powered by Disqus


章节列表