#82 SSH 打洞

2015-07-15

网络代理是一种常见的网络技术,其中 SSH 由于其普遍性,用来代理是最顺手不过的。

助记:

  • DLR 独立日, 分别对应三种最常见的代理模式 Dynamic, Local, Remote
  • ProxyPort:DestHost:DestPort User@ProxyHost

比如有三台机器 A, B, C, 其中 A 是本地机器,B 是代理机器,C 是目标机器(注意:A,B,C 可以是同一台机器,或者其中任意两个是同一台机器):

  • 在本地机器上建立隧道,代理机器做跳板,将本地端口绑定到目的机器端口:
// A:80 => B => C:80
ssh -L 80:C:80 root@B
  • 将代理机器的指定端口绑定到目的机器:
// B:80 => C:80
ssh -R 80:C:80 root@B

上面两种都是端口映射模式,应该能够满足所有端口对端口的代理。
最后的 -D 就是动态代理,类似 NAT,将发往指定端口的数据转发到出去。

ssh -D 80 root@B

这个时候,如果设置系统网络代理为 socks5://B:80,然后本地的所有网络请求都会经过 B 中转,比如我访问 C:80,C 服务器接受到的是来自 B 的访问。
PS: 这个 socks5 要是展开带另起一篇,这里就只需要知道,这种代理的方式有一个专门的名字叫 socks 就行了。

如果在 Windows 上,PuTTY 就是一个很好的 SSH 客户端,也可以从来建立代理,这就不展开讲了。
下次我研究研究 PuTTY,另外再发一篇文章。

Update @ 2021-10-01:

  1. 有一些 Linux 命令直接支持 socks5 代理,比如 wget, curl。
  2. 编程时,发起网络请求也可以设置 socks5 代理。

Update @ 2016-09-16:

Update @ 2017-06-09:

#81 MySQL 编码字符集

2015-05-11

各种类型的编码

mysql> SHOW VARIABLES LIKE 'character%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | gbk                        |
| character_set_connection | gbk                        |
| character_set_database   | utf8mb4                    |
| character_set_filesystem | binary                     |
| character_set_results    | gbk                        |
| character_set_server     | utf8mb4                    |
| character_set_system     | utf8mb3                    |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)

5.7 及更早版本默认字符集和 Collation 是 latin1 和 latin1_swedish_ci
8.0 之后改成 utf8mb4 和 utf8mb4_0900_ai_ci

上面就有 6 种 全局的编码:

  • client 客户端编码
  • connection 连接编码
  • database 数据库编码,创建表时的默认编码
    load data, 以及创建
  • filesystem 文件系统编码
  • results 结果编码
  • server 服务端编码,创建数据库时的默认编码
  • system 系统编码

还有两种局部的编码:表的编码和列(字段)的编码。
列主要是 char, varchar, text 类型。

字符集

# 都可以 LIKE 搜索
SHOW CHARACTER SET;
SHOW CHARSET;
SHOW CHAR SET;
Charset Description Default collation Maxlen
armscii8 ARMSCII-8 Armenian armscii8_general_ci 1
ascii US ASCII ascii_general_ci 1
big5 Big5 Traditional Chinese big5_chinese_ci 2
binary Binary pseudo charset binary 1
cp1250 Windows Central European cp1250_general_ci 1
cp1251 Windows Cyrillic cp1251_general_ci 1
cp1256 Windows Arabic cp1256_general_ci 1
cp1257 Windows Baltic cp1257_general_ci 1
cp850 DOS West European cp850_general_ci 1
cp852 DOS Central European cp852_general_ci 1
cp866 DOS Russian cp866_general_ci 1
cp932 SJIS for Windows Japanese cp932_japanese_ci 2
dec8 DEC West European dec8_swedish_ci 1
eucjpms UJIS for Windows Japanese eucjpms_japanese_ci 3
euckr EUC-KR Korean euckr_korean_ci 2
gb18030 China National Standard GB18030 gb18030_chinese_ci 4
gb2312 GB2312 Simplified Chinese gb2312_chinese_ci 2
gbk GBK Simplified Chinese gbk_chinese_ci 2
geostd8 GEOSTD8 Georgian geostd8_general_ci 1
greek ISO 8859-7 Greek greek_general_ci 1
hebrew ISO 8859-8 Hebrew hebrew_general_ci 1
hp8 HP West European hp8_english_ci 1
keybcs2 DOS Kamenicky Czech-Slovak keybcs2_general_ci 1
koi8r KOI8-R Relcom Russian koi8r_general_ci 1
koi8u KOI8-U Ukrainian koi8u_general_ci 1
latin1 cp1252 West European latin1_swedish_ci 1
latin2 ISO 8859-2 Central European latin2_general_ci 1
latin5 ISO 8859-9 Turkish latin5_turkish_ci 1
latin7 ISO 8859-13 Baltic latin7_general_ci 1
macce Mac Central European macce_general_ci 1
macroman Mac West European macroman_general_ci 1
sjis Shift-JIS Japanese sjis_japanese_ci 2
swe7 7bit Swedish swe7_swedish_ci 1
tis620 TIS620 Thai tis620_thai_ci 1
ucs2 UCS-2 Unicode ucs2_general_ci 2
ujis EUC-JP Japanese ujis_japanese_ci 3
utf16 UTF-16 Unicode utf16_general_ci 4
utf16le UTF-16LE Unicode utf16le_general_ci 4
utf32 UTF-32 Unicode utf32_general_ci 4
utf8mb3 UTF-8 Unicode utf8mb3_general_ci 3
utf8mb4 UTF-8 Unicode utf8mb4_0900_ai_ci 4
SELECT * FROM information_schema.character_sets;
SELECT * FROM information_schema.character_sets WHERE CHARACTER_SET_NAME LIKE "%utf%";
CHARACTER_SET_NAME DEFAULT_COLLATE_NAME DESCRIPTION MAXLEN
utf8mb3 utf8mb3_general_ci UTF-8 Unicode 3
utf16 utf16_general_ci UTF-16 Unicode 4
utf16le utf16le_general_ci UTF-16LE Unicode 4
utf32 utf32_general_ci UTF-32 Unicode 4
utf8mb4 utf8mb4_0900_ai_ci UTF-8 Unicode 4

排序规则

SHOW COLLATION;

这就多了,两百多。

SHOW COLLATION LIKE "%ascii%";
Collation Charset Id Default Compiled Sortlen Pad_attribute
ascii_bin ascii 65 Yes 1 PAD SPACE
ascii_general_ci ascii 11 Yes Yes 1 PAD SPACE
SELECT * FROM information_schema.collations WHERE CHARACTER_SET_NAME = "utf8mb4" AND COLLATION_NAME LIKE "%zh%";
COLLATION_NAME CHARACTER_SET_NAME ID IS_DEFAULT IS_COMPILED SORTLEN PAD_ATTRIBUTE
utf8mb4_zh_0900_as_cs utf8mb4 308 Yes 0 NO PAD

命名规则

字符集名称,语言,通用后缀

  • ai Accent-insensitive 重音不敏感
  • as Accent-sensitive 重音敏感
  • ci Case-insensitive 大小写不敏感
  • cs Case-sensitive 大小写敏感
  • ks Kana-sensitive 假名敏感(日语)
  • bin 二进制

8.0 之后,很多编码中多了 0900 字样,表示 Unicode 9.0 规范。

参考资料与拓展阅读

#80 Python 单例模式

2015-04-06

使用全局变量

我个人比较喜欢的方式,简单高效。

class Application:
    pass

APP = Application()

装饰器(使用特定方法来获取对象实例)

import threading

class singleton:
    _instance_lock = threading.Lock()

    def __init__(self, decorated):
        self._decorated = decorated

    def instance(self):
        if not hasattr(self, '_instance'):
            with singleton._instance_lock:
                if not hasattr(self, '_instance'):
                    self._instance = self._decorated()
        return self._instance

    def __call__(self):
        raise TypeError('Singletons must be accessed through `instance()`.')

    def __instancecheck__(self, inst):
        return isinstance(inst, self._decorated)

    def __getattr__(self, name):
        if name == '_instance':
            raise AttributeError("'singleton' object has no attribute '_instance'")
        return getattr(self.instance(), name)

@singleton
class Application:
    pass

Application.instance()

装饰器

def singleton(cls):
    _instances = {}

    def _inner(*args, **kargs):
        if cls not in _instances:
            _instances[cls] = cls(*args, **kargs)
        return _instances[cls]

    return _inner

元类

class Application:
    _instance_lock = threading.Lock()

    def __new__(cls, *args, **kwargs):
        if not hasattr(cls, "_instance"):
            with cls._instance_lock:
                if not hasattr(cls, "_instance"):
                    cls._instance = object.__new__(cls)
        return cls._instance

改成通用方式:

import threading

class SingletonMeta(type):
    _instances = {}
    _instance_lock = threading.Lock()

    def __call__(cls, *args, **kwargs):
        if cls not in cls._instances:
            with cls._instance_lock:
                if cls not in cls._instances:
                    instance = super().__call__(*args, **kwargs)
                    cls._instances[cls] = instance
        return cls._instances[cls]

class Application(metaclass=SingletonMeta):
    pass

#79 Djanog 密码算法:PBKDF2

2015-04-06

实现

在 Python 2.7.8 (July 2, 2014),或 Python 3.4 (March 17, 2014) 以上版本,标准库 hashlib 有 pbkdf2_hmac:

import base64
import hashlib
import random
import secrets

def pbkdf2(password, salt, iterations, dklen=0, digest=None):
    if digest is None:
        digest = hashlib.sha256
    if not dklen:
        dklen = None
    password = password.encode('utf-8')
    salt = salt.encode('utf-8')
    return hashlib.pbkdf2_hmac(
        digest().name, password, salt, iterations, dklen)

### pbkdf2_sha256 ###

ALGORITHM = 'pbkdf2_sha256'
ITERATIONS = 30000
DIGEST = hashlib.sha256

def encode(password, salt, iterations=None):
    assert password is not None
    assert salt and '$' not in salt
    if not iterations:
        iterations = ITERATIONS
    hash = pbkdf2(password, salt, iterations, digest=DIGEST)
    hash = base64.b64encode(hash).decode('ascii').strip()
    return "%s$%d$%s$%s" % (ALGORITHM, iterations, salt, hash)

def decode(encoded):
    algorithm, iterations, salt, hash = encoded.split('$', 3)
    assert algorithm == ALGORITHM
    return {
        'algorithm': algorithm,
        'hash': hash,
        'iterations': int(iterations),
        'salt': salt,
    }

def verify(password, encoded):
    algorithm, iterations, salt, hash = encoded.split('$', 3)
    assert algorithm == ALGORITHM
    encoded_2 = encode(password, salt, int(iterations))
    return secrets.compare_digest(encoded.encode('utf-8'),
                                  encoded_2.encode('utf-8'))

salt_length = 32
# salt = bytes([random.randint(0, 255) for i in range(salt_length + 20)])
salt = random.randbytes(salt_length + 20)
salt = salt.replace(b'$', b'').decode('latin')[:salt_length]
# print(repr(salt))
a = encode('123456', salt)
print(repr(a))
print(decode(a))
# 'pbkdf2_sha256$30000$x\x82¨\x07Ó[Ám¬9V¬\x13·sÉ\x1eEܱ3\x04Ü\x07\x05BÁfM\x9fV×$/jqvasjfZX6c9xCytBVqddAJFve2DXcLWClTAsZvl48='
print(verify('234567', a))
print(verify('123456', a))

参考资料与拓展阅读

#77 Django 密码

2015-02-05

https://docs.djangoproject.com/en/2.0/topics/auth/passwords/

在 Django 中,密码哈希存储在 auth_user 表中的 password 字段中。该字段的值包含了算法名称、迭代次数、盐值和哈希值,它们都是使用特定的格式进行编码的。例如,一个密码哈希的值可能如下所示:

pbkdf2_sha256$150000$V7v0fjbhMIhq$Gd/0XuOqK3ib6NBNGVwIKGXcFTUiNbTzdTNN8RiW24E=

用美元符号切割,得到 4 部分:

  • pbkdf2_sha256 算法名称
  • 150000 迭代次数
  • V7v0fjbhMIhq
  • Gd/0XuOqK3ib6NBNGVwIKGXcFTUiNbTzdTNN8RiW24E=

支持的哈希算法

PASSWORD_HASHERS = [
    'django.contrib.auth.hashers.PBKDF2PasswordHasher',
    'django.contrib.auth.hashers.PBKDF2SHA1PasswordHasher',
    'django.contrib.auth.hashers.Argon2PasswordHasher',
    'django.contrib.auth.hashers.BCryptSHA256PasswordHasher',
    'django.contrib.auth.hashers.BCryptPasswordHasher',
]

默认使用的是 PBKDF2PasswordHasher

PBKDF2 是基于密钥的密码派生函数,它使用一个伪随机函数(PRF)和一个盐值来从给定密码派生出一个密钥。
PBKDF2 算法的核心是迭代的使用 PRF 函数,将每一次迭代的输出与前一次迭代的输出进行异或,生成最终的派生密钥。

大概逻辑如下:

import hashlib
import binascii
import os

class PBKDF2PasswordHasher:
    algorithm = 'pbkdf2_sha256'
    iterations = 100000

    def salt(self):
        return binascii.hexlify(os.urandom(16)).decode()

    def encode(self, password, salt):
        dk = hashlib.pbkdf2_hmac('sha256', password.encode(), salt.encode(), self.iterations)
        return '%d$%s$%s' % (self.iterations, salt, binascii.hexlify(dk).decode())

    def verify(self, password, encoded):
        iterations, salt, dk = encoded.split('$')
        iterations = int(iterations)
        hashed_password = self.encode(password, salt)
        return hashed_password == encoded

生成密码

from django.contrib.auth.hashers import make_password

password = '123456'
hashed_password = make_password(password)

验证密码

from django.contrib.auth.hashers import check_password

is_correct_password = check_password(password, hashed_password)

#75 unzip 压缩工具

2015-01-19
zip -e archive.zip file1.txt file2.txt

unzip -l archive.zip # 列出

# 解压缩
unzip archive.zip
unzip -o archive.zip # 覆盖
unzip archive.zip file1.txt file2.txt -d /path/to/directory/ # 指定文件,指定目录
unzip -P password archive.zip # 有密码的压缩文件

#74 Linux 信号

2015-01-17

信号

信号可以被捕获,或者被忽略,除了 9 KILL 和 19 STOP

信号值

Ubuntu:

$ kill -l
HUP INT QUIT ILL TRAP IOT BUS FPE KILL USR1 SEGV USR2 PIPE ALRM TERM STKFLT CHLD CONT STOP TSTP TTIN TTOU URG XCPU XFSZ VTALRM PROF WINCH POLL PWR SYS

$ type kill
kill is a shell builtin

$ /usr/bin/kill -L
 1 HUP      2 INT      3 QUIT     4 ILL      5 TRAP     6 ABRT     7 BUS
 8 FPE      9 KILL    10 USR1    11 SEGV    12 USR2    13 PIPE    14 ALRM
15 TERM    16 STKFLT  17 CHLD    18 CONT    19 STOP    20 TSTP    21 TTIN
22 TTOU    23 URG     24 XCPU    25 XFSZ    26 VTALRM  27 PROF    28 WINCH
29 POLL    30 PWR     31 SYS

CentOS:

$ kill -l
 1) SIGHUP       2) SIGINT       3) SIGQUIT      4) SIGILL       5) SIGTRAP
 6) SIGABRT      7) SIGBUS       8) SIGFPE       9) SIGKILL     10) SIGUSR1
11) SIGSEGV     12) SIGUSR2     13) SIGPIPE     14) SIGALRM     15) SIGTERM
16) SIGSTKFLT   17) SIGCHLD     18) SIGCONT     19) SIGSTOP     20) SIGTSTP
21) SIGTTIN     22) SIGTTOU     23) SIGURG      24) SIGXCPU     25) SIGXFSZ
26) SIGVTALRM   27) SIGPROF     28) SIGWINCH    29) SIGIO       30) SIGPWR
31) SIGSYS      34) SIGRTMIN    35) SIGRTMIN+1  36) SIGRTMIN+2  37) SIGRTMIN+3
38) SIGRTMIN+4  39) SIGRTMIN+5  40) SIGRTMIN+6  41) SIGRTMIN+7  42) SIGRTMIN+8
43) SIGRTMIN+9  44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12 47) SIGRTMIN+13
48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14 51) SIGRTMAX-13 52) SIGRTMAX-12
53) SIGRTMAX-11 54) SIGRTMAX-10 55) SIGRTMAX-9  56) SIGRTMAX-8  57) SIGRTMAX-7
58) SIGRTMAX-6  59) SIGRTMAX-5  60) SIGRTMAX-4  61) SIGRTMAX-3  62) SIGRTMAX-2
63) SIGRTMAX-1  64) SIGRTMAX
  1. HUP 终端关闭
  2. INT Ctrl + C
  3. QUIT 终端退出 Ctrl + 4 or \
  4. ILL
  5. TRAP
  6. IOT/ABRT
  7. BUS
  8. FPE
  9. KILL 常来自 kill -9
  10. USR1
  11. SEGV
  12. USR2
  13. PIPE
  14. ALRM
  15. TERM
  16. STKFLT
  17. CHLD 子进程终止
  18. CONT
  19. STOP
  20. TSTP 终端挂起 Ctrl + Z
  21. TTIN
  22. TTOU
  23. URG
  24. XCPU
  25. XFSZ
  26. VTALRM
  27. PROF
  28. WINCH 终端窗口大小改变
  29. POLL
  30. PWR
  31. SYS

由于 9 KILL 不能被捕获,设计软件时可以捕获 15 TERM, 进行清理操作,然后需要杀进程时,先尝试 kill -15

附: 信号表(来自 man 7 signal

Signal x86/ARM
most others
Alpha/
SPARC
MIPS PARISC Notes
SIGHUP 1 1 1 1
SIGINT 2 2 2 2
SIGQUIT 3 3 3 3
SIGILL 4 4 4 4
SIGTRAP 5 5 5 5
SIGABRT 6 6 6 6
SIGIOT 6 6 6 6
SIGBUS 7 10 10 10
SIGEMT - 7 7 -
SIGFPE 8 8 8 8
SIGKILL 9 9 9 9
SIGUSR1 10 30 16 16
SIGSEGV 11 11 11 11
SIGUSR2 12 31 17 17
SIGPIPE 13 13 13 13
SIGALRM 14 14 14 14
SIGTERM 15 15 15 15
SIGSTKFLT 16 - - 7
SIGCHLD 17 20 18 18
SIGCLD - - 18 -
SIGCONT 18 19 25 26
SIGSTOP 19 17 23 24
SIGTSTP 20 18 24 25
SIGTTIN 21 21 26 27
SIGTTOU 22 22 27 28
SIGURG 23 16 21 29
SIGXCPU 24 24 30 12
SIGXFSZ 25 25 31 30
SIGVTALRM 26 26 28 20
SIGPROF 27 27 29 21
SIGWINCH 28 28 20 23
SIGIO 29 23 22 22
SIGPOLL Same as SIGIO
SIGPWR 30 29/- 19 19
SIGINFO - 29/- - -
SIGLOST - -/29 - -
SIGSYS 31 12 12 31
SIGUNUSED 31 - - 31

#73 Apache: client denied by server configuration

2015-01-04

2.2 版本下的权限控制语句在 2.4 下已经不再适用,需要做一些调整。

Order deny,allow
Deny from all

应该改成:Require all denied

Order deny,allow
Deny from all
Allow from example.com

应该改成:Require host example.com

Order allow,deny
Allow from all

应该改成:Require all granted

嗯,挺好,新语法更加简单。