首页 (93) - 码厩

#130 华人与华侨

个人 2016-03-16

每次春晚都必然问候海外华人华侨，但我从来没有思考过华人和华侨有什么区别，近日看到网络上的一些资料才弄明白。

#129 删除文件属性

Linux 2016-03-15

# 时间清零
touch -amt 197001010000.00 <file>

sudo apt install -y xattr
xattr -l <file>
# 删除所有附加属性
xattr -c <file>

历史

70 年代初，IBM 公司开发了 SEQUEL 语言 (Structured English Query Language，结构化英语查询语言)，用于管理 RDB。
70 年代末，IBM 和甲骨文分别开始开发基于 SQL 的 RDBMS。
PS: IBM 的产品就包括大名鼎鼎的 DB2，世界上最早的 SQL 数据库。
PS: 甲骨文当时还叫做 Relational Software, Inc
1980 年，由于商标问题，SEQUEL 改名 SQL。
虽然官方发音是 ess-cue-el, 但至今为止，不少人还是将其读做 /ˈsiːkwəl/。
1986 年被美国国家标准学会标准化（ANSI X3.135-1986）
1987 年，ISO 采纳 ANSI SQL (ISO 9075:1987)，所以这个版本也被称之为 SQL87。
后来，SQL 陆续推出 89，92，1999, 2003 .... 多个版本。
应该是 ISO 负责制定和维护吧，也无所谓啦。

版本

SQL-86 (or SQL-87) is the ISO 9075:1987 standard of 1987
SQL-89 is the ISO/IEC 9075:1989 standard of 1989
SQL-92 is the ISO/IEC 9075:1992 standard of 1992
SQL:1999 is the ISO/IEC 9075:1999 standard of 1999
SQL:2003 is the ISO/IEC 9075:2003 standard of 2003
SQL:2006 is the ISO/IEC 9075:2006 standard of 2006
SQL:2008 is the ISO/IEC 9075:2008 standard of 2008
SQL:2011 is the ISO/IEC 9075:2011 standard of 2011
SQL:2016 is the ISO/IEC 9075:2016 standard of 2016

Year	Name	Alias	Comments
1986	SQL-86	SQL-87	First formalized by ANSI
1989	SQL-89		Minor revision that added integrity constraints
1992	SQL-92	SQL2	Major revision (ISO 9075)
1999	SQL:1999	SQL3
2003	SQL:2003
2006	SQL:2006
2008	SQL:2008
2011	SQL:2011
2016	SQL:2016
2019	SQL:2019

SQL:1999
Added regular expression matching, recursive queries (e.g. transitive closure), triggers, support for procedural and control-of-flow statements, nonscalar types (arrays), and some object-oriented features (e.g. structured types), support for embedding SQL in Java (SQL/OLB) and vice versa (SQL/JRT)

2003

Introduced XML-related features (SQL/XML), window functions, standardized sequences, and columns with autogenerated values (including identity columns)

2006
ISO/IEC 9075-14:2006 defines ways that SQL can be used with XML. It defines ways of importing and storing XML data in an SQL database, manipulating it within the database, and publishing both XML and conventional SQL-data in XML form. In addition, it lets applications integrate queries into their SQL code with XQuery, the XML Query Language published by the World Wide Web Consortium (W3C), to concurrently access ordinary SQL-data and XML documents.

2008
Legalizes ORDER BY outside cursor definitions. Adds INSTEAD OF triggers, TRUNCATE statement,[34] FETCH clause

2011
Adds temporal data (PERIOD FOR)[35] (more information at: Temporal database#History). Enhancements for window functions and FETCH clause.

2016
Adds row pattern matching, polymorphic table functions, JSON

2019
Adds Part 15, multidimensional arrays (MDarray type and operators)

https://en.wikibooks.org/wiki/Structured_Query_Language
https://en.wikibooks.org/wiki/MySQL
https://en.wikibooks.org/wiki/PostgreSQL
https://en.wikibooks.org/wiki/SQLite
https://en.wikipedia.org/wiki/SQL_reserved_words

ISO 9075

最新的 SQL 标准一共分成 9 个部分（Part 5,6,7,8,12 可能是被废弃了）：

Part 1: Framework (SQL/Framework)
基本概念
Part 2: Foundation (SQL/Foundation)
基础语法
Part 3: Call-Level Interface (SQL/CLI)
应该是编程语言方面的接口
Part 4: Persistent stored modules (SQL/PSM)
SQL 面向过程编程
Part 9: Management of External Data (SQL/MED)
Part 10: Object language bindings (SQL/OLB)
Java SQLJ 相关内容
Part 11: Information and definition schemas (SQL/Schemata)
Part 13: SQL Routines and types using the Java TM programming language (SQL/JRT)
又是 Java 相关
Part 14: XML-Related Specifications (SQL/XML)
XML 相关

PS: 前缀 ISO/IEC 9075-<n>:2016 – Information technology – Database languages – SQL – 省略。

PS: 还有一个拓展标准：ISO/IEC 13249 SQL Multimedia and Application Packages

变种

多数数据库没有严格按照标准来实现，导致不通平台上的 SQL 语句是不能跨平台的。

以下是两种主要的 SQL 方言：

T-SQL(Transact-SQL): SQLServer
PL/SQL: Oracle

#127 看《过年好》的感受

影视 2016-03-07

看了盗版电影，我有罪...

刚看了赵本山、闫妮的《过年好》，挺好的，本来赵本山那一伙人一出现就是烂片的印象被打破了。

一直忘不掉那个画面，妈妈（闫妮）和女儿吵架，姥爷（赵本山）在一旁默默的灌酒~

还有，那个好像有点傻的、一直在唱《一人我饮酒醉》的小天。

...

不过，为什么看海报上看起来好像是被当做是贺岁片在宣传呢？

和张国立、姚晨主演的《一切都好》类似的主题，本来都可以往文艺方向走的。

可能是文艺片在国内票房没市场吧，请来大腕拍电影，肯定是要算投资回报的。在年底的票房蛋糕中圈尽可能多的钱才是投资方的目标。明显，包装成贺岁片才能实现这个目标。

一人我饮酒醉
醉把那佳人成双对
两眼是独相随
我只求他日能双归

#126 Excel 技巧

Excel 2016-02-20

2016/02/20, Excel 技巧
2017/01/21, Python Excel 01
2017/01/22, Python Excel 02
2022/01/26, Excel 学习
2022/04/18, 小朋友的数学加强计划
2022/04/21, Go Excel 操作

Excel 文件 > Sheet（工作表） > 行和列

基础

!Sheetname
[filename]
引用
1. =A1 相对引用
2. =$A$1 绝对引用
3. 混合地址引用
  1. =A$1 列相对引用+行绝对引用
  2. =$A1 行相对引用+列绝对引用
4. 按 F4 依次切换引用方式
+、-、*、/、(、)
搜索
1. ? 单个字符
2. * 多个字符
字符串
1. 第一个字符单引号表示数据为文本类型，表格左上角会有一个绿色三角标识
  1. 鼠标右键菜单中可以选：转换成数字
2. =A1&B1 字符串连接

函数

NOW
AND、OR、NOT、XOR
查找
1. LOOKUP、MATCH、INDEX
2. VLOOKUP
3. XLOOKUP
4. HLOOKUP
统计
1. SUM 求和
  1. SUMIF
  2. SUMIFS
2. MAX、MIN、AVERAGE、MEDIAN 最大，最小，平均值，中位数
  1. MAXA
  2. MAXIFS
  3. MINA
  4. MINIFS
  5. AVERAGEIF
  6. AVERAGEIFS
3. COUNT 计数
  1. COUNTIF
  2. COUNTIFS
  3. COUNTA
  4. COUNTBLANK
4. PRODUCT (a1 * b1 * a2 * b2 * ...)
5. SUBPRODUCT (a1 * b1 + a2 * b2 + ...)
6. 标准差
  1. STDEV
  2. STDEVA
  3. STDEVP
  4. STDEVPA
  5. STDEV.P
  6. STDEV.S
7. 方差
  1. VAR
  2. VARA
  3. VARP
  4. VARPA
  5. VAR.P
  6. VAR.S

小技巧

斜线表头
单元格格式 > 边框 > 斜线
表格内输入右上角内容，然后换行，再输入左下角内容，然后用空格来调整位置，对齐。
冻结窗格
Alt + Enter 换行
Ctrl + ; 输入日期
Ctrl + Shift + ; 输入时间
填充柄
1. 选中单元格，拖动填充柄：顺序填充
  1. 加 Ctrl：复制填充
2. 双击填充柄：顺序填充到对齐旁边的列
  1. 加 Ctrl：复制填充到对齐旁边的列
选中行/列，按 Shift 拖动，可以更改位置
移动表格 / 行 / 列：边界上出现黑色十字箭头之后，按 Shift 拖动
撤销 Ctrl + Z / 重做 Ctrl + Y
选中范围，DEL 删除数据

参考资料与拓展阅读

#125 中国省级行政区

中国地理 2016-02-10

华北地区
- 北京市 (京)
- 天津市 (津)
- 河北省 (冀)
- 山西省 (晋)
- 内蒙古自治区 (蒙)
东北地区
- 辽宁省 (辽)
- 吉林省 (吉)
- 黑龙江省 (黑)
华东地区
- 上海市 (沪)
- 江苏省 (苏)
- 浙江省 (浙)
- 安徽省 (皖)
- 福建省 (闽)
- 江西省 (赣)
- 山东省 (鲁)
华中地区
- 河南省 (豫)
- 湖北省 (鄂)
- 湖南省 (湘)
华南地区
- 广东省 (粤)
- 广西自治区 (桂)
- 海南省 (琼)
西南地区
- 重庆市 (渝)
- 四川省 (川)
- 贵州省 (贵)
- 云南省 (云)
- 西藏自治区 (藏)
西北地区
- 陕西省 (陕)
- 甘肃省 (甘)
- 青海省 (青)
- 宁夏自治区 (宁)
- 新疆自治区 (新)
港澳台地区
- 香港特别行政区 (港)
- 澳门特别行政区 (澳)
- 台湾省 (台)

一共 34 个省级行政区：

23 个省
5 个自治区：内蒙古，新疆，西藏，宁夏，广西
4 个直辖市：北京，天津，上海，重庆
2 个特别行政区：香港，澳门

河北河南
湖北湖南江西
广东广西福建海南
山东山西
辽宁吉林黑龙江
新疆西藏内蒙古
陕西甘肃青海宁夏
四川重庆
贵州云南
江苏浙江安徽
香港澳门台湾
北京天津上海

#124 MySQL 索引

MySQL 2016-02-08

CREATE [UNIQUE | FULLTEXT | SPATIAL] INDEX index_name
    [index_type]
    ON tbl_name (key_part,...)
    [index_option]
    [algorithm_option | lock_option] ...

key_part:
    col_name [(length)] [ASC | DESC]

index_option: {
    KEY_BLOCK_SIZE [=] value
  | index_type
  | WITH PARSER parser_name
  | COMMENT 'string'
}

index_type:
    USING {BTREE | HASH}

algorithm_option:
    ALGORITHM [=] {DEFAULT | INPLACE | COPY}

lock_option:
    LOCK [=] {DEFAULT | NONE | SHARED | EXCLUSIVE}

功能类型

主键索引（PrimaryKey）
唯一索引（UNIQUE）
普通索引（KEY）
全文索引（FULLTEXT）
空间索引（SPATIAL）

算法类型

B-Tree
Hash
R-TREE（空间索引使用）
倒排索引（Inverted Index，全文索引使用）

聚簇索引 Clustered Index

聚簇索引（Clustered Index）是一种特殊的索引类型，它决定了表中数据的物理存储顺序。
在聚簇索引中，数据行按照索引键的顺序存储在磁盘上，因此具有相邻的物理位置，这样可以提高查询效率。
聚簇索引只能有一个，因为它决定了表中数据的物理存储顺序，如果有多个聚簇索引，就会导致数据在磁盘上存储的位置不确定，影响查询效率。

其他索引叫二级索引（Secondary Index），或者辅助索引。

对于 InnoDB，有限使用主键做聚簇索引，其次找一个不含 NULL 值的唯一索引，还没有，就自动生成一个隐式的自增型 ROW_ID 字段（BIGINT UNSIGNED）做聚簇索引 GEN_CLUST_INDEX。

注意：所有索引在提升查询效率的同时，都会影响插入和删除的性能。尤其是聚簇索引需要重新组织数据行的物理存储顺序。

#123 Python tempfile

Python 2016-02-03

import tempfile
import time

timestr = time.strftime("%Y%m%d%H%M%S")

临时文件

# tempfile.mktemp(suffix='', prefix='tmp', dir=None)
# tempfile.mkstemp(suffix=None, prefix=None, dir=None, text=False)

filepath_temp = tempfile.mktemp(suffix='.html', prefix='.cache_%s_' % timestr)
print(filepath_temp)
# /tmp/.cache_20210929205452_zlmi3pkf.html

filepath_temp = tempfile.mktemp(suffix='.html', prefix='.cache_%s_' % timestr,
                                dir='/opt/apps/markjour/tmp/')
print(filepath_temp)
# /opt/apps/markjour/tmp/.cache_20210929205452_mcr7nj3e.html

filepath_temp = tempfile.mkstemp(suffix='.html', prefix='.cache_%s_' % timestr, text=True)

mkstemp 和 mktemp 的区别：mkstemp 返回一个文件描述符，mktemp 返回一个文件路径。
mktemp 返回的路径理论上会被另一个进程使用，所以这是 UNSAFE 的，应该用 mkstemp 代替。
mktemp 从 Python 2.3 开始标记为 Deprecated，但是至今还是可以调用。

临时目录

# tempfile.mkdtemp(suffix=None, prefix=None, dir=None)
dirpath_temp = tempfile.mkdtemp()
print(dirpath_temp)
# /tmp/tmp1dj2cl0d

dirpath_temp = tempfile.mkdtemp(prefix='build_',
                                dir='/opt/apps/markjour/tmp/')
print(dirpath_temp)
# /opt/apps/markjour/tmp/build_9k5synh5

#122 Python zipfile

Python zip 2016-02-02

import zipfile
import os

zip_path = "/path/to/zip/file.zip"
extract_path = "/path/to/extract/directory"

if not os.path.exists(extract_path):
    os.makedirs(extract_path)

with zipfile.ZipFile(zip_path, 'r') as fp:
    # 直接解压
    zip_ref.extractall(target_dir)

    # 逐个解压
    for filename in fp.namelist():

        # 如果文件存在就先移除，避免 FileExistsError
        extract_file_path = os.path.join(extract_path, filename)
        if os.path.exists(extract_file_path):
            # print("跳过解压缩文件：", filename)
            # continue
            os.remove(extract_file_path)

        # 解压，会自动创建目录结构
        fp.extract(filename, extract_path)

#121 Linux 文件描述符

Linux 2016-02-01

文件描述符

File Descriptor

“一切皆文件” 设计思想来自早期 Unix，这是现在 Unix/Linux 界的一个非常重要的概念（根据 Plan9 操作系统的相关资料，这一条没有得到彻底的贯彻）。

允许打开的最大文件数

系统限制

cat /proc/sys/fs/file-nr
# 17781 0   9223372036854775807
# 已分配, 已使用, 最大值

cat /proc/sys/fs/file-max
# 9223372036854775807

sudo sysctl -a | grep file
# fs.file-max = 9223372036854775807
# fs.file-nr = 17813    0   9223372036854775807

修改：

echo 100000000 > /proc/sys/fs/file-max
sysctl fs.file-max 100000000

进程限制

# ulimit 是一个 Shell 内建命令
ulimit -a       # 输出所有的相关限制
ulimit -n       # 当前进程的最大打开文件数
ulimit -n 10240 # 修改当前进程的最大打开文件数

ps -ef | grep sshd | grep -Fv 'grep '
root        1080       1  0 10月16 ?      00:00:00 sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups
sudo ls -l /proc/`pgrep sshd`/fd
总用量 0
lr-x------ 1 root root 64 10月 17 12:04 0 -> /dev/null
lrwx------ 1 root root 64 10月 17 12:04 1 -> 'socket:[25487]'
lrwx------ 1 root root 64 10月 17 12:04 2 -> 'socket:[25487]'
lrwx------ 1 root root 64 10月 17 12:04 3 -> 'socket:[28674]'
lrwx------ 1 root root 64 10月 17 12:04 4 -> 'socket:[28676]'

/etc/security/limits.conf

echo "* soft nofile 65535" | sudo tee -a /etc/security/limits.conf
echo "* hard nofile 65535" | sudo tee -a /etc/security/limits.conf

系统会根据 /etc/pam.d/login 中的 PAM 配置来为登录用户设置，
如果配置了 session required /lib/security/pam_limits.so, 就会加载 /etc/security/limits.conf 设置用户的各种限制值。

<domain>        <type>  <item>  <value>

domain
a user name
a group name, with @group syntax
the wildcard *, for default entry
the wildcard %, can be also used with %group syntax, for maxlogin limit
NOTE: group and wildcard limits are not applied to root.
To apply a limit to the root user, must be the literal username root.
type
"soft" for enforcing the soft limits
"hard" for enforcing hard limits
item
core - limits the core file size (KB)
data - max data size (KB)
fsize - maximum filesize (KB)
memlock - max locked-in-memory address space (KB)
nofile - max number of open file descriptors
rss - max resident set size (KB)
stack - max stack size (KB)
cpu - max CPU time (MIN)
nproc - max number of processes
as - address space limit (KB)
maxlogins - max number of logins for this user
maxsyslogins - max number of logins on the system
priority - the priority to run user process with
locks - max number of file locks the user can hold
sigpending - max number of pending signals
msgqueue - max memory used by POSIX message queues (bytes)
nice - max nice priority allowed to raise to values: [-20, 19]
rtprio - max realtime priority
chroot - change root to directory (Debian-specific)

码厩技术博客