#719 Loki 日志系统
架构 日志 Loki 2021-12-05Loki
Grafana 公司出品的一个日志系统。才出来没两年,是一个相对较年轻的项目,不过已经有一定知名度了。
业界最为知名的日志系统是 ELK,它对日志做全文索引,搜索起来最快、最灵活,同时大量索引导致存储成本相对较高。
Loki 则将日志分成时间戳、标签、正文三部分,标签就是索引,存储在
Promtail
Grafana
Grafana 是一个数据面板,常用于监控系统。它本身不会收集和存储数据,而是通过接入其他数据源来实现。
通过内置的插件,Loki 可以支持各种关系型数据库和时序数据库(Zabbix 一般配套使用 MySQL 做存储,Prometheus 本身就可以认为是一个时序数据库),也支持 Loki,Elasticsearch 这样的数据源。
实验
Install Loki & Promtail
# 获取最新版本号
# LOKI_VERSION=$(curl -s https://api.github.com/repos/grafana/loki/releases/latest | jq -r .tag_name)
LOKI_VERSION=$(curl -s https://api.github.com/repos/grafana/loki/releases/latest | grep -Po '"tag_name": "\Kv[0-9.]+')
# 下载 loki & promtail
curl -O -L "https://github.com/grafana/loki/releases/download/${LOKI_VERSION}/loki-linux-amd64.zip"
curl -O -L "https://github.com/grafana/loki/releases/download/${LOKI_VERSION}/promtail-linux-amd64.zip"
# loki : 18M -> 57M
# promtail: 21M -> 74M
# 解压 & 设置
unzip loki-linux-amd64.zip promtail-linux-amd64.zip
sudo mv -n loki-linux-amd64 /usr/local/bin/loki
sudo mv -n promtail-linux-amd64 /usr/local/bin/promtail
# chmod a+x /usr/local/bin/{loki,promtail} # already 755
# 下载配置文件
sudo -E wget -qO /etc/loki.config.yaml "https://raw.githubusercontent.com/grafana/loki/${LOKI_VERSION}/cmd/loki/loki-local-config.yaml"
sudo -E wget -qO /etc/promtail.config.yaml "https://raw.githubusercontent.com/grafana/loki/${LOKI_VERSION}/clients/cmd/promtail/promtail-local-config.yaml"
ls -l /etc/{loki,promtail}.config.yaml
# 启动 loki
loki -config.file /etc/loki.config.yaml
# 在另一个终端查看
browse http://localhost:3100/metrics
# 启动 promtail
Install Grafana
sudo apt-get install -y apt-transport-https software-properties-common wget
wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
echo "deb https://packages.grafana.com/oss/deb stable main" | sudo tee -a /etc/apt/sources.list.d/grafana.list
# Bate 版本
# echo "deb https://packages.grafana.com/oss/deb beta main" | sudo tee -a /etc/apt/sources.list.d/grafana.list
sudo apt-get update
sudo apt-get install -y grafana
# 无法创建主目录"/usr/share/grafana"
# sudo systemctl daemon-reload
# sudo systemctl enable grafana-server
sudo systemctl start grafana-server
browse http://localhost:3000
参考资料与拓展阅读
- 微信公众号,云原生实验室,使用 Grafana 和 Loki 监控传说中的武当纵云梯
- 微信公众号,云原生实验室,轻量级云原生日志收集方案 Loki
#718 不可变操作系统
操作系统 Fedora 2021-12-04牺牲:灵活性
优势:稳定性和安全性(只读文件系统,不容易被破坏,方便回滚)
- 桌面使用
- Fedora Silverblue:基于 Fedora,使用 Flatpak 管理应用程序,具有原子更新功能.
- Vanilla OS:基于 Debian Sid,使用 Ext4 文件系统,通过 Apx 工具安装软件.
- openSUSE Aeon:简化了 Btrfs 的复杂性,适合桌面用户,结合了 Flatpak 和稳定的系统基础.
- NixOS
- Endless OS
- 服务器使用
- Fedora CoreOS:注重最小化和自动原子更新,适合容器托管或 Kubernetes 集群.
- Flatcar Container Linux:继承了 CoreOS Container Linux 的遗产,专注于容器化环境.
- openSUSE MicroOS (Server Edition):基于 openSUSE Tumbleweed,提供事务性更新和基于 Btrfs 的不可变根文件系统.
- AWS Bottlerocket
- Talos Linux: 专注于 Kubernetes 集群.
- 其他
- Ubuntu Core
- Photon OS
- blendOS
#717 转载:云原生时代,Java 的危与机
Java 云原生 2021-12-03今天,25 岁的 Java 仍然是最具有统治力的编程语言,长期占据编程语言排行榜的首位,拥有一千二百万的庞大开发者群体,全世界有四百五十亿部物理设备使用着 Java 技术,同时,在云端数据中心的虚拟化环境里,还运行着超过两百五十亿个 Java 虚拟机的进程实例 (数据来自Oracle的WebCast)。
以上这些数据是 Java 过去 25 年巨大成就的功勋佐证,更是 Java 技术体系维持自己“天下第一”编程语言的坚实壁垒。Java 与其他语言竞争,底气从来不在于语法、类库有多么先进好用,而是来自它庞大的用户群和极其成熟的软件生态,这在朝夕之间难以撼动。然而,这个现在看起来仍然坚不可摧的 Java 帝国,其统治地位的稳固程度不仅没有高枕无忧,反而说是危机四伏也不为过。目前已经有了可预见的、足以威胁动摇其根基的潜在可能性正在酝酿,并随云原生时代而降临。
#716 curl: 通过 SSH 隧道执行
SSH 网络代理 Curl 2021-12-03ssh user@host curl http://anyurl
先建立 SSH 代理,在 curl:
ssh -D 19888 -CNfq user@host
curl -x socks5h://localhost:19888 http://anyurl
后面的 CNfq 也是非常好记的,懂的懂。
至于什么含义,可以 man ssh
。
#715 Nginx 缓存导致的问题
Nginx 2021-12-03背景:公司有两套环境,其中有一个 Web 服务,包含一个比较耗时的接口。
今天收到反馈说环境 B 中的接口每次都是半天没有反应,等执行完一次性输出,A 环境中则是流式的,一点一点输出。
然后相关同事希望 A 环境体验好一些,希望 B 环境能和 A 环境保持一致。
经过检查,发现其中一遍经过了 Nginx 的代理,马上就想到了代理缓存的问题。
联系运维修改配置,在相关 server 中添加 proxy_buffering off;
,问题果然解决。
#714 Susam Pal:Curl 计时
开发者 Curl 2021-11-28Here is a command I use often while measuring why an HTTP request is taking too long:
curl -L -w "time_namelookup: %{time_namelookup}
time_connect: %{time_connect}
time_appconnect: %{time_appconnect}
time_pretransfer: %{time_pretransfer}
time_redirect: %{time_redirect}
time_starttransfer: %{time_starttransfer}
time_total: %{time_total}
" https://example.com/
Here is the same command written as a one-liner, so that I can copy it easily from this page with a triple-click whenever I need it in future:
curl -L -w "time_namelookup: %{time_namelookup}\ntime_connect: %{time_connect}\ntime_appconnect: %{time_appconnect}\ntime_pretransfer: %{time_pretransfer}\ntime_redirect: %{time_redirect}\ntime_starttransfer: %{time_starttransfer}\ntime_total: %{time_total}\n" https://example.com/
Here is how the output of the above command typically looks:
$ curl -L -w "namelookup: %{time_namelookup}\nconnect: %{time_connect}\nappconnect: %{time_appconnect}\npretransfer: %{time_pretransfer}\nstarttransfer: %{time_starttransfer}\ntotal: %{time_total}\n" https://example.com/
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
...
</html>
time_namelookup: 0.001403
time_connect: 0.245464
time_appconnect: 0.757656
time_pretransfer: 0.757823
time_redirect: 0.000000
time_starttransfer: 0.982111
time_total: 0.982326
In the output above, I have omitted most of the HTML output and replaced the omitted part with ellipsis for the sake of brevity.
The list below provides a description of each number in the output above. This information is picked straight from the manual page of curl 7.20.0. Here are the details:
time_namelookup
: The time, in seconds, it took from the start until the name resolving was completed.time_connect
: The time, in seconds, it took from the start until the TCP connect to the remote host (or proxy) was completed.time_appconnect
: The time, in seconds, it took from the start until the SSL/SSH/etc connect/handshake to the remote host was completed. (Added in 7.19.0)time_pretransfer
: The time, in seconds, it took from the start until the file transfer was just about to begin. This includes all pre-transfer commands and negotiations that are specific to the particular protocol(s) involved.time_redirect
: The time, in seconds, it took for all redirection steps include name lookup, connect, pretransfer and transfer before the final transaction was started. time_redirect shows the complete execution time for multiple redirections. (Added in 7.12.3)time_starttransfer
: The time, in seconds, it took from the start until the first byte was just about to be transferred. This includes time_pretransfer and also the time the server needed to calculate the result.time_total
: The total time, in seconds, that the full operation lasted. The time will be displayed with millisecond resolution.
An important thing worth noting here is that the difference in the numbers for time_appconnect
and time_connect
time tells us how much time is spent in SSL/TLS handshake. For a cleartext connection without SSL/TLS, this number is reported as zero. Here is an example output that demonstrates this:
$ curl -L -w "time_namelookup: %{time_namelookup}\ntime_connect: %{time_connect}\ntime_appconnect: %{time_appconnect}\ntime_pretransfer: %{time_pretransfer}\ntime_redirect: %{time_redirect}\ntime_starttransfer: %{time_starttransfer}\ntime_total: %{time_total}\n" http://example.com/
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
...
</html>
time_namelookup: 0.001507
time_connect: 0.247032
time_appconnect: 0.000000
time_pretransfer: 0.247122
time_redirect: 0.000000
time_starttransfer: 0.512645
time_total: 0.512853
Also note that time_redirect
is zero in both outputs above. That is because no redirection occurs while visiting example.com. Here is another example that shows how the output looks when a redirection occurs:
$ curl -L -w "time_namelookup: %{time_namelookup}\ntime_connect: %{time_connect}\ntime_appconnect: %{time_appconnect}\ntime_pretransfer: %{time_pretransfer}\ntime_redirect: %{time_redirect}\ntime_starttransfer: %{time_starttransfer}\ntime_total: %{time_total}\n" https://susam.in/blog
<!DOCTYPE HTML>
<html>
...
</html>
time_namelookup: 0.001886
time_connect: 0.152445
time_appconnect: 0.465326
time_pretransfer: 0.465413
time_redirect: 0.614289
time_starttransfer: 0.763997
time_total: 0.765413
When faced with a potential latency issue in web services, this is often one of the first commands I run several times from multiple clients because the results form this command help to get a quick sense of the layer that might be responsible for the latency issue.
#713 Go LDAP
Golang LDAP 2021-11-26使用 go-ldap/ldap 实现 LDAP 的基本操作,包括查询、添加、修改、删除等。
#712 转载:是谁拉黑了你的 IP
计算机网络 Email 2021-11-24邮件无法送达的原因有很多,例如 取消订阅、服务器不可达、地址格式错误或不存在、被判定为垃圾邮件、发信人/收信人被拒等等等等…今天,我们来聊一下 IP、域名被拒。
#711 UCloud UFile (US3) 文件分享
云服务 UCloud UFile 2021-11-22UFile 没有提供任何文件粒度的分享方式,只有一种方法: 使用公开存储空间。
PS: 七牛云也是一样。
存储空间(Bucket)类型为 public
,相当于阿里云 OSS 的 public-read
,整个 Bucket 中的所有文件都可以直接根据 URL 下载,不需要签名。
PS: 如果不希望文件被遍历下载,文件路径要加入随机字符串,或者 UUID,总之不能有规律。