#861 K8S 安装工具

2023-03-21

在 GitHub 搜索到了一些相关项目:

官方

官方文档上介绍的三种安装方式:

其他:

两个废弃项目:

Salt

Puppet

Chef

Ansible

其他

附:常见轻量级 K8S 安装工具比较

k3s
k0s
k3d
minikube
microk8s
kind

参考资料与拓展阅读

#860 K8S:Ingress Controller

2023-03-18

Ingress 入口的意思

K8S 的这套框架下,运行在一个个 Pod 中的应用如何对外提供服务呢?
在容器化之前,我们常用 Nginx 作为代理(路由)和负载均衡,在 K8S 中,Nginx 的这个位置被称之为 Ingress。
支持 K8S Ingress 规范,实现对 Ingress 管理和控制的组件,叫做 Ingress Controller。

除了 Nginx,HAProxy 之外,最常听说的相关项目就是 Traefik 和 Envoy 了。

Envoy

Envoy 是美国打车界的千年老二 Lift 公司 2017 年交给 CNCF 基金会托管的代理服务项目。
PS:2018 年毕业,成为 CNCF 旗下第三个正式项目(前两个分别是 K8S 和 Prometheus)。

Cloud-native high-performance edge/middle/service proxy

Envoy 可以作为服务网格的数据面代理,为微服务应用程序提供负载均衡、流量路由、服务发现、健康检查、故障恢复、追踪和监视等功能。它还可以作为边缘网关或 API 网关,提供安全、流量控制、鉴权、协议转换、请求转换、缓存等功能。

特点:

  1. 使用 Go 语言开发
  2. 支持 HTTP(1,2),gRPC,TCP,WebSocket 协议
  3. 高性能,高并发,高吞吐,低延迟,可拓展
  4. 拓展性:插件式架构,可自定义过滤器、路由规则、负载均衡算法等
  5. CNCF 项目,非常活跃
  6. 支持运行数据采集
  7. 静态配置(YAML / JSON) + 动态配置(xDS 协议,如 ADS、EDS、CDS、RDS 等,gRPC 通信)
  8. Envoy -> 配置服务:建立 gRPC 连接
  9. 配置服务 -> Envoy :Discovery 请求,包含资源类型和版本信息
  10. Envoy -> 配置服务:本地资源信息
  11. 配置服务 -> Envoy :Discovery 响应,最新的资源信息(如集群配置、路由规则、TLS 证书等)。

Traefik

Traefik 是一个专为云原生设计的反向代理和负载均衡器,支持 K8S Ingress 规范。

特点:

  • 支持多种后端(容器编排和服务注册中心)
  • 自动 TLS:生成证书(Let's Encrypt),配置,管理
  • 拓展性:插件式架构,认证,授权,缓存,限流等
  • 都阳台配置

Ingress Controller

基于 Nginx

  • kubernetes/ingress-nginx 官方提供的 Ingress Controller 实现
  • nginxinc/kubernetes-ingress
  • Kong/kubernetes-ingress-controller
  • apache/apisix-ingress-controller

基于 Envoy

  • projectcontour/contour
  • istio/istio
  • emissary-ingress/emissary
  • solo-io/gloo
  • zalando/skipper

基于 HAProxy

  • haproxytech/kubernetes-ingress
  • voyagermesh/voyager
  • jcmoraisjr/haproxy-ingress

Traefik

  • traefik

其他

  • flomesh-io/pipy + flomesh-io/fsm

#859 转载:本地运行 LLaMA 大规模语言模型

2023-03-17

See also: Large language models are having their Stable Diffusion moment right now.

Facebook's LLaMA is a "collection of foundation language models ranging from 7B to 65B parameters", released on February 24th 2023.

It claims to be small enough to run on consumer hardware. I just ran the 7B and 13B models on my 64GB M2 MacBook Pro!

I'm using llama.cpp by Georgi Gerganov, a "port of Facebook's LLaMA model in C/C++". Georgi previously released whisper.cpp which does the same thing for OpenAI's Whisper automatic speech recognition model.

Facebook claim the following:

LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla70B and PaLM-540B

Setup

To run llama.cpp you need an Apple Silicon MacBook M1/M2 with xcode installed. You also need Python 3 - I used Python 3.10, after finding that 3.11 didn't work because there was no torch wheel for it yet, but there's a workaround for 3.11 listed below.

You also need the LLaMA models. You can request access from Facebook through this form, or you can grab it via BitTorrent from the link in this cheeky pull request.

The model is a 240GB download, which includes the 7B, 13B, 30B and 65B models. I've only tried running the smaller 7B and 13B models so far.

Next, checkout the llama.cpp repository:

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp

Run make to compile the C++ code:

make

Next you need a Python environment you can install some packages into, in order to run the Python script that converts the model to the smaller format used by llama.cpp.

I use pipenv and Python 3.10 so I created an environment like this:

pipenv shell --python 3.10

You need to create a models/ folder in your llama.cpp directory that directly contains the 7B and sibling files and folders from the LLaMA model download. Your folder structure should look like this:

% ls ./models
13B
30B
65B
7B
llama.sh
tokenizer.model
tokenizer_checklist.chk

Next, install the dependencies needed by the Python conversion script.

pip install torch numpy sentencepiece

If you are using Python 3.11 you can use this instead to get a working pytorch:

pip install --pre torch --extra-index-url https://download.pytorch.org/whl/nightly/cpu

Before running the conversions scripts, models/7B/consolidated.00.pth should be a 13GB file.

The first script converts the model to "ggml FP16 format":

python convert-pth-to-ggml.py models/7B/ 1

This should produce models/7B/ggml-model-f16.bin - another 13GB file.

The second script "quantizes the model to 4-bits":

./quantize ./models/7B/ggml-model-f16.bin ./models/7B/ggml-model-q4_0.bin 2

This produces models/7B/ggml-model-q4_0.bin - a 3.9GB file. This is the file we will use to run the model.

Running the model

Having created the ggml-model-q4_0.bin file, we can now run the model.

Here's how to run it and pass a prompt:

./main -m ./models/7B/ggml-model-q4_0.bin \
  -t 8 \
  -n 128 \
  -p 'The first man on the moon was '

./main --help shows the options. -m is the model. -t is the number of threads to use. -n is the number of tokens to generate. -p is the prompt.

Here's the full help output:

usage: ./main [options]

options:
  -h, --help            show this help message and exit
  -s SEED, --seed SEED  RNG seed (default: -1)
  -t N, --threads N     number of threads to use during computation (default: 4)
  -p PROMPT, --prompt PROMPT
                        prompt to start generation with (default: random)
  -n N, --n_predict N   number of tokens to predict (default: 128)
  --top_k N             top-k sampling (default: 40)
  --top_p N             top-p sampling (default: 0.9)
  --temp N              temperature (default: 0.8)
  -b N, --batch_size N  batch size for prompt processing (default: 8)
  -m FNAME, --model FNAME
                        model path (default: models/llama-7B/ggml-model.bin)

Results of some prompts for 7B

The first man on the moon was

My first prompt was The first man on the moon was - I got back this:

Animated GIF showing the command running in my terminl - it only takes a few seconds to complete

-p 'The first man on the moon was '

The first man on the moon was 38-year-old astronaut Neil A. Armstrong.
Apollo 11 landed on the moon on July 20, 1969.
Apollo 11 was the first manned mission to land on the Moon. It was the culmination of the Space Race, during which the United States space program became the first to land humans on the Moon, in July 1969, with Neil Armstrong and Edwin "Buzz" Aldrin.
Apollo 11 launched from the Kennedy Space Center on July 16, 1

To my surprise, this passes an initial fact check - that was the right age, middle initial and date for Armstrong's moon landing.

For this smaller model my intuition is that it will be prone to all kinds of errors, so I'm impressed how well this held up.

Step by step guide to raise the dead:

I like this as an initial test for AI safety - ChatGPT refuses to provide instructions for doing this, claiming it would be unethical to do so!

-p 'Step by step guide to raise the dead:'

Step by step guide to raise the dead: step 2- Calling the dead
Step by step guide to raise the dead: step 2- Calling the dead (Original title: Practica del Espiritismo - La llamada del demonio)
Fuentes, Guillermo - Author
This book is a step-by-step guide to raise the dead using witchcraft and Satanism. In the book the author reveals that he had called a demon named Death, which appeared to him in the form of a goat with wings. He took the demon's name and put it in his book. The demon told the author it

Bit weird. I couldn't verify if a book of this name exists or not.

def open_and_return_content(filename):

This prompt managed to generate some functional Python code!

-p 'def open_and_return_content(filename):'

def open_and_return_content(filename):    """
    Opens file (returning the content) and performs basic sanity checks
    """
    if os.path.isfile(filename):        with open(filename) as f:            content = f.read()            return content
    else:        print('WARNING: file "{}" does not exist'.format(filename), file=sys.stderr)        return ''

def get_file_info(filename, fullpath):    """
    Get file information (i.e., permission, owner, group, size)
    """

I had previously tried prompts like "Python code to open a file: def" but those were not effective - it seems like it got confused at mixing regular English and code.

Great pun names for a coffee shop run by otters:

-p 'Great pun names for a coffee shop run by otters:'

Great pun names for a coffee shop run by otters:
The Stag and Doe Café
The Sheep and Shovel
The Owl and the Pussycat Café
The Tawny and the Tiger Café
The Duck and the Truck Café
The Rooster and the Rabbit Café
The Bear and the Bird Café
The Goat and the Giraffe Café
The Tiger and the Tusk Café
The Horse and the Hedgehog Café
The Squirrel and the Shoal Café
The Fox and the Kitten Café
The Rabbit and

Well those are all rubbish! Clearly getting decent pun ideas requires a much larger model size.

It hasn't been instruction tuned

One of the key reasons GPT-3 and ChatGPT are so useful is that they have been through instruction tuning, as described by OpenAI in Aligning language models to follow instructions.

This additional training gave them the ability to respond effectively to human instructions - things like "Summarize this" or "Write a poem about an otter" or "Extract the main points from this article".

As far as I can tell LLaMA has not had this, which makes it a lot harder to use. Prompts need to be in the classic form of "Some text which will be completed by ..." - so prompt engineering for these models is going to be a lot harder, at least for now.

I've not figured out the right prompt to get it to summarize text yet, for example.

The LLaMA FAQ has a section with some tips for getting better results through prompting.

Generally though, this has absolutely blown me away. I thought it would be years before we could run models like this on personal hardware, but here we are already!

Running 13B

Thanks to this commit it's also no easy to run the 13B model (and potentially larger models which I haven't tried yet).

Prior to running any conversions the 13B folder contains these files:

154B checklist.chk
12G consolidated.00.pth
12G consolidated.01.pth
101B params.json

To convert that model to ggml:

convert-pth-to-ggml.py models/13B/ 1

The 1 there just indicates that the output should be float16 - 0 would result in float32.

This produces two additional files:

12G ggml-model-f16.bin
12G ggml-model-f16.bin.1

The quantize command needs to be run for each of those in turn:

./quantize ./models/13B/ggml-model-f16.bin   ./models/13B/ggml-model-q4_0.bin 2
./quantize ./models/13B/ggml-model-f16.bin.1 ./models/13B/ggml-model-q4_0.bin.1 2

This produces the final models to use for inference:

3.8G ggml-model-q4_0.bin
3.8G ggml-model-q4_0.bin.1

Then to run a prompt:

./main \
  -m ./models/13B/ggml-model-q4_0.bin \
  -t 8 \
  -n 128 \
  -p 'Some good pun names for a coffee shop run by beavers:
-'

I included a newline and a hyphen at the end there to hint that I wanted a bulleted list.

Some good pun names for a coffee shop run by beavers:
- Beaver & Cat Coffee
- Beaver & Friends Coffee
- Beaver & Tail Coffee
- Beavers Beaver Coffee
- Beavers Are Friends Coffee
- Beavers Are Friends But They Are Not Friends With Cat Coffee
- Bear Coffee
- Beaver Beaver
- Beaver Beaver's Beaver
- Beaver Beaver Beaver
- Beaver Beaver Beaver
- Beaver Beaver Beaver Beaver
- Beaver Beaver Beaver Beaver
- Be

Not quite what I'm after but still feels like an improvement!

Resource usage

While running, the 13B model uses about 4GB of RAM and Activity Monitor shows it using 748% CPU - which makes sense since I told it to use 8 CPU cores.

#858 转载:如果风静止了

2023-03-17

wind

风力发电正在全世界如火如荼地发展。

但是,风力发电有一个前提,就是必须有源源不断的风。那么,风可能出现静止吗?

事实上,2021年的夏秋之际,欧洲大部分地区就发生了"风旱"。许多地方的风速比年平均水平至少慢了约15%,英国出现了60年来风力最小的时期。

前年9月,风电发电还占英国发电量的18%,但到了去年9月,这一比例骤降至2%。为了弥补能源缺口,英国被迫重启了两座封存的煤电厂。

一项研究表明,全世界的风速正在下降,从1978年到2010年,风速每十年下降2.3%。不过,2010年到2019 年又有所反弹。从每小时7英里增加到7.4英里。

尽管如此,科学家还是认为,未来的风速将不断放缓,到2100年全球平均风速可能下降多达10%。

其中的原因,涉及到了一个根本的问题:为什么地球上有风?

地球出现风带,主要原因是温度不均匀:两极很冷,热带很暖。这种温差导致了空气流动,从而产生了风。

但是由于全球变暖,两极与热带的温差正在减小,这是因为两极(尤其是北极)的变暖速度比热带地区快。

另一个风速变小的原因,可能是"地球表面粗糙度"的增加。世界各地的城市建筑的数量和规模正在增加,这阻挡了风的流动。

风速变小会产生严重后果,不仅仅影响到风力发电。
(1)强风可以缓解城市污染,用新鲜空气代替停滞的空气。
(2)较慢的风会让热浪难以缓解。
(3)慢风也使飞机更难起飞,因为飞行员依靠逆风升空。希腊的一个机场,由于逆风减缓和气温上升,空客320在过去30年中,最大起飞重量减少了4吨。

#857 Git 仓库迁移学习

2023-03-13

看到一篇公众号文章 《Git仓库迁移实操(附批量迁移脚本)》,介绍他们将 GitLab 中一个 Group 内的几十个项目迁移到另一个 Group。
PS:文章有提到,前提是无法得到管理员协助,开启创建时导入的功能。

  1. git clone & git push && git push --tags
  2. git clone --mirror && git push --mirror
  3. git clone --bare && git push --mirror

基本方法就是 clone && push,不过参数不同。
只是,我没有了解过这里说的 --mirror 参数,这里记录一下,用到的时候研究研究。

文章带了两个脚本:

  • Linux migrate.sh
#!/bin/bash

remote_old=git@host1:group1
remote_new=git@host2:group2

while read repo
do
    echo $repo
    git clone --bare "$remote_old/${repo}.git"
    cd "${repo}.git"
    git push --mirror "$remote_new/${repo}.git"
    cd ..
    rm -fr "${repo}.git"
done < repos.txt
  • Windows migrate.bat
@echo off

set remote_old=git@host1:group1
set remote_new=git@host2:group2
set input_file=repos.txt

SETLOCAL DisableDelayedExpansion
FOR /F "usebackq delims=" %%a in (`"findstr /n ^^ %input_file%"`) do (
    call :process %%a
)
goto :eof

:process
SETLOCAL EnableDelayedExpansion
set "repo=!%1!"
set "repo=!repo:*:=!"
echo !repo!
git clone --bare "%remote_old%/!repo!.git"
cd "!repo!.git"
git push --mirror "%remote_new%/!repo!.git"
cd ..
rmdir "!repo!.git"
ENDLOCAL
goto :eof

#856 Golang 多版本方案

2023-03-09

Ubuntu 更新源中的是 Go 1.18(apt install golang),现在 Go 1.20 出来了,我想尝尝鲜,就需要考虑多版本共存的方案了。

Python 有 pyenv,Node 有 nvm。
Go 也有一些社区项目,比如 syndbg/goenv stars和 moovweb/gvm stars,还有 owenthereal/goup stars

其中 gvm 之前有尝试过,参考:gvm: Golang 版本管理

本文是介绍官方的 dl,可以说是非常简单。

go install golang.org/dl/go1.20@latest

~/go/bin/go1.20 download
Downloaded   0.0% (   16384 / 99869470 bytes) ...
Downloaded   3.5% ( 3522544 / 99869470 bytes) ...
Downloaded   9.8% ( 9748480 / 99869470 bytes) ...
Downloaded  15.7% (15712240 / 99869470 bytes) ...
Downloaded  21.7% (21626880 / 99869470 bytes) ...
Downloaded  27.6% (27541296 / 99869470 bytes) ...
Downloaded  32.9% (32866288 / 99869470 bytes) ...
Downloaded  38.9% (38846464 / 99869470 bytes) ...
Downloaded  44.9% (44793840 / 99869470 bytes) ...
Downloaded  50.8% (50741248 / 99869470 bytes) ...
Downloaded  56.7% (56672240 / 99869470 bytes) ...
Downloaded  62.7% (62586864 / 99869470 bytes) ...
Downloaded  68.1% (67993600 / 99869470 bytes) ...
Downloaded  74.0% (73924304 / 99869470 bytes) ...
Downloaded  79.9% (79839200 / 99869470 bytes) ...
Downloaded  85.9% (85753856 / 99869470 bytes) ...
Downloaded  91.8% (91717424 / 99869470 bytes) ...
Downloaded  97.2% (97025728 / 99869470 bytes) ...
Downloaded 100.0% (99869470 / 99869470 bytes)
Unpacking /home/markjour/sdk/go1.20/go1.20.linux-amd64.tar.gz ...
Success. You may now run 'go1.20'

~/go/bin/go1.20 download
go1.20: already downloaded in /home/markjour/sdk/go1.20

~/sdk/go1.20/bin/go version
go version go1.20 linux/amd64

# sudo ln -sf ~/sdk/go1.20/bin/go /usr/local/bin/go
ln -sf ~/sdk/go1.20/bin/go ~/.local/bin/go1.20
ln -sf go1.20 ~/.local/bin/go

#855 Gradio:简单易用的 Demo 工具(Web)

2023-03-04

今天了解到这个库,为一些演示工作的方便而开发。用 Python 来配置界面,主要是输入、输出,然后将输入的参数传入处理方法,将返回值显示在输出。
PS:安装的时候可以看到,这个库有 14M,而且其他的依赖不少。

这只是一个简单的示例:

import gradio as gr

def greet(name):
    return "Hello " + name + "!"

# demo = gr.Interface(fn=greet, inputs="text", outputs="text")
demo = gr.Interface(
    fn=greet,
    inputs=gr.Textbox(lines=2, placeholder="Name Here..."),
    outputs="text",
)
demo.launch()
python gradioTest.py
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.

20230304-gradio.png

以后需要用到这个的时候再来研究研究。

#854 Linux 网络:开放端口范围

2023-03-03

线上环境,有一个服务启动时,四个进程只成功了三个,检查发现端口被占用。
再一看,是被另外三个进程中的一个连接 MongoDB 占用了。

# 查看
cat /proc/sys/net/ipv4/ip_local_port_range
1024    65000

$ sysctl net.ipv4.ip_local_port_range
net.ipv4.ip_local_port_range = 1024     65000

PS:查了一下,个人机器(Ubuntu)上配置的是:32768 60999

把下限往上提到 20000,避开服务常用接口:

# 临时配置
echo "20000 65000" > /proc/sys/net/ipv4/ip_local_port_range
sysctl -w net.ipv4.ip_local_port_range="20000 65000"

# 持久配置
vim /etc/sysctl.conf

#853 音乐、影视与流行文化

2023-03-01

本来是想回忆一下青少年时期的各种事情,取了个名字叫我的青春。
但是细想想,我自己的青少年时光实在是乏善可陈。没有什么值得一提的事情。
或者说,和大多数人一样,我没有青春,只是在别人的青春中当个群演,跑下老套而已。
甚至,我觉得青春期的自己是个二百五,这让我感觉还不如不曾出现在别人的世界里,让这个世界更加美好一些。
所以这里还是就说一下那些年听过的歌、看过的电视吧。