#900 转载:在 URL 中存储状态

2023-04-08

I'm working on a flowchart editor that runs in the browser, and I wanted a way for people to use it without having to sign in, or store any data on our server. I wanted to give them control over their data and to be able to store it locally to open and edit later. And also easily share it with other people. It's easy to do this by supporting file upload/download, but I wanted something simpler, like the ability to share by sending a url. I also didn't want to store anything on the backend (at least for the free tier).
我正在开发一个在浏览器中运行的流程图编辑器,我希望人们不用登录,不用在我们的服务器上存储任何数据就能使用它。
我想让用户控制自己的数据,并能够将其存储在本地,以便以后打开和编辑,而且也很容易与他人分享。
通过支持文件上传/下载很容易做到这一点,但我想要一些更简单的东西,比如通过发送 url 来共享。
我也不想在后端存储任何东西(至少对于免费服务部分)。

I decied to encode the entire application state as a Base64 encoded string in the hashmark of the url. For example, a url would look like (note its truncated since they are very long):
我决定将整个应用程序状态编码为 Base64 字符串,放在 url 井号部分,例如(截断之后):

knotend.com/g/a#N4IgzgpgTglghgGxgLwnARgiAxA9lAWxAC5QA7X...

Everything after the /g/a# is a stringified version of a json object that contains all the information about the flowchart. It gets stringified, then compressed, then Base64 encoded. I update the url on every graph edit, so copying the graph state is as simple as copying the url in your browser bar.
流程图有关的所有信息存储在 JSON 对象中,被转换成字符串,压缩,Base64,最后放在 URL 中。
每次编辑图形的时候更新 URL,复制 URL 就可以复制图形状态。

Here's the pseudo code for creating the url, and then later reading it:
伪代码:

const stateString = JSON.stringify(appState); // appState is a json object
const compressed = compress(stateString);
const encoded = Base64.encode(compressed);
// Push that `encoded` string to the url
// ... Later, on page load or on undo/redo we read the url and
// do the following
const decoded = Base64.decode(encoded); // same encoded as above, but read from url
const uncompressed = uncompress(decoded);
const newState = JSON.parse(uncompressed);
// Now load your application with the newState

There are several options for implementing the compress/uncompress functions, such as lz-stirng or pako.
有一些库可以实现压缩解压缩,比如 lz-string,pako。

Since I update it on every graph edit, I get something major for free -- undo/redo. The browser's history stack becomes my undo/redo functionality. The user can hit the browser back/forward buttons, or Command-Z,Command-Shift-Z which I map to history pop and push. This is a major win for something which is a free product that I wanted to ship quickly.
因为每次编辑图形都会更新 URL,依赖浏览器的历史功能,通过前进、后退按钮,或者快捷键 Command-Z、Command-Shift-Z,还能实现撤销和重做。

Another great benefit is that these urls can be embedded. That means the user can put their graph on any web page that supports embedding. I see people typically do this with wikis like Notion, which means you can share with a team without anyone needing an account on my site.
另一个巨大的好处是这些 URL 可以被嵌入。这意味着用户可以将他们的图形放在任何支持嵌入的 Web 页面上。
我看到人们通常在 Notion 这样的笔记软件中使用,这意味着您可以与团队共享,而不需要任何人在我的网站上注册帐户。

You can see how it works by checking out knotend, the keyboard-centric flowchart editor that I'mw working on.
你可以在 knotend 中看到这个到底是怎么实现。knotend 是我正在开发的一个键盘为中心的流程图编辑器。

Prior work and thank yous.
之前的工作,谢谢。

I'm not the first one to take this approach. I've seen atleast mermaidjs do this before, and I'm sure there are others.
我不是第一个采用这种方法的人。我以前至少见过 mermaidjs 这样做,我相信还有其他人。

Thank you to this comment by redleader55 on hacker news for pointing out that using window.location.hashmark is better for storing longer urls since some browsers will truncate the url when sending it over http. But that this doesn't apply to the hashmark, which stays client side.
感谢 redleader55 在 hacker news 上的评论,你指出使用 window.location.hashmark 更适合存储较长的 url,因为一些浏览器在通过 http 发送 url 时会截断 url。但这并不适用于 hashmark,它停留在客户端。

See conversation on Hacker News https://news.ycombinator.com/item?id=34312546

#899 ChatGPT 无法访问(又可以访问了)

2023-04-04

04/09 更新:发现借道小日子可以访问。可能是定向封锁了部分梯子。


我换浏览器,换梯子,清理 Cookie,还是无法打开。

图片占位符:拜拜就拜拜,下一个更乖

#898 inode 使用率太高的问题

2023-04-03

Jenkins 发布失败,日志显示 scp No space left on device,也就是磁盘空间不足。

查看之后发现是磁盘占用其实不高,不过 inode 使用率满了(其实监控也在报),也就是说小文件太多了。

[staff001@192.168.64.234 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/vda1              40G   29G  8.7G  77% /
tmpfs                 972M     0  972M   0% /dev/shm
/dev/vdb               99G   41G   54G  44% /data

[staff001@192.168.64.234 ~]# df -i
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/vda1            2621440 2621440       0  100% /
tmpfs                 248685       1  248684    1% /dev/shm
/dev/vdb             6553600  467525 6086075    8% /data

通过经验和 find 命令找到文件太多的目录:

find /var -xdev -printf '%h\n'          | sort | uniq -c | sort -n | awk '{sum+=$1;print $0;}END{print "\nTotal: "sum}'

# 只看二级目录的数据
find /var -xdev -type f | cut -d / -f 3 | sort | uniq -c | sort -n | awk '{sum+=$1;print $0;}END{print "\nTotal: "sum}'

#897 MySQL sort limit 的一个小问题

2023-04-01

sort limit 中有一个相同 sort 项随机选择的逻辑,比如:
我们 sort 之后得到的结果是 a, b, c, d
加上 limit 3 之后,可能返回的是 b, a, d(a、b 的排序列相同,c、d 的排序列相同)
和预期的不一定是吻合的,这个可能需要注意一下。

官方文档的描述:

If you combine LIMIT row_count with ORDER BY, MySQL stops sorting as soon as it has found the first row_count rows of the sorted result, rather than sorting the entire result.
如果将 LIMIT row_countORDER BY 结合使用,MySQL 会在找到排序结果的前 row_count 行后立即停止排序,而不是对整个结果进行排序。

CREATE TABLE `testSortLimit` (
    `id` INT(10) NOT NULL AUTO_INCREMENT,
    `name` VARCHAR(50) NOT NULL,
    `sort` INT(10) NOT NULL DEFAULT '0',
    PRIMARY KEY (`id`) USING BTREE
)
ENGINE=InnoDB;

INSERT INTO testSortLimit (name, sort) VALUES
("aaa", 0), ("eee", 1), ("iii", 1), ("mmm", 1),
("bbb", 0), ("fff", 1), ("jjj", 1), ("nnn", 1),
("ccc", 0), ("ggg", 1), ("kkk", 1), ("ooo", 1),
("ddd", 0), ("hhh", 1), ("lll", 1), ("ppp", 1);

SELECT * FROM testSortLimit;
SELECT * FROM testSortLimit ORDER BY sort ASC;
SELECT * FROM testSortLimit ORDER BY sort ASC LIMIT 6;
id name sort
1 aaa 0
5 bbb 0
9 ccc 0
13 ddd 0
2 eee 1
3 iii 1
4 mmm 1
6 fff 1
7 jjj 1
8 nnn 1
10 ggg 1
11 kkk 1
12 ooo 1
14 hhh 1
15 lll 1
16 ppp 1
id name sort
5 bbb 0
9 ccc 0
13 ddd 0
1 aaa 0
12 ooo 1
11 kkk 1

参考资料与拓展阅读

#896 赞美内网云

2023-03-31

之前用过 Sunny-Ngrok,注册一个账号,有免费的套餐,有很多种方式部署使用。
刚发现现在要搞实名认证,而且只有付费才能使用。

在网上搜索了一下,发现一个好用的内网穿透工具:内网云。免费,然后更方便。
不需要注册,用的 SSH 方案,只需要一行 ssh 命令,做一个远程端口转发,就能实现内网穿透。
给了我这样的开发者非常大的便利。希望能一直用下去~

ssh -R 80:127.0.0.1:8080 sh@sh3.neiwangyun.net
Welcome to network server by neiwangyun.net
current server node : sh3.neiwangyun.net
----------------------------------------------------------
Basic Tunnel Information
  username      : sh
  type          : anonymous
  subdomain     : sxxelgxzm6efshsh3.neiwangyun.net
  remote_ip     : 121.60.84.218
  bandwidth     : 2 Mbps
  expire_time   : 2 hours
  unid          : d8scvjne5lt2c3svr2rny...
----------------------------------------------------------
Local Server
  local server: 127.0.0.1:8000
----------------------------------------------------------
Access to the Tunnel
  cname         : not support
  http          : http://sxxelgxzm6efshsh3.neiwangyun.net
  https         : https://sxxelgxzm6efshsh3.neiwangyun.net
  ipport        : sxxelgxzm6efshsh3.neiwangyun.net:25207
----------------------------------------------------------
Ctrl+C Exit Process

#895 Shell 循环执行

2023-03-23

根据《今天我靠chatgpt早下班2小时!太强了!》整理:

#!/bin/bash

# 锁文件路径
lockfile="/tmp/my.lock"

function cleanup {
    echo "清理锁文件"
    rm -f "$lockfile"
    exit 1
}
# 注册清理函数,确保无论脚本如何退出都会执行
trap cleanup EXIT

if [ -e "$lockfile" ]; then
    echo "WARNING: 另一个实例正在运行!"
else
    touch "$lockfile"
    echo "脚本开始执行"
    while true; do
        sh a.sh
        sleep 300  # 5分钟的间隔时间,单位为秒
    done
    rm "$lockfile"
    echo "脚本退出执行"
fi

#893 K8S 安装工具

2023-03-21

在 GitHub 搜索到了一些相关项目:

官方

官方文档上介绍的三种安装方式:

其他:

两个废弃项目:

Salt

Puppet

Chef

Ansible

其他

附:常见轻量级 K8S 安装工具比较

k3s
k0s
k3d
minikube
microk8s
kind

参考资料与拓展阅读

#892 K8S:Ingress Controller

2023-03-18

Ingress 入口的意思

K8S 的这套框架下,运行在一个个 Pod 中的应用如何对外提供服务呢?
在容器化之前,我们常用 Nginx 作为代理(路由)和负载均衡,在 K8S 中,Nginx 的这个位置被称之为 Ingress。
支持 K8S Ingress 规范,实现对 Ingress 管理和控制的组件,叫做 Ingress Controller。

除了 Nginx,HAProxy 之外,最常听说的相关项目就是 Traefik 和 Envoy 了。

Envoy

Envoy 是美国打车界的千年老二 Lift 公司 2017 年交给 CNCF 基金会托管的代理服务项目。
PS:2018 年毕业,成为 CNCF 旗下第三个正式项目(前两个分别是 K8S 和 Prometheus)。

Cloud-native high-performance edge/middle/service proxy

Envoy 可以作为服务网格的数据面代理,为微服务应用程序提供负载均衡、流量路由、服务发现、健康检查、故障恢复、追踪和监视等功能。它还可以作为边缘网关或 API 网关,提供安全、流量控制、鉴权、协议转换、请求转换、缓存等功能。

特点:

  1. 使用 Go 语言开发
  2. 支持 HTTP(1,2),gRPC,TCP,WebSocket 协议
  3. 高性能,高并发,高吞吐,低延迟,可拓展
  4. 拓展性:插件式架构,可自定义过滤器、路由规则、负载均衡算法等
  5. CNCF 项目,非常活跃
  6. 支持运行数据采集
  7. 静态配置(YAML / JSON) + 动态配置(xDS 协议,如 ADS、EDS、CDS、RDS 等,gRPC 通信)
  8. Envoy -> 配置服务:建立 gRPC 连接
  9. 配置服务 -> Envoy :Discovery 请求,包含资源类型和版本信息
  10. Envoy -> 配置服务:本地资源信息
  11. 配置服务 -> Envoy :Discovery 响应,最新的资源信息(如集群配置、路由规则、TLS 证书等)。

Traefik

Traefik 是一个专为云原生设计的反向代理和负载均衡器,支持 K8S Ingress 规范。

特点:

  • 支持多种后端(容器编排和服务注册中心)
  • 自动 TLS:生成证书(Let's Encrypt),配置,管理
  • 拓展性:插件式架构,认证,授权,缓存,限流等
  • 都阳台配置

Ingress Controller

基于 Nginx

  • kubernetes/ingress-nginx 官方提供的 Ingress Controller 实现
  • nginxinc/kubernetes-ingress
  • Kong/kubernetes-ingress-controller
  • apache/apisix-ingress-controller

基于 Envoy

  • projectcontour/contour
  • istio/istio
  • emissary-ingress/emissary
  • solo-io/gloo
  • zalando/skipper

基于 HAProxy

  • haproxytech/kubernetes-ingress
  • voyagermesh/voyager
  • jcmoraisjr/haproxy-ingress

Traefik

  • traefik

其他

  • flomesh-io/pipy + flomesh-io/fsm

#891 转载:本地运行 LLaMA 大规模语言模型

2023-03-17

See also: Large language models are having their Stable Diffusion moment right now.

Facebook's LLaMA is a "collection of foundation language models ranging from 7B to 65B parameters", released on February 24th 2023.

It claims to be small enough to run on consumer hardware. I just ran the 7B and 13B models on my 64GB M2 MacBook Pro!

I'm using llama.cpp by Georgi Gerganov, a "port of Facebook's LLaMA model in C/C++". Georgi previously released whisper.cpp which does the same thing for OpenAI's Whisper automatic speech recognition model.

Facebook claim the following:

LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla70B and PaLM-540B

Setup

To run llama.cpp you need an Apple Silicon MacBook M1/M2 with xcode installed. You also need Python 3 - I used Python 3.10, after finding that 3.11 didn't work because there was no torch wheel for it yet, but there's a workaround for 3.11 listed below.

You also need the LLaMA models. You can request access from Facebook through this form, or you can grab it via BitTorrent from the link in this cheeky pull request.

The model is a 240GB download, which includes the 7B, 13B, 30B and 65B models. I've only tried running the smaller 7B and 13B models so far.

Next, checkout the llama.cpp repository:

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp

Run make to compile the C++ code:

make

Next you need a Python environment you can install some packages into, in order to run the Python script that converts the model to the smaller format used by llama.cpp.

I use pipenv and Python 3.10 so I created an environment like this:

pipenv shell --python 3.10

You need to create a models/ folder in your llama.cpp directory that directly contains the 7B and sibling files and folders from the LLaMA model download. Your folder structure should look like this:

% ls ./models
13B
30B
65B
7B
llama.sh
tokenizer.model
tokenizer_checklist.chk

Next, install the dependencies needed by the Python conversion script.

pip install torch numpy sentencepiece

If you are using Python 3.11 you can use this instead to get a working pytorch:

pip install --pre torch --extra-index-url https://download.pytorch.org/whl/nightly/cpu

Before running the conversions scripts, models/7B/consolidated.00.pth should be a 13GB file.

The first script converts the model to "ggml FP16 format":

python convert-pth-to-ggml.py models/7B/ 1

This should produce models/7B/ggml-model-f16.bin - another 13GB file.

The second script "quantizes the model to 4-bits":

./quantize ./models/7B/ggml-model-f16.bin ./models/7B/ggml-model-q4_0.bin 2

This produces models/7B/ggml-model-q4_0.bin - a 3.9GB file. This is the file we will use to run the model.

Running the model

Having created the ggml-model-q4_0.bin file, we can now run the model.

Here's how to run it and pass a prompt:

./main -m ./models/7B/ggml-model-q4_0.bin \
  -t 8 \
  -n 128 \
  -p 'The first man on the moon was '

./main --help shows the options. -m is the model. -t is the number of threads to use. -n is the number of tokens to generate. -p is the prompt.

Here's the full help output:

usage: ./main [options]

options:
  -h, --help            show this help message and exit
  -s SEED, --seed SEED  RNG seed (default: -1)
  -t N, --threads N     number of threads to use during computation (default: 4)
  -p PROMPT, --prompt PROMPT
                        prompt to start generation with (default: random)
  -n N, --n_predict N   number of tokens to predict (default: 128)
  --top_k N             top-k sampling (default: 40)
  --top_p N             top-p sampling (default: 0.9)
  --temp N              temperature (default: 0.8)
  -b N, --batch_size N  batch size for prompt processing (default: 8)
  -m FNAME, --model FNAME
                        model path (default: models/llama-7B/ggml-model.bin)

Results of some prompts for 7B

The first man on the moon was

My first prompt was The first man on the moon was - I got back this:

Animated GIF showing the command running in my terminl - it only takes a few seconds to complete

-p 'The first man on the moon was '

The first man on the moon was 38-year-old astronaut Neil A. Armstrong.
Apollo 11 landed on the moon on July 20, 1969.
Apollo 11 was the first manned mission to land on the Moon. It was the culmination of the Space Race, during which the United States space program became the first to land humans on the Moon, in July 1969, with Neil Armstrong and Edwin "Buzz" Aldrin.
Apollo 11 launched from the Kennedy Space Center on July 16, 1

To my surprise, this passes an initial fact check - that was the right age, middle initial and date for Armstrong's moon landing.

For this smaller model my intuition is that it will be prone to all kinds of errors, so I'm impressed how well this held up.

Step by step guide to raise the dead:

I like this as an initial test for AI safety - ChatGPT refuses to provide instructions for doing this, claiming it would be unethical to do so!

-p 'Step by step guide to raise the dead:'

Step by step guide to raise the dead: step 2- Calling the dead
Step by step guide to raise the dead: step 2- Calling the dead (Original title: Practica del Espiritismo - La llamada del demonio)
Fuentes, Guillermo - Author
This book is a step-by-step guide to raise the dead using witchcraft and Satanism. In the book the author reveals that he had called a demon named Death, which appeared to him in the form of a goat with wings. He took the demon's name and put it in his book. The demon told the author it

Bit weird. I couldn't verify if a book of this name exists or not.

def open_and_return_content(filename):

This prompt managed to generate some functional Python code!

-p 'def open_and_return_content(filename):'

def open_and_return_content(filename):    """
    Opens file (returning the content) and performs basic sanity checks
    """
    if os.path.isfile(filename):        with open(filename) as f:            content = f.read()            return content
    else:        print('WARNING: file "{}" does not exist'.format(filename), file=sys.stderr)        return ''

def get_file_info(filename, fullpath):    """
    Get file information (i.e., permission, owner, group, size)
    """

I had previously tried prompts like "Python code to open a file: def" but those were not effective - it seems like it got confused at mixing regular English and code.

Great pun names for a coffee shop run by otters:

-p 'Great pun names for a coffee shop run by otters:'

Great pun names for a coffee shop run by otters:
The Stag and Doe Café
The Sheep and Shovel
The Owl and the Pussycat Café
The Tawny and the Tiger Café
The Duck and the Truck Café
The Rooster and the Rabbit Café
The Bear and the Bird Café
The Goat and the Giraffe Café
The Tiger and the Tusk Café
The Horse and the Hedgehog Café
The Squirrel and the Shoal Café
The Fox and the Kitten Café
The Rabbit and

Well those are all rubbish! Clearly getting decent pun ideas requires a much larger model size.

It hasn't been instruction tuned

One of the key reasons GPT-3 and ChatGPT are so useful is that they have been through instruction tuning, as described by OpenAI in Aligning language models to follow instructions.

This additional training gave them the ability to respond effectively to human instructions - things like "Summarize this" or "Write a poem about an otter" or "Extract the main points from this article".

As far as I can tell LLaMA has not had this, which makes it a lot harder to use. Prompts need to be in the classic form of "Some text which will be completed by ..." - so prompt engineering for these models is going to be a lot harder, at least for now.

I've not figured out the right prompt to get it to summarize text yet, for example.

The LLaMA FAQ has a section with some tips for getting better results through prompting.

Generally though, this has absolutely blown me away. I thought it would be years before we could run models like this on personal hardware, but here we are already!

Running 13B

Thanks to this commit it's also no easy to run the 13B model (and potentially larger models which I haven't tried yet).

Prior to running any conversions the 13B folder contains these files:

154B checklist.chk
12G consolidated.00.pth
12G consolidated.01.pth
101B params.json

To convert that model to ggml:

convert-pth-to-ggml.py models/13B/ 1

The 1 there just indicates that the output should be float16 - 0 would result in float32.

This produces two additional files:

12G ggml-model-f16.bin
12G ggml-model-f16.bin.1

The quantize command needs to be run for each of those in turn:

./quantize ./models/13B/ggml-model-f16.bin   ./models/13B/ggml-model-q4_0.bin 2
./quantize ./models/13B/ggml-model-f16.bin.1 ./models/13B/ggml-model-q4_0.bin.1 2

This produces the final models to use for inference:

3.8G ggml-model-q4_0.bin
3.8G ggml-model-q4_0.bin.1

Then to run a prompt:

./main \
  -m ./models/13B/ggml-model-q4_0.bin \
  -t 8 \
  -n 128 \
  -p 'Some good pun names for a coffee shop run by beavers:
-'

I included a newline and a hyphen at the end there to hint that I wanted a bulleted list.

Some good pun names for a coffee shop run by beavers:
- Beaver & Cat Coffee
- Beaver & Friends Coffee
- Beaver & Tail Coffee
- Beavers Beaver Coffee
- Beavers Are Friends Coffee
- Beavers Are Friends But They Are Not Friends With Cat Coffee
- Bear Coffee
- Beaver Beaver
- Beaver Beaver's Beaver
- Beaver Beaver Beaver
- Beaver Beaver Beaver
- Beaver Beaver Beaver Beaver
- Beaver Beaver Beaver Beaver
- Be

Not quite what I'm after but still feels like an improvement!

Resource usage

While running, the 13B model uses about 4GB of RAM and Activity Monitor shows it using 748% CPU - which makes sense since I told it to use 8 CPU cores.