Python (5) - 码厩

#79 采用霍夫变换矫正倾斜图片

Python OpenCV 霍夫变换图像处理 2021-06-04

原图（内容成 15° 娇）：

原理

里面设计很多知识点，我一个都不知道，涉及图像处理，数学计算，信号变换等多个方面。

但我查到的资料似乎和下面两个概念相关。

如果看到这篇文章的人了解其中原理能够跟我讲讲 (ninedoors#126)，万分感激！

傅立叶变换, Fourier Transform, FT

傅里叶变换是一种线性积分变换，用于信号在时域（或空域）和频域之间的变换，在物理学和工程学中有许多应用。

我的理解是用多个三角函数来表示一个变化曲线。

霍夫变换, Hough Transform

霍夫变换是一种特征提取，被广泛应用在图像分析、电脑视觉以及数位影像处理。霍夫变换是用来辨别找出物件中的特征，例如：线条。他的算法流程大致如下，给定一个物件、要辨别的形状的种类，算法会在参数空间中执行投票来决定物体的形状，而这是由累加空间（accumulator space）里的局部最大值来决定。

维基百科说的很清楚了，用来分析图片特征，识别出其中的形状。

步骤

原图（灰度处理 + 尺寸裁剪），减少无关数据，减少计算量
傅里叶变换，得到频域图像
利用霍夫变换做直线检测
计算倾斜角度
旋转校正

代码

import logging
import math

import cv2
import numpy as np
from numpy.lib.function_base import average

logging.basicConfig(level=logging.DEBUG, format='%(asctime)s %(levelname)s [%(name)s:%(funcName)s#%(lineno)s] %(message)s')
LOG = logging.getLogger(__name__)

# 1、灰度化读取文件
filepath = '/tmp/rotated.png'
img = cv2.imread(filepath, 0)
if img is None:
    LOG.error('image not exists!')
    exit()

# 2、图像延扩
h, w = img.shape[:2]
new_h = cv2.getOptimalDFTSize(h)  # 傅里叶最优尺寸
new_w = cv2.getOptimalDFTSize(w)
right = new_w - w
bottom = new_h - h
# 边界扩充 cv2.copyMakeBorder(src, top, bottom, left, right, borderType, dst=None)
#   BORDER_CONSTANT    常量，增加的变量通通为 value
#   BORDER_REFLICATE   用边界的颜色填充
#   BORDER_REFLECT     镜像
#   BORDER_REFLECT_101 倒映
#   BORDER_WRAP        没有规律
nimg = cv2.copyMakeBorder(img, 0, bottom, 0, right, borderType=cv2.BORDER_CONSTANT, value=0)
cv2.imshow('new image', nimg)

# 3、傅里叶变换，获到频域图像
f = np.fft.fft2(nimg)
fshift = np.fft.fftshift(f)
magnitude = np.log(np.abs(fshift))
LOG.info(magnitude)
# 二值化
magnitude_uint = magnitude.astype(np.uint8)
ret, thresh = cv2.threshold(magnitude_uint, 11, 255, cv2.THRESH_BINARY)
LOG.info(ret)
cv2.imshow('thresh', thresh)
LOG.info(thresh.dtype)

# 4、霍夫直线变换
lines = cv2.HoughLinesP(thresh, 2, np.pi/180, 30, minLineLength=40, maxLineGap=100)
LOG.info('line number: %d', len(lines))
# 创建一个新图像，标注直线
lineimg = np.ones(nimg.shape, dtype=np.uint8)
lineimg = lineimg * 255
for index, line in enumerate(lines):
    LOG.info('draw line#%d: %s', index, line)
    x1, y1, x2, y2 = line[0]
    cv2.line(lineimg, (x1, y1), (x2, y2), (0, 255, 0), 2)
cv2.imshow('line image', lineimg)

# 5、计算倾斜角度
piThresh = np.pi / 180
pi2 = np.pi / 2
angles = []
for line in lines:
    LOG.info('line#%d: %s <===============', index, line)
    x1, y1, x2, y2 = line[0]
    if x2 - x1 == 0:
        LOG.debug('skip 1')
        continue
    theta = (y2 - y1) / (x2 - x1)
    LOG.debug('theta: %r', theta)
    if abs(theta) < piThresh or abs(theta - pi2) < piThresh:
        LOG.debug('skip 2: %r', theta)
        continue
    angle = math.atan(theta)
    LOG.info('angle 1: %r', angle)
    angle = angle * (180 / np.pi)
    LOG.info('angle 2: %r', angle)
    angle = (angle - 90)/(w/h)
    LOG.info('angle 3: %r', angle)
    angles.append(angle)

if not angles:
    LOG.info('图片挺正的了，别折腾！')
else:
    # from numpy.lib.function_base import average
    # angle = average(angles)
    LOG.info('  方差: %r', np.array(angles).var())
    LOG.info('标准差: %r', np.array(angles).std())
    # angle = np.mean(angles)
    import statistics
    # LOG.debug(statistics.multimode(angles))
    # angle = statistics.mode(angles)
    # statistics.StatisticsError: geometric mean requires a non-empty dataset  containing positive numbers
    # statistics.StatisticsError: harmonic mean does not support negative values
    angle = statistics.median(angles)
    if 180 > angle > 90:
        angle = 180 - angle
    elif -180 < angle < -90:
        angle = 180 + angle
    LOG.info('==> %r, %r', angles, angle)

    # 6、旋转
    center = (w//2, h//2)
    M = cv2.getRotationMatrix2D(center, angle, 1.0)
    LOG.debug('=========== RotationMatrix2D ===========')
    for i in M:
        LOG.debug(i)
    LOG.debug('^======================================^')
    rotated = cv2.warpAffine(img, M, (w, h), flags=cv2.INTER_CUBIC, borderMode=cv2.BORDER_REPLICATE)
    cv2.imshow('rotated', rotated)

cv2.waitKey(0)
cv2.destroyAllWindows()

参考资料与拓展阅读

之前写过一篇，Python 2.7 + Django 1.x 版本的（链接地址），看 Django 的时候，想起来，拿出来跑一下，发现跑不起来，这里更新一下，用 Python3.8 + Django2.2 / Django 3.2 试试。
PS: 依然没有什么实际意义，只是玩玩而已。
Django 2.2 到 Django 3.2 的变更对这个单文件中使用的地方完全没有影响，代码公用。
和之前那份代码基本上相同，就不贴出来了，如果感兴趣可以点开：代码

#77 关于 CPython 的 Shannon Plan

Python 2021-05-18

今天刚看到一个消息，微软资助龟叔搞了个香农计划，目标是在 4 年时间内实现将 Python 提速 5 倍，而且是无痛提升，不会导致兼容性问题。最早可以在明年发布的 3.11 中得到体现：提速至少一倍。

#76 Pillow 旋转

Python PIL 图像处理 2021-05-17

抱歉，又水一篇

#75 使用 Go 语言开发 Python 拓展

Golang Python PythonExt 2021-05-15

Python 有一个好搭档就是 C/C，Python 提供生产力，C/C 则负责效率。
这篇文章探讨 Python + Go 混合开发的可能性。

#74 Python 弱引用

Python 2021-02-20

垃圾回收

Garbage Collector, 简写: GC

Python 垃圾回收是简单基于引用计数

弱引用

在计算机程序设计中，弱引用与强引用相对，是指不能确保其引用的对象不会被垃圾回收器回收的引用。一个对象若只被弱引用所引用，则被认为是不可访问（或弱可访问）的，并因此可能在任何时刻被回收。一些配有垃圾回收机制的语言，如Java、C#、Python、Perl、Lisp等都在不同程度上支持弱引用。

一句话：弱引用不增加计数，对引用计数型 GC 友好一些

垃圾回收与循环引用的问题

import gc

IDS = {}

class A:
    def __del__(self):
        _id = id(self)
        print('A.__del__ %s: 0x%x' % (IDS[_id], _id))

OBJS = {i: A() for i in range(3)}
for i, obj in OBJS.items():
    _id = id(obj)
    IDS[_id] = f'OBJS[{i}]'
    print('%s: 0x%x' % (IDS[_id], _id))
OBJS[1].attr = OBJS[1]
print('1' * 50)
print('====> del OBJS[0]')
del OBJS[0]
print('2' * 50)
print('====> del OBJS[1]')
del OBJS[1]
print('3' * 50)
print('====> del OBJS[2]')
del OBJS[2]
print('4' * 50)
gc.collect()

import weakref

print()
print('=' * 50)
class B:
    def __init__(self, obj):
        self.attrs = [obj]
    def __del__(self):
        _id = id(self)
        print('B.__del__ %s: 0x%x' % (IDS[_id], _id))
a = A()
b = B(a)
a.xyz = b
IDS[id(a)] = 'a'
IDS[id(b)] = 'b'
del a, b  # do nothing
print('=' * 40)
gc.collect()  # will del a and b

print()
print('=' * 50)
class C:
    def __init__(self, obj):
        self.attrs = [weakref.ref(obj)]
    def __del__(self):
        _id = id(self)
        print('C.__del__ %s: 0x%x' % (IDS[_id], _id))
a = A()
c = C(a)
a.xyz = c
IDS[id(a)] = 'a'
IDS[id(c)] = 'c'
del a, c
print('=' * 40)
gc.collect()

标准库：weakref

class weakref.ref(object[, callback]) 回调
weakref.proxy(object[, callback])
weakref.getweakrefcount(object)
weakref.getweakrefs(object)
class weakref.WeakKeyDictionary([dict])
.keyrefs()
class weakref.WeakValueDictionary([dict])
.valuerefs()
class weakref.WeakSet([elements])

Set class that keeps weak references to its elements. An element will be discarded when no strong reference to it exists any more.
class weakref.WeakMethod(method)
class weakref.finalize(obj, func, /, *args, **kwargs)
weakref.ReferenceType
weakref.ProxyType
weakref.CallableProxyType
weakref.ProxyTypes

import weakref

class Klass:
    pass

obj = Klass()
ref = weakref.ref(obj)
print(ref())
del obj
print(ref())  # None

obj = Klass()
p = weakref.proxy(obj)
print(p)
del obj
print(p)  # ReferenceError: weakly-referenced object no longer exists

参考资料与拓展阅读

维基百科，弱引用
https://docs.python.org/3/library/weakref.html
知乎专栏，一文看懂弱引用！Python中的weakref有什么用？
博客园，爱编程的小灰灰，Python3标准库：weakref对象的非永久引用

#73 Python logging 日志模块 [编辑中]

Python logging 日志 2021-02-08

:) 本文正在编辑中，暂时不提供浏览...

#72 Tornado: max_clients limit reached, request queued

Python Tornado 异常 2021-02-05

看到日志里面有很多 tornado 的 max_clients limit reached, request queued. 日志。

#71 Python 类型提示（Type Hints）

Python Python3 2021-01-28

Type Hint, 英文直译应该是输入提示。

动态类型语言有一个优点，同时也是缺点：不好做静态类型检查，IDE 或者其他开发工具很难根据代码去准确判断一个变量的类型。
Python 3.0 开始引入并逐渐完善类型注解（Type Annotation）则给 Python 静态类型检查提供了可能性。
PS: Python 运行时会忽略类型注解，不会给任何提示或警告。
PS: 之前的一些工具可以通过注释来做类型检查 (Type Comment)，起到相同的作用。

PEP

SF 3107 [2006-12-02] (3.0 ) Function Annotations  开始引入函数注解
SF 3141 [2007-04-23] (    ) A Type Hierarchy for Numbers
SF  424 [2012-07-14] (3.4 ) A method for exposing a length hint
SF  451 [2013-08-08] (3.4 ) A ModuleSpec Type for the Import System
SP  484 [2014-09-29] (3.5 ) Type Hints            类型提示
IF  483 [2014-12-19] (    ) The Theory of Type Hints
IF  482 [2015-01-08] (    ) Literature Overview for Type Hints
SF  526 [2016-08-09] (3.6 ) Syntax for Variable Annotations
SA  544 [2017-03-05] (3.8 ) Protocols: Structural subtyping (static duck typing)
SA  560 [2017-09-03] (3.7 ) Core support for typing module and generic types
SA  563 [2017-09-08] (3.7 ) Postponed Evaluation of Annotations
SA  561 [2017-09-09] (3.7 ) Distributing and Packaging Type Information
SA  585 [2019-03-03] (3.9 ) Type Hinting Generics In Standard Collections
SA  586 [2019-03-14] (3.8 ) Literal Types
SA  591 [2019-03-15] (3.8 ) Adding a final qualifier to typing
SA  589 [2019-03-20] (3.8 ) TypedDict: Type Hints for Dictionaries with a Fixed Set of Keys
SA  593 [2019-04-26] (3.9 ) Flexible function and variable annotations
SA  604 [2019-08-28] (3.10) Allow writing union types as ``X | Y``
SA  613 [2020-01-21] (    ) Explicit Type Aliases     引入类型别名
SA  647 [2020-10-07] (3.10) User-Defined Type Guards  引入 TypeGuard 类型，缩小类型检查时的范围
S   649 [2021-01-11] (    ) Deferred Evaluation Of Annotations Using Descriptors
S   655 [2021-01-30] (3.10) Marking individual TypedDict items as required or potentially-missing

基础用法

如果担心类型检查会，可以使用 @no_type_check 装饰器。

def greeting(name: str) -> str:
    return 'Hello ' + name

Vector = list[float]

def scale(scalar: float, vector: Vector) -> Vector:
    return [scalar * num for num in vector]

Union

from typing import NoReturn

Address = tuple[str, int]

def connect(Union[Address, str]) -> NoReturn:
    pass

Optional[T] 是 Union[T, None] 的简写。

def get_argument(name:str, default:Optional[str]=None) -> Union[str, None]:
    pass

其他常用类型

Any
Callable
ClassVar
NewType

stub 文件

https://github.com/python/typeshed
Collection of library stubs for Python, with static types

类型检查工具

IDE，比如 PyCharm，可以配置类型检查。
VSCode 或者 Atom 之类的编辑器也可以通过插件支持类型检查。

mypy
pyright
pytype
pyre

参考：

mypy

参考资料与拓展阅读

Python 文档, typing — Support for type hints
阿里云开发者社区, 利用Stub File标注Python文件类型
知乎, Python 类型检查指南
Typing Python with typing

#70 Python f-string

Python Python3 2021-01-25

以前的字符串格式换方法

1. `format` 方法格式化

这种方法用的不多（对应 string.Formatter）。

Format String Syntax

replacement_field ::=  "{" [field_name] ["!" conversion] [":" format_spec] "}"

field_name        ::=  arg_name ("." attribute_name | "[" element_index "]")*
arg_name          ::=  [identifier | digit+]
attribute_name    ::=  identifier
element_index     ::=  digit+ | index_string
index_string      ::=  <any source character except "]"> +

conversion        ::=  "r" | "s" | "a"

format_spec       ::=  [[fill]align][sign][#][0][width][grouping_option][.precision][type]
fill              ::=  <any character>
align             ::=  "<" | ">" | "=" | "^"
sign              ::=  "+" | "-" | " "
width             ::=  digit+
grouping_option   ::=  "_" | ","
precision         ::=  digit+
type              ::=  "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g" | "G" | "n" | "o" | "s" | "x" | "X" | "%"

print('{} {}'.format('Hello', 'World'))
print('There are three people in my family: {0}, {1}, and I, and I love my {0} a litte more.'.format('father', 'mother'))

2. 模板字符串格式化

这种方法我只在文档中看到，从没真的用过。
PS: 被废弃的 PEP 215 曾建议采用 $'a = $a, b = $b' 这种语法。

Template strings

from string import Template
Template('$who likes $what').substitute(who='tim', what='kung pao')
Template('$who likes $what').safe_substitute({'who': 'time'})

3. 百分号格式化

这应该是现在的最主流的字符串格式化方式。

printf-style String Formatting

print('Hello %s' % 'World')
print('action %s cost %.3f seconds' % ('download', 0.123456789))
print('%(language)s has %(number)03d quote types.' % {'language': "Python", "number": 2})

Python 3.6 新加入 f-string

Formatted string literals

f_string          ::=  (literal_char | "{{" | "}}" | replacement_field)*
replacement_field ::=  "{" f_expression ["="] ["!" conversion] [":" format_spec] "}"
f_expression      ::=  (conditional_expression | "*" or_expr)
                         ("," conditional_expression | "," "*" or_expr)* [","]
                       | yield_expression
conversion        ::=  "s" | "r" | "a"
format_spec       ::=  (literal_char | NULL | replacement_field)*
literal_char      ::=  <any code point except "{", "}" or NULL>

从 format 方法格式化语法中复用了很多 (格式化和 conversion 部分), 不过变得更强大了。
主要是里面支持条件语句，表达式 (包括 yield)。

a = 3.1415926
f'{a}'
f'{a:.2f}'

f"{1 + 1}", f"{{1 + 1}}", f"{{{1 + 1}}}"
# ('2', '{1 + 1}', '{2}')

注意：f-sting 里面不能使用反斜杠转义！

f'{John\'s}'
# SyntaxError: f-string expression part cannot include a backslash

`r`, `s`, `a`

!r -> repr()
!s -> str()
!a -> ascii()

PS: ascii 方法是 Python 3 引入，和 repr 相似，但是 ascii 方法仅使用 ASCII 字符。
例如：a = '中国'; print(f'{a!a} {ascii(a)}') 输出 '\u4e2d\u56fd' '\u4e2d\u56fd'。

print(f'{a!r}')

`=`

有人提议加入 !d 表示输出表达式本身，然后加上等于号，加上计算值，例如 f'1 + 1!d' => 1 + 1=2。
后来实现成了这样：

a = 3.14
b = 1
print(f'a + b=')   # a + b=4.140000000000001
print(f'a + b = ') # a + b = 4.140000000000001

很有趣！

参考资料与拓展阅读

Bentley University, A Guide to f-string Formatting in Python
RealPython, Python 3's f-Strings: An Improved String Formatting Syntax (Guide)
Python，PEP 498 -- Literal String Interpolation

码厩技术博客

#79 采用霍夫变换矫正倾斜图片

原理

傅立叶变换, Fourier Transform, FT

霍夫变换, Hough Transform

步骤

代码

参考资料与拓展阅读

#78 单文件启动 Django 应用

#77 关于 CPython 的 Shannon Plan

#76 Pillow 旋转

#75 使用 Go 语言开发 Python 拓展

#74 Python 弱引用

垃圾回收

弱引用

垃圾回收与循环引用的问题

标准库：weakref

参考资料与拓展阅读

#73 Python logging 日志模块 [编辑中]

:) 本文正在编辑中，暂时不提供浏览...

#72 Tornado: max_clients limit reached, request queued

#71 Python 类型提示（Type Hints）

PEP

基础用法

Union

其他常用类型

stub 文件

类型检查工具

mypy

参考资料与拓展阅读

#70 Python f-string

以前的字符串格式换方法

1. `format` 方法格式化

2. 模板字符串格式化

3. 百分号格式化

Python 3.6 新加入 f-string

`r`, `s`, `a`

`=`

参考资料与拓展阅读

一	二	三	四	五	六	日

#79 采用霍夫变换矫正倾斜图片

原理

傅立叶变换, Fourier Transform, FT

霍夫变换, Hough Transform

步骤

代码

参考资料与拓展阅读

#78 单文件启动 Django 应用

#77 关于 CPython 的 Shannon Plan

#76 Pillow 旋转

#75 使用 Go 语言开发 Python 拓展

#74 Python 弱引用

垃圾回收

弱引用

垃圾回收与循环引用的问题

标准库：weakref

参考资料与拓展阅读

#73 Python logging 日志模块 [编辑中]

:) 本文正在编辑中，暂时不提供浏览...

#72 Tornado: max_clients limit reached, request queued

#71 Python 类型提示（Type Hints）

PEP

基础用法

Union

其他常用类型

stub 文件

类型检查工具

mypy

参考资料与拓展阅读

#70 Python f-string

以前的字符串格式换方法

1. format 方法格式化

2. 模板字符串格式化

3. 百分号格式化

Python 3.6 新加入 f-string

r, s, a

=

参考资料与拓展阅读

1. `format` 方法格式化

`r`, `s`, `a`

`=`